1 Introduction and Brief Literature Review

Underground coal mining induces movement and deformation of overlying strata and eventually causes ground subsidence, which has been a prominent concern for both local governments and coal mines, especially in China, as considerable coal deposits are buried under buildings, railways, and among other infrastructures. Land reclamation, foundation stability evaluation, and safety estimation of construction stability all demand an accurate prediction of surface subsidence in mining-affected areas (Malinowska and Hejmanowski 2010; Cheng et al. 2017; Sun et al. 2021). In this regard, numerical methods are quite promising for understanding the mechanism behind strata movement and for surface subsidence prediction in a more realistic and direct manner, and one obvious benefit of which, compared to others, is their capacity of merging large numbers of field data and laboratory test data into one model to reflect the complex interaction process between rock masses from deep underground to surface.

The mining-affected rock masses, whose mechanical properties not only vary along the gravity direction but may be of great difference along the horizontal directions (see for example: Fig. 1), can extend from several kilometers to tens of kilometers (Stavropoulou et al. 2007; Satter and Iqbal 2016) in terms of the coal depth and mining areas. Under such scale, rock mass property variation and strata thickness variation along horizontal directions should be carefully modeled but often neglected presently in abundant cases of numerical modeling due to difficulties of obtaining sufficient data (Bieniawski 1989; Zhang and Einstein 2004) and limitations of software and computing power of hardware (Barla and Barla 2000; Shreedharan and Kulatilake 2016).

Fig. 1
figure 1

Measured elastic modulus of rock specimens cored from three boreholes (see Fig. 8 for locations) in the Ying-Pan-Hao coal mine at a depth from 575 to 725 m below the present surface

As a matter of fact, compromises have long been made between modeling details and the limitations. For instance, a COSFLOW model (4.5 × 4.5 × 0.75 km) with ten rock units was built by Adhikary and Guo (2014) on the basis of a typical geological log simplified from borehole investigations, for the study of mining-induced strata permeability change. A 3DEC model (1.63 × 1.0 × 0.55 km) incorporating ten stratigraphic units was built by Zhang et al. (2016) based upon a generalized stratigraphy column for analyzing mining-induced valley closure movements. Xu et al. (2013a) also used generalized strata for constructing a FLAC3D model (10.5 × 9.0 × 1.8 km) where ten representative rock formations with different properties were included when studying mining-induced surface subsidence. Other similar studies (Adhikary et al. 2016; Ma et al. 2017a, b; Pongpanya et al. 2017; Cheng et al. 2018; Li et al. 2018; Zhang et al. 2018) can be found as well. All these relied only on the so-called generalized or typical borehole log, and ignored rock mass property variation along horizontal directions and strata lateral thickness variation at bed level, which we call the conventional numerical modeling method (CNMM). To be clear, the CNMM of large-scale rock mass has two major characteristics: (1) it is equivalent continuum concept dependency (Barla and Barla 2000; Sherizadeh and Kulatilake 2016; Shreedharan and Kulatilake 2016; Craig and Jackson 2017) so as to meet the computationally demanding; (2) it is based on a single borehole log or a generalized borehole log and only exhibits property variation along the gravity direction.

On the other hand, one should emphasize that borehole investigation is the most reliable approach (Bieniawski 1989), and borehole logs should be fully utilized if one expects a more realistic and accurate numerical analysis. In this regard, according to Chinese standards and regulations, shaft inspection holes, which are boreholes near the main shaft, auxiliary shaft, and air shaft of a coal mine, need to be drilled prior to construction to provide necessary mechanical parameters for the design of shaft walls, and at least one rock specimen should be taken for physical–mechanical property tests when drilling through each rock bed. Take the Ying-Pan-Hao coal mine as an example, 16 boreholes within 4.99 km3 (3.8 × 1.8 × 0.73 km) revealed 519 rock beds (some of them may belong to the same rock bed from the view of lithostratigraphic correlation), and 278 rock specimens from three shaft inspection holes were tested. All these tests and borehole investigations produce massive data readily available, which makes it possible to exhibit the property and thickness variation with space as well as to build a data-intensive numerical model. In addition, it is worth mentioning that many literatures have focused on the utilization of multiple borehole logs, but these methods were proposed mostly for 3D visualization (Lemon and Jones 2003; Calcagno et al. 2008; Guillen et al. 2008; Kaufmann and Martin 2008; Shao et al. 2011; Zu et al. 2012; Li et al. 2013), the representation of complex geologic phenomena (Wu et al. 2005; Xu et al. 2013a), and others (Zhu et al. 2013; Zhang and Zhu 2018), where numerical modeling limitations were not considered. In our previous study (Gong and Guo 2019), a conceptual method without modeling details was proposed for full usage of multiple boreholes and was specially designed for FLAC3D models via FISH (short for “FLAC-ISH”, the language of FLAC) because the primary concern was posed on introducing the application of geospatial big data in underground coal mining. More unfortunately, the mesh size effect (e.g., Davies et al. 1984; Turon et al. 2007; Deng et al. 2012; Sande and Ray 2014; Alañón et al. 2018) and rock property calibration were not considered. As a result, the findings were undermined.

In this paper, we intend to present a universal and detailed modeling method via BlockRanger (ITASCA 2016), a grid generation tool, and to present new findings after careful calibrations of mesh size and rock mass properties. The method is given in Sect. 2, where we focus on the accurate description of modeling processes through a mathematical model and 3D schematics, on the modeling space parting involving irregularly distributed boreholes, and on the estimation of rock mass properties by a four-step procedure. Two FLAC3D models on the basis of the CNMM and DINMM are built and calibrated in Sect. 3, and their predictions on surface subsidence, stress, and failure propagation are compared and discussed with field observation and theoretical understanding in Sect. 4. The objectives of this paper are (1) to address the whole utilization of borehole logs and test results of rock specimens in coal mines, thereby improving the accuracy of mining subsidence prediction, and (2) to provide a new perspective for numerical simulation of large-scale rock strata.

2 The Data-Intensive Numerical Modeling Method

2.1 Modeling Space Parting and Mathematical Description

Although rock mass properties are spatially varied, they may nevertheless be uniform in regions as noted by Bieniawski (1989) and Priest (1993). With this in mind, we assume that rock mass property in the modeling space coincides with that revealed by the nearest borehole, that is, the large-scale model is an assemblage of smaller regions. Figure 2 shows a guideline for determining the modeling space by a 45° angle which is a conservative estimate (State Bureau of Coal Industry 2000), in most cases, that guarantees the modeling space exceeding the mining influenced range. By doing so, the target borehole locations on the ABCD surface can be captured.

Fig. 2
figure 2

Estimation of the modeling space

The attention is subsequently posed on the geometry in connection with borehole distribution and computing grid. In practical, six-sided solids (hexahedron-like) and five-sided solids (prism-like), which become quadrangles and triangles when we look at them from the top, are often used because their quality is sufficiently high for modeling and can be directly converted into a computing grid through BlockRanger (ITASCA 2016), a grid generation tool. The ABCD surface in Fig. 2 was, therefore, divided into subareas consisting of triangles and quadrangles. Note that each subarea contains only one borehole and their area should be equal if possible. For boreholes distributed in a uniform grid, the ABCD surface can be divided into pure quadrilateral subareas along the prospecting line; otherwise, it can be split into a combination of triangles and quadrangles via the following suggested steps (see Fig. 3).

Fig. 3
figure 3

Parting of the ABCD surface for irregularly distributed boreholes

  1. (1)

    The boreholes on the ABCD surface (Fig. 3a) are first connected to form triangular meshes (Fig. 3b).

  2. (2)

    The midpoints of each side of the triangular meshes are marked as triangle points (Fig. 3c), and a straight line is drawn from the triangle points on the outer boundary of the triangle mesh to the borders of the ABCD surface to determine the square points.

  3. (3)

    The triangle points and the square points are then connected to form closed polygons (Fig. 3d) with only one borehole being allowed in each polygon regarded as the subarea.

  4. (4)

    The polygons are further simplified by reducing their edges, and the areas of all polygons should be as similar as possible (Fig. 3e).

  5. (5)

    Finally, the polygons are divided into combinations of triangles and quadrangles (Fig. 3f).

After parting of the ABCD surface, further construction along the gravity direction of each subarea is needed for the lithological variability revealed by the corresponding borehole. In general, a model with \(M\) boreholes and \(N\) strata (rock formation or rock group) interfaces contains \(M\) × \(N\) points, as the schematic shown in Fig. 4. We denote \({\varvec{P}}_{{\varvec{i}}}\) and \(P_{i,j}\) as the point set on the \(i\)-th strata interface and the three-dimensional coordinates of the \(j\)-th borehole on the \(i\)-th strata interface, respectively. \({\varvec{P}}_{{\varvec{i}}}\) is represented as follows:

$${\varvec{P}}_{{\varvec{i}}} = \left\{ {\left( {x,y,z} \right)\left| {P_{i,j} \left( {x_{i,j} ,y_{i,j} ,z_{i,j} } \right),1 \ll i \ll N,{ }1 \ll j \ll M, i \in {\mathbb{N}},j \in {\mathbb{N}}} \right.} \right\},$$
(1)

where \({\mathbb{N}}\) represents the integer set. The strata interfaces \({\varvec{S}}_{{\varvec{i}}}\) are created via the NURBS (non-uniform rational basis spline) mathematical model (Piegl and Tiller 1996), which can be expressed as

$${\varvec{S}}_{{\varvec{i}}} \left( {x,y,z} \right) = F\left( {{\varvec{P}}_{{\varvec{i}}} } \right).$$
(2)
Fig. 4
figure 4

Schematic of a computing model with 16 boreholes and 6 strata interfaces based on geological conditions of the Ying-Pan-Hao coal mine. \(P_{i,j}\) on the ABCD surface that had been divided into 16 quadrilateral subareas represent locations of 16 boreholes. \(S_{i}\) and \(S_{i + 1}\) represent two adjacent strata interfaces

Since rock lithology varies spatially, strata revealed by different boreholes with identical elevation may have different lithologies. Even if their lithologies are the same across subareas, the strata thicknesses may not be identical. To connect strata across subareas, a processing method illustrated schematically in Fig. 5 is used, where the spatial points of the \(j\)-th borehole on the adjacent strata interfaces \(S_{i}\) and \(S_{i + 1}\) are first connected (Fig. 5a) and then equally split into \(n\) sections (Fig. 5b). Provided that \(m\) rock beds of different lithologies occur between \(S_{i}\) and \(S_{i + 1}\) at a given borehole j, then we define

$$\Delta z = \left( {z_{i,j} - z_{i + 1,j} } \right)/n$$
(3)
$$\alpha_{s,j} = \frac{{L_{s,j} }}{\Delta z},{ }0 < \alpha_{s,j} < n,0 < s \ll m,\alpha_{s,j} \in {\mathbb{N}},s \in {\mathbb{N}}$$
(4)

where \(L_{s,j}\) (Fig. 6b) is the thickness of the \(s\)-th rock bed of the \(j\)-th borehole between \(S_{i}\) and \(S_{i + 1}\). Therefore, \(\left( {n - 1} \right)\) new point sets \({\varvec{P}}_{{\varvec{k}}}\) and new surfaces \({\varvec{S}}_{{\varvec{k}}}\) (Fig. 6a) are generated between \(S_{i}\) and \(S_{i + 1}\), which are represented as

$${\varvec{P}}_{{\varvec{k}}} = \left\{ {\left( {x,y,z} \right)\left| {P_{i,j} \left( {x_{i,j} ,y_{i,j} ,z_{i,j} + k \times \Delta z} \right),0 < k < n,k \in {\mathbb{N}}} \right.} \right\}$$
(5)
$${\varvec{S}}_{{\varvec{k}}} \left( {x,y,z} \right) = F\left( {{\varvec{P}}_{{\varvec{k}}} } \right).$$
(6)
Fig. 5
figure 5

Schematic of two modeling steps: a the points on two adjacent strata interfaces are connected and b each of the connected lines is equally split into 30 sections. In this figure, we use cylinders to represent the connected lines for convenience of display and the circular section of each cylinder represents the location of the split point. This figure is further delineated based on Fig. 4, as an example of how to handle spatial points on any two adjacent strata interfaces

Fig. 6
figure 6

Schematic of two modeling steps: a generation of new interfaces \({{\varvec{S}}}_{{\varvec{k}}}\) between any two adjacent strata interfaces and b lithology spatial distribution across subareas. The 29 new surfaces, \({S}_{1}\) to \({S}_{29}\), are obtained by fitting the split points of each connected line in Fig. 5b, as described by Eqs. 5 and 6; the yellow lines represent the outer boundary of subarea 1. We can see that six rock beds of different lithologies between \({S}_{i}\) and \({S}_{i+1}\) are revealed by borehole 5, and \({L}_{\mathrm{2,5}}\) represents the thickness of the second bed. \({\Delta V}_{\mathrm{1,1}}\) represents the space enclosed by \({S}_{i}\), \({S}_{1}\), and the outer boundary of the subarea 1; \({\Delta V}_{\mathrm{30,1}}\) represents the space enclosed by \({S}_{i+1}\), \({S}_{29}\), and the outer boundary of the subarea 1; \({V}_{\mathrm{1,1}}\) as indicated by the red lines represents the space of the 1st rock bed revealed by borehole 1, which is equal to \(\sum_{i=1}^{i=6}{\Delta V}_{i,1}\) in volume, as described by Eqs. 79

The space enclosed by \(S_{i}\), \({\varvec{S}}_{{\varvec{k}}}\), \(S_{i + 1}\), and the outer boundary of subareas is called the model subunit \(\Delta {\varvec{V}}_{{{\varvec{k}},{\varvec{j}}}}\) (Fig. 6b). It is assumed that \(\Delta {\varvec{V}}_{{{\varvec{k}},{\varvec{j}}}}\) and the borehole range (\(z_{i,j} + \left( {k - 1} \right) \times \Delta z, z_{i,j} + k \times \Delta z)\) have consistent lithologies. Then, the expression for the entire model is obtained as

$${\varvec{V}}_{{{\varvec{s}},{\varvec{j}}}} = \mathop \sum \limits_{{\theta = \theta_{1} }}^{{\theta = \theta_{2} }} \Delta {\varvec{V}}_{{{\varvec{\theta}},{\varvec{j}}}} ,$$
(7)
$$\theta_{1} = \mathop \sum \limits_{\theta = 1}^{\theta = s - 1} \alpha_{\theta ,j} ,$$
(8)
$$\theta_{2} = \mathop \sum \limits_{\theta = 1}^{\theta = s} \alpha_{\theta ,j} ,$$
(9)

where \({\varvec{V}}_{{{\varvec{s}},{\varvec{j}}}}\) (Fig. 6b) is the space of the \(s\)-th rock bed revealed by borehole j between \(S_{i}\) and \(S_{i + 1}\).

The above modeling process has two major benefits. First, the resulting 3D model satisfies BlockRanger’s requirements on geometry, which should maintain grid conformity and continuity with no dangling nodes, such that it can easily convert into not only FLAC3D but also 3DEC, ABAQUS, and ANSYS computing grid (ITASCA 2016). This is greatly helpful to reduce modeling complexity when compared to the direct method via FISH, especially considering the establishment of a data-intensive numerical model. Second, all borehole logs in the mining-affected area are fully used, such that the thickness and lithology variation of rock strata can be reflected both along the gravity direction and horizontal directions.

2.2 Procedure for Rock Property Estimation in the DINMM

Another concern in the DINMM is how to determine rock mass properties of various rock strata of different lithologies, and note that rock masses with the same lithology may have properties of significant difference (Liu et al. 2020). As of now, estimation of rock mass property is data dependency as discussed by many researchers. For example, in some cases, RQD rather than RMR or Q is used due to insufficient data (Zhang and Einstein 2004), and this is a way of life rather than a simple difficulty in rock mechanics and engineering design (to quote Jing 2003). Bieniawski (1989) also remarked that the drilling investigation for geotechnical purposes provides more detailed information and is much more expensive than that for mineral exploration purposes, which accounted for the lacking of data.

Data scale for large-scale strata modeling in underground coal mining has its characteristics. The shaft inspection holes are drilled for geotechnical purposes where laboratory tests for each rock bed are fully conducted, while the others are mostly drilled for mineral exploration purposes where only geological descriptions are available. Although additional tests on rock specimens cored from the roof and floor of mining panels and roadways are data sources as well, they would not cover the whole strata from ground surface to mining level because attentions, in coal mines, are paid more on rocks near coal seams. Therefore, the difficulty lies in how to estimate the properties of the rock masses revealed during mineral exploration. This is done, in this paper, by a four-step strategy: statistical analysis, analogy analysis, rock mass classification, and orthogonal testing.

The statistical analysis is to find relationships between geological descriptions (such as burial depth, color, mineral composition, mineral grain size, cementation, and structure) and mechanical parameters (such as density, elastic modulus, Poisson’s ratio, cohesion, compressive strength, friction angle, and tensile strength) on the basis of a local database with data gained mostly from shaft inspection holes. In the Ying-Pan-Hao coal mine, we found that it is still hard to establish a multi-parameter equation to reflect the detailed property variation, while through curve fitting we also attained some evident rules as shown in Fig. 7, where we can see that the density, elastic modulus, cohesion and compressive strength of the intact rock are closely related to depth.

Fig. 7
figure 7

The intact rock properties versus depth in the Ying-Pan-Hao coal mine

The analogy analysis is to infer intact rock properties by comparing the geological descriptions of an intact rock with that in the local database. We denote the input (geological description of an intact rock named “X”) and output (the corresponding mechanical properties) as GD-X and MP-X, respectively. The suggested steps are:

  1. 1.

    If GD-1 exists in the database such that GD-1 is the same with GD-X, that is, the depth, color, mineral composition, mineral grain size, structure, and cementation level of the “X” conforms with the record “1” in the database, then MP-X is equal to MP-1. Otherwise, go to Step 2.

  2. 2.

    Compare GD-X with each of the recorded geological descriptions in the database to find the GD-2 which is most similar to GD-X. Then, MP-2 is used for MP-X. Since depth is the most influential factor in this case, we first look for a few record candidates in terms of the given depth and then select the best match according to other factors which are considered equally important (i.e., the majority wins).

The above steps may seem rough but it is better to be roughly right than precisely wrong (to quote Carveth Read), especially considering that, in some cases, a simple analogy relied only on lithology was used. The significance of this proposed analogy process lies in twofold: (1) it does provide an approach for determining intact rock properties in data-intensive numerical modeling, which can be served as a baseline study; (2) it uses all the mechanical test data and geological descriptions obtained during coal mine construction, which contributes to estimating rock properties more reasonably. The demerit is that its contribution to interpreting rock property variation depends on how well the tested rock specimens can represent the property changes of the modeling domain.

Once intact rock properties have been assigned, classification systems together with a calibration routine will help derive sound inputs of numerical modeling, and these two steps will be discussed in the subsequent sections for reasons of relevance.

3 Development of the FLAC3D Models

The increasing computing power today available at a reasonable cost and persistent improvement of numerical software in parallel computing are the main backdrops for achieving a data-intensive numerical model (ITASCA 2017). In this section, the engineering background, model setup, and calibration procedure are explained.

3.1 Engineering Background and Data

As shown in Fig. 8, our interests are placed in the Ying-Pan-Hao coal mine located in the Uxin Banner of Inner Mongolia, western China. This is a newly built mine with few buildings or structures nearby, which is beneficial for arranging subsidence monitoring points and for obtaining reliable observation results. 2201 panel, the first mining panel of the mine, applies a full-seam longwall coal mining method with panel length, panel width, average mining thickness, average mining depth, and average dip angle at 300 m, 2500 m, 6.5 m, 730 m, and 1°, respectively. The green area indicates Inner Mongolia province of China, red pin indicates the location of the Ying-Pan-Hao coal mine, black arrow is for coal mining direction, and black dotted lines denote the observation lines with monitoring points numbered from B10 to B81 cross the panel and from C1 to C81 along the panel. When the 2201 panel was mined to 361, 478, 594, and 665 m, subsidence measuring was performed at each monitoring point, designated the 1st-, 2nd-, 3rd-, and 4th-period measurement.

Fig. 8
figure 8

Distribution of boreholes and surface monitoring points in the Ying-Pan-Hao coal mine

Sixteen boreholes, three shaft inspection holes, and two surface observation lines are distributed on the ground surface. Geological investigation revealed: (1) the coal seam overburden had ages from ancient to recent periods including the Yan’an formation, the Zhiluo formation, the Anding formation, the Zhidan formation, and the Quaternary system; (2) the coal-bearing strata were primarily composed of sedimentary rock in a layered or blocky structure; and (3) the study area had a simple geological structure with discontinuities been primarily beddings and joints without faults. The 16 borehole logs were used to build a DINMM-based FLAC3D model, while a comprehensive stratigraphy column (see Fig. 9) provided by geologists through simplification of the borehole logs was used to build a CNMM-based FLAC3D model.

Fig. 9
figure 9

The comprehensive stratigraphy column used for the CNMM-based model. Red numbers indicate depth below the ground surface, and black numbers are strata notations used also in Fig. 12. The thickness of each stratum reflects the average of the 16 borehole logs

A total of 278 specimens from the three shaft inspection holes were tested and some of them are shown in Fig. 10. These tests together with the corresponding geological descriptions are data sources for building the local database, and following the procedure described in Sect. 2.2, intact rock properties were attained for each stratum revealed by the 16 boreholes. Subsequently, the basic GSI chart (Hoek and Brown 2019) was used to roughly scale the intact rock properties down to the rock mass properties (Barla and Barla 2000), mainly because the description form of rock structure in our geological reports is consistent with that in the GSI chart. In general, the higher the categorization level in the GSI chart, the closer the mechanical properties of a rock mass are to those of an intact rock. In the Ying-Pan-Hao coal mine, the rock structure can be categorized as “INTACT OR MASSIVE” or “BLOCKY”, and the rock surface conditions can be categorized as “VERY GOOD” or “GOOD”. The above classification results imply that the difference between rock masses and intact rocks is relatively small. Therefore, we first presume that the initial properties of rock mass are basically consistent with that of the intact rock, and then correct them by the calibration procedure demonstrated in Sect. 3.3.

Fig. 10
figure 10

Rock specimens from three shaft inspection holes of the Ying-Pan-Hao coal mine

3.2 FLAC3D Model Setup

As shown in Fig. 11, the established two FLAC3D models share the same geometry size (3800 × 1800 × 760 m), boundary conditions, and constitutive relationship, i.e., linear-elastic, perfectly plastic model with the Mohr–Coulomb failure criterion. The bottom, left, right, front and back surfaces of both models were constrained to move along the x-, y-, and z-axes; the top boundary was a free surface without a load; the gravity, and lateral stress coefficient at rest, \({k}_{0}\), given by \({k}_{0}=\mu /(1-\mu )\) where \(\mu\) is Poisson’s ratio, were used to apply the in situ stress (Xu et al. 2013b). The color variation indicates different properties, which manifests that model-(b) incorporated much more details than model-(a) due to use of the DINMM.

Fig. 11
figure 11

Comparison of the two FLAC3D models based upon the CNMM and DINMM

Figure 12 presents the detailed values used for each stratum in the CNMM-based model. These values are the weighted average of that used for the strata in the DINMM-based model. Do note that a stratum in the CNMM-based model usually corresponds to \(n\) strata in the DINMM-based model, and the relationship between them is that they represent the same spatial position. Thus, we can specifically get

$$\overline{{{\text{MP}}}}_{i} = \frac{{\mathop \sum \nolimits_{j = 1}^{n} MP_{i,j} \times H_{j} }}{{\mathop \sum \nolimits_{j = 1}^{n} H_{j} }},$$
(10)

where \(\overline{{{\text{MP}}}}_{i}\) (\(i = 1\), 2, 3, 4, 5) represents the weighted average of the \(i\)-th mechanical parameter (i.e., bulk modulus, shear modulus, friction angle, cohesion, and tensile strength, respectively) of a stratum in the CNMM-based model; \(H_{j}\) represents the thickness of the \(j\)-th (\(1 \ll j \ll n\)) stratum in the DINMM-based model; and \(MP_{i,j}\) represents the \(i\)-th mechanical parameter of the \(j\)-th stratum.

Fig. 12
figure 12

Initial rock mass properties used in the CNMM-based model. Numbers from 1 to 38 correspond to that in Fig. 9

In addition, mesh size is another consideration on modeling accuracy and its effect has been fully accepted by researchers (e.g., Davies et al. 1984; Turon et al. 2007; Deng et al. 2012; Sande and Ray 2014; Alañón et al. 2018). To avoid mesh size bias, FLAC3D simulations were performed using various meshing schemes (see Table 1) for sensitivity analysis. The relationship between mesh size and ground surface subsidence is presented in Fig. 13. Clearly, it can be seen that for both models, the computed maximum surface subsidence first climbs with the increasing number of zones or in other words with the decreasing mesh size, and then converges to a stable value when the total number of zones exceeding 1 million. Hence, the meshing scheme 4 hereafter is applied.

Table 1 Meshing schemes for the two FLAC3D models
Fig. 13
figure 13

Influence of mesh size on ground surface subsidence prediction. The correlation between the total number of zones and mesh size is shown in Table 1

3.3 Calibration of Rock Mass Properties

For equivalent continuous modeling, especially a large-scale case, it is imperative to calibrate properties of rock masses, as they are essentially comprised of intact rocks and discontinuities of different scales, which will not be fully understood during geological investigations at present. Among many back analysis techniques, an orthogonal experimental design method with reference to Xu et al. (2013b) is chosen because of its satisfactory application in mining subsidence prediction. In the following, the calibration steps are briefly summarized and the results are shown.

(1) Experimental factors and orthogonal experimental table Elastic modulus, Poisson’s ratio, cohesion, and friction angle are the orthogonal test factors with each having five levels representing its variation range. Although a broader representation can be achieved using more levels, the dramatically increasing computing time constrain its expansion. The detailed values for the five levels are presented in Table 2 and subsequently constituted the orthogonal experimental table (see Table 3) according to the orthogonal experimental method (Taguchi 1987).

Table 2 Selected values for the five levels of the experimental factors
Table 3 The orthogonal experimental table and results of the tested schemes

(2) Calculation coefficients of each rock bed The calculation coefficient of the i-th rock bed, such as \({\lambda }_{i}^{E}\), is defined as the ratio of the elastic modulus of the i-th rock bed (\({E}_{i}\)) to the average elastic modulus of all rock beds (\(\overline{E }\)) in the numerical model. Specifically, \(\overline{E}\) is calculated as follows:

$$\overline{E} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} E_{i} \times H_{i} }}{{\mathop \sum \nolimits_{i = 1}^{n} H_{i} }},$$
(11)

where n represents the total number of rock beds, and \({H}_{i}\) represents the thickness of the i-th rock bed. For other calculation coefficients (\({\lambda }_{i}^{\mu }\), \({\lambda }_{i}^{C}\), and \({\lambda }_{i}^{\mathrm{\varnothing }}\)), the definitions are the same as \({\lambda }_{i}^{E}\).

(3) The relationship between the calculation coefficients and the schemes Each scheme in Table 3 corresponds to a FLAC3D model, in which the values of the mechanical parameters of the \(i\)-th rock bed are equal to the calculation coefficients of the \(i\)-th rock bed multiply by the corresponding levels of the scheme. Taking scheme 8 as an example, the elastic modulus of the \(i\)-th rock bed, in the FLAC3D model corresponding to scheme 8, is determined by\(\lambda_{i}^{E} \times E{\text{I }}\), where \(E{\text{II }}\) equals to 3.45 GPa in reference to Table 2.

(4) The test indicator Maximum surface subsidence is the test indicator to find the most reasonable scheme because it is sensitive to changes of rock mass properties and is available by surface subsidence observation. Corresponding to Table 3, a total of 50 FLAC3D models were established in terms of the CNMM and DINMM. The most reasonable scheme can then be selected by comparing the predicted and measured maximum subsidence. Since the maximum subsidence of the 1st-period measurement is 63.6 mm, we can see in Table 3 and Fig. 14 that the most justifiable ones for the CNMM-based model and DINMM-based model are schemes 16 and 21, respectively.

Fig. 14
figure 14

Predictions versus the 1st-period measurement. Locations of the monitoring points can be found in Fig. 8

4 Results and Discussion

4.1 Overestimation Effect in FLAC3D

FLAC3D, as designed to modeling continuous media, can well simulate rock mass deformation behavior but barely reflect the behavior of joints and bedding planes directly, especially when the modeling range is large and contains a large number of joints and bedding planes (ITASCA 2017). When the goaf area (Zhu et al. 2016) is expanded beyond a certain range, the overlying rock roof ruptures and collapses in longwall coal mining, which causes differences between the simulation and the actual condition, as shown in Fig. 15. In practical, a balanced arch structure (Yang et al. 2015) will be formed to support the overlying weight after the rock collapse (see Fig. 15a); however, in the simulation, the rocks are still attached to the overlying stratum, which is equivalent to adding more weight to the balanced arch structure in Fig. 15a and thus leads to greater ground subsidence (also see Gong et al. 2021). Hence, FLAC3D is more suitable and accurate for simulating the initial stage of the coal mining process and leads to an overestimation after the strata rupture and collapse. On the basis of the above reasoning, in this study, the early four periods measurement corresponding to the 2201 panel advancing at 361 m, 478 m, 594 m, and 655 m were used for comparison. Moreover, it is worth noting that, although the existence of overestimation effect in FLAC3D, higher computed surface subsidence is beneficial for safer engineering design.

Fig. 15
figure 15

Schematic of the balanced arch structure and the unfallen rocks. a For the actual situation where roof rocks have fallen down and stacked together after mining. b For the situation in FLAC3D where rocks are still attached to the overlying strata

4.2 Comparison of Surface Subsidence Prediction

After the calibration, further comparisons between the computed and measured subsidence are helpful to understand the effectiveness of the proposed modeling method. As shown in Fig. 16, we can see that in general the predictions of both models reflect the subsidence variation trend and are larger than the measured values. We note that, most importantly, the DINMM-based model is more accurate in predicting the 2nd- and 3rd-period measurement, even though the two models were both calibrated and showed good consistency with the 1st-period measurement. This is more pronounced when we focus on the monitoring points of B35 to B50, where a better prediction of the CNMM-based model in the 1st period conversely turns out to show a larger error in the following periods, which supports that the DINMM-based model is not only more accurate in general but also more robust in local. The mechanism behind this should be attributed to the strategy of realizing property variations along horizontal directions and the more realistic properties assigned to the related rock beds, which of course are the merits of the DINMM.

Fig. 16
figure 16

Predictions versus measurements. a The 2nd period; b the 3rd period; c the 4th period. Locations of the monitoring points can be found in Fig. 8

We further compared the root mean square error of all 70 monitoring points as shown in Table 4. We can see that the error of the DINMM-based model is smaller in the 2nd, 3rd, and 4th period, even though the CNMM-based model was better calibrated at the beginning. In addition, we note that there is a sudden root mean square error increase in the 3rd period of both models, which suggests that a strata fracture might have occurred because of the overestimation effect as we stated before. Nevertheless, a root mean square error decline of at least 27.4% can be concluded.

Table 4 Comparison of the root mean square error of the 70 monitoring points

The prediction accuracy of maximum subsidence is another concern when evaluating numerical models. We can see in Table 5 that the relative error of the predicted maximum subsidence of both models increases with the coal mining process, which agrees with the overestimation effect, and the DINMM-based model is generally better performed. Statistically, we can conclude that the relative error of maximum subsidence is reduced by 5.1% through the DINMM when compared to that of the CNMM, and that the error reduction rate reaches 66.5% on average. In addition, we must stress that the rock properties used in the CNMM-based model were derived on the basis of that used in the DINMM-based model, which avoided some biases. Whereas in previous studies, their properties are often determined by analogy relied only on lithology, which probably induces more error (also see Sect. 2.2). From this point of view, the better prediction of a DINMM-based model is more significant. Considering that the prediction result of a CNMM-based model seems to be acceptable as well, the four-step procedure together with Eq. 10 should be quite helpful to identify the best CNMM-based model.

Table 5 Comparison of the maximum subsidence predictions

4.3 Displacement, Stress and Failure Propagation Maps

Here, we present and describe the displacement, stress, and failure propagation along a vertical section (same direction as indicated by the black arrow in Fig. 8) to show the potential advantages of having a data-intensive model. Since the z-displacement contours are very similar for different mining stages, we only illustrate a typical one in Fig. 17, where we can see that the marked difference in the DINMM-based model is the asymmetric subsidence across subareas divided by the vertical blue line. This asymmetry is more pronounced near the excavated zones and transforms to near symmetry when approaching the ground surface. Apparently, the realization of strata thickness and property variation is the major driving force behind the phenomenon. Since asymmetric subsidence of the roadway roof and ground surface was frequently reported (Li et al. 2016; Wang et al. 2020; Wu et al. 2020; Sun et al. 2021), a data-intensive model may provide new insight into their interpretations.

Fig. 17
figure 17

A typical z-displacement contour at the mining distance of 361 m. The vertical blue line indicates the boundary between subareas

Same patterns are also observed in the z-stress contour maps (see Fig. 18). But more markedly, we note in the vicinity of the boundary line that sudden stress variation occurs. This is most likely due to the sudden property changes from one subarea to another. We thus suggest a linear property assignment along the bedding plane, which probably works to tackle this issue, while as a side effect, this demands a more complex modeling process, which merits further study. On the other side, since rock properties are essentially anisotropy, the contour map in Fig. 18b will shed more light on the understanding of real vertical stress distribution in the field and may provide an alternative explanation on the mechanism of rockburst from the perspective of property variation along bedding planes. We also note in recent literatures (Su et al. 2020; Wang et al. 2020) that asymmetric distribution of fracture zone and roadway deformation were observed, which supports that an asymmetric stress distribution should be more realistic.

Fig. 18
figure 18

A typical z-stress contour at the mining distance of 361 m. The vertical blue line indicates the boundary between subareas. Numbers in MPa

The failure propagation in FLAC3D is represented by the yield zones as shown in Fig. 19. A widely accepted understanding of the mining-induced deformation and movement of overburden is the “four zones” theory (Peng 1992), where it demonstrates that failure initiates at rock roof after sufficient extraction, gradually propagates upward with further extraction, and finally forms the caved zone, fractured zone, continuous bending zone, and soil zone. Therefore, it is unrealistic in Fig. 19A1–A4 that the failed zones occurred at the same time both in the vicinity of the ground surface and near the coal seam with no transmission between them. This is likely related to the inapposite rock properties assigned to some individual strata to offset the oversimplification of a CNMM-based model. We note that such phenomenon also occurred in the right subarea of the DINMM-based model (Fig. 19B1–B4) but is much slight, while in the left subarea the failure propagation agrees well with current knowledge. It appears that although a CNMM-based model can provide adequate subsidence prediction after careful parameter calibration, it is at the cost of sacrificing rationality in other aspects.

Fig. 19
figure 19

Failure propagation with mining process of 2201 panel. A1A4 The CNMM-based model mining after 361 m, 478 m, 594 m, and 660 m, respectively. B1B4 The same for the DINMM-based model

5 Summary and Conclusion

This study stems from the concept of big data, i.e., we follow the simple guideline that the more data integrated into numerical models, the more accurate result can be achieved. Over the years, the mining-affected large-scale rock masses were often modeled based upon a single or a comprehensive borehole log and only exhibits rock property and thickness variation along the gravity direction. Meanwhile, multiple borehole logs are readily available and can be used to build a data-intensive numerical model, which is important because doing so helps to exhibit rock property and thickness variation both along the gravity direction and horizontal directions, and in turn to gain a better understanding of strata movement and mining subsidence prediction. However, such efforts are seldom seen in literatures.

In this paper, we addressed this knowledge gap by proposing a data-intensive numerical modeling method with consideration of modeling limitations and the data scale available. Specifically, we proposed a method to partition modeling space for both regularly and irregularly distributed boreholes, a mathematical model of lithology and thickness spatial distribution at rock bed level, and a property estimation procedure for large-scale rock masses. These efforts build the foundation for the notion of data-intensive modeling and led to the application in the Ying-Pan-Hao coal mine via FLAC3D. After a careful calibration of the meshing scheme and rock mass properties, detailed comparisons between computed and measured values and between the CNMM- and DINMM-based models were carried out to verify the effectiveness of the proposed method.

We find that when compared to the CNMM-based prediction, the root mean square error of the 70 monitoring points is decreased at least by 27.4% in the DINMM-based prediction, and the relative error of maximum subsidence is reduced by 5.1% with a reduction rate of 66.5% on average, even though the CNMM-based model was originally better calibrated. We also find that the subsidence prediction of the DINMM-based model is not only more accurate in general but also more robust in local. Regarding the displacement, stress, and failure propagation, we observe that the pronounced characteristic in the DINMM-based model is the asymmetric z-displacement and z-stress distribution, which is more in line with the field observations in other literatures. We also observe that a DINMM-based model is more realistic in terms of failure propagation pattern, which agrees well with the theoretical understanding. We suggest using the four-step procedure together with Eq. 10 for identifying the best CNMM-based model but note that although a CNMM-based model can provide adequate subsidence prediction after careful parameter calibration, it is at the cost of sacrificing rationality in other aspects. We also suggest a linear property assignment along bedding planes in a DINMM-based model, which probably works to tackle the issue of sudden stress variation across subareas.

Overall, the DINMM is proved to be effective at least in the context of this study, which is significant for large-scale rock strata modeling and enables a more accurate prediction of mining-induced subsidence. In addition, the notion of data-intensive modeling seems to be promising and merits further study with the increasing data scale in rock engineering.