Introduction

Coal mining has played a crucial role in China’s economy, it has substantial reserves and high production. However, there are many geological disasters and environmental issues associated with coal mining. Subsidence, cracks1, collapses, and other similar disasters have caused severe damage to the land resources and ecological environment in mining areas2,3,4. Therefore, it is particularly important to establish accurate and effective prediction models and grasp the spatio-temporal evolution laws of surface subsidence in mining areas in advance.

Traditional methods for monitoring surface subsidence, such as leveling, global positioning system (GPS), and triangulation elevation measurement, have high monitoring accuracy5. These methods have drawbacks including high cost, heavy workload, and limited efficiency. Additionally, they cannot retrospectively capture the evolution of surface subsidence that has already occurred. To overcome these limitations, interferometric synthetic aperture radar (InSAR) technology has emerged, such as differential InSAR (DInSAR), small baseline subset InSAR (SBAS InSAR), and persistent scatterer InSAR (PS InSAR)6,7,8,9,10,11. There are advantages like all-weather capability, high-precision, high spatio-temporal resolution, wide coverage, and low cost. A large number of research examples have shown that using InSAR to monitor mining surface subsidence is of great help in deeply understanding the mechanism and laws of surface subsidence in mining areas12,13,14. Moreover, in recent years, with the increasingly mature application of InSAR in regional surface subsidence monitoring, combining InSAR with machine learning, deep learning, etc. to accurately predict regional surface subsidence has gradually become a new research hot spot15,16,17,18,19,20.

In InSAR deformation prediction applications, convolutional neural network (CNN) methods to detect the slow deformation of volcanoes and forecast the short-term InSAR deformation maps in the early stage21. Nukala et al. presented a novel Recurrent Neural Network (RNN) method to forecast time series deformation maps based on Sentinel-1 images, which achieves good predictive performance22. The LSTM network resolves the problem that older variants of RNN can suffer from exploding and vanishing gradients limitations in learning long-range dependencies in the data. Some researchers established a time series InSAR deformation forecasting model and indicated that the LSTM network has better prediction performance23,24. Many methods of combination forecasting have also become hot topics among scholars25,26. Muhammad Kamran et al. developed a decision intelligence-based predictive model for assessing the stability of hard rock pillars in underground engineering structures. The model integrates K-Nearest Neighbors with the Grey Wolf Optimization algorithm (KNN-GWO) to enhance accuracy in stability predictions27. An innovative decision intelligence-driven predictive modeling approach integrating K-means clustering and random forest algorithms to accurately predict the air quality index (AQI) in surface coal mining, achieving a high prediction accuracy of 97%28.

However, in the actual work of predicting surface subsidence in mining areas, it was found that existing deep learning prediction models for regional mining surface subsidence do not consider the spatial correlation and temporal nonlinear characteristics of subsidence, resulting in a lack of interpretability of prediction results29,30. In addition, the prediction accuracy is significantly influenced by the parameters of the deep learning models, and it is necessary to continuously adjust the values of the parameters manually to achieve high-precision prediction, which is very complex to operate.

Therefore, this study considers the spatio-temporal correlation and temporal nonlinearity of mining surface subsidence, utilizes the obvious advantage of the GRU network model of deep learning in learning the nonlinear characteristics of mining surface subsidence time series, and proposes a combination method of spatio-temporal prediction for regional surface subsidence in mining area. This method utilizes a strategy of combining spatial partitioning and local modeling to weaken the impact of spatial differences, utilizes the GRU network model of deep learning to capture the nonlinearity of surface subsidence in mining areas, utilizes an intelligent optimization algorithm (snake optimization algorithm) to globally optimize the parameters of the GRU model and adaptively determine the optimal parameters, avoiding the tedious and random nature of manual parameter adjustment with strong adaptability and higher accuracy, the final objective was to achieve high-precision regional surface subsidence prediction in mining areas.

Materials and methods

Study area

A mining area in Heze City, Shandong Province, China was selected as the study area, its geographical location and scope were shown in Fig. 1. This area belongs to the Yellow River alluvial plain, with a flat terrain, a ground elevation of + 41 m to + 46 m, a natural terrain slope of 0.2‰, and a total area of approximately 259 km2. The main coal types are fat coal, one-third coking coal, and gas coal. The depth, thickness and date of coal mining is 938 m, 6.8 m and since 2006, respectively. The overlying features include mostly farmland with common crops such as wheat, corn, and sweet potato, along with economic crops such as soybean, cotton, and vegetables. Furthermore, the industrial, construction, animal husbandry, and tertiary industries are relatively developed. The above natural conditions and the coal mining have caused serious surface subsidence problems in this study area, addressing the contradiction between underground coal mining and surface protection is an important work to achieve efficient and civilized production, and it is particularly important to conduct temporal monitoring and prediction.

Figure 1
figure 1

Location and scope of the study area: (a) location of the study area; (b) enlarged image and scope of the study area; and (c) location and distribution of the leveling points.

In addition, the leveling monitoring surface subsidence values of 211 leveling points were compiled to compare, analyse, and verify the accuracy and reliability of the SBAS InSAR monitoring surface subsidence time series. The distribution and extent of the 211 leveling points were shown in Fig. 1c. The leveling data was measured according to the fourth-order leveling survey rules and using the American Trimble DiNi03 electronic level, and the allowable closing error was \(20\sqrt L\) mm (\(L\) is the length of the leveling line in km).

Accuracy analysis of SBAS InSAR

The Sentinel-1 satellite constellation is an earth observation plan of the European Space Agency Copernicus Program (Global Monitoring for Environment and Security) and is composed of Sentinel-1 A satellite and Sentinel-1 B satellite. The two satellites were successfully launched on April 3, 2014, and April 25, 2016, respectively. The two satellites can observe ground with all-weather and continuous radar imaging since they carry a C-band synthetic aperture radar and fly at an on-orbit altitude of 693 km. More importantly, Sentinel-1 A and B can work in different modes, e.g., single and dual polarization work modes, with excellent timeliness and reliability31. In this study, 66 Sentinel-1A SAR images from January 10, 2019, to April 5, 2021 were selected and processed using SBAS InSAR to obtain the mining surface subsidence time series of the study area. In detail, due to high accuracy and less time cost, the GAMMA software was used to calculate surface subsidence based on SBAS, with temporal baseline of 365 days, max normal baseline of 45% and unwrapping coherence threshold of 0.3 for first unwrapping and 0.2 for the second. In order to obtain a uniformly distributed time series of mining surface subsidence and minimize the surface subsidence prediction error, linear interpolation was performed when encountering subsidence values with a 24 day interval (January 10, 2019 to February 3, 2019, February 3, 2019 to February 27, 2019, August 26, 2019 to September 19, 2019). The resulting time series comprised subsidence values over 69 periods, each period as 12-days. Figure 2 shows the surface subsidence time series of six periods.

Figure 2
figure 2

Mining surface subsidence time series of six periods monitored by SBAS InSAR: (a) January 10, 2019–March 11, 2019, (b) January 10, 2019–July 9, 2019, (c) January 10, 2019–November 18, 2021, (d) January 10, 2019–March 17, 2020, (e) January 10, 2019–September 13, 2020, and (f) January 10, 2019–April 5, 2021.

The leveling monitoring covered 19 periods from January 26, 2019, to April 4, 2021, with data from 211 leveling points. The SBAS InSAR monitoring spanned 69 periods from January 10, 2019, to April 5, 2021. To align the data, a piecewise linear interpolation was applied to the leveling results to match the SBAS InSAR data. The difference between the two monitoring methods, i.e., leveling monitoring and SBAS InSAR was then calculated for analysis. Figures 4 and 5 compare surface subsidence results from leveling and SBAS InSAR. Figure 4 displays surface subsidence curves for points H2, Q56, and Q10 over 69 dates from January 10, 2019, to April 5, 2021. Figure 5 shows histogram plots of surface subsidence values at 211 leveling points across three monitoring periods.

Figures 3 and 4 shows that leveling and SBAS InSAR could successfully monitor the continuous surface subsidence during the period from January 10, 2019, to April 5, 2021, and the surface subsidence time series in mining area exhibited obvious nonlinear characteristics.

Figure 3
figure 3

Curves of mining surface subsidence time series on three leveling points: (a) H2, (b) Q10, (c) Q56.

Figure 4
figure 4

Histogram of mining surface subsidence values on 211 leveling points: (a) January 10, 2019–May 5, 2019, (b) January 10, 2019–April 22, 2020, and (c) January 10, 2019–April 5, 2021.

For example, on leveling point H2 in Fig. 3a, during the first 16 monitoring periods, the subsidence values were similar and highly correlated. From the 17th to 29th monitoring periods, subsidence continued but at different rates between leveling and SBAS InSAR. During the 30th to 41st periods, leveling showed a rapid subsidence with an increased velocity, while SBAS InSAR also showed rapid subsidence, but with a lower velocity. From the 41st to 69th periods, leveling and SBAS InSAR both showed slowing down of surface subsidence, with SBAS InSAR having higher velocity than leveling. In Fig. 4, the histograms of mining surface subsidence values monitored by leveling and SBAS InSAR on 211 leveling points exhibited similar shapes. This suggested that the mining surface subsidence values of leveling and SBAS InSAR monitoring were relatively close to each other.

Overall, the analysis and comparison of the mining surface subsidence of SBAS InSAR and leveling monitoring demonstrated the consistency and accuracy of the SBAS InSAR monitoring, the spatio-temporal distribution of surface subsidence in mining areas monitored by SBAS InSAR was reliable.

Spatio-temporal distance and K-means clustering

This study used a combination strategy of spatio-temporal distance and K-means clustering algorithm (hereinafter referred to as SDK) to determine the spatial partitions of the mining surface subsidence time series. This method comprehensively considers and describes the spatio-temporal correlation and similarity of surface subsidence time series between adjacent pixels, and obtains more reasonable partitioning results.

Firstly, the spatio-temporal distance between the nearest neighboring pixels is calculated. The spatio-temporal distance refers to the weighted sum of temporal and spatial distances between the two pixels.

Assuming two pixels are \(P_{i}\) and \(P_{j}\), with longitude and latitude coordinates of \(\left( {lon_{i} ,lat_{i} } \right)\) and \(\left( {lon_{j} ,lat_{j} } \right)\), respectively, and the average values of the surface subsidence time series monitored by SBAS InSAR are \(t_{i}\) and \(t_{j}\), the spatio-temporal distance \(d_{i,j}\) between them can be calculated using the following equation:

$$ d_{i,j} = \omega_{s} \sqrt {\left( {\frac{{lon_{i} - lon_{j} }}{\alpha }} \right)^{2} + \left( {\frac{{lat_{i} - lat_{j} }}{\beta }} \right)^{2} } + \omega_{t} \gamma \left| {t_{i} - t_{j} } \right|, $$
(1)

where \(\omega_{s}\) and \(\omega_{t}\) refer to the weights of temporal and spatial distances, \(\alpha\) and \(\beta\) refer to the coefficients for scaling latitude and longitude, \(\gamma\) refers to the coefficient for scaling temporal distance. In this study, \(\omega_{s}\) and \(\omega_{t}\) were set to 0. 5, \(\alpha\), \(\beta\), and \(\gamma\) were set to 1.

Then, K-means clustering is performed based on the above spatio-temporal distance. K-means clustering is widely recognized as an effective clustering method32,33. The algorithm first randomly selects k pixels as cluster centers, where k represents the desired number of clusters and can be determined by the following elbow method34,35,36. Subsequently, calculates the spatio-temporal distance between each pixel and each initial cluster center. Based on the shortest distance, assigns each pixel to the nearest cluster center to form initial clusters. Recalculates the cluster center of each initial cluster based on the existing pixels in that cluster, and determines the new cluster centers. Subsequently, calculates the spatio-temporal distance between each pixel and each new cluster center. Based on the shortest distance again, assigns each pixel to the nearest cluster center to form new clusters.This iterative process will continue until one of the following termination conditions is met: (1) No (or the minimum number) pixels are reassigned to different clusters, (2) no (or the minimum number) cluster centers change again, and (3) the sum of squared errors (SSE) is minimized. In this study, the third condition was selected as the iterative termination condition, and the objective function of K-means clustering was defined as37:

$$ \min \left( {SSE} \right) = \min \left( {\sum\limits_{i = 1}^{k} {\sum\limits_{{x \in C_{i} }} {\left( {D_{i} - x} \right)^{2} } } } \right), $$
(2)

where \(C_{i}\) refers to the ith cluster, \(D_{i}\) refers to the cluster center of \(C_{i}\), \(x\) refers to the pixels of \(C_{i}\).

LSTM and GRU

LSTM and GRU are two types of recurrent neural networks38,39. The main difference between the two lies in the different gating mechanisms. LSTM uses three gates to control the information flow, namely input gate, forget gate, and output gate. The input gate and forget gate respectively control whether new input data and previous memory are written, and the output gate controls whether the output values should be passed to the next layer40. GRU only uses two gates, namely reset gate and update gate. The update gate controls the degree that the previous state information is retained in the current state, while the reset gate is used to determine how the current state is combined with previous memory. Compared to LSTM, GRU is simpler, fewer gating structures reduce the risk of over fitting, fewer parameters reduce the computational complexity, and improve operational efficiency. Overall, the advantages of the GRU lie in its simplicity, faster training speed and computational efficiency, and higher generalization ability. Figure 5 illustrates the basic structures of LSTM and GRU neural networks.

Figure 5
figure 5

Basic structures of LSTM and GRU neural networks.

Snake optimization algorithm

The number of neurons, learning rate, dropout rate, and batch size of training samples are key parameters that affect the performance of the GRU network model. A novel meta-heuristic optimization algorithm called snake optimization algorithm was used to globally optimize these parameters41. It simulates the feeding and breeding behaviors of a snake to reduce the average prediction error and achieve efficient parameter combination optimization, with numerous advantages, e.g., faster compute, higher precious and robustness42,43. It is widely used in the fields of machine learning and deep learning. its process involves the following steps44:

  1. (1)

    Parameter definition and population initialization: Determine the parameters that need to be optimized. In this study, the parameters were the number of neurons, learning rate, dropout rate, and batch_size of GRU. Additionally, apply the SO algorithm to generate an initial set of positions (parameter combinations) for the GRU model, with each position corresponding to an individual.

  2. (2)

    Fitness calculation: Calculate the RMSE of the model’s predicted subsidence values. Lower RMSE values indicate better fitness. The fitness function can be mathematically expressed as follows45,46,47:

    $$ {\text{RMSE}} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^{n} {\left( {\hat{y}_{i} - y_{i} } \right)^{2} } } , $$
    (3)

    where n denotes the number of samples, \(y_{i}\) denotes the actual subsidence values, and \(\hat{y}_{i}\) denotes the predicted values.

  3. (3)

    Iterative optimization and model training: Use SO algorithm to simulate the feeding and breeding behaviors of a snake and adjust its position to find the parameters combination with the best fitness value. Obtain the optimal combination of network model parameters, and use these optimal parameters to train the GRU prediction model.

Deep learning-based combination method of spatio-temporal prediction

To address the issues of existing models not taking into account the spatio-temporal correlation this study proposed spatio-temporal prediction method (Fig. 6) for regional mining surface subsidence can adaptively determine the optimal parameters, avoiding the tedious and random manual parameter adjustment, and ensuring that the model has strong adaptability and higher accuracy. The implementation of this method mainly involves the following three steps: Firstly, the SKD method is used to divide the surface subsidence time series into a group of partitions. Then, learn different subsidence patterns and construct local models within each partition. Finally, use the well-trained model to make short-term prediction of future regional mining surface subsidence.

Figure 6
figure 6

Spatio-temporal prediction combination model and data processing flow of regional mining surface subsidence based on deep learning.

Results

Pretreatment

The experimental data was the SBAS InSAR monitoring surface subsidence time series of 30,965 pixels in the study area, with subsidence values on 69 dates (with an interval of 12 days) from January 10, 2019 to April 5, 2021. The preceding section above provides a detailed description and accuracy verification of the data.

In order to better utilize the spatial correlation characteristics of surface subsidence time series in the mining area, the study area should be firstly divided into several partitions by the SDK strategy.

As shown in Fig. 7a, there was a distinct inflection point (i.e. elbow point) when the number of clusters was “3”. Therefore, we chose to cluster the study area into three partitions. The partition results of k-means clustering are shown in Fig. 8a, partition1 had 3718 sample pixels, partition2 had 9056 sample pixels, and partition3 had 18,191 sample pixels. Comparing Fig. 8a,b, it was found that each partition not only aggregated time series with similar subsidence characteristics, but also highly matched the spatial distribution of surface subsidence in the mining area, further verifying the reliability of the SDK strategy.

Figure 7
figure 7

(a) Relationship curve between SSE and number of clusters. (b) Relationship curves between sample input length L and RMSE of the predicted subsidence value of the testing set sample labels within each partition. (c) Decomposition process of the surface subsidence time series.

Figure 8
figure 8

Results of spatial partitioning: (a) three partitions divided by the SDK strategy, and (b) final cumulative surface subsidence in the study area monitored by SBAS InSAR on April 5, 2021.

In order to enable the GRU network model to accurately learn these nonlinear subsidence patterns and increase the number of training samples, assuming that the surface subsidence time series of sample pixels within each partition have the same subsidence pattern, and their subsidence velocities are roughly the same within L time intervals, so L can be called the optimal sample input length38,39. If the surface subsidence time series of each sample pixel is denoted as \(D_{m} = \left\{ {\begin{array}{*{20}c} {d_{1} ,} & {d_{2} ,} & { \ldots ,} & {d_{m} } \\ \end{array} } \right\}\) (m equals 69 in this study), there is \(2 \le L < m\), and the surface subsidence time series will be decomposed into \(m - L + 1\) subsidence time series, which can be expressed as \(\left\{ {\begin{array}{*{20}c} {D_{1} ,} & {D_{2} ,} & { \ldots ,} & {D_{m - L + 1} } \\ \end{array} } \right\}\) and \(D_{i} = \left\{ {\begin{array}{*{20}c} {d_{i} ,} & {d_{i + 1} ,} & { \ldots ,} & {d_{i + L - 1} } \\ \end{array} } \right\}\)38,39.

In Fig. 7c, for each surface subsidence time series, the first \(L - 1\) values \(\left\{ {d_{i} ,d_{i + 1} , \ldots ,d_{i + L - 2} } \right\}\) and the last value \(d_{i + L - 1}\) are used as the sample inputs and labels for training GRU network model, respectively. For each sliding series, the first 80% and last 20% of \(\left\{ {\begin{array}{*{20}c} {D_{1} ,} & {D_{2} ,} & { \ldots ,} & {D_{m - L + 1} } \\ \end{array} } \right\}\) of each sample pixel are used as the training and testing sets for the GRU network model, respectively.

As shown in Fig. 7b, when the sample input lengths L in partitions 1, 2, and 3 were 8, 14, and 6, respectively, the RMSE of predicted surface subsidence values on the sample labels of the testing set was the smallest. Therefore, the optimal sample input lengths for these three partitions should be 8, 14, and 6.

Finally, the training and testing series were standardized using the z-score to eliminate the influence of magnitude order of the surface subsidence time series on the training results48. The specific data standardization method is the following equation

$$ Z = {\left( {X - \mu } \right)}/\sigma, $$
(4)

where \(X\) refers to the surface subsidence time series, \(\mu\) and \(\sigma\) refer to the average value and the standard deviation of the time series, respectively.

The standardized surface subsidence time series was recorded as \(\left\{ {Z_{1} ,Z_{2} , \ldots ,Z_{m - L + 1} } \right\}\) and \(Z_{i} = \left\{ {\begin{array}{*{20}c} {z_{i} ,} & {z_{i + 1} ,} & { \ldots ,} & {z_{i + L - 1} } \\ \end{array} } \right\}\) \(i = 1,2, \ldots ,m - L + 1\).

Spatio-temporal analysis of surface subsidence

In each partition, the first 80% of standardized time series \(\left\{ {Z_{1} ,Z_{2} , \ldots ,Z_{m - L + 1} } \right\}\) of each sample pixel was used as the training set to train GRU network model, and the last 20% was used as the testing set to test GRU network model.

In the GRU network model training, many network parameters are involved, among which the number of neurons in the GRU hidden layer, learning rate, dropout rate, and batch_size of training samples are the most critical. In this study, the SO algorithm was used to globally optimize these four parameters, and use these optimal parameters to train the GRU network model.

As shown in Fig. 9, within the three partitions, the coefficient of determination were all as high as 0.99. This indicated that the predicted surface subsidence values within each partition were significantly positively correlated with the SBAS InSAR monitoring values, indicating that the combination prediction model trained in this experimentation was effective.

Figure 9
figure 9

Relationship and coefficient of determination between the predicting and SBAS InSAR monitoring surface subsidence values on the last day of the testing set (i.e. April 5, 2021): (a) Partition1, (b) Partition2, and (c) Partition3.

To further validate the proposed combination prediction method’s validity and overall effectiveness in predicting regional mining surface subsidence, the LSTM and GRU were also employed to predict the surface subsidence values in the study area without dividing spatial partitions and parameters optimization. The first 80% of the standardized time series \(\left\{ {Z_{1} ,Z_{2} , \ldots ,Z_{m - L + 1} } \right\}\) of each sample pixel was used as the training set for model training, and the last 20% was used as the testing set for model testing. The critical parameters used for training the LSTM and GRU network models are as follows: number of neurons—128, learning rate—0.001, dropout rate—0.20, batch size—4096.

As shown in Fig. 10a–c, the surface subsidence distribution and values predicted by the three methods in the mining area were very consistent with the results of SBAS InSAR monitoring. The prediction results of the three methods were reliable. However, as shown in Fig. 10d–f, the prediction accuracy of our method was significantly higher than the other two methods. Specifically, in the prediction results of our method, there were 30,686 sample points (99.1% of all sample pixels) with an absolute error of less than 8 mm. In the prediction results of GRU and LSTM methods, there were 29,975 sample points (96.8% of all sample pixels) and 29,757 sample points (96.1% of all sample pixels) with an absolute error of less than 8 mm, respectively. The prediction accuracy of our method was higher than that of LSTM and GRU.

Figure 10
figure 10

Regional mining surface subsidence and their absolute error predicted by the combined prediction method, GRU, and LSTM on the last day of the testing set (i.e. April 5, 2021): (a) combination prediction method’s predicted subsidence, (b) GRU’s predicted subsidence, (c) LSTM’s predicted subsidence, (d) combination prediction method’s prediction error, (e) GRU’s prediction error, and (f) LSTM’s prediction error.

Spatio-temporal prediction of surface subsidence

Assuming the input data set is X and the prediction result is Y, use the cyclic prediction mode to predict the subsidence value of the (L + 1)st moment based on the subsidence values of the first L moments, and add the predicted subsidence value of the (L + 1)st moment to the input data set to form a new X, and then the new value of the (L + 2)th moment is predicted.

The combination of cyclic prediction mode and well-trained spatio-temporal GRU network model was used to make short-term surface subsidence prediction in regional mining areas for 15 time periods within 180 days (April 5, 2021 to October 2, 2021), with an equal time interval of 12 days. Figure 11 shows the regional mining surface subsidence and their growth values during three periods predicted by the presented deep learning-based combination prediction method.

Figure 11
figure 11

Predicted regional surface subsidence and their growth values within 180 days in the mining area: (a) January 10, 2019–April 17, 2021, (b) January 10, 2019–June 28, 2021, (c) January 10, 2019–October 2, 2021, (d) April 5, 2021–April 17, 2021, (e) April 5, 2021–June 28, 2021, and (f) April 5, 2021–October 2, 2021.

As shown in Fig. 11a–f, the mining surface subsidence in the study area would gradually increase in the next 6 months, and the maximum cumulative subsidence value at the subsidence center would reach − 801 mm. The range of surface subsidence would continuously expand, and the subsidence basin obviously expand too. The surface subsidence trend at the center of the basin would become gentle, while the subsidence at the edges would become more severe.

As shown in Fig. 12a–c, the predicted surface subsidence time series after April 5, 2021 were very consistent with the previous subsidence trend shown in the SBAS InSAR monitoring results. The maximum cumulative subsidence value of the three leveling feature points H2, Q10, and Q56 would reach − 327 mm, − 747 mm, and − 188 mm in the next six months, respectively. On the leveling points Q10 and Q56, the surface showed a sustained and slow downward trend in the next 6 months, while on H2, the surface showed a trend of first continuous decline and then gradually stabilizing in the next 6 months.

Figure 12
figure 12

Predicted surface subsidence time series on the three leveling points: (a) H2, (b) Q10, and (c) Q56.

Discussion

Comparison experiment

To validate the prediction performance of our combination prediction method, we quantitatively compared it with three famous models: MLR, RNN, and SVR. Prediction errors, e.g. MAE RMSE R2, were used as evaluation metrics, with all methods using the same data source. As shown in Table 1, our combined method outperforms the other algorithms, as evidenced by the highest R2 (0.9918) and lowest RMSE (2.35).

Table 1 Comparison of experimental results of different prediction model.

Research limitations

While the feasibility and high accuracy of our prediction algorithm have been validated in coal, some shortcomings of our combination strategy need to be discussed and addressed.

  1. (1)

    The above comparison experiments show that the combination method proposed in this study has the highest prediction accuracy. However, our algorithm is limited by data processing in terms of complex data pre-processing. Only using these data pre-processing approaches can ensure the excellent accuracy of our algorithm. The following work should discuss simplifying data processing and ensuring high accuracy.

  2. (2)

    The GRU model we used for time series prediction in this study can be improved. Like another DL-based model, e.g., LSTM, the GRU model relies on samples in terms of quantity and quality to improve its performance in predicting surface subsidence. Therefore, obtaining enough data at a low cost is an issue that needs to be addressed.

  3. (3)

    Additionally, the GRU model shows high accuracy and robustness in the short term but performs worse in predicting surface subsidence in the long term, which may be attributed to its construction.

  4. (4)

    Our algorithm assumed that the surrounding geologic environment (e.g., sediment thickness and land use type) would not change significantly over the study period, and we focused on predicting surface subsidence at a slow speed. Therefore, our algorithm is not applied for prediction of transient scenarios (e.g. landslides and earthquakes).

Conclusion

In regions affected by coal mining, surface subsidence poses significant risks to human life and property. Therefore, it is essential to establish accurate and effective prediction models and grasp the spatio-temporal evolution laws of surface subsidence in mining areas in advance. In this study, the spatio-temporal correlation and temporal nonlinear characteristics of mining surface subsidence were considered, and a DL-based combination method of spatio-temporal prediction for regional mining surface subsidence was proposed. Consequently, high-precision regional surface subsidence prediction of a mining area in Heze City, Shandong Province, China, was obtained based on the SBAS InSAR monitoring surface subsidence time series. The main conclusions are as follows:

  1. (1)

    To considering spatio-temporal relationship of surface subsidence in mining areas, the three partitions obtained by spatio-temporal distance, K-means clustering algorithm, and elbow method, which were highly consistent with the spatial distribution of SBAS InSAR monitored surface subsidence. Strengths and flexibility of the partitioning method is highlighted by the partitioning results

  2. (2)

    Due to that the surface subsidence time series in mining areas exhibited obvious nonlinear characteristics, the optimal sample input length L of each partition was determined based on the minimum RMSE. The subsidence time series of each sample pixel was then decomposed into several subsidence time series at L time intervals to construct the training and testing sets of the GRU network model. The sample sets established not only increase the number of training and testing set, but also enable GRU network model to more accurately learn the nonlinear surface subsidence patterns in mining areas. Consequently, the distribution and values of predicted surface subsidence were reliable and closely consistent with the monitoring results of SBAS InSAR.

  3. (3)

    Using RMSE as the fitness, the SO algorithm was used to globally optimize the number of neurons, learning rate, dropout rate, and batchsize of training samples to train the GRU network model. Subsequently, a well-trained and optimal GRU prediction network model was constructed. By using this model, high-precision prediction of surface subsidence in regional mining areas can be obtained. This adaptive method of determining the optimal parameters eliminated tedious and random manual parameter adjustments, ensuring prediction model with strong adaptability and high accuracy.

  4. (4)

    Compared to LSTM and existed GRU model, our DL-based combination method of spatio-temporal prediction for regional mining surface subsidence has higher prediction accuracy. The predicted surface subsidence time series in the next 6 months exhibited good agreement with the subsidence trend observed in the early SBAS InSAR monitoring results.

Overall, this study expands the applicability of GRU network models in the field of surface subsidence prediction and the application scope of deep learning techniques, which can help accurate prediction of surface subsidence, especially in coal.