Introduction

The occurrence of ground subsidence around abandoned coal mine areas has recently become a serious social problem in Korea, since almost all underground coal mines have been abandoned, and few remain since 1989. However, the effort of quantitative assessment of predicted ground subsidence areas is very few, especially in coal mining areas where the structures of the geology and mining are complex. For this reason, the purpose of the present study was to assess and predict ground subsidence for hazard mapping near an abandoned underground coal mines (AUCMs) area using an artificial neural networks and GIS.

A method that predicts the probability of ground subsidence empirically, within surprisingly narrow limits considering the form of the input data, has been suggested (Goel and Page 1982) using (1) the intact strength of the rock, (2) the stress field, (3) the geological structure, (4) the depth of the mining horizon, (5) the extent of the mined area, and (6) the volume extracted per unit area of mining. The National Coal Board has published a basic technique to determine the estimated area influenced by ground subsidence based on the height of the cavity, the width of the mined panel, and the angle of inclination of the coal seam (National Coal Board 1975). The method used to predict the subsidence area is dependent on the structure of the local geology and the coal-mining method used, and the empirical methods discussed above were developed for conditions involving horizontal coal seams and longwall working, which are predominant in Europe. However, in Korea, due to the heterogeneous structure of the geology, there are coal seams of various widths, and irregularly inclined coal seams and strata, so the slant-chute block caving method has been used. As a result, a sinkhole type of subsidence is usual, and therefore a different estimation of ground subsidence is necessary. Table 1 shows the factors that commonly affect sink-hole-type ground subsidence over time (Coal Industry Promotion Board 1997). Furthermore, quantitative analysis of presumptive ground subsidence near AUCMs in Korea has not been well studied heretofore. However, Kim et al. (2006) have studied using probabilistic and statistic model in GIS environment. The fundamental difference of the study from that of Kim et al. (2006) is to apply artificial neural networks in GIS environment.

Table 1 Factors affect sink-hole type ground subsidence (Coal Industry Promotion Board 1997)

When choosing a study area, field investigations and reinforcement reports related to ground subsidence were carefully assessed. In this study, a site called Magyori was chosen, where 21 signs of ground subsidence have been identified near an AUCM at Samcheok City (Coal Industry Promotion Board 1999). The study site is located between longitudes 129° 2′ 40″ and 129° 3′ 30″ and latitudes 37° 14′ 26″ and 37° 15′ 24″. The coal resource of South Korea consists almost entirely of anthracite, 85% of which was deposited during the Upper Paleozoic and the Lower Mesozoic in the Jangseong Formation of the Pyeongan Supergroup (The Geological Society of Korea 1999). The Oship Fault, Youngdong railroad, and no. 38 local road pass along the study area (Coal Industry Promotion Board 1997). The location map of this study site with ground subsidence areas is given in Fig. 1.

Fig. 1
figure 1

Study area

Theory: artificial neural networks

An artificial neural network is a “computational mechanism able to acquire, represent, and compute a mapping from one multivariate space of information to another, given a set of data representing that mapping” (Garrett 1994). The back-propagation training algorithm is the most frequently used neural networks method (Lee et al. 2004; Sonmez et al. 2006; Tunusluoglu et al. 2007; Zorlu et al. 2008; Nefeslioglu et al. 2008) and is the method used in this study. The back-propagation training algorithm is trained using a set of examples of associated input and output values. The purpose of artificial neural networks is to build a model of the data-generating process, so that the networks can generalize and predict outputs from inputs that it has not previously seen. This learning algorithm is multi-layered neural networks, which consists of an input layer, hidden layers, and an output layer. The hidden and output layer neurons process their inputs by multiplying each input by a corresponding weight, summing the product, then processing the sum using a nonlinear transfer function to produce a result. The artificial neural networks “learns” by adjusting the weights between the neurons in response to the errors between the actual output values and the target output values. At the end of this training phase, the neural networks provide a model that should be able to predict a target value from a given input value.

Paola and Schwengerdt (1995) indicated that there are two stages involved in using neural networks for multi-source classification: the training stage, in which the internal weights are adjusted and the classifying stage. Typically, the back-propagation algorithm trains the networks until some targeted minimal error is achieved between the desired and actual output values of the networks. Once the training is complete, the networks are used as a feed-forward structure to produce a classification for the entire data (Paola and Schwengerdt 1995).

The neural networks consist of a number of interconnected nodes. Each node is a simple processing element that responds to the weighted inputs it receives from other nodes. The arrangement of the nodes is referred to as the networks architecture (Fig. 2). The receiving node sums the weighted signals from all the nodes that it is connected to in the preceding layer. Formally, the input that a single node receives is weighted according to Eq. 1.

$$ {\text{net}}_{j} = \sum\limits_{i} {w_{ij} } \cdot o_{i} $$
(1)

where w ij represents the weights between nodes i and j, and o i is the output from node j, given by

$$ o_{j} = f({\text{net}}_{j} ). $$
(2)
Fig. 2
figure 2

Architecture of neural networks for ground subsidence hazard analysis

The function f is usually a nonlinear sigmoid function that is applied to the weighted sum of inputs before the signal propagates to the next layer. One advantage of a sigmoid function is that its derivative can be expressed in terms of the function itself:

$$ f^{\prime}({\text{net}}_{j} ) = f({\text{net}}_{j} )(1 - f({\text{net}}_{j} )). $$
(3)

The networks used in this study consisted of three layers. The first layer is the input layer, where the nodes were the elements of a feature vector. The second layer is the internal or “hidden” layer. The third layer is the output layer that presents the output data. Each node in the hidden layer is interconnected to nodes in both the preceding and following layers by weighted connections (Atkinson and Tatnall 1997).

The error, E, for an input training pattern, d, is a function of the desired output vector, and the actual output vector, o, given by:

$$ E = \frac{1}{2}\sum\limits_{k} {(d_{k} - o_{k} )} . $$
(4)

The error is propagated back through the neural networks and is minimized by adjusting the weights between layers. The weight adjustment is expressed as:

$$ w_{ij} (n + 1) = \eta (\delta _{j} \cdot o_{i} ) + \alpha \Updelta w_{ij} $$
(5)

where η is the learning rate parameter (set to η = 0.01 in this study), δ j is an index of the rate of change of the error, and α is the momentum parameter (set to α = 0.01 in this study).

The factor δ j is dependent on the layer type. For example,

$$ {\text{for hidden layers}},\,\delta _{j} = (\sum {\delta _{k} w_{jk} } )f^{\prime}({\text{net}}_{j} ) $$
(6)
$$ {\text{and for output layers}},\,\delta _{j} = (d_{k} - o_{k} )f^{\prime}({\text{net}}_{k} ). $$
(7)

This process of feeding forward signals and back-propagating the error is repeated iteratively until the error of the networks as a whole is minimized or reaches an acceptable magnitude.

Using the back-propagation training algorithm, the weights of each factor can be determined and may be used for classification of data (input vectors) that the networks have not seen before. Zhou (1999) described a method for determining the weights using back propagation. From Eq. 2, the effect of an output, o j , from a hidden layer node, j, on the output, o k , from an output layer (node k) can be represented by the partial derivative of o k with respect to o j as

$$ \frac{{\partial o_{k} }}{{\partial o_{j} }} = f^{\prime}({\text{net}}_{k} ) \cdot \frac{{\partial ({\text{net}}_{k} )}}{{\partial o_{j} }} = f^{\prime}({\text{net}}_{k} ) \cdot w_{jk} . $$
(8)

Equation (8) produces both positive and negative values. If the effect’s magnitude is all that is of interest, then the importance (weight) of node j relative to another node j0 in the hidden layer may be calculated as the ratio of the absolute values derived from Eq. 8:

$$ \frac{{\left| {\partial o_{k} } \right|}}{{\left| {\partial o_{j} } \right|}}/\frac{{\left| {\partial o_{k} } \right|}}{{\left| {\partial o_{j0} } \right|}} = \frac{{\left| {f^{\prime}({\text{net}}_{k} ) \cdot w_{jk} } \right|}}{{\left| {f^{\prime}({\text{net}}_{k} ) \cdot w_{j0k} } \right|}} = \frac{{\left| {w_{jk} } \right|}}{{\left| {w_{j0k} } \right|}}. $$
(9)

We should mention that w j0k is simply another weight in w jk other than w ik .

For a given node in the output layer, the results of Eq. 9 show that the relative importance of a node in the hidden layer is proportional to the absolute value of the weight connecting the node to the output layer. When the networks consists of output layers with more than one node, then Equation (9) cannot be used to compare the importance of two nodes in the hidden layer.

$$ w_{j0k} = \frac{1}{J} \cdot \sum\limits_{j = 1}^{J} {\left| {w_{jk} } \right|} $$
(10)
$$ t_{jk} = \frac{{\left| {w_{jk} } \right|}}{{\frac{1}{J} \cdot \sum\nolimits_{j = 1}^{J} {\left| {w_{jk} } \right|} }} = \frac{{J \cdot \left| {w_{jk} } \right|}}{{\sum\nolimits_{j = 1}^{J} {\left| {w_{jk} } \right|} }} $$
(11)

Therefore, with respect to node k, each node in the hidden layer has a value that is greater or smaller than unity, depending on whether it is more or less important, respectively, than an average value. All the nodes in the hidden layer have a total importance with respect to the same node, given by

$$ \sum\limits_{j = 1}^{J} {t_{jk} } = J. $$
(12)

Consequently, the overall importance of node j with respect to all the nodes in the output layer can be calculated by

$$ t_{j} = \frac{1}{K} \cdot \sum\limits_{j = 1}^{K} {t_{jk} } . $$
(13)

Similarly, with respect to node j in the hidden layer, the normalized importance of node j in the input layer can be defined by

$$ s_{ij} = \frac{{\left| {\omega _{ij} } \right|}}{{\frac{1}{I} \cdot \sum\nolimits_{i = 1}^{I} {\left| {\omega _{ij} } \right|} }} = \frac{{I \cdot \left| {\omega _{ij} } \right|}}{{\sum\nolimits_{i = 1}^{I} {\left| {\omega _{ij} } \right|} }}. $$
(14)

The overall importance of node i with respect to the hidden layer is

$$ s_{i} = \frac{1}{J} \cdot \sum\limits_{j = 1}^{J} {s_{ij} } . $$
(15)

Correspondingly, the overall importance of input node i with respect to output node k is given by

$$ st_{i} = \frac{1}{J} \cdot \sum\limits_{j = 1}^{J} {s_{ij} \cdot t_{j} } . $$
(16)

Data

Many studies have identified important factors that contribute to ground subsidence around coal mines, including (Coal Industry Promotion Board 1997; Waltham 1989): depth and height of the mined cavities, excavation method, degree of inclination of the excavation, scope of mining, structural geology and flow of groundwater. Therefore, the factors governing the occurrence of ground subsidence were collected in a vector-type spatial database. These included a 1:50,000 scale geological map, 1:5,000 scale topographic maps, 1:5,000 scale land use maps, 1:1,200 scale mined-tunnel maps, and borehole data. Reliable accuracy of spatial database is indispensable in GIS environment. For this reason, accurate maps authorized by Korean government agencies were collected even though the scales of each map were different. The data layers are shown in Table 2.

Table 2 Constructed GIS database including factors conneted with ground subsidence of study area

The geology data were extracted from a 1:50,000 scale geological map of the Korea Institute of Geoscience and Mineral Resources. Contour and survey base points with elevation values read from the topographic map were extracted, and a digital elevation model (DEM) was constructed. Using the DEM, the slope gradients were calculated. There are 14 classes of land use, which were extracted from the land use map of the National Geographic Information Institute. Most of the literature (Goel and Page 1982; National Coal Board 1975; Waltham 1989) maintains that the major factor in ground subsidence is the scope of the mined cavities. Therefore, constructing a database of the depths and widths of mined cavities was very important during this study. To achieve this object, (1) GPS (ProMark2 GPS system, less than 10 mm static survey accuracy) measurements were used to determine the exact positions of mine heads; (2) these were used to vectorize a hard copy of the mined tunnel map; and (3) the vectorized mined tunnel map was converted to an ASCII grid file, and subtracted with the DEM raster data. There were 35 boreholes at the study site, but some boreholes did not have values, so an inverse distance weighting (IDW) interpolation method was used to contour groundwater levels, and permeability factors.

Method

This study was conducted using GIS, artificial neural networks with factors that may cause ground subsidence. An image database and an attribute database for ground subsidence were constructed. When using this approach the principal assumption is that the potential ground subsidence (occurrence possibility) will be similar to the actual frequency of ground subsidence. After the study site was selected, areas of ground subsidence were detected at the study site by field surveys. A map of existing ground subsidence was developed, and this was used to evaluate the frequency and distribution of ground subsidence at the study site.

For the study, first, maps relevant to ground subsidence occurrence were used to construct a vector-type spatial database using the GIS software package, ARC/INFO. Second, ground subsidence occurrence areas were detected in the study area by interpretation of field surveys. A map of the ground subsidence locations were constructed to spatial database using GIS. Third, for the calculation of the weight, the ground subsidence factors were converted to grid (ARC/INFO grid-type), and then converted to ASCII data for use with an artificial neural networks program. And the ASCII datasets were normalized between 0.1 and 0.9 since the value of sigmoid function used in artificial neural networks varies from 0 to 1. Then, using detected ground subsidence locations, the weights of each factor were determined by neural networks method. The weight of each factor was determined after training using artificial neural networks program that was developed using MATLAB (Demuth et al. 2005). For the weight determination using the artificial neural networks, the location where the ground subsidence occurred was assigned as a training area and the artificial neural networks had been trained. When the weights converged to a proper value, the weights were determinate by back propagation between the neural networks layers. After then, the results of the analysis were converted to grid data using the GIS. Finally, ground subsidence hazard mapping was carried out using the weight together in our study and the analytical results were verified using the ground subsidence locations.

In this study, the GIS software ArcView 3.3 and ARC/INFO version 9.0 were used as the basic analysis tools for spatial management and data manipulation.

Prediction of ground subsidence using the artificial neural networks

Figure 3 is the flowchart of the neural networks training for the weight determination. The weights between layers that acquired by training of the neural networks were calculated reversely and the contribution or importance of each factor was calculated. So, weights that are contribution or importance of each factor were determined. For the calculation of the weight, program developed by Hines (1997) was used and for the interpretation of the weight, a newly developed program was used.

Fig. 3
figure 3

The flow chart of neural networks training for weight determination

Fig. 4
figure 4figure 4

Input factors

The seven factors listed in Table 2 were used as the input data. The factors were converted to a 1 × 1 m2 grid and the total cell number was 2,102,594, and the ground subsidence occurrence cell number was 10,369. Using GIS software, a grid of 1,207 rows and 1,742 columns, with a point spacing of 1 m, was used (Fig. 4).

The subsidence-prone (occurrence) locations and the locations that were not prone to subsidence were selected as training sites. Cells from each of the two classes were randomly selected as training cells, with 3,000 cells denoting areas where subsidence not occurred or occurred. First, areas where the subsidence was not occurred were classified as “areas not prone to subsidence” and areas where subsidence was known to exist were assigned to an “areas prone to subsidence” training set. The training sites were processed ten times to identify any changes that might occur due to the assignment of random initial weights.

The back-propagation algorithm was then applied to calculate the weights between the input layer and the hidden layer, and between the hidden layer and the output layer, by modifying the number of hidden node and the learning rate. Three-layered feed-forward networks were implemented using the MATLAB software package. Here, “feed-forward” denotes that the interconnections between the layers propagate forward to the next layer. The number of hidden layers and the number of nodes in a hidden layer required for a particular classification problem are not easy to deduce. In this study, a 7 × 15 × 1 structure was selected for the networks, with input data normalized in the range 0.1–0.9. The nominal and interval class group data were converted to continuous values ranging between 0.1 and 0.9. Therefore, the continuous values were not ordinal data, but nominal data, and the numbers denote the classification of the input data.

The learning rate was set to 0.01, and the initial weights were randomly selected to values between 0.1 and 0.9. The weights calculated from 10 test cases were compared to determine whether the variation in the final weights was dependent on the selection of the initial weights. The back-propagation algorithm was used to minimize the error between the predicted output values and the calculated output values. The algorithm propagated the error backwards, and iteratively adjusted the weights. The number of epochs was set to 5,000, and the root mean square error (RMSE) value used for the stopping criterion was set to 0.1. Most of the training data sets met the 0.1 RMSE goal. However, if the RMSE value was not achieved, then the maximum number of iterations was terminated at 5,000 epochs. When the latter case occurred, then the maximum RMSE value was <0.216. The final weights between layers acquired during training of the neural networks and the contribution or importance of each of the seven factors used to predict ground subsidence hazard are shown in Table 3. The results were not the same, as the initial weights were assigned random values. Therefore, in this study, the calculations were repeated ten times, to allow the results to achieve similar values. The standard deviation of the results was in the range 0.0153–0.0343, and therefore, the random sampling did not have a large effect on the results. For easy interpretation, the average values were calculated, and these values were divided by the average of the weights of the some factor that had a minimum value. The slope value was the minimum value, 1.00, and the depth of groundwater value was the maximum value, 2.0908. Finally, the weights were applied to the entire study area, and the ground subsidence hazard map was created (Fig. 5). The values were classified by equal areas and grouped into four classes for visual interpretation.

Table 3 Weights of each factor estimated by neural networks considered in this study
Fig. 5
figure 5

Ground subsidence hazard map using artificial nueral networks

Verification

The subsidence hazard analysis results were verified using known ground subsidence locations. Verification was performed by comparing the known ground subsidence location data with the subsidence hazard map. Each factor used and its frequency ratio were compared. Rate curves were created and the areas under the curves were calculated for two cases. The rate explains how well the model and the factor predict the subsidence. To obtain the relative rank for each prediction pattern, the calculated index values for all cells in the study area were sorted in descending order. The ordered cell values were then divided into 100 classes, at accumulated 1% intervals. The rate verification results appear as a line in Fig. 6. For example, the 90–100% (10%) class of the study area where the subsidence hazard index had a high rank could explain 91% of all subsidence. The 80–100% (20%) class of the study area where the subsidence hazard index had a high rank could explain 95% of subsidence.

Fig. 6
figure 6

Cumulative frequency diagram showing ground subsidence hazard rank occurring in cumulative percent of ground subsidence occurrence

The area under the curve can quantitative estimate the prediction accuracy. So, to compare the results quantitatively, the areas under the curves were recalculated as a total area of 1, which means perfect prediction accuracy. Therefore, the area under a curve can be used to assess the prediction accuracy quantitatively. The area ratio was 0.9606, so we can say that the prediction accuracy is 96.06%.

Results and discussions

Ground subsidence is one the most hazardous event among the artificial disasters. Government and research institutions worldwide have attempted for years to assess subsidence hazards and risks, and to show their spatial distribution. In this study, a data mining approach to identifying hazardous areas of subsidence using GIS shows considerable promise. Ground subsidence map was constructed using artificial neural networks. This showed prediction accuracy, 96.06%. Thus, the result showed very high prediction accuracy. GIS data was used to efficiently analyze the large volume of data, and the artificial neural networks proved to be an effective tool to analyze ground subsidence hazard.

The weights calculated from ten test cases were compared to determine whether the variation in the final weights was dependent on the selection of the initial weights. The results show that the initial weights did not have an influence on the final weight under the conditions used.

As shown in Table 3, depth of groundwater, land use, permeability and geology have relatively high weights to analyze ground subsidence near AUCM. The surveyed subsidence areas of this study are located around railroad, road, and other facilities above a shallow mined tunnel. Therefore, land use, geology and depth of the mined tunnel are important factors, as well as the groundwater level.

Correlation between the flow of groundwater and the depth of mined tunnel is meaningful in this study. Because the study area where depth of groundwater is shallower than depth of mined tunnel shows that the area has higher subsidence hazard index than the other areas. And therefore, relative high weights for depth of groundwater and permeability factor can be explained here (Table 3).

The data on groundwater levels were obtained during field surveys, without considering the amount of rainfall at the time. However, it is a meaningful value and should be considered in calculating the safety of a base rock.