Abstract
This study shows the construction of a hazard map for presumptive ground subsidence around abandoned underground coal mines (AUCMs) at Samcheok City in Korea using an artificial neural network, with a geographic information system (GIS). To evaluate the factors governing ground subsidence, an image database was constructed from a topographical map, geological map, mining tunnel map, global positioning system (GPS) data, land use map, digital elevation model (DEM) data, and borehole data. An attribute database was also constructed by employing field investigations and reinforcement working reports for the existing ground subsidence areas at the study site. Seven major factors controlling ground subsidence were determined from the probability analysis of the existing ground subsidence area. Depth of drift from the mining tunnel map, DEM and slope gradient obtained from the topographical map, groundwater level and permeability from borehole data, geology and land use. These factors were employed by with artificial neural networks to analyze ground subsidence hazard. Each factor’s weight was determined by the back-propagation training method. Then the ground subsidence hazard indices were calculated using the trained back-propagation weights, and the ground subsidence hazard map was created by GIS. Ground subsidence locations were used to verify results of the ground subsidence hazard map and the verification results showed 96.06% accuracy. The verification results exhibited sufficient agreement between the presumptive hazard map and the existing data on ground subsidence area.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
The occurrence of ground subsidence around abandoned coal mine areas has recently become a serious social problem in Korea, since almost all underground coal mines have been abandoned, and few remain since 1989. However, the effort of quantitative assessment of predicted ground subsidence areas is very few, especially in coal mining areas where the structures of the geology and mining are complex. For this reason, the purpose of the present study was to assess and predict ground subsidence for hazard mapping near an abandoned underground coal mines (AUCMs) area using an artificial neural networks and GIS.
A method that predicts the probability of ground subsidence empirically, within surprisingly narrow limits considering the form of the input data, has been suggested (Goel and Page 1982) using (1) the intact strength of the rock, (2) the stress field, (3) the geological structure, (4) the depth of the mining horizon, (5) the extent of the mined area, and (6) the volume extracted per unit area of mining. The National Coal Board has published a basic technique to determine the estimated area influenced by ground subsidence based on the height of the cavity, the width of the mined panel, and the angle of inclination of the coal seam (National Coal Board 1975). The method used to predict the subsidence area is dependent on the structure of the local geology and the coal-mining method used, and the empirical methods discussed above were developed for conditions involving horizontal coal seams and longwall working, which are predominant in Europe. However, in Korea, due to the heterogeneous structure of the geology, there are coal seams of various widths, and irregularly inclined coal seams and strata, so the slant-chute block caving method has been used. As a result, a sinkhole type of subsidence is usual, and therefore a different estimation of ground subsidence is necessary. Table 1 shows the factors that commonly affect sink-hole-type ground subsidence over time (Coal Industry Promotion Board 1997). Furthermore, quantitative analysis of presumptive ground subsidence near AUCMs in Korea has not been well studied heretofore. However, Kim et al. (2006) have studied using probabilistic and statistic model in GIS environment. The fundamental difference of the study from that of Kim et al. (2006) is to apply artificial neural networks in GIS environment.
When choosing a study area, field investigations and reinforcement reports related to ground subsidence were carefully assessed. In this study, a site called Magyori was chosen, where 21 signs of ground subsidence have been identified near an AUCM at Samcheok City (Coal Industry Promotion Board 1999). The study site is located between longitudes 129° 2′ 40″ and 129° 3′ 30″ and latitudes 37° 14′ 26″ and 37° 15′ 24″. The coal resource of South Korea consists almost entirely of anthracite, 85% of which was deposited during the Upper Paleozoic and the Lower Mesozoic in the Jangseong Formation of the Pyeongan Supergroup (The Geological Society of Korea 1999). The Oship Fault, Youngdong railroad, and no. 38 local road pass along the study area (Coal Industry Promotion Board 1997). The location map of this study site with ground subsidence areas is given in Fig. 1.
Theory: artificial neural networks
An artificial neural network is a “computational mechanism able to acquire, represent, and compute a mapping from one multivariate space of information to another, given a set of data representing that mapping” (Garrett 1994). The back-propagation training algorithm is the most frequently used neural networks method (Lee et al. 2004; Sonmez et al. 2006; Tunusluoglu et al. 2007; Zorlu et al. 2008; Nefeslioglu et al. 2008) and is the method used in this study. The back-propagation training algorithm is trained using a set of examples of associated input and output values. The purpose of artificial neural networks is to build a model of the data-generating process, so that the networks can generalize and predict outputs from inputs that it has not previously seen. This learning algorithm is multi-layered neural networks, which consists of an input layer, hidden layers, and an output layer. The hidden and output layer neurons process their inputs by multiplying each input by a corresponding weight, summing the product, then processing the sum using a nonlinear transfer function to produce a result. The artificial neural networks “learns” by adjusting the weights between the neurons in response to the errors between the actual output values and the target output values. At the end of this training phase, the neural networks provide a model that should be able to predict a target value from a given input value.
Paola and Schwengerdt (1995) indicated that there are two stages involved in using neural networks for multi-source classification: the training stage, in which the internal weights are adjusted and the classifying stage. Typically, the back-propagation algorithm trains the networks until some targeted minimal error is achieved between the desired and actual output values of the networks. Once the training is complete, the networks are used as a feed-forward structure to produce a classification for the entire data (Paola and Schwengerdt 1995).
The neural networks consist of a number of interconnected nodes. Each node is a simple processing element that responds to the weighted inputs it receives from other nodes. The arrangement of the nodes is referred to as the networks architecture (Fig. 2). The receiving node sums the weighted signals from all the nodes that it is connected to in the preceding layer. Formally, the input that a single node receives is weighted according to Eq. 1.
where w ij represents the weights between nodes i and j, and o i is the output from node j, given by
The function f is usually a nonlinear sigmoid function that is applied to the weighted sum of inputs before the signal propagates to the next layer. One advantage of a sigmoid function is that its derivative can be expressed in terms of the function itself:
The networks used in this study consisted of three layers. The first layer is the input layer, where the nodes were the elements of a feature vector. The second layer is the internal or “hidden” layer. The third layer is the output layer that presents the output data. Each node in the hidden layer is interconnected to nodes in both the preceding and following layers by weighted connections (Atkinson and Tatnall 1997).
The error, E, for an input training pattern, d, is a function of the desired output vector, and the actual output vector, o, given by:
The error is propagated back through the neural networks and is minimized by adjusting the weights between layers. The weight adjustment is expressed as:
where η is the learning rate parameter (set to η = 0.01 in this study), δ j is an index of the rate of change of the error, and α is the momentum parameter (set to α = 0.01 in this study).
The factor δ j is dependent on the layer type. For example,
This process of feeding forward signals and back-propagating the error is repeated iteratively until the error of the networks as a whole is minimized or reaches an acceptable magnitude.
Using the back-propagation training algorithm, the weights of each factor can be determined and may be used for classification of data (input vectors) that the networks have not seen before. Zhou (1999) described a method for determining the weights using back propagation. From Eq. 2, the effect of an output, o j , from a hidden layer node, j, on the output, o k , from an output layer (node k) can be represented by the partial derivative of o k with respect to o j as
Equation (8) produces both positive and negative values. If the effect’s magnitude is all that is of interest, then the importance (weight) of node j relative to another node j0 in the hidden layer may be calculated as the ratio of the absolute values derived from Eq. 8:
We should mention that w j0k is simply another weight in w jk other than w ik .
For a given node in the output layer, the results of Eq. 9 show that the relative importance of a node in the hidden layer is proportional to the absolute value of the weight connecting the node to the output layer. When the networks consists of output layers with more than one node, then Equation (9) cannot be used to compare the importance of two nodes in the hidden layer.
Therefore, with respect to node k, each node in the hidden layer has a value that is greater or smaller than unity, depending on whether it is more or less important, respectively, than an average value. All the nodes in the hidden layer have a total importance with respect to the same node, given by
Consequently, the overall importance of node j with respect to all the nodes in the output layer can be calculated by
Similarly, with respect to node j in the hidden layer, the normalized importance of node j in the input layer can be defined by
The overall importance of node i with respect to the hidden layer is
Correspondingly, the overall importance of input node i with respect to output node k is given by
Data
Many studies have identified important factors that contribute to ground subsidence around coal mines, including (Coal Industry Promotion Board 1997; Waltham 1989): depth and height of the mined cavities, excavation method, degree of inclination of the excavation, scope of mining, structural geology and flow of groundwater. Therefore, the factors governing the occurrence of ground subsidence were collected in a vector-type spatial database. These included a 1:50,000 scale geological map, 1:5,000 scale topographic maps, 1:5,000 scale land use maps, 1:1,200 scale mined-tunnel maps, and borehole data. Reliable accuracy of spatial database is indispensable in GIS environment. For this reason, accurate maps authorized by Korean government agencies were collected even though the scales of each map were different. The data layers are shown in Table 2.
The geology data were extracted from a 1:50,000 scale geological map of the Korea Institute of Geoscience and Mineral Resources. Contour and survey base points with elevation values read from the topographic map were extracted, and a digital elevation model (DEM) was constructed. Using the DEM, the slope gradients were calculated. There are 14 classes of land use, which were extracted from the land use map of the National Geographic Information Institute. Most of the literature (Goel and Page 1982; National Coal Board 1975; Waltham 1989) maintains that the major factor in ground subsidence is the scope of the mined cavities. Therefore, constructing a database of the depths and widths of mined cavities was very important during this study. To achieve this object, (1) GPS (ProMark2 GPS system, less than 10 mm static survey accuracy) measurements were used to determine the exact positions of mine heads; (2) these were used to vectorize a hard copy of the mined tunnel map; and (3) the vectorized mined tunnel map was converted to an ASCII grid file, and subtracted with the DEM raster data. There were 35 boreholes at the study site, but some boreholes did not have values, so an inverse distance weighting (IDW) interpolation method was used to contour groundwater levels, and permeability factors.
Method
This study was conducted using GIS, artificial neural networks with factors that may cause ground subsidence. An image database and an attribute database for ground subsidence were constructed. When using this approach the principal assumption is that the potential ground subsidence (occurrence possibility) will be similar to the actual frequency of ground subsidence. After the study site was selected, areas of ground subsidence were detected at the study site by field surveys. A map of existing ground subsidence was developed, and this was used to evaluate the frequency and distribution of ground subsidence at the study site.
For the study, first, maps relevant to ground subsidence occurrence were used to construct a vector-type spatial database using the GIS software package, ARC/INFO. Second, ground subsidence occurrence areas were detected in the study area by interpretation of field surveys. A map of the ground subsidence locations were constructed to spatial database using GIS. Third, for the calculation of the weight, the ground subsidence factors were converted to grid (ARC/INFO grid-type), and then converted to ASCII data for use with an artificial neural networks program. And the ASCII datasets were normalized between 0.1 and 0.9 since the value of sigmoid function used in artificial neural networks varies from 0 to 1. Then, using detected ground subsidence locations, the weights of each factor were determined by neural networks method. The weight of each factor was determined after training using artificial neural networks program that was developed using MATLAB (Demuth et al. 2005). For the weight determination using the artificial neural networks, the location where the ground subsidence occurred was assigned as a training area and the artificial neural networks had been trained. When the weights converged to a proper value, the weights were determinate by back propagation between the neural networks layers. After then, the results of the analysis were converted to grid data using the GIS. Finally, ground subsidence hazard mapping was carried out using the weight together in our study and the analytical results were verified using the ground subsidence locations.
In this study, the GIS software ArcView 3.3 and ARC/INFO version 9.0 were used as the basic analysis tools for spatial management and data manipulation.
Prediction of ground subsidence using the artificial neural networks
Figure 3 is the flowchart of the neural networks training for the weight determination. The weights between layers that acquired by training of the neural networks were calculated reversely and the contribution or importance of each factor was calculated. So, weights that are contribution or importance of each factor were determined. For the calculation of the weight, program developed by Hines (1997) was used and for the interpretation of the weight, a newly developed program was used.
The seven factors listed in Table 2 were used as the input data. The factors were converted to a 1 × 1 m2 grid and the total cell number was 2,102,594, and the ground subsidence occurrence cell number was 10,369. Using GIS software, a grid of 1,207 rows and 1,742 columns, with a point spacing of 1 m, was used (Fig. 4).
The subsidence-prone (occurrence) locations and the locations that were not prone to subsidence were selected as training sites. Cells from each of the two classes were randomly selected as training cells, with 3,000 cells denoting areas where subsidence not occurred or occurred. First, areas where the subsidence was not occurred were classified as “areas not prone to subsidence” and areas where subsidence was known to exist were assigned to an “areas prone to subsidence” training set. The training sites were processed ten times to identify any changes that might occur due to the assignment of random initial weights.
The back-propagation algorithm was then applied to calculate the weights between the input layer and the hidden layer, and between the hidden layer and the output layer, by modifying the number of hidden node and the learning rate. Three-layered feed-forward networks were implemented using the MATLAB software package. Here, “feed-forward” denotes that the interconnections between the layers propagate forward to the next layer. The number of hidden layers and the number of nodes in a hidden layer required for a particular classification problem are not easy to deduce. In this study, a 7 × 15 × 1 structure was selected for the networks, with input data normalized in the range 0.1–0.9. The nominal and interval class group data were converted to continuous values ranging between 0.1 and 0.9. Therefore, the continuous values were not ordinal data, but nominal data, and the numbers denote the classification of the input data.
The learning rate was set to 0.01, and the initial weights were randomly selected to values between 0.1 and 0.9. The weights calculated from 10 test cases were compared to determine whether the variation in the final weights was dependent on the selection of the initial weights. The back-propagation algorithm was used to minimize the error between the predicted output values and the calculated output values. The algorithm propagated the error backwards, and iteratively adjusted the weights. The number of epochs was set to 5,000, and the root mean square error (RMSE) value used for the stopping criterion was set to 0.1. Most of the training data sets met the 0.1 RMSE goal. However, if the RMSE value was not achieved, then the maximum number of iterations was terminated at 5,000 epochs. When the latter case occurred, then the maximum RMSE value was <0.216. The final weights between layers acquired during training of the neural networks and the contribution or importance of each of the seven factors used to predict ground subsidence hazard are shown in Table 3. The results were not the same, as the initial weights were assigned random values. Therefore, in this study, the calculations were repeated ten times, to allow the results to achieve similar values. The standard deviation of the results was in the range 0.0153–0.0343, and therefore, the random sampling did not have a large effect on the results. For easy interpretation, the average values were calculated, and these values were divided by the average of the weights of the some factor that had a minimum value. The slope value was the minimum value, 1.00, and the depth of groundwater value was the maximum value, 2.0908. Finally, the weights were applied to the entire study area, and the ground subsidence hazard map was created (Fig. 5). The values were classified by equal areas and grouped into four classes for visual interpretation.
Verification
The subsidence hazard analysis results were verified using known ground subsidence locations. Verification was performed by comparing the known ground subsidence location data with the subsidence hazard map. Each factor used and its frequency ratio were compared. Rate curves were created and the areas under the curves were calculated for two cases. The rate explains how well the model and the factor predict the subsidence. To obtain the relative rank for each prediction pattern, the calculated index values for all cells in the study area were sorted in descending order. The ordered cell values were then divided into 100 classes, at accumulated 1% intervals. The rate verification results appear as a line in Fig. 6. For example, the 90–100% (10%) class of the study area where the subsidence hazard index had a high rank could explain 91% of all subsidence. The 80–100% (20%) class of the study area where the subsidence hazard index had a high rank could explain 95% of subsidence.
The area under the curve can quantitative estimate the prediction accuracy. So, to compare the results quantitatively, the areas under the curves were recalculated as a total area of 1, which means perfect prediction accuracy. Therefore, the area under a curve can be used to assess the prediction accuracy quantitatively. The area ratio was 0.9606, so we can say that the prediction accuracy is 96.06%.
Results and discussions
Ground subsidence is one the most hazardous event among the artificial disasters. Government and research institutions worldwide have attempted for years to assess subsidence hazards and risks, and to show their spatial distribution. In this study, a data mining approach to identifying hazardous areas of subsidence using GIS shows considerable promise. Ground subsidence map was constructed using artificial neural networks. This showed prediction accuracy, 96.06%. Thus, the result showed very high prediction accuracy. GIS data was used to efficiently analyze the large volume of data, and the artificial neural networks proved to be an effective tool to analyze ground subsidence hazard.
The weights calculated from ten test cases were compared to determine whether the variation in the final weights was dependent on the selection of the initial weights. The results show that the initial weights did not have an influence on the final weight under the conditions used.
As shown in Table 3, depth of groundwater, land use, permeability and geology have relatively high weights to analyze ground subsidence near AUCM. The surveyed subsidence areas of this study are located around railroad, road, and other facilities above a shallow mined tunnel. Therefore, land use, geology and depth of the mined tunnel are important factors, as well as the groundwater level.
Correlation between the flow of groundwater and the depth of mined tunnel is meaningful in this study. Because the study area where depth of groundwater is shallower than depth of mined tunnel shows that the area has higher subsidence hazard index than the other areas. And therefore, relative high weights for depth of groundwater and permeability factor can be explained here (Table 3).
The data on groundwater levels were obtained during field surveys, without considering the amount of rainfall at the time. However, it is a meaningful value and should be considered in calculating the safety of a base rock.
References
Atkinson PM, Tatnall ARL (1997) Neural networks in remote sensing. Int J Remote Sens 18:699–709
Coal Industry Promotion Board, CIPB (1997) A study on the mechanism of subsidence over abandoned mine area and the construction method of subsidence prevention. Coal Industry Promoton Board, Seoul, 97–06, pp 1–67
Coal Industry Promotion Board, CIPB (1999) Fundamental investigation report of the stability test for Gosari. Coal Industry Promotion Board, Seoul, 99–06, pp 7–22
Demuth H, Beale M, Hagan M (2005) MATLAB version 7.3.0.267; Neural network toolbox for use with Matlab, the Mathworks, p 348
Garrett J (1994) Where and why artificial neural networks are applicable in civil engineering. J Comput Civil Eng 8:129–130
Goel SC, Page CH (1982) An empirical method for predicting the probability of Chimney Cave occurrence over a mining area. Int J Rock Mech Min Sci Geomech Abstr 19:325–337
Kim KD, Lee S, Oh HJ, Choi JK, Won JS (2006) Assessment of ground subsidence hazard near an abandoned underground coal mine using GIS. Environ Geol 50:1183–1191
Hines JW (1997) Fuzzy and neural approaches in engineering. Wiley, New York, p 209
Lee S, Ryu JH, Won JS, Park HJ (2004) Determination and application of the weights for landslide susceptibility mapping using an artificial neural network. Eng Geol 71(3/4):289–302
Nefeslioglu HA, Gokceoglu C, Sonmez H (2008) An assessment on the use of logistic regression and artificial neural networks with different sampling strategies for the preparation of landslide susceptibility maps. Eng Geol 97(3/4):171–191
National Coal Board (1975) Subsidence engineer’s handbook. National Coal Board Mining Department, London, p 111
Paola JD, Schowengerdt RA (1995) A review and analysis of backpropagation neural networks for classificatioin of remotely sensed multi-spectral imagery. Int J Remote Sens 16:3033–3058
Sonmez H, Gokceoglu C, Kayabaşı A, Nefeslioğlu HA (2006) Estimation of rock modulus: for intact rocks with an artificial neural network and for rock masses with a new empirical equation. Int J Rock Mech Min Sci 43(2):224–235
The Geological Society of Korea (1999) Geology of Korea. Sigma Press, Seoul, pp 550–556
Tunusluoglu MC, Gokceoglu C, Sonmez H, Nefeslioglu HA (2007) An artificial neural network application to produce debris source areas of Barla, Besparmak, and Kapi Mountains (NW Taurids, Turkey). Nat Hazards Earth Syst Sci 7:557–570
Waltham AC (1989) Ground subsidence. Blackie & Son Ltd, New York, pp 49–97
Zhou W (1999) Verification of the nonparametric characteristics of backpropagation neural networks for image classification. IEEE Trans Geosci Remote Sens 37:771–779
Zorlu K, Gokceoglu C, Ocakoglu F, Nefeslioglu HA, Acikalin S (2008) Prediction of uniaxial compressive strength of sandstones using petrography-based models. Eng Geol 96(3–4):141–158
Acknowledgments
The authors thank the Coal Industry Promotion Board to have provided whole investigation reports and basic GIS database. This research was supported by the Basic Research Project of the Korea Institute of Geoscience and Mineral Resources (KIGAM) funded by the Ministry of Science and Technology of Korea.
Author information
Authors and Affiliations
Corresponding author
Additional information
An erratum to this article can be found at http://dx.doi.org/10.1007/s00254-009-1734-5
Rights and permissions
About this article
Cite this article
Kim, KD., Lee, S. & Oh, HJ. Prediction of ground subsidence in Samcheok City, Korea using artificial neural networks and GIS. Environ Geol 58, 61–70 (2009). https://doi.org/10.1007/s00254-008-1492-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00254-008-1492-9