1 Introduction

Sinkholes are a common geological hazard in karst regions and are known as one of the most dangerous natural hazards (Yechieli et al. 2006). There are six basic types of sinkholes: solution, soluble bedrock collapse, caprock collapse, cover subsidence, buried, and cover collapse. Cover collapse sinkholes are the most destructive geological hazards among sinkholes and generally result from the development of soil cavities in areas underlain by limestone bedrock (Waele et al. 2011; Siska et al. 2016; Jia et al. 2019). The unexpected and rapidly developing phenomena are widespread, especially in populated areas, and pose serious safety, economic, and environmental hazards (Galve et al. 2009; Filin et al. 2011; Xiao et al. 2016; Panno and Luman, 2018).

Karst collapse occurrence is induced by numerous factors that, in turn, influence each other and are difficult to predict (Wu et al. 2018). Cover collapse sinkhole formation and development are closely related to many factors, such as the overburden conditions, groundwater level, and karst topography. Karst aquifers have unusual flow and storage conditions compared to granular aquifers that create a complex subsurface system for groundwater flow. In recent years, anthropogenic activities, in particular groundwater withdrawal, have increased the frequency of sinkholes in many regions (Taheri et al. 2018). Research has been focused on the mechanisms and development features of karst collapses (Tharp 1999; Santo et al. 2019). However, it is difficult and complex to obtain regional data. Karst collapse susceptibility assessment is a useful and easily accessible tool for decision-makers and engineers for managing this hazard, particularly in the case of regions without sufficient research (Galve et al. 2009).

Initially, the research on sinkhole susceptibility was mainly based on qualitative analysis. The wide application of geographic information system (GIS) and various algorithms have provided the conditions for quantitative evaluation and prediction. There are several studies of sinkhole susceptibility development around the world, which can be roughly divided single-factor and multifactor approaches (Duman et al. 2005; Galve et al. 2009; Taheri et al. 2015; Zhou et al. 2016; Wu et al. 2018; Jia et al. 2019). Methods include the one-factor model, spatial overlay analysis, probabilistic modeling, conditional probability, analytic hierarchy process (AHP), logistic regression modeling, and machine learning. The results of related studies have shown reasonable prediction ability and can be used to assess sinkhole susceptibility problems.

Using AHP to generate a geological hazard susceptibility map is one of the most widely used methods (Ercanoglu et al. 2003; Chowdary et al. 2013; Feizizadeh et al. 2014a). AHP has the apparent value of permitting a hierarchical structure of the criteria, which can provide a better focus on specific criteria and sub-criteria (Saaty 1980). Considering the interactive effects of disaster-causing factors to a sinkhole, the AHP method is indeed suitable and effective. However, weights of conditioning factors are made through building a pairwise matrix, which is dependent on opinions or experiences of assessment experts (Chowdary et al. 2013). Many authors have proposed improved models to overcome this issue such as integrating fuzzy logic mathematics, ordered weighted averaging, Monte Carlo simulations, catastrophe theory, sensitivity and uncertainty analysis, global sensitivity analysis, Dempster–Shafer theory, and machine learning into the model (Feizizadeh et al. 2014a, b; Sadeghfam et al. 2016; Ghorbanzadeh et al. 2017; Negahban 2018; Shokati and Feizizadeh 2018; Taheri et al. 2018). However, little has been published on the actual evaluation of the karst collapse susceptibility using these models. It is necessary to explore highly reliable evolution approaches for sinkhole susceptibility for better prediction capabilities.

Catastrophe theory is designed for discontinuous dynamic systems governed by a potential energy-like function. This fits well with karst collapse, as they have obvious sudden characteristics that can be adequately represented by catastrophe theory (Kuniansky et al. 2018; Santo et al. 2019). The catastrophe approach also does not involve the decision-maker’s opinion in assigning weights to the factors; rather, it calculates the importance of one criterion over others by its inner mechanism. Many studies have tried this method to avoid subjectivity in assigning weights such as groundwater susceptibility, flood susceptibility, and landslide susceptibility (Sadeghfam et al. 2016; Sun et al. 2019). However, there is an inevitable disadvantage to this theory: The order of factors during its normalization process has a significant influence on the final result.

In information theory, entropy is a measure of uncertainty in a random variable. The entropy index can represent the relative importance of factors without any consideration of the decision-makers’ preferences, allowing for an objective evaluation method (Kornejady et al. 2017; Zhao et al. 2017). When the value of a factor to be evaluated is small, and the entropy value is considerable, this index has no obvious effect on the system (Constantin et al. 2011). Entropy theory can be used to calculate the relative importance of the index system to determine the importance order of parameters. Therefore, to overcome the disadvantage of the catastrophe model, the order of factors can be objectively and reasonably determined first using the entropy theory; then, the catastrophe theory can be integrated into the final model for prediction.

The main objective of this study is to develop a novel approach for tackling uncertainty and imprecision when analyzing sinkhole susceptibility. AHP was first selected to build the structure model according to the judgment of experts, which fully considers the sinkhole occurrence mechanisms. The objective methods (catastrophe theory and information theory) were then combined to calculate a weight-based only on the observation data without any consideration of subjectivity. The innovative aspect in the present work is the joint use of subjective and objective approaches to better divide the spatial regions of sinkhole susceptibility. The methods adopted and the result obtained can be used as a tool to help with suitable measures for sinkhole guidance.

2 Study area

The study area is located in the Qixia economic development zone, Shandong Province, China, as shown in Fig. 1. The study area, within 120 °59′ 03′′–121° 05′ 09′′ east longitude and 37° 25′ 41′′–37° 32′ 16′′ north latitude, is an area with a high probability of karst collapse. The region mainly consists of alluvial plains and piedmonts, covering an area of 105 km2. The altitude reaches more than 260 m above sea level in the piedmont region and decreases toward the central area in the river beds. The climate is semiarid, with 648 mm of average annual precipitation and a mean annual temperature of 11.4 °C. The major river is the Baiyang River, which is an intermittent stream during seasonal droughts, as shown in Fig. 2.

Fig. 1
figure 1

Location and geological condition of the study area

Fig. 2
figure 2

Features within the study area

The exposed bedrock consists of Proterozoic to Mesozoic successions that include sedimentary and volcanic rocks (Fig. 1). The oldest rocks exposed in this area are Paleo-proterozoic schist, marble, quartzite, and amphibolite. Paleozoic sediments are not exposed in this region. The Neo-proterozoic succession mainly consists of limestone, marlstone, and shale. The main lithologies of the Cretaceous units include sandstone, andesitic tuff breccia, and sandstone, which are only exposed sporadically in the study area. The Quaternary sediments consist of alluvial deposits from rivers, followed by proluvial and diluvial deposits, which mainly consist of clay, silt, and sand. Fracture structures are relatively developed in this region, especially in the central area (Fig. 2). Furthermore, cracks are well developed in the rock near the fault zones.

Groundwater in the study area mainly includes pore water in loose sediment (unconfined aquifers) and fracture karst water in the carbonates (semi-confined and confined aquifers). Karst water is the target aquifer of intensive exploitation. The plains aquifer consists of Pliocene to Holocene-aged loose sediment aquifers, mainly distributed in the central and valley areas. It is primarily comprised of sand, sandy gravel, and gravel. The aquifer thickness varies from less than 5 m in the low-lying areas to as much as 30 m in the valley region, with burial depths varying from 10 to 1.6 m. The slope primarily governs the general flow direction of the groundwater in the Quaternary aquifer. Precipitation and surface runoff are the main sources of groundwater recharge. The groundwater discharge is mainly extraction, runoff, and evaporation. The karst aquifer is mostly located in the eastern and southern regions with the study area and is also the aquifer in the central-southern region below the Quaternary overburden. The burial depth of this aquifer varies from 5 to 60 m and has favorable water abundance. In the overburden region, the karst aquifer receives recharge from the Quaternary aquifer. Topographic and tectonic conditions mainly control the flow of the karst groundwater in the area. Mine dewatering of the cement plant is the major groundwater exploitation activity. The location of the cement plant is shown in Fig. 2.

According to the statistical data, annual groundwater extraction has continuously increased, especially the cement plant mining activities since 2005, causing a dramatic lowering of the water level. The first sinkhole appeared in 2005, corresponding to an intense period of exploitation. At present, two groundwater depression cones induced by over-pumping have formed in the northeast and east of Zhongqiao village. Figure 2 shows the distribution of the overall 12 recorded karst collapses in the study area, and nine recorded karst collapses (accounting for 75% of the total) have occurred since 2016. Most of the karst collapses are located in or around the village region, and more than one hundred houses have been cracked or tilted since 2016. Overall, this hazard seems to be more serious recently. Most of the sinkholes are circular or elliptical shape in plain view. Furthermore, the number of sinkhole occurrences is more significant in the summer than in the other seasons. The karst and overlying layer conditions such as lithology of the layer of karst formation, condition of the overlying layers, and tectonic conditions are the primary conditions for cover collapse. Groundwater extraction is the primary inducing factor. The buried karst area, covered by rock not easily dissolved and clastic rocks, is not threatened by this hazard in this area. Generally, the basic pattern of karst collapse occurrence shows that sinkholes consist of typical cover collapse sinkholes in the study area. The boundary of susceptibility assessment is shown in Fig. 2.

3 Methodology

The occurrence of karst collapse is induced by many factors and is a complex process. The complexity and obvious anisotropy of karst aquifer systems pose challenges for analyzing the susceptibility of cover collapse. Hazard susceptibility is, therefore, a multi-objective and multi-criteria problem. A novel method for karst collapse susceptibility consists of combining both subjective and objective methods to consider the advantages and disadvantages of both methods. AHP was first introduced into the model, which offers a flexible, effective, and easy implementation for intricate decision-making problems (Saaty 1980). To avoid the subjectivity of decision-makers during the process of AHP, the combined objective methods based on entropy and catastrophe theories were introduced to further determine the factor weights. GIS software was used, and all factors were prepared as a raster dataset in the GIS environment.

Generally, the following steps were applied to reach the final susceptibility map:

  1. a.

    The effective factor related to the cover collapse susceptibility map was selected and collected according to the occurrence mechanism and knowledge of local experts, and those factors were then arranged in hierarchic order. Meanwhile, the rating was synthesized to determine the score of the factors in the index layer of AHP.

  2. b.

    The factors were ranked based upon the importance, and the weights of sub-factors were determined in the index layer based on entropy theory.

  3. c.

    The weighting of the criterion factors of the AHP model was calculated based on the catastrophe theory according to the rank result of the index layer using the entropy model.

  4. d.

    All the spatial data were then overlaid on to a map, and the susceptibility map was generated in the GIS environment.

3.1 The AHP model

The AHP model developed by Saaty (1980) is a decision support system that is designed for dealing with complex and multi-criteria decisions through a hierarchical structure. AHP consists of targets to be attained, criteria to be used for decision making and alternatives to be selected (Pourghasemi et al. 2012; Mallick et al. 2018). A satisfactory evaluation consists of a karst collapse susceptibility model and the related factors that induce cover collapses. This method is coupled with the knowledge of experts or users to determine the parameter and criteria (Neshat et al. 2014).

According to the formation conditions and mechanisms of karst collapse, an evaluation factor set of this hazard was determined based on experts’ expertise as well as a literature review (Tharp, 1999; Kaufman and Quinif 2002; Galve et al. 2009; He et al. 2013; Papadopoulou-Vrynioti et al. 2013; Siska et al. 2016; Taheri et al. 2018; Jia et al. 2019; Wu et al. 2018). The occurrence of sinkholes in the study area was the combined effect of several factors typical of covered collapse sinkholes. The major influencing factors were the karst development conditions, overburden characteristics, and groundwater level. Karst development was the most fundamental condition for collapse, which consists of the storage place and transfer channel for the collapse materials. Overburden conditions consisted of the material source and were the main controlling factor of soil holes. Groundwater extraction and mine dewatering were primary inducing factors for karst collapse. The dynamic load, generated by the fluctuation of the groundwater table, impacted the soil cap, which could also cause soil erosion damage.

Finally, the hierarchical structure included three layers. The topmost layer of the structure was the goal layer, called the susceptibility of karst sinkhole (A). The second layer was the selected category factors, which had significant impacts on layer A, and called the criterion layer (B). The third layer explained the concrete meaning of the second layer and called the index layer (C). Generally, B was mainly composed of the three main group factors: karst condition (B1), overlaying layer condition (B2), and hydrological condition (B3). Each criterion factor also consisted of several sub-factors, i.e., B1 included the degree of karst development (C1) and the distance from faults (C2). B2 included the lithology of the overlying layer (C3), structures within the overlying layer (C4), and the thickness of the overlying layer (C5). B3 included the relative location between the groundwater and bedrock (C6), groundwater abundance (C7), the rate of groundwater level drawdown (C8), and the distance from the river (C9). Figure 3 shows the detailed assessment index system of this model.

Fig. 3
figure 3

AHP model structure

There were divided four classes in this process: very high, high, medium, and low, where the corresponding score values were 6, 4, 2, and 1, respectively. According to the subjective relevance of all the selected factors in the AHP model to determine the occurrence of karst collapse, the rating scores values also could be further determined.

3.2 Entropy theory

Entropy information is a concept from information theory and indicates the amount of information contained in the data (Constantin et al. 2011; Kornejady et al. 2017; Zhao et al. 2017). The entropy method is an unbiased and accurate quantitative weight-assigning method compared to subjective techniques. The entropy index, the information content, is the average unpredictability in a random variable. If an event can be predicted precisely, the value of entropy will be small. Therefore, the entropy weights also can present the relative importance of factors.

In this study, to overcome the difference due to the ranking of indicators during the evaluation process of the catastrophe model, the entropy method was used to rank the importance of indicators before the final catastrophe evaluation. The index and weights of the nine factors C1, C2, C3, C4, C5, C6, C7, C8, and C9 were calculated for the three index layers, respectively, in the built AHP.

The procedure for assigning weights and relative importance of factors using entropy theory was as follows:

  1. a.

    The matrix formation was first built. Supposing that the target area was divided into n areas, and m was the number of evaluation parameters, the data related to the factors would then be in the form of matrix X. The decision-making matrix X could be defined using Eq. (1).

    $$X = \left( {\begin{array}{*{20}c} {x_{11} } & {x_{12} } & \ldots & {x_{1m} } \\ {x_{21} } & {x_{22} } & {...} & {x_{2m} } \\ \vdots & \vdots & \ldots & \vdots \\ {x_{n1} } & {x_{n2} } & \ldots & {x_{nm} } \\ \end{array} } \right)$$
    (1)

    where i = 1,2,…,n; j = 1,2,…,m; xij is the value of ith region with regard to parameter j.

  2. b.

    The data were then standardized. In order to eliminate the influence resulting from the difference in the units of the characteristic parameters, values were normalized and transformed using Eq. (2).

    $$r_{ij} = \frac{{x_{ij} }}{{\frac{1}{n}\sum\nolimits_{i = 1}^{n} {x_{ij} } }}$$
    (2)

    where rij is the standardization data of xij; xij is the actual value of the sample; and n is the number of evaluating objects.

  3. c.

    The entropy value for index Ej (for the jth evaluating indicator) was then defined using Eqs. (3) and (4).

    $$E_{j} = - \frac{1}{\ln (n)}\sum\nolimits_{i = 1}^{n} {f_{ij} } \ln (f_{ij} )$$
    (3)
    $$f_{ij} = \frac{{r_{ij} }}{{\sum\nolimits_{i = 1}^{n} {r_{ij} } }}$$
    (4)

    where n is the number of evaluating objects, note that if fij = 0, Ej = 0.

  4. d.

    The weight and relative rank for each index (variables) were finally estimated using Eq. (5).

    $$w_{i} = \frac{{1 - E_{i} }}{{m - \sum\nolimits_{i = 1}^{m} {E_{i} } }}$$
    (5)

    where wi is the index weight of ith parameter and m is the number of the evaluating indicators.

3.3 Catastrophe theory

Catastrophe theory originated from the topology branch of mathematics. This theory was developed to study the nonlinear behavior of a dynamic system, particularly to assess discontinuities (Kuniansky et al. 2018; Santo et al. 2019). In catastrophe theory, the system function variables can be divided into dependent state variables and control variables. The state variable generally represents the behavior state of a system, while control variables are the influencing factors. The dependency of state variables on the control variables is determined by catastrophic fuzzy membership functions rather than weights assigned by the users. Table 1 shows the number of control parameters and state variables for the standard catastrophe model. For the butterfly function, xa, xb, xc, and xd are the state variables corresponding to the control parameters (a, b, c, and d). In addition, different control variables have different impacts on the state of the variables (Wang et al. 2011). Therefore, the weight determined by this theory is more realistic.

Table 1 Normalization formulas for catastrophe theory

To delineate the susceptibility zones by using the catastrophe theory, four steps were involved: (a) data preprocessing; (b) data standardization; (c) normalization; and (d) calculation of susceptibility index.

  1. (a)

    Data preprocessing: To avoid the catastrophe progression of a control variable that may be too dispersed, which would influence the calculation accuracy, the average value of the index was used. The original data were preprocessed using Eq. (6).

    $$x_{i} = \frac{{r_{ij} }}{{\frac{1}{m}\sum\nolimits_{j = 1}^{m} {r_{ij} } }}$$
    (6)

    where rij is the original data of the ith indicator of jth evaluating object and m is the number of evaluating objects. xi is the average value of the indicator i.

  2. (b)

    Data standardization: The different thematic indices have different units of measurement. The standardization process makes the data dimensionless, which is a necessary step in catastrophe theory. To avoid the catastrophe progression of a control variable being too high, the following equations were used to standardize the indices.

For the cost type, where larger is better, data values were directly proportional to the susceptibility index. The data were standardized using Eq. (7).

$$y_{i} = \frac{{x_{i} - x_{i(\min )} }}{{x_{i(\max )} - x_{i(\min )} }}$$
(7)

For the efficiency type, where smaller is better, the susceptibility index inversely varied proportionally to the data layer values. This type of layer was standardized using Eq. (8).

$$y_{i} = \frac{{x_{i(\max )} - x_{i} }}{{x_{i(\max )} - x_{i(\min )} }}$$
(8)

where i is the index or attribute, xi is the original value of index i, and xi(max) and xi(min) are the maximum and minimum values of original data of index i, respectively.

  1. (c)

    Normalization: The data normalization was implemented using the catastrophe models listed in Table 1. The catastrophe progression of each control variable was computed from the initial fuzzy subordinate function based on the normalization formulas. When applying the normalizing formula to perform recursive computations, one of the two principles was selected: complementary and non-complementary (Sadeghfam et al. 2016). When the control variables were independent of each other, and each control variable could not offset the other, the non-complementary principle was selected. The minimum value of normalized control parameters was selected as the state variable value of the whole system, i.e., x = min(xa + xb + xc + xd), for the butterfly function. Conversely, when a control variable was complementary to the other, the mean value was selected, i.e., x = (xa + xb + xc + xd)/4 for the butterfly function.

In the sinkhole susceptibility assessment, the complementary principle is satisfied. Considering the structure of the AHP model, the swallowtail function was selected to calculate the final weight of the criterion layer. The weights of the selected nine index factors based on the above entropy theory were further used in this process. Adding the weight of the factors in the same index layer, the weights of the three criterion factors B1, B2, and B3 were obtained using the normalization formulas (Table 1). Finally, the rank of the nine basic factors was determined based on the superimposed principle.

  1. (d)

    Calculation of susceptibility index: In the final step, the total susceptibility index of an element in a given location was obtained by integrating individual variable scores with estimated weight values, as shown in the following equation.

    $$S_{i} = \sum\limits_{j = 1}^{n} {w_{j} } \times r_{ij}$$
    (9)

    where Si is the sinkhole susceptibility index for area i, wj is the final calculated weight of parameter j, rij is the grading value of area i of factor j, and n is the total number of criteria parameters.

4 Sensitivity analysis

The use of sensitivity analysis can improve the uncertainty of model accuracy and evaluate the relative importance of different parameters on the assessment results (Feizizadeh and Blaschke 2013). In the present work, two methods for sensitivity analysis were implemented: single-parameter sensitivity analysis and map removal sensitivity analysis. These were used to test the sensitivity of the index factors in the AHP model (Huan et al. 2018; Patra et al. 2018).

The single-parameter sensitivity analysis was used to assess the influence of the individual input parameter on the susceptibility index. Equation (10) is used to calculate the effective weight of each parameter.

$$W_{i} = \frac{{w_{i} \cdot r_{i} }}{S}$$
(10)

where Wi is the effective weight of the ith parameter; ri and wi are the given rating and weight of the ith parameter, respectively; and S is the susceptibility index.

The map removal sensitivity analysis provided the sensitivity associated with the removal of one or more indicator layers. The removal of parameter sensitivity also can be used to examine the contribution of parameters to the susceptibility index.

The index of the removal of a parameter was defined as follows.

$${\text{SA}} = \frac{{\left| {\frac{S}{N} - \left. {\frac{{S^{^{\prime}} }}{n}} \right|} \right.}}{S} \times 100$$
(11)

where SA denotes the sensitivity index; S and S′ are the unperturbed and perturbed susceptibility indices, respectively; and N and n are the numbers of factors used to compute S and S′.

5 Results

5.1 Data collection and preparation

To identify karst collapse susceptibility zones, the criterion layer of karst conditions, overlying layer conditions, and hydrological conditions were prepared in the AHP model. Generally, nine important causative factors were selected, and the corresponding thematic data layers were prepared in GIS with a resolution of 10 m. Furthermore, the kriging spatial interpolation technique was applied to transform the discrete spatial data into a continuous surface, which simultaneously considered both the distance and the degree of variation between values of known data points (Goovaerts 1997). Input data were mainly obtained from field study investigation, topographic maps, boreholes logs, well logs, and field tests.

5.1.1 Karst conditions

Karst collapse is affected by the degree of karst development and geological structures. Karstification in carbonate rocks is significant, causing the appearance of karstic formations (Papadopoulou-Vrynioti et al. 2013). Karstification of the underlying bedrock was active in the study area, especially in the covered karst area. Of the 56 boreholes drilled in active karstification, 47, approximately 84%, encountered cavities. The cavity size ranged from 0.1 to 18.9 m high, and approximately 87% of the heights were less than 5 m. Generally, the degree of karst development, as shown in Table 2, can be divided into four susceptibility degrees according to the situation of the study area. A higher rating value indices more substantial karst development. The data were obtained using boreholes and a geological map at 1:10,000 scale, as shown in Fig. 4a.

Table 2 Reclassified criteria and their score values
Fig. 4
figure 4

Karst conditions of the study area

Geological structures affect the continuity of the bedrocks and soil masses and may act as conduits for seepage (Tharp 1999; Wu et al. 2018). Fault zones are commonly characterized by higher hydraulic conductivity and more cavernous conditions. The rock surrounding fault structures was severely fragmented and had karst development, according to geophysical and borehole data. The spatial distribution of most recorded collapses occurred near fault zones. Therefore, faults are an essential factor in karst collapse susceptibility. Classes were assigned based upon the distance from a fault (Table 2): ≤ 50 m, 50–200 m, 200–500 m, and ≥ 500 m. The lithology map was achieved from geological maps with scales of 1:10,000. The map indicating the distance from a fault was produced by a buffering method in GIS (Fig. 4b).

5.1.2 Overlying layer conditions

The overburden is the material basis of soil holes in the covered area (Tharp 1999; Santo et al. 2019). An apparent phenomenon of this region was that all of the sinkholes occurred in zones where the limestone bedrock was covered by Quaternary alluvium. The type, formation, and thickness of the overburden were vital factors of the karst sinkhole development processes.

The available borehole data indicated that the overburden could be divided into several types, including alluvial–fluvial sandy clay, clayey sand, medium-coarse sand with gravels, etc. The material of overburden influenced the formation of soil caves. Nine sinkholes occurred in the sand or a mixture material that consists of sand, gravel, and clay, and three occurred in a mixture of gravel and clay. Meanwhile, according to statistics, more than 90% of the karst collapses occurred where the overburden overlying the karst contained multilayered structures. Only one sinkhole occurred in a monolayer structure that was mainly composed of sands. Therefore, the lithology of the soil layer was classified into four classes (Table 2). The lithology of sand was given the highest rate value, followed by a mixed composition and clay, as shown in Fig. 5a. The structure of the overlying layer was based on four categories that were based on the vertical stratification of the soil (Table 2). A more complex structure was given a higher rating value (Fig. 5b). The lithology and structure of the overlying layer were calculated from 116 data points collected at boreholes and production wells.

Fig. 5
figure 5

Overlying layer conditions of the study area

Seven recorded karst collapses, 58% in total, occurred in the covered area with a soil thickness less than 15 m. Two sinkholes have soil thicknesses ranging between 15 and 25 m. The other three sinkholes occurred in a region having a thickness greater than 25 m. Accordingly, four classes were created based on the thickness of the overburden layer, as noted in Table 2. The overlying thickness map was extracted from collected well and boreholes logs (Fig. 5c).

5.1.3 Hydrological conditions

The groundwater exploitation and hydrographic network contribute to the occurrence of cover collapse (He et al. 2013). The over-extraction of groundwater in the area had led to a sharp table decline since 2016, which was the peak period of karst collapses. The groundwater table fluctuating within the bedrock can easily induce the collapse of formation. Half of the sinkholes occurred where the groundwater level was around the bedrock depth (< 5 m). Five sinkholes emerged where the water table was above the bedrock (more than 5 m). Only two collapses occurred in the study area, while the water table below the bedrock, more than 5 m. Therefore, the relative location between groundwater and bedrock was divided into four classes (Table 2) obtained by the thematic mapping of overburden thickness and groundwater depth (Fig. 6a). In this computational process, the groundwater level was selected as the lowest level in a year (just before rainfall in June 2018).

Fig. 6
figure 6

Hydrological conditions of the study area

The basic pattern of sinkhole occurrence showed that the region of karst collapses is related to groundwater exploitation. Groundwater resources play an essential role in the formation of sinkholes. Groundwater abundance indirectly reflects the degree of karst development and groundwater potential zones, which are related to the occurrence of collapse. Higher levels of groundwater abundance have more likely potential to cause sinkholes. Therefore, a larger value was assigned to a region having greater groundwater abundance, as listed in Table 2. Groundwater abundance was obtained by drilling, wells, and field tests, as shown in Fig. 6b.

Excessive pumping of groundwater leads to a decline in the piezometric level and induces the possibility of sinkholes. The rate of groundwater level drawdown was calculated between June of 2017 and June 2018. Based on the available data of the groundwater table and mechanical analysis, the classifications of the drawdown rate are shown in Table 2. Groundwater level data of boreholes and monitoring wells were entered into GIS and separately interpolated across the whole assessment area, as shown in Fig. 6c.

Previous studies have shown that hydrologic flow channels are essential predictors of cover collapse sinkholes. Three recorded sinkholes emerged along riverbed and banks in the investigation area. There is an intermittent river in the assessment region, which is a source for the aquifer system. Therefore, the distance from the river thematic layer was also selected as a factor divided into four groups (Table 2): ≤ 100 m, 100–200 m, 200–400 m, and ≥ 400 m. This factor was obtained by buffering the river network from the topographic maps at a scale of 1:10,000 (Fig. 6d).

5.2 Analytic hierarchy process evaluation model

As mentioned above, there were four classes in this process: very high, high, medium, and low, where the corresponding class score values were 6, 4, 2, and 1, respectively. The values were assigned based on published information and experience of the local experts in the study area, which are given in Table 2.

The relationship matrix R was ascertained using Eq. (12).

$$R = \left[ {\begin{array}{*{20}c} {r_{11} } & {r_{12} } & \ldots & {r_{1n} } \\ {r_{21} } & {r_{22} } & \ldots & {r_{2n} } \\ \vdots & \vdots & \vdots & \vdots \\ {r_{m1} } & {r_{m2} } & \ldots & {r_{mn} } \\ \end{array} } \right]$$
(12)

where rij indicates the score value of the jth index belonging to the ith evaluation unit, m is the number of the evaluation units, and n is the number of the factors within the index layer (in this study, n = 9).

5.3 The rank of factors based on entropy theory

Considering there were two-layer parameters in the AHP model, the weights of the factors were judged, respectively, to the criterion and index layer. To avoid the effect of factors rank on the weight while using the catastrophe theory, the entropy method was used before the weight calculation of the criterion layer.

To the criterion layer, there were three first-grade factors: B1, B2, and B3. Every criterion factor had different numbers of second-grade factors in the index layer. First, the ranks of second-grade factors were determined. The indicator score values were standardized (given in the fourth column of Table 3) to calculate the weights of the index factors. Using Eq. (1) through Eq. (4), the weights of the index belonging to the same criterion layer (ws) were determined, as shown in Table 3. Also, the weight represented the relative intensity coefficient in the competitive sense, so the priority of each index factor (as shown in the sixth column of Table 3) was also determined for further use in the catastrophe model.

Table 3 Rank of the index in criterion layer based on entropy theory

5.4 Calculating the weight of factors based on catastrophe theory

As noted earlier, the weight of the criterion factors was further calculated using the catastrophe theory. Based on the number of index factors, the catastrophe model of the criterion system was selected. For example, criterion factor B1 had two sub-factors, namely C1 and C2; therefore, these met the requirements of the cusp catastrophe model (Table 1). Furthermore, the priority of index factor C1 was more important than C2, according to the results of the above entropy model (Table 3). In this study, larger values for all nine index factors were all considered to be better, so Eq. (7) was selected. Based on the selected catastrophe model ( the cusp model selected for B1), the weights of the criterion factor (wc) were finally calculated considering the basic principle of complementary, as shown in Table 4.

Table 4 Results of the catastrophe model

Using the following equation, the calculated weight was further normalized (given in the eighth column of Table 4).

$$w_{j}^{c * } = \frac{{w_{j}^{c} }}{{\sum\nolimits_{j = 1}^{m} {w_{j}^{c} } }}$$
(13)

where the wcj is the calculated weighted of factor j, wc*j is the normalization weight, j is the criterion factor, and m is the number of the criterion factors in the AHP model.

Finally, based on the hierarchical structure of the AHP model, the weights of all nine factors were further calculated using Eq. (14). The weight results are listed in Table 4.

$$w_{i} = w_{j}^{c*} \times w_{ij}^{c}$$
(14)

where wi is the final calculated weight of factor i; wc*j is the normalization weight of parameter in j criterion layer; and wcij is the index factor i that belongs to j criterion layer.

5.5 Results of sensitivity analysis and validation

Sensitivity analysis was used to discuss the influence of the rate and weight values assigned to each selected factor. Considering that the basis of weight calculation was the rate value of the factors in the assessment region, the rate values were given by the local experts and conditions, providing the sources of sensitivity. Therefore, the statistical values of the selected nine evolution parameters in the AHP model were firstly presented, as shown in Table 5. The highest mean rate value was C5 (mean value = 5.29), followed by C3, C4, and C1. In comparison, C2, C6, C7, C8, and C9 had lower mean rate values.

Table 5 Summary statistics of the nine factors of AHP

The single-parameter sensitivity analysis represents the relative importance of a single factor compared to the other eight parameters. The analysis results (Table 6) show that the effective weights of C1, C3, C4, and C5 exhibited higher effective weightings (16.02%, 12.03%, 11.56%, and 19.05%, respectively) compared with their empirical weights values, especially C5 and C1, indicating that these parameters had the most significant impact on sinkhole susceptibility. C2, C6, C7, C8, and C9 tended to be less effective parameters in the susceptibility assessment, with low effective weightings (8.25%, 8.55%, 9.16%, 6.01%, and 6.78%, respectively). This may be because parameter C5, C3, C4, and C1 had larger mean score values for the subregion class values.

Table 6 Statistical summary of single-parameter sensitivity analysis

The use of sensitivity analysis on one map removal can help to identify the most significant parameter. Table 7 summarizes the change in index S as a result of the removal of a single parameter at a time using Eq. (11). C5 was determined as the most sensitivity parameter with an average variation index of 1.37%, followed by C1. Furthermore, due to the high rate value, C2, C8, and C9 also exerted relatively more significant influences on sinkhole susceptibility. The influence of C3 was small according to its lower mean variation index (0.4%), as well as C6 and C7 (0.38% and 0.43%, respectively). The results of map removal sensitivity analysis were consistent with the result of the single-parameter sensitivity analysis.

Table 7 Statistics of map removal sensitivity analysis

In general, the difference between the theoretical weight and effective weight was not significant, and the variation index of the one map removal sensitivity was relatively reasonable, which indicates that the calculated assessment results are relatively reliable.

However, the sensitivity analysis also indicated that C5 and C1 were the most sensitive parameters in AHP, which were mainly due to the high mean score value in the study area. Thus, a more accurate and objective rate value is further required for better assessment.

5.6 The region of karst collapse susceptibility

Using the weights calculated by the catastrophe theory combined with the entropy index of the AHP model, the final comprehensive susceptibility index S was calculated by the weighted mean method according to Eq. (1) through Eqs. (9) and (12) through Eq. (14). The higher calculated susceptibility index S indicates a higher potential for karst collapse. The natural break classification scheme can determine the best arrangement of values into different classes (Yalcin and Gul 2017). Using the natural breaks classification method, when S ≤ 2.4, the susceptibility was low; when 2.4 < S ≤ 3, the susceptibility was medium; when 3 < S ≤ 3.6, the susceptibility was high; when 3.6 < S, the susceptibility was very high. The sinkhole susceptibility maps produced from this novel method in a GIS environment are presented in Fig. 7.

Fig. 7
figure 7

Results of sinkhole susceptibility modeling

5.6.1 Very high-susceptibility area

All of the karst collapses distributed in the very high-vulnerability area, where there are abundant geological structures, karstification, and intensive human activities. This area is mainly located in Yandi, Zhongqiao, Shuidaoguan, Sishui, Beichengzi, and Gaogezhuang village, which covers an area of 9.04 km2 (about 20.09% of the total area). The degree of karsts and fault development in this area is also extreme. The Quaternary system, whose thickness was less than 25 m, is mainly composed of dual or multivariate structures. Furthermore, this region is near the cement plant, where dynamic groundwater conditions have dramatically changed, and the groundwater table is near the bedrock level. Generally, this area has visible conditions of sinkhole disaster occurrence, so there is a large possibility of cover collapse and related environmental problems.

5.6.2 High-susceptibility area

This area includes regions with strong karstification and/or near fault zones. This area is mainly close to the villages of Caogezhuang, Baichengzi, Sishui, which comprises a total area of 8.92 km2 (about 19.82% of the total area). The Quaternary system, whose thickness is less than 15 m in the region, is mainly composed of dual structures mixed with sand, gravel, and clay. Due to water use for industry and agriculture, especially near the village Sishui, the groundwater level fluctuates relatively consistently, although not very intensely. When the dynamic conditions change, such as the drawdown of the groundwater level, a soil cave may continue to expand, and a new sinkhole may occur in this region.

5.6.3 Medium-susceptibility area

This area includes regions with low karstification and few geological structures and low groundwater exploitation. The distribution of this type of region was scattered across the assessment area and is mainly affected by surface water or fault structures. The tectonic movement and the hydrographic network are the critical factors for the occurrence of karst collapses in this region. The region consists of an area of about 17.36 km2 (about 38.58% of the area).

5.6.4 Low-susceptibility area

This type of area occurs where the geological conditions are simple and without the effect of groundwater extraction. The groundwater level is relatively stable in the region. Generally, the necessary conditions of collapse in this region do not exist, so the karst collapse susceptibility is low. The low-susceptibility zones consist of a total area of 9.68 km2 (about 21.51% of the area).

6 Discussion and conclusions

Sinkhole hazards have expanded dramatically since 2005 in the study area, and have caused severe damage across the region. Only preliminary work related to sinkholes has been performed in the study area since 2016. The karst collapse susceptibility assessment is an effective and urgently needed tool used for hazard mitigation purposes. This article presented a novel integrated method for the assessment of karst collapse vulnerability by applying the analytic hierarchy process approach combined with catastrophe theory and entropy theory. The location of recorded sinkholes, as well as sensitivity analysis, showed that the AHP model combined with catastrophe and entropy theories could reduce the subjectivity in karst collapse vulnerability assessment, which would be helpful for the prevention and control of karst collapse hazard of the study area. Additionally, the methodology applied here also could be used in other geological hazard susceptibility assessments.

The occurrence of sinkholes in the study area was the combined effect of several factors typical of covered collapse sinkholes. The built AHP model was composed of three criterion layers (B1, B2, and B3) considering nine factors: C1, C2, C3, C4, C5, C6, C7, C8, and C9. The sinkhole susceptibility was classified into four classes: very high, high, moderate, and low. The scale values were given by the opinion of experienced experts. The thematic layers of the nine factors were carried out by using GIS platform with a 10 × 10 m grid size. Rather than constructing a matrix using systematic pairwise comparisons as the normal AHP would perform, the relative weights were preliminarily assigned to each layer and factors using the catastrophe model combined with entropy theory to reduce the subjectivity. The use of catastrophe theory reduced the subjectivity that is a typical disadvantage in the AHP model. Furthermore, the entropy theory further improved the rationality and reliability of the calculated weights, which resolved the disadvantage of the catastrophe model. The final normalization weights of C1, C2, C3, C4, C5, C6, C7, C8, and C9 were 0.1555, 0.1385, 0.0995, 0.1024, 0.1121, 0.1000, 0.1019, 0.0980, and 0.0921, respectively. Finally, a weighted linear combination method was utilized to obtain the susceptibility zonings.

Based on the classification results in GIS, the assessment area was divided into very high-, high-, moderate-, and weak sinkhole susceptible areas, consisting of areas of 9.04, 8.92, 17.36, and 9.68 km2, respectively. All of the 12 recorded cover collapses were located within the very high-susceptibility area, indicating that the model could potentially predict sinkholes. The very high- and high-susceptibility areas, consisting of 39.91% of the total area, were concentrated in the vicinity of the mine and/or towns, where human activities, especially the extraction of groundwater, were intensive. According to observation and survey results, the hazard sinkholes have caused property damages in this region, primarily distributed within the densely populated areas. In general, the condition of soil, rock, and groundwater are closely related to the cover collapse sinkhole formation and development in this kind of region. Groundwater extraction is the primary factor that induces sinkholes.

Furthermore, environmental problems associated with sinkholes would induce severe disasters in those regions coupled with long-term excessive exploitation of groundwater. The medium- and low-susceptibility zones with moderate karstification or few geological structures were mainly outside the scope of the groundwater drawdown region, covering 38.58% and 21.51% of the total area, respectively. The change in the groundwater level was relatively stable in this region, far from the cement plant and intense extraction of other industrial facilities. The tectonic and hydrographic network may be the underlying factor contributing to the occurrence of sinkholes in this region. In general, the drawdown of the groundwater level is the most important affecting factor in the study area. Therefore, with continued intense groundwater exploitation, the karst collapse maybe occurs more frequently, especially in the very high- and high-susceptibility areas. Groundwater exploitation should be strictly controlled, especially in the very high and high zones, to avoid the occurrence of sinkholes.

Numerous studies have been conducted to produce the sinkhole susceptibility map. The subjective and objective methodologies both have merits and limitations during the assessment of hazard susceptibility. The AHP technique is a broadly utilized subjective method in the field of natural hazard susceptibility assessment, including karst collapse. Expert opinions are indeed very useful in solving complex problems such as sinkholes. Considering the key disadvantage of the AHP method and the subjective weights, the objective method should be further established to obtain reliable results. The comprehensive use of objective and subjective methods could fully reflect the merits of each method. Within this present work, the comprehensive method not only placed particular importance on the experiences of local experts, fully considering the sinkhole occurrence mechanisms, but also noted differences between indices using objective methods. Furthermore, to overcome the disadvantage of the catastrophe theory, another objective method entropy model was selected to rank the order of the nine factors during the process of the catastrophe theory.

The use of the subjective methodology AHP considers the practical situation, which results in more suitable and practical outcomes that apply specifically to the selected study region. As there are no standard rules to assign the value of classified factors, the rate value had inevitable subjective features. Therefore, not only the weights determined by the relative importance judgment from experts were subjective, but also the classified values of the parameters were given by local experienced experts. The grade values of the selected nine parameters (C1, C2, C3, C4, C5, C6, C7, C8, and C9) were also the source of the objective methods in the present work. To further analysis the results of the present methodology, the single-parameter sensitivity analysis and map removal sensitivity analysis were used to discuss the influence of the rate and weight values of the nine selected factors. The results of both sensitivity analysis methods showed that the given rate values affected the effectiveness of the calculated weights. Parameter C5 (the overlying layer) was the most sensitive parameter, which exhibited the highest mean effective weight 19.05% and the largest mean variation index 1.37%, and followed by karst development (C1). The mean effective weight and mean variation index of C1 were 16.02% and 0.81%, respectively. Owning to the higher mean score values of C5 and C1 (5.29 and 3.21, respectively), those two parameters were the most sensitive factors in the AHP model. Furthermore, C3 and C4, corresponding to relatively large average values of 3.59 and 3.34, showed higher effective weightings. Therefore, the rate values were the source of uncertainty during the assessment, which also should be noted in future studies.