Introduction

Karst can be defined as calcareous or dolomitic areas with distinctive landforms and hydrology, formed from the dissolution of soluble rocks (e.g., limestone, dolomites, or gypsum) (Christofoletti 1980; Ford and Williams 2007). Other authors extend the definition of karst to non-carbonatic rocky landscapes (pseudokarst), such as quartzite, sandstone, and even granites (Grimes 1975; Halliday 2007; Hardt and Pinto 2009; Hardt et al. 2009; Fabri et al. 2014). Karst landscapes have singular hydrology and geomorphologic forms with a high hydraulic connection between surface water and groundwater through numerous features, such as sinkholes and swallow holes, which are results of the dissolution of carbonates with high solubility and well-developed secondary porosity (Ford and Williams 1989, 2007). The water precipitated within the limits of a karst watershed can infiltrate directly into swallow holes or sinkholes, which act as direct recharge points; or diffusely into rock fractures or overarching soils (Goldscheider 2005).

Sinkholes (or dolines) are important features developed in karst landscapes and can be considered the most common forms of exokarst. Classified as closed depressions with circular to a semicircular aspect, they have been described since the XIX century as zones where superficial drainage converge, being unique and extremely important in environmental studies that involve this type of geomorphological landscape (Cvijić 1893a, b; Piló 2000). Furthermore, sinkholes can evolve into uvalas (nested sinkholes, compound dolines, or polygenetic sinkholes) defined as a significant closed depression formed by the coalescence of two or more sinkholes (Sweeting 1979; Piló 2000; Florenzano 2008; Ćalić 2011). Another common karst feature is the polje, a very large closed depression, in some places several kilometers long and wide, with a flat floor either of limestone or covered by recent material and high slopes concentrated on its edges (Monroe 1970). Subtypes and combinations of these features can also be identified with frequency in the karst geomorphology. Hence, the identification and mapping of karst geomorphologic elements are essential, since they express this landscape’s geogenic processes.

Due to favorable conditions, a highly evolved karst landscape can be observed in the Carste Lagoa Santa Environmental Protection Unit (CLSPU), in which the more typical expressions are: sinkholes, uvalas, swallow holes, poljes, limestone walls, and blind valleys leading to a typical fluviokarst landscape where the drainage becomes a mixture between surface water and groundwater (Monroe 1970; Ford and Williams 2007). The significant development of typical karst features in this region is mainly attributed to the purity of carbonatic rocks (i.e., high calcite proportions) and the relatively high rainfall (IBAMA 1998). These factors directly influence the constant dissolution of carbonatic rocks and the consequent development of the typical features of the karst landscape.

The integrated use of remote sensing and geographic information systems (GIS) is becoming increasingly efficient and accessible due to their improved data accuracy and resolution, added to a continuous reduction of acquisition costs. The capacity for analyzing large areas in short periods is another excellent characteristic and advantage of geoprocessing resources. Meneses (2003) reinforces that computer tools facilitate complex analysis with the interaction of data from various sources and the creation of georeferenced data banks. This type of application has proven very useful for the geomorphological mapping of karst landscapes (Oliveira 2001; Sallun Filho and Karmann 2007; Siart et al. 2009; Silva et al. 2015). However, many of these investigations have been performed based on aerial photography interpretation and manual delineation, which leads to long execution periods and greater subjectivity (Williams 1972; IBAMA 1998; Telbisz et al. 2009; Uagoda et al. 2011; Zumpano et al. 2019). Although aerial photographs are handy tools for the visual identification of sinkholes, they are limited by image resolution and the experience and judgment of the GIS professional; therefore, it may not be possible to locate features of reduced area, very soft edges, or/and with unfavorable characteristics of soil cover (Gutiérrez et al. 2008).

In contrast, the application of Digital Elevation Models (DEMs) allows the detection of closed depressions and improves classification coverage by providing an indispensable numeric component for spatial analysis (Lyew-Ayee et al. 2007). In this context, the size of the objects to be identified always represents the variable that predetermines the adequate image resolution to be used in the analysis. Automatic methods to identify surface features in karst environments found in the literature are mainly based on methods that use multi-spectral classification, digital elevation models, digital terrain models, and RGB satellite imagery, or a combination of one or more of these four resources (Slater and Brown 2000; Oliveira 2001; Denizman 2003; Lyew-Ayee et al. 2007; Kokalj and Ostir 2007; Siart et al. 2009; Basso et al. 2013; Doctor and Young 2013; de Carvalho et al. 2014; Liang et al. 2014; Bauer 2015; Al-Fugara et al. 2016; Jeanpert et al. 2016; Pardo-Igúzquiza et al. 2016; Wall et al. 2016; Wu et al., 2016; Hofierka et al. 2018; Šegina et al. 2018; Theilen-Willige et al. 2014; Verbovšek and Gabor 2019; Zumpano et al. 2019).

Aiming to identify closed depressions in the CLSPU and to contribute to the characterization of its different geomorphological features, the semi-automatic method developed by Rodrigues (2011) was modified and partially employed in this study with additional interpretation, such as manual removal of features considered outliers, attribute table manipulation, analysis of circularity index, and comparison between the results obtained in this study with those presented by IBAMA (1998). For that, several geoprocessing tools were used to optimize technical steps. This set of routines shows the advantages of quantitative methodological procedures and guarantees their reproducibility when performed by GIS professionals with experience in karst environments. Moreover, this study aims to apply and evaluate the proposed method’s performance to identify closed depressions that may correspond to sinkholes and their various forms.

Study area

Legally established in the early 90 s as a result of the need to control economic development pressure activities (e.g., mining, pasture, and urban expansion), the CLSPU, located in the Metropolitan Region of Belo Horizonte, Minas Gerais—Brazil (Fig. 1), is considered the cradle of archeology and speleology in Brazil due to the vast number of karst features that holds large amounts of fossil records (Baeta 2011). The study area covers the entire protection unit, approximately 400 km2, intersecting the municipalities of Confins, Funilândia, Lagoa Santa, Matozinhos, Pedro Leopoldo, and Prudente de Morais, including the Tancredo Neves International Airport. The CLSPU is part of the Velhas River basin, located in the upper portion of the São Francisco River basin (IBAMA 1998).

Fig. 1
figure 1

Location of the study area and the most important geological and hydrological features, originally performed at 1:50,000. Adapted from CPRM (2003) IBAMA (1998)

The study area climate is associated with general conditions of atmospheric circulation under the dominance of the stationary system called Subtropical Anticyclone of the South Atlantic, which presents a high degree of absolute humidity and high temperature as a result of intense incident solar radiation (IEF 2010). The protection unit altitude varies between 620 and 910 m above sea level, with an average annual temperature of 20.9 °C and average annual precipitation of 1328 mm (INMET 2014). The rainy season is well defined between October and March, and the dry season occurs between April and September. The minimum monthly average precipitation is 10 mm (June and August), and the maximum is 289 mm (January) (INMET 2014).

The vegetation cover is predominantly from Cerrado (tropical savanna ecoregion), semideciduous forest, and Atlantic forest remnants. The latter consists of isolated zones of varying sizes. Restrictively, there are areas of extensive Eucalyptus sp and Pinus sp culture and vegetation similar to Caatinga (semiarid Biome in Northeast Brazil) in zones of limestone outcrops. The environmental pressure of real-estate expansion represents a significant factor in landscape and vegetation alteration. Urbanized areas occupy approximately 10.3% of the area, or 41.35 km2, which shows not such dense urbanization; however, the region suffers from disorderly occupation in some places, added to limestone mining, clay and sand extraction, and agriculture (IBAMA 1998).

The study area is characterized by a high degree of karst evolution where surface water and groundwater flow are mixed, thus, classified as a fluviokarst landscape. The presence of interconnected dissolution conduits and caverns indicates a developed endokarst, while the surface exhibits sinkholes, uvalas, swallow holes, blind valleys, and resurgences (IBAMA 1998). According to this same publication, the geomorphology of the Carste Lagoa Santa Environmental Protection Unit can be subdivided, depending on the diversity of the forms, into four distinct compartments: (a) gorges with high limestone walls; (b) great depression belt (uvalas); (c) small depression landscape (sinkholes); and (d) karst plains or poljes. The unconsolidated subtracts are mainly composed of clayey texture layers represented predominately by latosols (45% of total area coverage) followed by cambisols (38.6%) and finally podzolic soils (15%) (IBAMA 1998; EMBRAPA 2013; Tayer 2016). These soils are well-drained, usually deep, very porous, and permeable (Shinzato 1998).

Geologically, the CLSPU is located in the southern portion of the São Francisco Craton and is constituted by a metasedimentary sequence of Neoproterozoic age, which forms the Bambuí Group (Costa and Branco 1961; Dardenne 1978; Schobbenhaus et al. 1984; Ribeiro et al. 2003; Tuller et al. 2009). Two formations of the Bambuí Group occur in the area (in stratigraphic sequence—from bottom to top): the Sete Lagoas (composed by Pedro Leopoldo and Lagoa Santa Members) and Serra de Santa Helena Formations. Locally, in the area’s SW portion, gneissic Archean basement rocks (CPRM 2003) of the Belo Horizonte Complex occur. The geological map is shown in Fig. 1.

The Pedro Leopoldo Member is formed by metasiltstone, phyllites, calcium phyllites, and dark gray siliceous limestone (calcium carbonate between 60 and 90%) (CPRM 2003). The Lagoa Santa Member has a higher concentration of calcite, reaching over 94% of purity (IBAMA 1998), which facilitates the dissolution process of limestones in the surface and subsurface and increases the rate and processes of typical karst features formation, such as sinkholes and swallow holes. The Serra de Santa Helena Formation (SSHF) overlays the Sete Lagoas Formation and occurs locally, as two spots in the northeast and southeast regions of the study area. The SSHF comprises pelitic rocks as argillaceous siltstones, subordinate sandstones, fine calcarenite, and marl lenses (CPRM 2003; Pessoa 2005; Tuller et al. 2009), and presents a lower potential for the development of sinkholes. The Tertiary and Quaternary sediments, which overlays the Bambuí Group, occur as detritus coverage, alluvial terraces and alluvium in active rivers, abandoned river meanders, and low topographic areas (Ribeiro et al. 2003; Tuller et al. 2009).

Materials and methods

To identify closed depressions and, consequently, analyze their eventual classification as sinkholes, the method found in Rodrigues (2011) unpublished internal report was modified and partially applied using the software ArcGIS 10 (Spatial Analyst and Analysis toolboxes). The flowchart with all steps of the proposed method is illustrated in Fig. 2.

Fig. 2
figure 2

Steps executed in the application of the proposed semi-automatic spatial analysis method to identify and map potential sinkholes. For cases (a) and f, see item 3.4

Input data

For the elevation data processing, two adjacent images from the Shuttle Radar Topographic Mission (SRTM) were used as input data, located between the coordinates 19º and 20ºS, and 43º and 45ºE, obtained from the United States Geological Survey (USGS) image catalog (Earth Explorer). Each image corresponds to an approximate total size of 110 km (N–S) and 105 km (E–W) at a pixel resolution of one arc-second (30 m). It is essential to mention that by the nature of the employed mapping technology (radar interferometry) used by the SRTM, the altimetry data registered do not distinguish trees or buildings from ground truth, mixing them indistinctively.

Data processing

First, the SRTM pixel data (elevation values) were converted to centroid points (shape-points), making the refinement of the original data possible through interpolation using SPLine (interpolation method that considers lateral tendencies for the calculation). The result was a refined image with 10 × 10 m pixels, contrasting with 30 × 30 m pixels’ original resolution. The 10 m pixel image was chosen empirically after a few tests to understand the pros and cons of using different refined image pixel resolutions. Later, a higher resolution could maximize noise features when applying the method, while a lower resolution could neglect smaller depressions and depressions with smooth edges. Choosing what image to use and its possible refinement will depend on unique factors such as the study area and the GIS professional analysis.

Each pixel elevation value was converted to shape-points from this new image, and contour lines were extracted using the Countour tool found within ArcGIS Spatial Analyst Tools. The equidistance between contour lines was maintained at 5 m to enhance the topographical analysis. Note that the one arc-second SRTM used has a minimum vertical accuracy of 8.52 m (root-mean-square error) within the Brazilian territory, and the height bias has a direct correlation with slope, meaning that the highest errors occur in steep slopes and forested areas (Orlandi et al. 2019). The study area presents overall gentle-to-moderate slope values (mean of 13.57% and with slope values of less than 15% covering 65.3% of the area), and variable-sized sparse fragments of medium-to-densely forested areas (24.2% of coverage in the study area), and hence, the adopted values for contour can be considered to be a good representation of the topography for this study.

Method application

Following the results of the ‘data-processing’, contour lines were converted into polygons (Feature to Polygon). At this stage, obvious outliers caused by the edge effect, which generates polygons that consider the edge of the image as the natural end of a polygon resulting in extremely large and unnatural features, were manually excluded. The resulting polygons were clipped to the extent of the study area.

The Feature to Polygon tool converts contour lines into filled polygon ranges; that is, the final polygons are the fill between a given external contour and the subsequent internal contour line and do not overlap. However, the resulting polygon feature did not incorporate any information from the parent file (i.e., contour lines). Thus, to import the respective elevation values from the parent contour lines into the generated contour polygons, the Spatial Join tool was used. Then, the Join tool, based on contour ID, is used to link the Spatial Join output and the contour polygons attribute tables (see Fig. 2).

The Spatial Join tool’s output yielded contour values of all polygon borders that intersect contour lines (both internal and external). However, for the application of the proposed method, only the external boundary (or the contour with longer length) of each contour polygon is needed; for that, the Summary Statistics tool was used to select features sorted by the maximum length value based on each group of unique contour polygons (ID). Again, the Join tool merged the output statistics values to the contour polygon feature attribute table. Then, within the contour polygon attribute table, were selected and exported only the matching values between the maximum length column and the contour length column, who represented only the outermost contour value of each filled contour polygon.

Subsequently, the Intersect tool was used between the elevation shape-points of the refined SRTM and the contour polygons. The individual elevation values of each DEM pixel were subtracted from the surrounding contour polygons’ values, resulting in a new field with two possibilities: positive and negative values. Positive values represent depressions and negative values represent topographic highs (or acclivities). The Summary Statistics tool was then applied to define the average of the subtraction results (values of elevations or depressions) located within each contour polygon. The results were imported to the polygon contour attribute table using the Join tool, and the polygons were classified as depressions or topographic highs. Polygons classified as depressions were symbolized in blue, whereas topographic highs—in red. This procedure dramatically facilitates the visual distinction between depressions and topographic highs. Hypsometric colors (ArcGIS color ramp Elevation #1) with 40% transparency above hill-shading (tool Hillshade) were applied to improve the visual distinction between depressions and topographic highs. The polygons classified as topographic highs (or negative values in our analysis) were then eliminated from the process by simple exclusion.

Morphological analysis and validation

A series of morphological analyzes were carried out to identify patterns that could better understand the results generated by applying this method. Various types of depressions were identified and characterized as ‘special cases’, according to their morphological characteristics, including (a) depressions located within dry streambeds and flood plains; (b) undersized depressions in ‘blind spots’; (c) topographic highs within depressions; (d) depressions within topographic highs; (e) depressions influenced by tree canopy; and (f) a single depression (or topographic high) contour polygon involving a topographic high (or a depression).

The approach adopted for each one of these exceptional cases is explained next. The features identified and mapped within river channels and flood plains (a), and as a single depression (or topographic high) contour polygon involving a topographic high (or a depression) (f) were manually removed from the process using case-by-case visual analysis as criteria. Depressions that were undersized in blind spots (b), depressions within topographic highs (d), and the depressions influenced by the vegetation cover (e) were used without any other type of treatment (no exclusion of these features).

The remaining polygons, represented only by depressions, were then grouped (using Merge function) into the outermost closed contour and separated into unique features (using Explode function). Next, a minimum and maximum area threshold were applied as a filter for polygon selection. For the minimum area value threshold, was adopted a limit higher than 500 m2. After using the minimum area filter, the smallest identified feature presented an area of 500.09 m2. We empirically suggested this minimum threshold of 500 m2 for the study area, based on the analysis of several individual features, where we found a significant number of noise features generated below this minimum area value. Hence, features with less than 500 m2 were not considered in subsequent analyzes. For the maximum area threshold, the value of the largest depression identified by previous work (IBAMA 1998) was assumed, represented by the Lagoa do Sumidouro sinkhole, with 4.67 km2.

At this stage, depression polygons classified as the particular case “topographic highs within depressions” (c) had obvious gaps caused by the exclusion of topographic highs in the last section's method application. A topological rule (Must Not Have Gaps) was applied to all features, thus filling them.

In the first approach toward method validation, a detailed analysis was performed on the identified depressions using high-resolution RGB images (IKONOS and Google Earth Pro) and the comparison with the hydrographic mapping elaborated at 1:50.000 scale by IBAMA (1998). In some cases, elevation cross-sections were also analyzed as a visual tool to make elevation details clearer using the tool 3D Analyst/Interpolate Line.

Once all closed depressions were consistently identified, twenty-three (23) features were randomly chosen in the field for further validation, using as main parameters, in ascending order of importance: (1) geological domain; (2) detectable edges; (3) local geomorphology; (4) type of vegetation within the evident and surrounding limits; and (5) presence or reminiscence of water.

Comparative analysis

Complementary to the analysis of ‘special cases’, a comparative assessment was performed against the sinkhole map produced by IBAMA (1998). The comparative verification was done visually and by the use of GIS tools to inspect the overlap of the two maps, added to the search for possible unique cases. The methodology used by IBAMA (1998) for sinkhole mapping (scale 1:50,000) was based on the identification of features through aerial photography combined with the analysis of contour lines (Topographic maps: Lagoa Santa and Pedro Leopoldo in scale 1:50,000; and Baldim and Sete Lagoas in scale 1:100,000) for later manual delineation. Part of the mapped sinkholes was confirmed in subsequent field trips.

Finally, the Circularity Index (Ci), which indicates how close the shape of a polygon is to a circle, was calculated for the identified depressions. Then, patterns were identified based on a statistical and comparative evaluation of the present study results and the map produced by IBAMA (1998) to identify a geometric pattern between the features mapped in both studies.

Circularity index

The Circularity Index (Ci) is a mathematical resource for numerically defining the similarity of a feature to a perfect circle. The more circular the shape of the feature, the closer the index approaches one. Therefore, Ci’s value tends to unity as the feature approaches the perfectly circular shape and departs from one as the feature becomes elongated or irregular. The circularity index was calculated considering all features resulting from the present study and IBAMA (1998) using the following equation (Denizman 2003; Borges et al. 2004; Bauer 2015):

$$Ci = \frac{PA}{{\pi \times \left( {\frac{PP}{{2 \times \pi }}} \right)^{2} }}$$

where Ci = Circularity Index; PA = Polygon Area (m2); PP = Polygon Perimeter (m).

Results and discussion

Following the SRTM image refinement and subsequent extraction of 5 m equidistance contour lines, the results were obtained, for example, as shown in Fig. 3a. Figure 3b exhibits the results of the contour line conversion to polygons. The presence of outlier polygons was observed mainly at the edges of the image, where the unnatural features resulting from the border effect were eliminated from the process. The analysis of these attributes resulted in a preliminary map of elevations and depressions, illustrated in Fig. 3c. The features classified as depressions can be seen in Fig. 3d. The depressions identified as potential sinkholes through detailed morphological analyzes (visual analysis by hydrograph control, topography, cross-section elevation profiles, and high-resolution satellite imagery) are illustrated in the final map of Fig. 3e.

Fig. 3
figure 3

Performed steps, results, and the final map of the potential sinkholes identified by applying the proposed method. The red rectangle in the center of (3E) is detailed as examples on the left's smaller maps. (3A) Extraction of equidistance contour lines (3B) contour line conversion to polygons (3C) preliminary map of elevations and depressions (3D) features classified as depressions (3E) depressions identified as potential sinkholes after analysis

The proposed semi-automatic method to identify depressions using SRTM data resulted in the identification of 1076 features considered as potential sinkholes, of which 23 were confirmed in the field. Note that before limiting the minimum/maximum area, eliminating noise, excluding exceptional cases A and F, and manual analysis with the help of cross-section elevation profiles and high-resolution satellite imagery, the total number of depressions mapped was 1523.

The maximum boundary displacement of a sinkhole in the field compared with a depression mapped by the method was approximately 96 m, which does not disqualify the partial validation, given the difficulty of visually identifying the edges of these features in situ when the slope of the terrain is very low. Moreover, certain limiting conditions were also encountered in the field, such as poor accessibility or difficulties in acquiring licenses to enter private property, which significantly reduced the possibility of increasing the number of control points. From the analysis of the pre-selected 23 control sites composed of real sinkholes in field campaigns, we observed that all identified in situ sinkholes were also successfully detected by the method.

Complementarily, this method also allowed the identification of some individual cases (Fig. 4), such as (a) depressions located within dry streambeds and flood plains; (b) undersized depressions in ‘blind spots’; (c) depressions within topographic highs; (d) topographic highs within depressions; (e) depressions influenced by vegetation cover (canopy effect, mentioned previously); and (f) a unique depression (or topographic high) contour polygon involving a topographic high (or a depression). The coverage area and the number of features identified as special cases are presented in Table 1.

Fig. 4
figure 4

Special cases found in the preliminary map analysis of closed depressions. a Depressions located within dry streambeds and flood plains; b undersized depressions in ‘blind spots’; c topographic highs within depressions; d closed depressions within topographic highs; e depressions influenced by vegetation cover; and f a single depression (or topographic high) contour polygon involving a topographic high (or a depression)

Table 1 Coverage area and number of features identified as special cases

Depressions located within dry river beds and floodplains (a) were mostly associated with blind valleys, a common feature in highly evolved fluviokarst landscapes, as they are defined as closed river basins; it is plausible that the proposed method includes this type of feature. In cases where the method erroneously classifies open watersheds as blind valleys or depressions, a higher elevation pixel downstream caused by the canopy effect or any other factor may be responsible for closing the contour polygon, leading to this result. This problem can potentially be solved using an image with a different capture method than the SRTM, such as LiDAR.

Undersized features in “blind spots” (b) occurred mostly when the elevation of the outer limits of a depression was within the minimum contour interval adopted for this routine. This is because the edges of some features may be located within the minimum range of contour considered that might not capture the actual sinkhole perimeter, thus decreasing the final feature boundaries and, consequently, its final area by excluding the higher surrounding contour. One solution to this problem would be to use a finer scale DEM, such as LiDAR or ALOS, to process an even smaller spacing of the contour lines. These cases have not been quantified given the difficulty in identifying such a situation without manually delineating and describing every feature considering other topographic sources and high-resolution RGB images.

Closed depressions within topographic highs and closed topographic highs within depressions are common features in most environments; for that reason, they were used without any further treatment (no exclusion).

For quantifying sinkholes influenced by vegetation cover (e), we first classified a Sentinel-2 image (10 m pixel) using the red (band 4) and the near-infrared (Band 8) to compute the Normalized Difference Vegetation Index (NDVI). The NDVI values that best explained the medium-to-dense vegetation cover of the study area were between 0.42 and 0.99, representing 24.21% of vegetation cover. Then, we selected all depressions that were completely within a vegetation polygon.

The undersized features influenced by vegetation canopy effect (e) are mainly related to the SRTM image capture method, which uses radar interferometry. This technique is subject to many physical factors that may compromise the realism of the final product. For example, in areas with dense forest cover, canopy height appears as ground elevation, since the signals sent by the satellite emitter are disturbed by vegetation before reaching the ground; therefore, SRTM data generate the Digital Elevation, not the Digital Terrain Model. Thus, the identification of morphological features can be misrepresented in areas with extensive dense vegetation. For our final sinkhole map, these altered features were used without additional treatment. Still, we acknowledge that the replicability of the method can be compromised if applied in areas with extensive coverage of dense vegetation using the same type of imagery (SRTM).

Statistical and comparative analyses

A comparative analysis of the 1076 features identified by the method with those depressions in IBAMA (1998) found that the present study recognized almost twice as many features. The total intersection area between the two mappings was around 56%, representing a total number of overlapping features of 313, which corresponds to 55% of the total features identified by IBAMA (1998). The comparative analysis numbers are shown in Table 2. Taking into account the difference in sample size, total area, and intersection area, it can be inferred that a greater spatial distribution of the depressions identified by the method is a result of the greater refinement in the input altimetry data. One of the consequences of using SRTM data was the subdivision of large depressions mapped by IBAMA (1998) into smaller and more spatially distributed features. This also explains a large number of features mapped by the method.

Table 2 Comparative analysis between the results of this study and IBAMA (1998)

When considering the exceptional cases identified in the first comparative analysis between the results of this study and those of IBAMA (1998), the significant majority was a direct consequence of the objectivity of the method and the use of a refined elevation dataset in relation to the subjectivity of the method adopted by IBAMA (1998), where the topographic dataset used was relatively less detailed, added to the difficulty in visually trace the edges of the sinkholes when the slope becomes very smooth. This disparity can be typified according to the following observed cases, as shown in Fig. 5: (a) IBAMA (1998) identified larger features than the proposed method; (b) IBAMA (1998) identified smaller features than the proposed method; (c) the proposed method identified features where IBAMA (1998) did not; (d) the proposed method and IBAMA (1998) are close but displaced, and (e) IBAMA (1998) identified false positives. We could not find any cases of false positives by the method developed in the present study.

Fig. 5
figure 5

Special cases identified in the comparative analysis between the sinkholes mapped in the present study and those presented by IBAMA (1998). a IBAMA (1998) identified larger features than the method; b IBAMA (1998) identified smaller features than the method; c method identified features where IBAMA (1998) did not; d method and IBAMA (1998) are close but displaced, and e IBAMA (1998) identified false positives

Occasional undersized features mapped by the method compared to IBAMA 1998) could be influenced by the adopted equidistance between contour lines (5 m), as explained earlier. The cases that IBAMA (1998) identified smaller features than the method could be explained by the difficulties in visually mapping depressions with very smooth edges. The cases identified as false positives mapped in IBAMA (1998) were verified through elevation cross-sections profiles and were essential points for in situ checking. Concerning the other cases, the difference between mappings may be associated with the differences between methodological sensitivities and also used databases.

Ultimately, the patterns identified in the comparative analysis between the Circularity Index of depressions mapped in the present study and IBAMA (1998) were analyzed, and both basic statistical information is presented in Table 3. A descriptive analysis was also performed, in which frequency distribution histograms were generated for the two studies (Fig. 6). Results showed that the average, maximum, and median Circularity Index statistical values are very close in both groups. The standard deviation 37.7% higher in the group of sinkholes identified in this study is also compatible with the increased quality of altimetry data provided by SRTM (and refined image).

Table 3 Basic statistics between the circularity indices of the features mapped by IBAMA (1998) and the present study
Fig. 6
figure 6

Circularity index frequency distribution histograms for IBAMA (1998) and the present study

Conclusion

The interpretation of the method’s final output, combined with field validation, resulted in mapping 1076 potential sinkholes within the Carste Lagoa Santa Environmental Protection Unit. It was confirmed that the semi-automatic sinkhole identification method using DEM images, with the multi-component processing developed here, has considerable feasibility for the detection and morphometric refinement of karst features ensuring the development of an efficient tool to save time and financial resources. The possibility of statistical analysis, greatly facilitated by geoprocessing resources, also represents a remarkable gain of information and may contribute to future geogenic studies of these depressions.

The considerable increase in the number of features mapped compared with the other considered method is compatible with the refinement of altimetry data. Increased topographic detail also allowed better identification of small sinkholes and sinkholes with low slope edges with noticeable less subjectivity. The study indicated that although the use of SRTM data has identified hundreds of new features, they resemble previously mapped morphologies, confirming the convergence of results while endorsing the advantages of the method.

We conclude that this method provides another insight into sinkhole delineation with significant results, contributing noticeably to a more faithful elaboration of guidelines for soil use and occupation by management agencies and karst patrimony conservation, especially sinkholes. For example, Tayer and Velásques (2017) used the results of this study as one of the parameters to evaluate aquifer vulnerability at the Carste Lagoa Santa Environmental Protection Unit.

However, we recommend further investigation of the special cases indicated in this study to detail their causes and consequences using field analysis and statistics for each case. It is also advisable to apply the presented method in other areas with preliminary sinkholes mapping for endorsement of results and the investigation of new exceptional cases that may further refine the method. Future studies may increase the number of related field control areas and use other DEMs databases (e.g., ALOS and LiDAR), thereby further expanding the benefits of the methodology presented here. It should also be noted that further automation (using ArcGIS Query Builder routine or Python, for example) could allow even greater optimization in the sinkhole identification process.