1 Introduction

Due to rainfall shortage, groundwater has great importance in arid and semiarid areas. Accordingly, the investigation and exploration of shallow aquifers are more solicited than the study of groundwater potentiality. The groundwater potential maps correspond to a spatial representation of the groundwater occurrence (Jha et al. 2010; Mohammadi-Behzad et al. 2019) that can be obtained by several methods. The multi-criteria decision analysis (MCDA) approach is appropriate in terms of cost and time. Therefore, many scientists have used the MCDA to conduct the groundwater potential maps worldwide (Machiwal et al. 2011; Kumar et al. 2014; Fenta et al. 2015) and in arid regions in Tunisia (Chenini et al. 2010; Mokadem et al. 2018, Msaddek et al. 2019).

This method rests on using GIS, being the most widely used tool for generating groundwater potential map by overlaying weighted thematic maps (Oh et al. 2011; Fenta et al. 2015; Zabihi et al. 2016; Chen et al. 2018). Several statistical approaches are used to determine these maps weighting and ranking and for a better representation of the real potential of the water table. Indeed, statistical approaches are becoming more frequent in multi-criteria mapping (Ozdemir 2011; Rahmati et al. 2015; Zeinivand and Nejad 2018; Kordestani et al. 2019).

As for the study of vulnerability, researches are conducted to improve the standard weights of original methods (DRASTIC, GOD, GALDIT, etc.) and relate them to their specific cases of study by logistic regression (LR) (Antonakos and Lambrakis 2007; Jmal et al. 2017; Douglas et al. 2018; Hagedorn et al. 2018), AHP (Karan et al. 2018; Ebrahimi et al. 2019), and FAHP (Şener and Şener 2015). The statistical approaches were also used to study groundwater quality by the multiple correspondence analysis (Reis et al. 2004) and to determine springs potential occurrence (Corsini et al. 2009; Ozdemir 2011; Moghaddam et al. 2015).

In the case of the groundwater potential assessment, several statistical methods have been employed like AHP (Shekhar and Pandey 2015; Rahmati et al. 2015), frequency ratio (Oh et al. 2011; Elmahdy and Mohamed 2015; Zeinivand and Nejad 2018), WOE (Al-Abadi 2015; Chen et al. 2018; Kordestani et al. 2019), FAHP (Aryafar et al. 2013; Şener et al. 2018), and certainty factor (Razandi et al. 2015). These methods involve several factors that are related to topography, such as altitude, slope (value, aspect, curvature), and topographic wetness index; and soil characteristics such as lithology, geomorphology, land use, land cover, in addition to the basin geomorphology, namely the drainage density, fault density and distance from rivers, distance from fault (Oh et al. 2011; Al-Abadi 2015; Şener et al. 2018).

In this case, the delineation of the groundwater potential becomes an important component for aquifer management and planning. Therefore, the fuzzy AHP method, FR, and WOE were used to map groundwater potential of the Sfax region located in southeastern Tunisia using six thematic maps, namely lithology, topography, slope, fault density, drainage density, and land use. The three methods can be then validated by the ROC curves.

The present paper suggests three new statistical models (fuzzy AHP, FR, and WOE) to determine the groundwater potential in the Sfax region with a validation of their efficiency. This low expense approach can have an important contribution to the groundwater management and planning to face the high consumption for the regional agricultural needs.

2 Study area

Sfax region is a plain sited in the southeast of Tunisia (Fig. 1) between 34° 10′ 16.47 N–35° 17′ 20.14 N latitude and 9° 54′53.35 E–10°46′ 47.76 E longitude. The region exhibits monotonous topographical features where the altitude values range from 0 to 250 m. Because of its arid to semiarid climate, the average annual rainfalls are low of 215 mm for the period 1963–2017. The mean annual temperature and humidity are recorded as 19.4 °C (1961–2017) and 65% (1950–2017), respectively (NIM 2018). Sfax region has a developed stream network (DGRE 2005). However, due to the rainfall shortage, it has a temporal activity making the groundwater the main resource for the agricultural activities. Indeed, the major land use consists of olives, almonds, vineyards, greenhouse crops and vegetables. Because of its economical and demographic importance, several researches were devoted to study the shallow aquifer of Sfax taking into consideration its vulnerability (Smida 2010; Brahim et al. 2013; Allouche et al. 2015; Gontara et al. 2016) as its salinization (Trabelsi 2005, Boughariou 2018). Besides, the recharge components were considered in other studies (Bouri 2008; Boughariou 2015; Ayadi 2016). Water resources are exploited by 11,884 wells, and piezometric evolution is monitored by 96 piezometers. The aquifer is actually overexploited which makes the study of its potential of a high importance. In fact, 53.65 Mm3 was extracted, while it should not exceed 39.12 Mm3 in 2013 (RCAD 2014).

Fig. 1
figure 1

Study area according to Ghribi (2010)

The plain is mostly covered by a sandy and sandy clay soil. The geology of Sfax region consists of Mio-Pliocene outcrops and Quaternary deposits composed of recent alluvial deposits like gravels, sands, conglomerates and silts (Bouaziz et al. 2002).

3 Methodology

The groundwater potential maps were generated using the main conditioning factor of groundwater potentiality: lithology, lineament, drainage density, altitude, slope, and land use. Due to the spatial variability of these conditioning factors, they were given different ranking scores according to their contribution and significance to the infiltration process. Data are obtained from the Regional Commissionership of Agricultural Development of Sfax. (RCAD) and treated by ArcGIS 10.3. The methodology is illustrated by the flowchart in Fig. 2. Consumption data and wells locations were also collected from RCAD. The wells location data set is of 11,868 wells. Out of these, 8308 (70%) groundwater wells were selected by the random function in GIS as a training set, while the 3560 wells left (30%) were considered a validation set (Fig. 3).

Fig. 2
figure 2

Flowchart of the adopted methodology

Fig. 3
figure 3

Wells location data set

The obtained GWP was classified into three classes of low, moderate low, moderate and high potential using the quantile classification method of ArcGIS.

3.1 Groundwater conditioning factors

First, lithology is a main factor used for recharge calculation since it controls the conductivity and, therefore, the water infiltration (Razandi et al. 2015). The lithology map of Sfax is obtained from RCAD (2014) and illustrates 11 classes of lithological units, namely clay and sand, oolitic and bioclastic limestone, limestone encrust, alluvial fan, oued deposits, posphogypsum deposits, sandy deposits, red silty sand deposits, gypsum and silt encrust, oueds, and saline soil (Fig. 4a).

Fig. 4
figure 4

Conditioning factors thematic layers: a lithology, b land use, c topography, d slope, e drainage density, and f lineament density

Second, the studied region is mainly covered by agricultural categories reflecting an interaction between irrigation water and the aquifer. Eleven land use classes were identified: olives, palms, orchards, vineyards, cereals, field crop and green house, forest, built-up area, uncultivated area and water body (Fig. 4b). Then, altitude has the greatest importance among the factors influencing the potential of groundwater according to its influence on infiltration and runoff (Şener et al. 2018). The altitude in Sfax varies between 0 and 250 m. The classification was made by quantile classification and led to four classes: [0–60], [60–100], [100–140], and [140–250] (Fig. 4c). Moreover, in this case study the slope parameter is represented by its value in percent. The zone of study was reclassed into five classes according to slope values (Allouch et al. 2015; Boughariou et al. 2015). The first one is going from 0 to 3% and is covering almost the entire study area. This interval of gentle slopes refers to a runoff in the favor of the highest rate of infiltration. The second interval is accorded the areas of 3–5% slope, while the third one is for intermediate slope values ranging from 5 to 10%. A fourth interval is defined for higher values of slope ranging from 10 to 15%. The fifth interval is for the highest slope values that exceed 15% (Fig. 4d). The drainage density map was classified into four intervals: [0–0.29], [0.29–0.68], [0.68–1.14], [1.14–1.17], and [1.17–3.68] (Fig. 4e). Lineament density map was performed, and four classes were obtained by a quantile classification: [0–0.05], [0.05–0.1], [0.1–0.15], and [0.15–0.41] (Fig. 4f). The ranks and weights are generated by the following statistical methods.

3.2 Statistical methods

3.2.1 Fuzzy AHP (FAHP)

The AHP was first discussed by Saaty (1980) as an analytic tool of multivariate decision-making to solve unstructured problems in several scopes. In fact, this hierarchical structure compares the suitability of alternatives by a pairwise comparison. However, it is still incapable of reflecting the human thoughts totally for its interval comparing to the FAHP method that uses a variety of value called triangular fuzzy numbers (TFNs) (Şener and Şener 2015) and considers the uncertainty related to the cartography (Cheng et al. 1999). According to Özdağoğlu and Özdağoğlu (2007), in their comparison between both methods, researchers have provided evidence that FAHP provides a satisfactory report of decision-making beyond the traditional AHP methods.

Actually, FAHP is the extension of Saaty Ahp (Özdağoğlu and Özdağoğlu 2007; Kishore and Padmanabhan 2016) since it is also through pairwise matrix and weights calculation besides the possibility to deal with qualitative and quantitative data. Therefore, there is an analogy between the crisp and the fuzzy numbers as explained in the comparisons scale indicated in Table1. Moreover, FAHP is a recognized analytical method that could be employed to solve unstructured issues not only for geology but also for social and economical studies (Özdağoğlu and Özdağoğlu 2007). It was recently used in environmental field especially and introduced to vulnerability (Şener and Şener 2015) and groundwater potential (Şener et al. 2018) mapping. Many researchers (Buckley 1985; Chang 1992; Demirel et al. 2008) have proposed several fuzzy AHP methods for their calculations.

Table 1 Saaty scale and triangular fuzzy scale

The steps of Chang’s analysis (Chang 1996) were applied by Kahraman et al. (2004) and used by Aryafar et al. (2013), Şener and Şener (2015), and Şener et al. (2018) They can be given as follows:

I: The value of the fuzzy synthetic extent is defined as

$${\text{Si}} = \sum\nolimits_{j = 1}^{m} {U_{ki}^{i} } \otimes [\sum_{i = 1}^{n} \sum_{j - 1}^{m} U_{ki}^{i} ]^{ - 1}.$$
(1)

II: As U1 = (x1, y1, z1) and U2 = (x2, y2, z2): triangular fuzzy numbers (TFNs), the possible degree of U2 = (x2, y2, z2) ≥ U1 = (x1, y1, z1) is defined as

$$V \, \left( {U_{2} \ge \, U_{1} } \right) \, = \sup_{y \ge x} [\min \, (\mu_{U1} \left( x \right),\mu_{U2} \left( y \right))],$$
(2)

and also expressed as:

$$V \, \left( {U_{2} \ge \, U_{1} } \right) \, = {\text{ hgt }}\left( {U1 \cap U2} \right) \, = \, \mu_{U2} \left( d \right)$$
(3)
$$= \left\{ {\begin{array}{*{20}l} 1 \hfill & {{\text{If y}}_{{2}} \ge {\text{ y}}_{{{1},}} } \hfill \\ 0 \hfill & {{\text{If x}}_{{1}} \ge {\text{z}}_{{2}} } \hfill \\ {\frac{x1 - z2}{{\left( {y2 - z2} \right) - \left( {y1 - x1} \right)}}} \hfill & {{\text{Or}},} \hfill \\ \end{array} } \right.$$
(4)

where d is the ordinate of the highest intersection point between μU1 and μU1.

III: The possibility degree for a convex fuzzy number to be larger than l convex fuzzy Ui (i = 1, 2, …, l) numbers can be expressed as

$$\begin{aligned} {\text{V }}\left( {{\text{U }} \ge {\text{ U}}_{{1}} ,{\text{U}}_{{2}} \ldots ,{\text{ U}}_{{\text{l}}} } \right) \, = {\text{ V }}[\left( {{\text{U }} \ge {\text{ U}}_{{1}} } \right){\text{ and }}\left( {{\text{U }} \ge {\text{U}}_{{2}} } \right) \ldots {\text{ and }}\left( {{\text{U }} \ge {\text{ U}}_{{\text{i}}} } \right), \\ & {\text{i }} = { 1},{ 2},{ 3},...,{\text{ l}}. \\ & {\text{d}}\left( {{\text{A}}_{{\text{i}}} } \right) \, = {\text{ min V }}\left( {{\text{S}}_{{1}} \ge {\text{ S}}_{{\text{l}}} } \right){\text{ for l }} = { 1},{ 2}, \ldots ,{\text{ n}};{\text{ l }} \ne {\text{i}}. \\ & {\text{W}}^{\prime}{\text{ is theweight vector}} \\ \end{aligned}$$
(5)
$$\begin{aligned} & {\text{W}}^{\prime} \, = \, \left( {{\text{d}}^{\prime}\left( {{\text{A}}_{{1}} } \right),{\text{ d}}^{\prime}\left( {{\text{A}}_{{2}} } \right), \, ...,{\text{ d}}^{\prime}\left( {{\text{A}}_{{\text{n}}} } \right)} \right)^{{\text{T}}} , \\ & {\text{Ai }}\left( {{\text{i }} = { 1},{ 2}, \ldots ,{\text{ n}}} \right){\text{ are n elements}}. \\ \end{aligned}$$
(6)

IV: The normalized weight vectors are

$${\text{W }} = \, \left( {{\text{d}}\left( {{\text{A}}_{{1}} } \right),{\text{ d}}\left( {{\text{A}}_{{2}} } \right),...,{\text{ d}}\left( {{\text{A}}_{{\text{n}}} } \right)} \right)^{{\text{T}}}$$
(7)

where W is a non-fuzzy number.

3.2.2 Frequency ratio calculations

The FR model represents one of the bivariate statistical techniques since it measures the probability of occurrence between variables (Bonham-Carter 1994; Moghaddam et al. 2015). For groundwater potential mapping, the variables are wells location and the conditioning factors. The FR calculation is through the correlation between the thematic maps and wells spatial repartition (Razandi et al. 2015). The frequency ratio is calculated as follows:

$${\text{FR }} = \left( {\frac{\frac{E}{F}}{G/H}} \right)$$
(8)

where E represents the pixels number with groundwater wells for each parameter, F represents the total number of wells, G is the pixels number for the class area of the parameter, and H represents the number of total pixels.

4 Weights of Evidence calculations

WOE model is based on Bayes rule for events probability prediction. This model was used for the cartography of groundwater potentiality (Lee et al. 2012; Al-Abadi 2015; Daneshfar and Zeinivand 2015; Tahmassebipoor et al. 2016; Kordestani et al. 2019; Zeinivand and Nejad 2018) by the correlation between wells occurrence and the conditioning factors. Unlike arbitral weights attribution in other methods, the weights are calculated relatively to the study area, which facilitate the update of their values in case of new data availability or the improvement by the reduction in subjective opinions (Masetti et al. 2007). First, the positive (W+) and the negative one (W) are determined. They are reflecting the measured association between wells location and each class of each factor.

$${\text{W}}^{ + } = {\text{Ln}}\frac{{P\left( \frac{B}{A} \right)}}{{P\left( \frac{B}{A} \right)}}$$
(9)
$${\text{W}}^{ - } = {\text{Ln}}\frac{{P\left( \frac{B}{A} \right)}}{{P\left( \frac{B}{A} \right)}}$$
(10)

where P is the probability, B: existence of a conditioning factor, B′: absence of a conditioning factor, A: existence of a well, and A′: absence of a well.

Then, the contrast (C) is obtained by computing the difference between W+ and W.

When C value is zero, it means that the considered class is insignificant for the analysis, while a positive value of C points to a positive spatial correlation, and vice versa in the case of negative contrast value (Tahmassebipoor et al. 2016).

Also, the standard deviation S(C) is expressed as follows:

$$S\;\left( C \right) = \sqrt {\left( {\left( {S^{2} \left( {W^{ + } } \right) + S^{2} \left( {W^{ - } } \right)} \right)} \right)}$$
(11)

where S2 is the variance of the influencing parameters’ weights (Zeinivand and Nejad 2018).

$$S^{2} \;(W^{ + } ) = \frac{1}{{N\left\{ {B \cap A} \right\}}} + \frac{1}{B \cap A^{\prime} }$$
(12)
$$S^{2} \;\left( {W^{ - } } \right) = \frac{1}{{\left\{ {B^{\prime} \cap A} \right\}}} + \frac{1}{B^{\prime} \cap A^{\prime} }.$$
(13)

The studentized contrast τ is a measure of confidence expressed as follows:

$$\tau = \left( {\frac{C}{S \left( c \right)}} \right).$$
(14)

After overlapping the conditioning factors maps with the training well location map, the values of W+, W, C, S(C), and τ were calculated.

5 Results and discussion

5.1 Groundwater potential maps

5.1.1 Application of the FAHP

To establish the FAHP groundwater potential map, the thematic maps of the conditioning factors were classified and ranks were attributed to each class using Saaty scale (Table. 2). The ranks for lineament classes are increasing following the numerical values of the thematic map since the lineament presence is in favor of water infiltration. For the lowest class (0–0.05), the rank is 1; for the moderate classes (0.05–0.1) and (0.1–0.15), the ranks are, respectively, 3 and 5, while the highest class (0.15–0.41) has 7 as a rank. For the slope classes, the rating decrease when the slope values increase: the highest rank (7) is attributed to the lowest slope values (0–3) whereas the lowest values are given to the slope highest values. The topography classes also have decreasing ranks since the highest rank (7) is given to the lowest altitudes (0–60), (5) and (3) to the moderate classes (60–100) and (100–140), while the (140–250) class has the lowest rank (1). The rating of the lithology classes is based on the conductivity of the lithological units. Therefore, the highest ranks (7) are for sandy deposits, alluvial fan, oueds, and oued deposits; for the clay and sand, red silty sand deposits, oolitic, and bioclastic limestone, the rank is (5). The limestone encrust, saline soil, and gypsum and silt encrust classes have a low rank equal to (3), whereas the posphogypsum deposits class has the lowest rank (1).The land use classes are rated according to the infiltration capacity in each class. The highest rank (9) is for the water system class; the moderate rating was attributed according to the irrigation rates: (7) for greenhouse; (5) for cereals and field crop classes; (3) for forests, olives, orchards, and vineyards classes. The lowest values are (2) for the uncultivated area and (1) for sebkha and built-up area.

Table 2 Assigned rating of different classes of the conditioning factor

The thematic parameters are compared using fuzzy numbers. The pairwise comparison matrix was filled, based on literature review (Table 3). The normalized weight of the conditioning factor was calculated. As a result, the highest weights are attributed to the lithology and drainage density parameters (0.24 and 0.23, respectively). The weights of the lineament density and the land use thematic maps are moderate (0.19 and 0.15, respectively), while the lowest weights are obtained for the slope and topography maps (0.1 and 0.09, respectively).

Table 3  Pairwise comparison matrix for the FAHP process

The GWPI is a defined as a dimensionless quantity capable of groundwater potential delineation (Razandi et al. 2015). Once the weights are normalized, the groundwater potential index (GWPI) map could be calculated as in Eq. 8 and generated by overlaying thematic layers according to their FAHP weights (Fig. 5).

$${\text{GWPI }} = {\text{ W}}_{{{\text{DD}}}} *{\text{DD}} + {\text{ W}}_{{\text{L}}} *{\text{L}} + {\text{W}}_{{{\text{Lith}}}} *{\text{Lith}} + {\text{ W}}_{{\text{S}}} *{\text{S}} + {\text{W}}_{{\text{T}}} *{\text{T}} + {\text{W}}_{{{\text{LU}}}} *{\text{LU}}$$
(15)

where DD is the drainage density, L: lineaments, Lith: lithology, S: slope, T: topography, LU: land use, and W: the corresponding normalized weight.

Fig. 5
figure 5

Groundwater potential map using the FAHP model

The groundwater potential map by the FAHP model was obtained based on the weighted overlay thematic maps regardless of the wells occurrence. The groundwater potential index was first generated for every pixels in the studied zone; then, the map was classified according to the quantile method into three classes as low [1.68–3.47], moderate [3.47–4.16], and high potential [4.16–7]. Several researchers (Nampak et al. 2014; Tehrany et al. 2014; Rahmati et al. 2016) have proved that the quantile statistical method of classification is efficient for groundwater potential mapping by considering it as a good classifier with the most accurate wells distribution. The high groundwater potential zones are mostly located in the coastal area where lithology is in favor of infiltration, altitude is low, and the slope in gentle. Moreover, the low potential class occurs in central part and southeastern part where drainage density and altitude are high. The spatial distribution of the different classes in the GWP map shows an almost equal repartition since the surfaces of the three classes are close: 34.2% for the moderate class followed by the low and the high classes with 33.4% and 32.4%, respectively.

5.1.2 Application of the FR

Unlike the FAHP, the rates for the classes are not given according to the properties of the conditioning factor but according to the spatial occurrence of the wells in each class. The frequency ratios were therefore calculated for all the conditioning factors as shown in Table4. Finally, the groundwater potential index (GWPI) was calculated (Eq. 10) and mapped on the basis of τ values (Fig. 6).

$${\text{GWPI }} = {\text{ Fr}}_{{1}} + {\text{ Fr}}_{{{2}.}} .... + {\text{ Fr}}_{{\text{n}}}$$
(16)

where FR is the final weight.

Table 4  Spatial relationship between the conditioning factors and wells locations using FR and WOE models
Fig. 6
figure 6

Groundwater potential map using the FR model

Based on the correlation between wells existence and the conditioning factors, FR values could be interpreted. Since the average value is 1, greater values indicate a higher correlation and a high groundwater potential, while lower values indicate a lower correlation and a lower groundwater potential (Manap et al. 2014). The analysis of FR between the wells occurrence and the slope thematic map shows that the lowest class of slope percent 0–3 has the highest value of FR (1.02) followed by the class whose values are above 15 (0.2) followed by 5–10 and 3–5 classes of 0.17 and 0.11 FR values, while the 10–15 class is of a FR value equal to 0. Accordingly, these observations wells are most abundant on the coastal area where the slope is gentle for the easiness of infiltration in this area reflecting a higher potential of the aquifer. In the case of topography, the (0–60) class coincides with the highest value of FR (2.7), which reflects a strong groundwater potential in the coastal area that gets moderate in the (60–100) interval with a FR close to 1 (0.8). The lowest FR values in the high altitude of topography (100–140) and (140–250) indicate a low groundwater potential. The lineament class (0.15–0.41) is of the highest values (1.47); however, for the other classes FR values are close to 1 reflecting a low contribution of the lineament factor to groundwater potentiality and consequently to wells locations. The frequency ratio values in the lithology classes show a low occurrence of wells in sandy deposits, gypsum and silt encrust, and oolitic and bioclastic limestone (0.1, 0.21, and 0.22). However, the highest FR values are located in clay and sand, limestone encrust, alluvial fan, and posphogypsum deposits. For the drainage density, FR ratios are decreasing when the density increases. FR values go from 2.25 in the low density class (0–0.29) to 0.44 (1.17–3.68). This value is explained by the necessity of irrigation in this agricultural activity noticed also for the orchards (FR = 1.6) and field crop (1.18). The GWP map obtained by FR model was also classified into three classes using the quantile method. The high GWP zones are located in the coastal area, while the low potential class is located in the central part. In the spatial distribution of the different classes in the GWP map, the low potential class covers only 31% of the study area, while the moderate class covers 36% and the high potential 33%, respectively.

5.1.3 Application of the WOE

The groundwater potential index (GWPI) was calculated (Eq. 15) and mapped on the basis of τ values (Fig. 7).

$${\text{GWPI }} = {{\varvec{\uptau}}}_{{1}} + {{\varvec{\uptau}}}_{{{2}.}} .. + {{\varvec{\uptau}}}_{{\text{n}}}$$
(17)

where τ is the final weight for the WOE model.

Fig. 7
figure 7

Groundwater potential map using the WOE model

The WOE model is developed by the strength of the spatial repartition of the set of recognized locations by the calculation of W+ and W, C, and τ, the studentized value of C (Table 4) that reflect the relative certainty of the posterior probability (Ghorbani Nejad et al. 2017).

According to τ values, it is noted that for the slope, the class (0–3) has the highest τ value, indicating a high potential for the flat slope. The lithology thematic map has only two classes with a positive value of τ (clay and sand and posphogypsum deposits). Also, the land use parameter shows both classes of the green house and orchards have a positive studentized value of C. For topography, the highest value of τ is given by the lowest altitude in the coastal zone. The lowest value of drainage density has also a higher value of τ, while the highest value of this parameter is detected in the highest class of lineament.

For WOE model, the quantile method was also adopted to classify the groundwater potential map into three classes: low, medium, and high. According to the produced map, the low GWP zones are situated mostly on the coastal area. The partition of this map shows an equal surface for the high and moderate potential (33%). The low potential class is slightly bigger with a surface of 34%.

5.2 Validation models

The wells distribution in the classes of the GWP maps is an indicator of the efficiency of the adopted methods. High potential area is in fact characterized by the largest number of wells. Therefore, the percentage of wells has been calculated in each class (Fig. 8). In the case of FAHP, the wells distribution reflects an obvious contrast since the high potential class of 32.4% surface contains 54.2% of the wells, while the wells number is 28.6% and 17.2% in the moderate and the low classes, respectively. This distribution is significant and demonstrates coherence between groundwater potentiality and wells occurrence. The detected contrast gets higher for the FR and WOE models since for both of them, the high potential class contains 66% wells in 33% of the study area surface. The difference between these two methods is detected on wells partition in the low and moderate classes. The moderate groundwater potential has 24% and 23% of the wells for the FR and the WOE models, respectively (Table 5). According to these findings, FR seems the best method to determine GWP.

Fig. 8
figure 8

Wells location on groundwater potential maps

Table 5 Wells and area distribution of groundwater potential zones for FAHP, FR, and WOE

To deal with incomplete data uncertainty, validation models become essential to avoid erroneous conclusion (Kura et al. 2015).

Since the validation is the principal process of modeling (Razandi et al. 2015), the receiver operating characteristics (ROC) curves established further validation for the obtained GPMs. It was applied to find out the accuracy of the three models (Moghaddam et al. 2015; Miraki et al. 2019).

In fact, ROC are obtained by a confusion matrix of possible outcomes which are true positive (TP) for the correct occurrence of wells, true negative (TN) for the correct non-occurrence of wells, false positive (FP) for the incorrect occurrence of wells, and false negative (FN) for the incorrect non-occurrence of wells (Miraki et al. 2019). Then, the sensitivity and the specificity were calculated using Eqs. 18 and 19. To obtain the ROC curves, sensitivity is plotted against the specificity.

$${\text{Sensitivity }} = \frac{TP}{{TP + FN}}$$
(18)
$${\text{Specificity }} = \frac{TN}{{TN + FP}}.$$
(19)

The area under the ROC curves (AUC) provides qualitative measurement of the prediction precision (Rahmati et al. 2016). AUC values range between 0 and 1 and are generally expressed by a percentage. The ability of prediction accuracy is relative to the AUC value. The accuracy increases relatively to the area under the ROC curves. According to Yesilnacar and Topal (2005), the prediction accuracy is defined based on the AUC value. It can be considered as low (0.5–0.6); moderate (0.6–0.7); good (0.7–0.8); very good (0.8–0.9); and excellent (0.9–1) (Rahmati et al. 2016; Tahmassebipoor et al. 2016; Miraki et al. 2019).

Figure 9 illustrates the ROC curves of the groundwater potential maps generated by FAHP, FR, and WOE models, indicating that the WOE model results (AUC = 71.4%) are very close to FR model (71.1%) and both are more accurate than FAHP model (AUC = 65.1%). The obtained results show that the FR and WOE methods have better validation results compared to the FAHP model. It might seem that the fuzzy analytical hierarchy process takes into account the spatial distribution of the study area characteristics; however, both other methods (FR and WOE) are indirectly linked to the study area properties since each well creation requires a preliminary geophysical and hydrogeological prospection. Therefore, these statistical methods are highly recommended for the study of the groundwater potential assessment, especially that their calculation methods are practical and require only consumption data, while the FAHP method is recommended in the case of a reduced number of wells with a random spatial distribution.

Fig. 9
figure 9

Receiver operating characteristics (ROC) curves calculated for FAHP, FR, and WOE

Beside statistical similarity, it should be noted that the obtained maps have also similar zoning since the three classes of groundwater potential have the same spatial repartition and then corroborate the results validation. The groundwater potential maps have a spatial similarity with the topography map, as well, that could be explained on the one hand, by the importance of the recharge in low altitude where the infiltration is more important than the runoff.

The understanding of the groundwater potential and its close relationship with extraction wells spatial repartition encourage a further study of the aquifer management for sustainable and practical strategies of exploitation in the future. Sfax groundwater resources being essential for the agricultural activities should be protected from over-exploitation and climate change (Boughariou et al. 2018). Indeed, the high potential coastal zones are considered as endangered zones that are threatened not only by over-exploitation but also by seawater intrusion (Trabelsi et al. 2005; Boughariou et al. 2018) where intervention is needed starting with the prohibition of well creation, in these areas and their consideration as protected perimeters by the RCAD. It is also recommended to monitor the pumping rate in these zones as a precaution step with a consideration of water and soil conservation works and irrigated perimeters planning. Therefore, some innovative methods could be employed such as the modern portfolio theory that was applied as a planning paradigms based on vulnerability studies and possible climate change scenarios (Hua et al. 2015).

6 Conclusion

Three models (FAHP, FR, and WOE) were used for groundwater potential zone delineation using GIS. The application of each model requires a classification of six conditioning factors thematic maps as essential input parameters. Then, weights were calculated and the GWP was constructed on the basis of the obtained groundwater potential index values. The FAHP method is based on conditioning factors influence on the study area from the literature and experts knowledge. However, the FR and WOE necessitate the wells distribution to calculate the weights of each class.

The obtained maps were divided into three zones, representing the low, moderate, and high potential classes obtained by the quantile model classification.

The results are acceptable for the FAHP model since the 32.4% groundwater potentiality area includes 54% of the total groundwater wells. The results get better for both FR and WOE models where the 33% groundwater potentiality includes 66% of the total groundwater wells. The high groundwater potential is basically located on the coastal area on the eastern part of the Sfax region for the three adopted models.

Moreover, ROC curves are established and the areas under the curve were calculated. In the case of the FAHP model, the model accuracy is 65.1% while AUC values are higher for the FR model with 71.1%. The best results for the validation were detected in the case of the WOE model with an AUC value equal to 71.4%. These findings led to a consideration that the three models are suitable for groundwater potential delineation in arid and semiarid regions.

This study is of an importance not only for the water resource management, but also could contribute to regional land use planning, future wells creation, and groundwater protection since the shallow aquifer of Sfax is under over-exploitation, particularly in the coastal area. Thus, the application of the statistical methods is considered as a good tool for decision support in water exploitation planning and water management in phreatic aquifers. In addition, some innovative methods could be employed to better manage water resources to face over-exploitation and climate change scenarios.