Introduction

Mapping of undiscovered highly favorable landscapes where the sought deposit-type likely exists is a sophisticated procedure in regional-scale mineral exploration. It involves simultaneous consideration of multiple geoscience spatial datasets (e.g., geochemical, geophysical and geological) (e.g., Carranza 2008). Mineral prospectivity mapping (MPM), a process for delineating targets for exploration of the sought deposit-type, is able to integrate spatial evidence layers such as stream sediment geochemical signatures, geological-structural evidence and surface outcrops of hydrothermal alterations (e.g., Bonham-Carter 1994; Harris et al. 2001, 2008; Zuo and Carranza 2011). For this purpose, geospatial datasets should be initially compiled and analyzed in order to select and prepare spatial evidence layers in a geographical information system (GIS) (e.g., Zuo et al. 2009; McCuaig et al. 2010; Gao et al. 2016). In other words, the following steps are substantial in MPM (cf. Bonham-Carter 1994; Ghezelbash et al. 2019a): (1) recognition of exploration criteria according to a conceptual model of prospectivity for the sought deposit-type; (2) incorporating objective weights into spatial evidence layers; and (3) employing robust numerical techniques for producing a predictive model of mineral prospectivity. Thus, MPM can be deemed a multiple criteria decision-making (MCDM) problem (Abedi and Norouzi 2016; Ghezelbash et al. 2019b). That is because in MCDM procedures, diverse and several exploratory attributes in spatial evidence layers are integrated to generate subsequently a prospectivity model for a certain deposit-type. MPM techniques are generally classified into data-driven and knowledge-driven methods (Nykänen et al. 2008; Carranza 2017).

Three main steps are involved in data-driven MPM, namely (Oh and Lee 2010; Joly et al. 2012; Carranza and Laborte 2015): (1) identification and selection of training sites; (2) generation of predictive model of mineral prospectivity; and (3) evaluation of success rate of predictive model. In the first step, the training sites (locations of deposits and non-deposits) are selected with tacit assumption that deposit locations have features that are strongly similar, if not the same, as deposit-type features, whereas non-deposit locations have features that are completely dissimilar to deposit-type features. In the second step, quantitative relationships between training data and individual spatial evidence layers are established in order to create a predictive map of mineral prospectivity. In the third stage, the predictive map is evaluated in terms of goodness-of-fit with training deposit locations. Therefore, such techniques are convenient for well-explored areas (Lewkowski et al. 2010; Parsa et al. 2017; Ghezelbash et al. 2019a). Examples of these methods have been used for MPM are weights-of-evidence (Bonham-Carter and Agterberg 1990), logistic regression (Carranza and Hale 2001), artificial neural networks (Brown et al. 2000; Ghezelbash et al. 2019a), support vector machines (SVMs) (Zuo and Carranza 2011; Ghezelbash et al. 2019a), Bayesian classifiers (Porwal et al. 2006) and random forests (RF) (Carranza and Laborte 2016; Parsa et al. 2018). Despite many advantages of MPM, there are some exploration biases and limitations in data-driven methods duo to accessibility factors (Hronsky and Kreuzer 2019) as well as targeting criteria, as the spatial characteristics of known mineral deposits\occurrences are utilized as training dataset. Therefore, data-driven MPM as supervised methods are influenced by locations of known mineral occurrences.

Knowledge-driven methods are usable according to experience and expertise of geoscientists and their judgment of geospatial relations among evidence layers (i.e., exploration criteria) and known mineral deposits. These methods are suitable for under-explored or less-explored regions (Carranza 2011). There are many practical methods in this category, in which, the function parameters are conjectured conceptually according to knowledge and experience of geoscientists about mineralization controls or mineral systems (cf. Bonham-Carter 1994; Carranza 2008; Wang 2008; Ghezelbash and Maghsoudi 2018; Ghezelbash et al. 2019a). Despite widespread applications of these methods in MPM, these methods suffer from systemic exploration bias and uncertainty, which arise from over-estimation or underestimation of arbitrary weights of spatial evidence layers based on expert judgment (Ghezelbash et al. 2019b). MCDM techniques are often applied in the form of knowledge-driven MPM. Recently, Ghezelbash et al. (2019b) proposed an improved data-driven MCDM technique for mapping of porphyry-Cu prospectivity in the Varzaghan District, NW Iran, by assigning data-driven (and, thus, not arbitrary) weights to spatial evidence layers and to their discretized classes considering the locations of porphyry-Cu deposits in the study area as well as applying prediction-area (P-A) plot and normalized density function, respectively.

Concerning the nature of prediction, MPM as a predictive tool is mostly besets with prediction uncertainty, which must be modulated to obtain precise and reliable outcomes (Carranza et al. 2008; Kreuzer et al. 2008). Different factors such as fallacious selection of targeting criteria, unsuitable exploration dataset and inappropriate methodology may lead to prediction uncertainties, which are divided into two groups, known as stochastic and systematic uncertainties. Stochastic uncertainty is the result of inherent properties of a dataset and usually arises from inefficacious and inadequate exploration data used for MPM (Lisitsin et al. 2013; Ghezelbash et al. 2020a). Conversely, inaccurate selection of targeting criteria, sensitivity of predictive model to inefficient evidence layers and unsuitable selection or application of numerical methods for establishing the interrelations between geospatial properties and known mineral occurrence locations may propagate systemic uncertainties to MPM (Pirajno 2012).

From another perspective, MPM is a classification problem because every location of the area of interest needs to be categorized into favorable or non-favorable classes (Zuo and Carranza 2011; Parsa et al. 2018). Machine learning algorithms are known as very efficient classification tools that provide sensible solutions to MPM (Porwal et al. 2003). Two main types of task are considered in machine learning procedure: supervised and unsupervised. The main difference between these two types is that the former is done using a “ground truth.” In other words, there are prior knowledge and information about what the output for predictor variables (here spatial evidence layers) should be. Thus, the main aim of supervised learning is to train a function using a training set of deposit locations and non-deposit locations, by establishing the relationship between input vectors and output targets (Sun et al. 2019; Chen et al. 2019; Ghezelbash et al. 2019a). In contrast, the main aim of unsupervised learning is to assume the natural structure inherent to a dataset, which classifies the favorability of the area under investigation based solely on the statistical features of spatial evidence layers (Daviran et al. 2020; Ghezelbash et al. 2020b). In the last decade, machine learning algorithms have been applied extensively to MPM for supervised data-driven classification aims (Rodriguez-Galiano et al. 2015; Carranza and Laborte 2016; Daviran et al. 2021). The principal task of machine learning algorithms in MPM is to approximate precisely the relationships between spatial evidence layers and known mineral deposit occurrences because they are complex and nonlinear. In addition, machine learning algorithms are more successful when the space and dimension of input features are high.

SVM (Vapnik 1998) is one of the most well-known supervised machine learning algorithms. It is a discriminative classifier defined by separating hyperplanes. Indeed, for labeled training data features, SVM outputs an optimal hyperplane that is able to classify new feature vectors in a supervised way (Suykens and Vandewalle 1999). The learning procedure in SVM is done by using kernel functions. The type of kernel function and its relevant parameters are vital in deriving suitable results. The most common kernel functions used in SVM are linear function, radial basis function (RBF) and sigmoid function.

The main objectives of this study were twofold. Firstly, to introduce an improved data-driven MCDM technique, called data-driven simple additive weight (data-driven SAW) for MPM, whereby locations of known mineral deposit occurrences are considered to assign objective weights to exploration criteria (i.e., create spatial evidence layers) and associated sub-criteria (discretized classes) using P-A plots and frequency-ratio (FR) method, respectively. Secondly, to apply a SVM with RBF kernel as a supervised data-driven classification method to generate another predictive model of mineral prospectivity. The basic condition for implementing supervised data-driven classification methods (e.g., SVM) is that the known mineral occurrences (or deposits) of the type sought in the area under investigation must have genetically similar features. There are many epithermal vein-type Cu-Au deposit occurrences with roughly similar features in the study area. Thus, these multi-attribute deposit features, which are used as training data to machine learning algorithms (e.g., SVM), can provide suitable conditions for classification of the study area to favorable or non-favorable. However, there were three main reasons for using SVM in this study (cf. Suykens and Vandewalle 1999). Firstly, unlike many other machine learning algorithms (e.g., ANNs), SVM has a regularization parameter (λ), which makes it less prone to over-fitting. Secondly, while ANNs may suffer from multiple local minima, the solution of an SVM is global and unique. Thirdly, SVMs use kernel trick, and so they can provide expert knowledge about the problem by engineering the kernel. Besides, SVM with RBF kernel usually performs quite well compared to other kernel functions (Zuo and Carranza 2011; Han et al. 2012). To reach the above-mentioned goals, we used spatial evidence layers that are genetically and spatially relevant to Cu-Au deposits within the Moalleman District, NE Iran. These spatial evidence layers include: (1) multi-element geochemical signatures derived from principal component analysis (PCA); (2) proximity to host rocks of mineralization (Eocene volcano sedimentary units); (3) proximity to N, E, NW and NE-trending faults and fault density; and (4) proximity to hydrothermal alterations.

Geological Setting

The study region is located in NE Iran, within 1:100,000 scale quadrangle map of Moalleman (Fig. 1) (Eshraghi and Jalali 2006). This district approximately measures 1800 km2. In terms of structural geology, the district is located in the Central Iran zone (Fig. 1). The northern part of the region is called the Torud-Chah Shirin belt, which is situated as a part of the Great Kavir block between the principal Torud sinistral and Anjilow dextral strike-slip faults (Fig. 2) (Hushmandzadeh et al. 1978). The Torud-Chah Shirin volcano-plutonic complex extends more than 10 km in width and 100 km in length along NE-SW belt. The oldest lithological units in this area were expressed by metamorphosed Precambrian basement like gneisses, amphibolites and mica schists, which is covered by Paleozoic and Mesozoic metamorphic sedimentary sequences and Tertiary volcano-plutonic rock units (Hushmandzadeh et al. 1978).

Figure 1
figure 1

Location of study area in NE of Iran

Figure 2
figure 2

Simplified geological map of Moalleman 1:100,000 scale sheet (Modified after Eshraghi and Jalali 2006)

Eocene–Oligocene volcano-plutonic assemblage is the broadest lithological unit in the belt, which consists middle Eocene tuff, shale, marl and sandstone, middle-to-upper Eocene andesite and dacite and Oligocene intrusive rocks (Hushmandzadeh et al. 1978; Zolfaghari 1998; Kohansal 1998). The Torud-Chah Shirin volcano-plutonic complex hosts numerous of mineral occurrences and some abandoned mines, such as Gandy Au (Ag + Pb + Zn + Cu), Cheshmeh Hafez Pb + Zn + Cu (Au), Chalu Cu (Au), Chah Messi (Cu), Pousideh (Cu), Abolhassani Pb + Zn + Cu (Au), Zeresh Koh (Cu) and Baghu-Darestan Au (Cu).

The main host rocks of Gandy deposit are middle-to-upper Eocene volcanic, volcano-clastic and terrigenous sedimentary rocks (Fard et al. 2006). The Baghu-Darestan gold deposit consists dominantly of Eocene intermediate to acidic lava flows of basaltic andesite, andesite, trachyandesite, and dacite; and volcanic breccias and sub-volcanic intrusions, such as micro-quartz diorite, quartz monzodiorite, micro-granodiorite and micro-granite, which are cut by several dykes (Rashidnejad-Omran 1992; Niroomand et al. 2018). Andesite and basaltic andesite lavas in Cheshmeh Hafez area and trachyandesite of basalt in Chalu region host hydrothermal mineralization in these areas (Mehrabi and Siani 2012).

The intrusion-related copper- and gold-bearing epithermal veins, quartz-base metal veins and associated gold placers in ancient times at Baghu-Darestan mine are typical styles of mineralization throughout this tertiary volcano-plutonic complex. As an example, mineralization at Gandy has occurred in quartz sulfide veins and breccias, consisting mainly of carbonate minerals, quartz, barite, galena, sphalerite, pyrite and chalcopyrite (Shamanian et al. 2004). Middle to possibly late Eocene was the zenith of magmatic activity, which has been split into two sets (Shamanian et al. 2004). Firstly, Eocene volcano-clastic rocks comprising of andesite, andesite-basalt, trachyte, basalt, dacite and rhyolite with intercalated tuff strata, sandstone, siltstone and conglomerate among them. Secondly, late Eocene-early Oligocene shallow and dome-shape intrusion bodies consist primarily of andesite, andesite-dacite and diorite porphyry compositions. Ore fluids mainly produced distinct quartz ± sulfide veins and veinlets that cross-cut different types of country rocks. A common feature of this mineralization is their close spatial association with late Eocene-early Oligocene magmatism, which interpreted to be the source of mineralized fluid during the Pyrenean phase of the Middle Alpine orogenic activity (Eshraghi and Jalali 2006).

The magmatic-related ore deposits (e.g., vein-type Cu-Au deposits) in Torud-Chah Shirin belt were structurally controlled, because the fault system acted as pathways for the transport of ore-bearing fluids with magmatic origin. The penetration of sub-volcanic acidic to intermediate intrusions into andesitic volcanic sequences in the form of dykes and sills caused hydrothermal alterations with vein-type mineralization in some parts of the Torud-Chah Shirin belt, which are genetically and spatially associated with the fault system (Fard et al. 2006). The main hydrothermal alteration assemblages within this area include intermediate and advanced argillic (kaolinite, alunite, illite, montmorillonite and quartz), phyllic (sericite, pyrite and quartz), and Fe-oxide and extensive propylitic (chlorite, epidote and calcite). Argillic and phyllic hydrothermal alterations were developed mostly in the west and center of the Torud-Chah Shirin belt, especially in areas where metallic mineralization occurred such as at the Gandy, Chah Messi and Cheshmeh Hafez mines (Imamjomeh 2005).

Data Used

A systematic geochemical exploration program within the area covered by the Moalleman geological map (at scale 1:100,000) has been conducted by the Geological Survey of Iran (GSI) at 1993. Basically, a regular network of sampling locations with 1400 m × 1400 m cell size (or ~ 2 km2) was designed and then 2–4 subsamples of stream sediments were collected over the first- or second-order streams within each cell (Azmi et al. 2020). All of the collected subsamples within each cell were composited into one sample for analysis (representing ~ 2 km2) and was attributed to the center of the cell. This is because these composite stream sediment samples can acceptably provide information relevant not only to the upstream sources of the samples but also to the immediate vicinity of the sample locations. Subsequently, 819 composite stream sediment samples have been collected from the study area (Fig. 3).

Figure 3
figure 3

Location of the systematically collected sediment samples of study area

For each composite sample, the concentrations of 44 major and trace elements were measured by inductively coupled plasma optical emission spectrometry (ICP-OES) except Au, which was separately analyzed by fire assay method. Finally, among the 44 major and trace elements, 6 elements (i.e., As, Au, Cu, Pb, Sb and Zn) which are directly associated with the known epithermal vein-type Cu-Au deposits in the study area (Imamjomeh 2005) were selected for the data analysis in this study.

One may argue that stream sediment sampling provides information pertinent to an upstream source and cannot be used to predict prospectivity at the location at which the sample was taken. However, the collection of stream sediment samples from first- or second-order streams (but not from higher order streams) ensures that any recognized geochemical anomaly is coupled to the anomalous source (e.g., mineralization) (cf. Carranza and Hale 1997; Moon 1999; Carranza 2010). Besides, stream sediment geochemical anomalies are usually and should be integrated with geological data (e.g., proximity to faults, proximity to hydrothermally-altered rocks) to distinguish between significant (i.e., deposit-related) and false anomalies (cf. Carranza and Hale 1997; Ali et al. 2015; Yilmaz et al. 2015).

Therefore, the geological map of the study area (at 1:100,000 scale) was digitized, from which the recorded lithological units and faults/lineaments were derived in the vector format (Fig. 2). In addition, remote sensing data (ASTER and Landsat 8 OLI) were processed for detecting rocks outcrops with phyllic–argillic and Fe-oxide alterations and for validating the extracted faults from geological map.

Methodology

Techniques for Multiple-Criteria Decision-Making

MCDM deals with the selection of the best alternative from several different options or with prioritization and weighting of alternatives according to the final objective (Triantaphyllou 2000). In other words, decision makers attempt to select an optimal solution using several criteria or attributes. Several MCDM techniques have been proposed and developed for MPM such as AHP (Saaty 1990), TOPSIS (Hwang and Yoon 1981), VIKOR (Opricovic and Tzeng 2004) and SAW. These techniques have been implemented in many studies for knowledge-driven MPM according to expert opinion (Asadi et al. 2016; Ghezelbash and Maghsoudi 2018). However, the knowledge-driven MPM described in the cited references suffer from systemic uncertainties resulting from over- or underestimation of rating or weighting of spatial evidence layers and their relevant classes. In this study, such uncertainties were avoided by quantification of the geospatial associations among known mineral deposit occurrences and spatial evidence layers (Ghezelbash et al. 2019b) and, finally, by calculating objective weights for exploration criteria and their associated classes. To reach this goal, the performances of P-A plots as well as FR method were evaluated.

P-A Plots for Calculation of Exploration Criteria Weights

Measuring the degree of efficiency of each MSEL, which contributes to MPM, is a crucial stage because the most efficient MSEL can be recognized. In other words, the ability of each MSEL to predict mineralized areas can be estimated by utilizing the exact location of known mineral deposit occurrences. In this way, P-A plots are helpful (Yousefi and Carranza 2015). The main aim of drawing P-A plots in this study is to quantify the predictive ability of each MSEL by determining objective or empirical weights according to the exact location of known mineral deposit occurrences (Yousefi and Carranza 2015). To generate a P-A plot, each map of MSEL must be classified or re-classified. A P-A plot consists of two curves in opposite directions, one represents the prediction rate based on known mineral deposit occurrences and the other represents the proportion of areas related to different classes of spatial evidence layers. To calculate the degree of efficiency of spatial evidence layers and thus their weights, the normalized density index (Nd) and weight of each MSEL (We) can be applied according to the parameters (i.e., Pr (prediction rate) and Oa(occupied area)) derived from the intersection point of each P-A plot (Mihalasky and Bonham-Carter 2001). The Nd is a measure of the rank or relative importance of individual spatial evidence layers with respect to mineral deposit occurrences. Thus, a MSEL with Nd > 1 (We > 0) has positive spatial relationship with mineral deposit occurrences of the type sought whereas a MSEL with Nd < 1 (We < 0) has negative spatial relationship with the mineral deposit occurrences of the type sought (Parsa et al. 2016a).

Frequency Ratio (FR) for Assigning Sub-Criteria Weights

The FR method was applied in this study to model the relationships between the locations of mineral deposit occurrences and classes of spatial evidence layers (as sub-criteria). The FR is the ratio of the area containing mineral deposit occurrences to the whole area under study. The outstanding advantages of this method are its simplicity of use and the plain and straightforward interpretation of outcomes (Oh et al. 2011). The FR for each sub-criterion can be measured through the following steps (Lee and Talib 2005; Yilmaz 2007). Firstly, calculate the ratio of area of each sub-criterion (class) to the total map area (Ra). Secondly, determine the ratio of the number of known mineral deposit occurrences contained by each sub-criterion (class) to the number of all mineral deposit occurrences in the study area (Rmo). Thirdly, calculate the FR value for each sub-criterion (class) by dividing Rmo with Ra (i.e., \(FR = \left( {\frac{{\mathop R\nolimits_{mo} }}{{\mathop R\nolimits_{a} }}} \right)\)). Fourthly, rescale the range of the derived FR values of all classes of a MSEL into the [0,1] range for better comparison of the efficiency of each sub-criterion (class).

Data-Driven SAW MCDM Procedure

The SAW technique, which is as weighted linear scoring method, is a simple but useful MCDM method for calculating final weights of alternatives based on the weighted average (Afshari et al. 2010). In other words, quantitative weights are calculated for all alternatives by multiplying the scaled values assigned to alternatives with the weights derived directly by expert decision makers. However, in this study, we introduce a data-driven SAW technique by which the objective or empirical weights are derived using P-A plot per criterion as well as using FR method per sub-criteria considering the exact location of known mineral deposit occurrences instead of weights derived from the judgments of expert decision makers. The procedure of data-driven SAW consists of the following four main steps:

  1. 1.

    Construction of a decision matrix \(X\) from multi-attribute dataset as:

$$X = \left[ {x_{ij}} \right]_{m \times n}$$
(1)

where xij is the performance of the ith alternative regarding the jth criterion, m is the number of alternatives (here the pixel values of spatial evidence layers) and n is the number of criteria (here spatial evidence layers).

  1. 2.

    Calculating the objective weights using the FR method and assigning these weights to the locations of alternatives in the constructed decision matrix in step 1.

  2. 3.

    Normalizing the components of the decision matrix through the Max method according to the following equation:

$$d_{ij} = \left( {\begin{array}{*{20}c} {{{x_{ij} } \mathord{\left/ {\vphantom {{x_{ij} } {x_{j}^{ + } }}} \right. \kern-\nulldelimiterspace} {x_{j}^{ + } }},j \in \Omega_{\max } } \\ {{{x_{j}^{ - } } \mathord{\left/ {\vphantom {{x_{j}^{ - } } {x_{ij} }}} \right. \kern-\nulldelimiterspace} {x_{ij} }},j \in \Omega_{\min } } \\ \end{array} } \right)$$
(2)

where dij refers to the normalized performance of the ith alternative with respect to the jth criterion, \(x_{j}^{ + }\) is the highest number of \(xij\) in the column \(j\) for prospectivity criterion, \(x_{j}^{ - }\) is the lowest number of \(xij\) in the column \(j\) for non-prospectivity criterion, \(\Omega \max\) and \(\Omega \min\) are sets of prospectivity and non-prospectivity criteria, respectively.

  1. 4.

    Calculating the ranking scores of final MPM as:

$$S_{i} = \sum\limits_{j = 1}^{n} {w_{j}d_{ij}}$$
(3)

where \(S_{i}\) refers to the ranking score of the ith alternative, \(w_{j}\) represents the weight of jth criterion calculated using the parameters Pr and Oa of the intersection point on a P-A plot (Wang et al. 2016).

Support Vector Machine (SVM)

The SVM was invented by Vapnik and Chervonekis (1964) based on statistical learning theory as a supervised classification method. The SVM creates a hyperplane in a high dimensional feature space to classify a set of data vectors into sensible classes if the data in the original space is not linearly separable. In other words, a superb classification can be derived via the created hyperplane having the maximum distance to the closest training sample point of any class (Fig. 4) (Zuo and Carranza 2011). To describe the SVM technique related to the two-class problem, suppose the training data comprise N data pairs in Eq. (4):

$$D = \left\{ {(x_{i},y_{i})|x_{i} \in R^{n} ,y_{i} \in \left\{ { - 1,1} \right\}} \right\}_{i = 1}^{N}$$
(4)
Figure 4
figure 4

Support vectors and optimum hyperplane for the binary case of linearly separable data sets (after Zuo and Carranza 2011)

where xi represents the independent variable, which is labeled in two classes of \(y_{i} = + 1\) and \(y_{i} = - 1\) (Kavzoglu and Colkesen 2009). In case of linear data, the separation hyperplane equations of the two classes are:

$$\begin{gathered} wx_{i} + b \ge + 1\;{\text{for}}\;y_{i} = + 1 \hfill \\ wx_{i} + b \le - 1\;{\text{for}}\;y_{i} = - 1 \hfill \\ \end{gathered}$$
(5)

which are equivalent to:

$$y_{i}\left( {wx_{i} + b} \right) \ge 1\;i = 1,2,...,n$$
(6)

The separation hyperplanes can then be formalized as a decision function, thus:

$$f\left( x \right) = {\text{sgn}} \left( {wx + b} \right)$$
(7)

where sgn represents a sign function, which is defined as:

$${\text{sgn}} \left( x \right) = \left\{ {\begin{array}{*{20}c} 1 &\quad {{\text{if}}} & x\,>\,0 \\ 0 & {{\text{if}}} & x\,=\,0 \\ { - 1} & {{\text{if}}} & x\,<\,0 \\ \end{array} } \right.$$
(8)

where w and b are parameters of separation hyperplane decision-making, which are derived through the following optimization function:

$${\text{Minimizer}}\;\tau \left( w \right) = \frac{1}{2}\left\| w \right\|^{2}$$
(9)

Subject to

$$y_{l} \left( {\left( {wx_{i}} \right) + b} \right) \ge 1,\;i = 1, \ldots ,l$$
(10)

Transforming the problem into the equivalent Lagrangian dual problem can simplify the calculation. The solution to this optimization problem is the saddle point of the Lagrangian function, thus:

$$L\left( {w,b,\alpha } \right) = 0,\;\frac{\partial }{\partial w}L\left( {w,b,\alpha } \right) = 0$$
(11)

where \(\alpha i\) represents a Lagrangian multiplier. The following optimization function defines the Lagrangian multipliers \(\alpha i\):

$${\text{Maximize}}\sum\limits_{i = 1}^{l} {\alpha_{i} } - \frac{1}{2}\sum\limits_{ij = 1}^{l} {\alpha_{i} } \alpha_{j} y_{i} y_{j} \left( {x_{i} x_{j} } \right)$$
(12)

subject to

$$\alpha_{i} \ge 0,\;i = 1,...,l\,{\text{and}}\,\sum\limits_{i = 1}^{l} {\alpha_{i} } y_{i} = 0$$
(13)

The following decision function represents the separation rule according to the optimized hyperplane (Zuo and Carranza 2011):

$$f\left( x \right) = {\text{sgn}} \left( {\sum\limits_{i = 1}^{l} {\alpha_{i} } y_{i} \left( {x.x_{i} } \right) + b} \right)$$
(14)

A MATLAB-based program was employed to execute SVM algorithm. Among several kernels (linear, polynomial, sigmoid and RBF) which have frequently used in SVM algorithm, RBF kernel due to its less error as well as fewer parameters to be estimated was used in this study (Rodriguez-Galiano et al. 2015; Ghezelbash et al. 2019a). The RBF kernel based on two samples \(X\) and \(x^{\prime}\) is calculated as:

$$K(x,x^{\prime } ) = \exp \left( { - \frac{{\left\| {x - x^{\prime } } \right\|^{2} }}{{2\sigma^{2} }}} \right)$$
(15)

A specific portion of data from the available dataset is essential for training the machine learning algorithms called training data. In this case, known deposit and non-deposit datasets are utilized as training data, which the number of both mentioned data must be equal as the performance of SVM is highly depends on this equality (Zuo and Carranza 2011). The left out portion of data, which was not participated in the training procedure called out-of-the-bag data (OOB), is utilized after the learning procedure is terminated for evaluation of the performance of the model and in this stage one can decide if the proposed method is performing properly or not. The confusion matrix analysis has been conducted for calculation of SVM accuracy and learning procedure performance. This method is a very effective tool while addressing the result of multi-class classification problems. Indeed, this method is capable of demonstrating the relationship between the outputs and the true ones. The numbers of true and false classified data are summarized and confusion matrix illustrates the ways the classification method is confused predicting the true classification. In a two-class confusion matrix, four results are possible. These are: (a) true positive (TP), which refers to correct prediction of deposit locations as prospective; (b) true negative (TN), which refers to correct prediction of non-deposit locations as non-prospective; (c) false positive (FP), which refers to incorrect prediction of the deposit locations as non-prospective; and (d) false negative (FN), which refers to incorrect prediction of non-deposit locations as prospective. Classification accuracy of a trained model can be described and formulized as follows:

$${\text{Sensitivity}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}}$$
(16)
$${\text{Specificity}} = \frac{{{\text{TN}}}}{{{\text{TN}} + {\text{FP}}}}$$
(17)
$${\text{Precision}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FP}}}}$$
(18)
$${\text{Accuracy}} = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}}}$$
(19)
$$F{\text{-measure}} = \frac{{2 \times {\text{Sensitivity}} \times {\text{Precision}}}}{{{\text{Sensitivity}} + {\text{Presicion}}}}$$
(20)

Results

Evidence Layers

Definition of multi-element geochemical signatures, which significantly represent the simultaneous distribution of concentration of mineralization-related elements, is crucial in geochemical exploration evidence to be used in MPM. In other words, geochemical data of stream sediments generally require multivariate analysis to derive enhanced multi-geochemical layers of deposit-type sought. For this, PCA, as an effective tool in exploratory data analysis, has been used extensively to reduce the dimension of a dataset and to incorporate several correlated variables into a single variable (Jolliffe 2002), which makes possible the interpretation of stream sediment geochemical data. In other words, PCA can exhibit significant associations among chemical elements via decomposition of the correlation or covariance matrix of variables to component loadings and component scores. Before that, centered-logratio (clr) transformation (Aitchison 1986) was implemented to the measured concentration values of the six geochemical elements in order to take into account the compositional nature of geochemical data (Carranza 2011). Then, a one-stage PCA was conducted on the clr-transformed values of data of the six geochemical elements. The derived results are summarized in Table 1. Two efficient components according to significant eigenvalue of > 1 were extracted. The contribution of these two components is ~ 64% of the total variance. The first component represents a Pb-Zn assemblage with positive values of loadings and Cu enrichment with negative value of loading (Table 1). The second component represents an As-Sb elemental assemblage with positive values of loadings and Au enrichment with negative value of loading (Table 1). The study region is geologically prone to Cu-Au mineralization. Thus, the negative scores of PC1 and PC2, which represent the mineralization of Cu and Au, respectively, were considered as significant multi-element geochemical signatures of the sought deposit-type to be used in MPM.

Table 1 Rotated component matrix of ordinary PCA

Known mineral deposit occurrences in the study area are related spatially and genetically to a wide range of volcano sedimentary rocks with Eocene age (Hushmandzadeh et al. 1978). Most outcrops of these rock units are ore-forming geological clues to Cu-Au deposits in the study area. Therefore, the presence of and proximity to these rock units can provide favorable condition for exploring Cu-Au-related deposits within this region. Two sets of geological units of middle-to-upper Eocene age were separated from 1:100,000 geological map of Moalleman District. The first one includes intermediate lavas and volcano-clastic rocks, andesite, andesitic dacite and trachyandesite, while the second one includes spilitic basalt, keratophyre with a few beds of sandstone and volcano-clastic rocks. Accordingly, two maps of the presence of and proximity to these units were generated for use in MPM.

Pathways, through which ore-bearing fluids are transported, are extremely influenced by temperature, pressure, composition and permeability of rocks (Cox et al. 1987). The permeability of rocks is controlled by faults/lineaments, which provide favorable conditions for deposition of large volumes of mineralization near the surface. Faults with specific directions could be directly associated with certain mineralization (Faulkner et al. 2010). Therefore, the existence of faults, their directions and intersections could be considered as structural controls on Cu-Au mineralization in the study area. To reach this goal, faults with various directions (here, N-trending (350°–10° or 170°–190°), E-trending (80°–100° or 260°–280°), NW-trending (100°–170° or 280°–350°) and NE-trending (10°–80° or 190°–260°)) were considered and their spatial evidence layers of proximity to these faults were derived. In addition, fault density as a fluid pathways control evidence layer was generated. Finally, five structural layers were generated for MPM.

Remote sensing is the procedure of uncovering and monitoring the physical characteristics of an area by measuring its reflected and emitted radiation from satellite or aircraft. The significant capability of remote sensing images in mineral exploration is to recognize the hydrothermal alterations (e.g., potassic, phyllic, argillic, Fe-oxide and propylitic), which are considered as primary exploration guides for hydrothermal vein-type Cu-Au deposits in regional exploration stage. Image processing of remote sensing data (e.g., ASTER and Landsat 8 OLI) is a suitable way to gain valuable information on characteristics of the surface of exploration targets, which can be used for mapping hydrothermal alterations (Tangestani and Moore 2001). Argillic (kaolinite, alunite, illite, montmorillonite and quartz) and phyllic (sericite, pyrite and quartz), which are present mainly near veins, are the most important alterations associated with related mineralization in this region of interest (Mehrabi et al. 2014). Therefore, phyllic–argillic alteration was extracted from ASTER data using matched filtering (MF) approach (Moore et al. 2008). Iron oxide alteration, which is another important alteration associated with Cu-Au mineralization in the study area due to oxidation of sulfide minerals especially pyrite and chalcopyrite (Bahrampour et al. 2017), was detected from Landsat 8 OLI data by PCA technique (Crosta et al. 2003). Accordingly, spatial evidence layers of proximity to argillic-phyllic alterations and proximity to Fe-oxide alterations were generated.

As the ranges of minimum–maximum values of spatial evidence layers of geochemical, geological, structural and alterations data are not the same, they well all transformed into the same domain to be contributed to MPM. In this regard, multiple fuzzy membership functions (e.g., fuzzy linear, fuzzy Small, fuzzy Large, fuzzy MS-Large, fuzzy MS-Small, fuzzy Near and fuzzy Gaussian) have been used to transform the values of rasterized maps into fuzzy domain [0,1] (Beucher et al. 2014). MS-Large as a nonlinear fuzzy function is more applicable compared to others when the large input values are expected to have higher membership. In contrast, MS-Small is more applicable when small input values are expected to have higher membership (Demir et al. 2016) such as spatial evidence layers of proximity to certain spatial features (e.g., hydrothermal alterations, host rock and faults). These two functions (MS-Large and MS-Small) are similar to the fuzzy Large and Small functions, respectively, except that the definitions of these functions are founded on a specific mean and standard deviation. On the one hand, the MS-Large function was performed to convert the range of values of multi-element geochemical layers of PC1 and PC2 as well as fault density layer into the range [0, 1]. On the other hand, the original values of spatial evidence layers of proximity to host rocks, N, E, NW and NE-trending faults, phyllic–argillic and Fe-oxide alterations were transformed to fuzzy domains using the MS-Small function in Arc GIS software.

Mineral Prospectivity Mapping Using Data-driven SAW

One of the main objectives of this study is to explore the performance of data-driven SAW procedure for recognizing the exploration targets of the deposit-type sought. Prior to data-driven SAW MPM, continuous-value spatial evidence layer needs to be discretized for assignment of meaningful weights according to locations of known mineral deposit occurrences. This was applied to the fuzzified spatial evidence layers of (1) enhanced multi-element geochemical signatures of PC1 and PC2, (2) proximity to two sets of host rocks, (3) proximity to four sets of trending faults as well as fault density and (4) proximity to phyllic–argillic and Fe-oxide alterations.

At the first stage, the concentration-area (C-A) fractal method which was firstly introduced by Cheng et al. (1994) for separation of geochemical populations, was employed to derive the anomaly and background classes (Parsa et al. 2016b; Ghezelbash et al. 2019c, d) of fuzzified geochemical layers of PC1 and PC2. The C-A log–log plots of geochemical layers of PC1 and PC2 are shown in Figure 5. These log–log plots consist of the values of fuzzified scores of PC1 and PC2 Vs. the occupied areas with the fuzzified values of PC1 and PC2 scores greater than contour values. Breaks between the straight line pieces of log–log plots were used as thresholds in order to classify PC1 and PC2 scores. As shown in Figure 5, there were four different thresholds and, thus, five different geochemical classes for fuzzified multi-element geochemical layers of PC1 and PC2 (Fig. 6a, b).

Figure 5
figure 5

C-A fractal log–log plots for classification of fuzzified values of PC1 and PC2 scores

Figure 6
figure 6figure 6

Classified maps of fuzzified evidential layers: (a) PC1 scores; (b) PC2 scores; (c) proximity to phyllic–argillic alterations; (d) proximity to Fe-oxide alteration; (e) proximity to Eocene volcanic rocks; and (f) proximity to Eocene spilitic basalt and keratophyre rocks; (g) fault density; (h) proximity to N-trending faults; (i) proximity to E-trending faults; (j) proximity to NW-trending faults; and (k) proximity to NE-trending faults

At the second stage, the equal intervals of 0.2 were used to classify the fuzzified geological, structural and alteration maps (Fig. 6c–k).

At the third stage, the P-A plots and FR method were employed for calculating and assigning the meaningful weights of exploration criteria and their relevant sub-criteria, respectively. The P-A plots for 11 classified spatial evidence layers were drawn based on the occupied areas and locations of known mineral deposit occurrences (Fig. 7). Then, the intersection point of each plot was determined, by which the meaningful weights (We) as well as the degree of efficiency of exploration criteria (here classified spatial evidence layers) were measured. According to Table 2, all evidence layers exhibit positive geospatial association with Cu-Au deposit occurrences in Moalleman District. However, proximity to phyllic–argillic and Eocene volcanic rocks (two sets of host rocks) and also the fault density layers are the most efficient criteria due to the highest positive We values (We = 1.45, 1.32, 1.26 and 1.15, respectively). In addition, proximity to NE-trending faults layer, the geochemical layer of PC1, proximity to E-trending faults layer, proximity to Fe-oxide, the geochemical layer of PC2 and proximity to NW and N-trending faults layers are the least efficient criteria due to the lowest positive We values (We = 0.94, 0.94, 0.75, 0.7, 0.53, 0.4 and 0.36, respectively). Moreover, FR method was employed to calculate the sub-criteria (here relevant classes of spatial evidence layers) weights (Table 3). The FR values were calculated for all derived classes of spatial evidence layers and they are shown in Table 3.

Figure 7
figure 7figure 7

P-A plots for fuzzified classified maps of evidential layers: a PC1 scores; b PC2 scores; c proximity to phyllic–argillic alterations; d proximity to Fe-oxide alteration; e proximity to Eocene volcanic rocks; and f proximity to Eocene spilitic basalt and keratophyre rocks; g fault density; h proximity to N-trending faults; i proximity to E-trending faults; j proximity to NW-trending faults; and k proximity to NE-trending faults

Table 2 Prediction rate (Pr), occupied area (Oa), normalized density (Nd) and normalized weight (We) of each exploration criterion. Values in bold represent the efficient exploration criteria
Table 3 Frequency ratio values of discretized classes per exploration criterion

At the final stage, data-driven SAW procedure was applied for defining the favorable targets of Cu-Au mineralization. For this purpose, a decision matrix of exploration criteria vs. the alternatives should be, firstly, constructed, in which, the vertical columns represent the five spatial evidence layers of geochemistry, geology, tectonic and hydrothermal alterations and the horizontal rows represent the pixel values of spatial evidence layers with specific coordinates. Then, the measured weights derived from P-A plots (Table 2) were assigned to exploration criteria. Furthermore, the weighted pixel values of five spatial evidence layers derived from FR method (Table 3) were placed as alternatives in decision matrix. Finally, the model of mineral prospectivity was produced in this paper using data-driven SAW procedure (Fig. 11a).

Mineral Prospectivity Mapping Using Data-driven SVM

To portray the high potential areas of related mineralization, a RBF kernel-based SVM was executed in this paper as a supervised data-driven classification for modeling of mineral prospectivity. The supervised SVM model used in this contribution requires two sets of training data: (1) deposit locations, which represent the presence of mineral deposit occurrences and take value 1, and (2) non-deposit locations refer to the absence of mineral deposit occurrences and take value 0. Determination and extraction of deposit features is a simple procedure wherever the number of known mineral deposit occurrences is high, although determination and extraction of non-deposit features is a challenging problem. Three basic points must be considered for this purpose as follows (Carranza et al. 2008):

  • For elevating the efficiency and accuracy of classification procedure, the number of non-deposit locations must be equal to those of deposit locations; this is 20 in this paper.

  • The locations of non-deposit features must be selected as far as possible from the locations of deposit features.

  • Unlike the deposit locations, which usually have clustered, or regular nature, non-deposit locations must have random nature.

The point pattern analysis (Fig. 8a) was carried out in this paper for delineating how far the locations of non-deposits would be adequately far from deposit locations. It can be seen that all deposit pairs are located within ~ 5000 m, which demonstrates that there is 100% probability that another deposit occurs within this distance. Instead, 2987 m was selected in this study as the buffer distance within which there is an 88% probability of finding a neighboring deposit next to any given deposit (Fig. 8). In order to enhance the accuracy of selection of non-deposit locations, a 1000 m buffer around the volcanic host rocks of Eocene age was also considered. Accordingly, 20 non-deposit locations were randomly selected from the remaining regions (Fig bb).

Figure 8
figure 8

Locations of deposits and selection of non-deposit samples

Eleven fuzzified spatial evidence layers (Fig. 6), namely multi-element geochemical layers of PC1 and PC2, proximity to two sets of host rocks, proximity to N, E, NE and NW-trending faults as well as fault density, proximity to phyllic–argillic and Fe-oxide alterations, were implemented for SVM prospectivity modeling in this study. Then, the predictive prospectivity model of RBF kernel-based SVM was created. For this, a total number of 1800 pixel values (those of spatial evidence layers in the locations of 40 deposits and non-deposits) were used for training and OOB evaluation. The 75% of dataset (a total number of 1350) is considered as training dataset, while the remaining 25% is utilized as OOB data (a total number of 450) for validation purpose. To optimally selection of RBF kernel in SVM, two parameters namely C and λ should be appropriately selected. In this study, various C and λ sets during a trial-and-error procedure were tested and a trained model was generated for each set. Then, the accuracy of each trained model for each C and λ set based on OOB data was calculated (Fig. 9). The OOB error for different SVM models was calculated and finally it was found that in accuracy of 93.55% (error of 6.45%) (Fig. 10b) the optimum value of RBF-kernel parameters is C = 0.25 and λ = 0.2 (Fig. 9). Then, the calculated parameters were fixed for constructing optimum SVM model. As it is obvious in Figure 10a, the accuracy of trained SVM model is 96.07% resulting 3.93% error. Different classification indices such as sensitivity, specificity, precision and F-measure are used to measure the accuracy of the classification. Sensitivity, which refers to correctly classified deposit locations, was 93.62% (Table 4). This shows that the generated model is qualified in determining the deposit locations. Conversely, specificity demonstrates the capability of the model in predicting the non-deposit sites. The calculated result illustrates that the specificity of trained SVM-RBF model is 98.51% in determining the non-deposit zones (Table 4). Moreover, this model achieves 96.07% of precision (Table 4), representing that among the predicted cells that labeled as deposit, 96.07% of them are actually true deposit locations. F-measure calculates the weighted average of precision, sensitivity and false positives and false negatives are taken into account to clarify the classification accuracy. As it depicted in Table 4, the value of F-measure is 94.82% for trained model, clearly showing that the SVM-RBF model owns significant prediction capability and reliability in modeling the Cu-Au mineralization in the Moalleman District. In the final step, all pixel values of 11 spatial evidence layers were extracted and used as test data for generation of data-driven MPM model based on constructed SVM-RBF (Fig. 11b).

Figure 9
figure 9

3-D plot of the trial-and-error procedure for selecting optimum RBF kernel parameters C and λ

Figure 10
figure 10

Graphical confusion matrix, accuracy and error of SVM model based on training and OOB dataset

Table 4 Classification accuracy indices of SVM model
Figure 11
figure 11

Continuous-value mineral prospectivity models derived by a data-driven SAW and b RBF-based-kernel SVM models

Defuzzification and Performance Evaluation of Prospectivity Models

For quantitative assessment of two predictive prospectivity models derived from data-driven SAW and RBF kernel-based SVM methods and also measuring the degree of success or failure of these models, the weights of the evidence method were implemented. Indeed, this method supplies a statistical t-value that is able to quantify the efficiency of spatial associations between known mineral deposit occurrences and discretized classes of prospectivity models (Bonham-Carter 1994). The larger t-value represents the stronger spatial associations. Empirically, t = 1.96 is an acceptable cutoff value for determination of the statistical significant correlation called “the lower level of significance.” In addition, the highest t-value (tmax) which could be selected among the various t-values calculated for different classes of mineral prospectivity models called “the highest level of significance.”

In this study, for defuzzification of continuous-value prospectivity models derived from data-driven SAW and RBF kernel-based SVM methods, the threshold values at the five-percentile intervals were used. Then, the student t-value for each class of prospectivity models was calculated and the two significant thresholds were determined. As shown in Figure 12, tmax values for the two prospectivity models were determined at 85th percentile of prospectivity scores. Finally, classified predictive prospectivity models of data-driven SAW and RBF kernel-based SVM methods were generated and high-favorable, favorable and non-favorable classes were derived (Fig. 13).

Figure 12
figure 12

Discretization of prospectivity scores based on the thresholds of calculated t-values of a data-driven SAW and b RBF-based-kernel SVM models

Figure 13
figure 13

Classified mineral prospectivity models derived by a data-driven SAW and b RBF-based-kernel SVM models

Discussion and Conclusions

In this study, two prospectivity models of MCDM and supervised machine learning methods namely data-driven SAW and RBF kernel-based SVM were generated by integration of subjective geological knowledge and empirical mineralization-related dataset. The results represent that the two prospectivity models derived from data-driven SAW (Fig. 13a) and SVM (Fig. 13b) models were succeeded in delineating favorable targets associated with Cu-Au mineralization in Moalleman District. However, the SVM model is more reliable in delineating the mineralization-related targets in the study area. Because, this model could predict ~ 95% of known mineral deposit occurrences (19 out of 20) in only ~ 10% of the study area (highly favorable class), while the prospectivity model derived from data-driven SAW could predict ~ 65% of known mineral deposit occurrences (13 out of 20) in only ~ 8% of the study area (highly favorable class).

The derived predictive prospectivity models (especially SVM model) not only are able to accurately predict known areas of Cu-Au mineralization but also identify areas of high-favorable mineralization where no mineral deposit has been discovered. Accordingly, the following results are derived from this paper:

  • The contribution of inefficient exploration criteria can significantly increase the bias and uncertainty in MPM. Thus, retaining the most efficient targeting criteria that properly represent the mineralization-related characteristics to be used in MPM can significantly increase the accuracy and efficiency of the predictive model and success of prospectivity modeling.

  • Using data-driven weights based on the locations of known mineral deposit occurrences can extremely enhance the efficiency of MCDM methods (e.g., SAW) for generating MPM models and thus can reduce the systematic uncertainty in MPM.

  • Implementation of machine learning algorithms, such as support vector machines as supervised classifiers where there are a large number of training deposit locations, is very useful in data-driven predictive modeling of mineral prospectivity.

  • Although machine learning algorithms are able to predict highly favorable areas in spatial prospectivity modeling, the matter of the limitations of datasets, especially the non-uniform nature of certain input data (because data collection is typically denser around known deposits and outcrops), remains a challenging problem. This problem leads undoubtedly to bias and uncertainty in any predictive model of mineral prospectivity. However, this challenging problem is not directly related to the weakness of the machine learning algorithms used but it is directly related to the availability and the selection of input datasets (Hronsky and Kreuzer 2019). In this study, we believe that the bias and uncertainty in the result are mainly due to availability of data because we used mostly legacy data that are available in the study area. Therefore, like any other predictive model, the final prospectivity model achieved in this study needs to be updated once new relevant spatial data become available.