Introduction

With the current interest in deep mine prospecting and the ongoing improvements to subsurface geological and geophysical mapping methods (Schamper et al. 2014a, b; Chen and Wu 2017), vast amounts of geological exploration data in three dimensions are being generated. Analysis and modeling of 3D geological objects by integrating geophysical and geological datasets provides new insights into exploration targeting, but uncertainty in mineral exploration cannot be eliminated (Lindsay et al. 2012; Zuo and Xiong 2018; Wang et al. 2015; Li et al. 2019). How to use these large three-dimensional datasets to reveal the spatial distribution of the mineralization and the determinants of ore genesis is becoming an important research topic in metallogeny.

The goal of this research is to uncover something new about “why things are the way they are.” Exploring the spatial relationships between various geological features and mineralization is not only useful in understanding the ore genesis of deposits but can also help to guide mineral exploration by providing predictive mineral maps (Liu et al. 2013). Recently, spatial issues associated with mineralization and its determining factors have been of interest to many geologists. Most current methods are based on models that assume that the determinants of mineralization are constant over space (Zhang et al. 2012; Chen et al. 2005; Mao et al. 2009, 2010; Lin et al. 2019; Chen et al. 2019). Such models can only produce “average” or “global” parameter estimates (Batisani and Yarnal 2009; Geri and Amici 2010) and are unable to detect possible spatial non-stationary relationships. If spatial non-stationarity in the processes affecting ore mineralization exists, predictive modeling based on these classical global statistical methods will have limited accuracy.

In past research that aimed to explore spatial trends and non-stationary, a series of statistical techniques and modeling approaches were proposed. The solutions have focused on local spatial analysis and spatial modeling (Zhang et al. 2018). Early contributions introduced location coordinates or their functions as direct or indirect independent variables in predictive models to express linear or nonlinear trends in space (Agterberg, 1964, 1970; Casetti, 1972). The famous local window statistics models in geosciences, such as local singularity analysis, reduce the effect of spatial non-stationarity to some degree by removing spatial trends and minimizing the effects of high and low values of the variables on predictions (Cheng 1997, 1999; Zuo et al. 2016; Zhang et al. 2016). Geographically weighted regression (GWR) is a recently developed spatial analysis technique that considers the non-stationarity of variables. It is a relatively simple, but effective technique for exploring spatially varying relationships (Fotheringham et al. 1996, 1998, 1999; Brunsdon et al. 1996, 1999). It has attracted a great deal of attention in different fields such as resources and environment (Tu and Xia 2008; Gao and Li 2011; Gilbert and Chakraborty 2011; Clement et al. 2009; Harris and Brunsdon 2010; Zhao et al. 2014; Zhang et al. 2019), and economics (Lu et al. 2011; Lu and Bo 2014; Lee and Schuett 2014; Nilsson 2014; Breetzke and Cohn 2012; Andrew et al. 2015). In the field of metallogenic prediction, Zhang et al. (2018) developed a spatially weighted logistic regression technique where the dependent variable is a binary variable. Liu et al. (2013) quantified the spatial relationships between gold mineralization and plausible controlling factors in the central part of the St Ives area, Western Australia. Zhao et al. (2014) applied geographically weighted regression to identify spatially non-stationary relationships between Fe mineralization and its controlling factors in eastern Tianshan, China.

While GWR shows strength in modeling non-stationary spatial relationships, most of the existing work is limited to two-dimensional (2D) space. Tobler’s first law of geography (Tobler 1979), which argues that “everything is related to everything else, but near things are more related than distant things,” applies equally to 3D geological space. Therefore, the influence of ore-controlling factors on mineralization is multidimensional. In the metallogenic system, there are great differences in the distribution of geological bodies and fluids at different depths, temperature, pressure, acidity and alkalinity, and redox environment. The spatial non-stationarity of mineralization in the vertical direction cannot be ignored. Therefore, modelers need to introduce variables that reflect spatial non-stationarity at different depths.

Given that real geological space is a three-dimensional (3D) space, extension of the GWR to 3D will give us a new perspective to explore spatially non-stationary relationships in the determinants of mineralization. In this study, we extend the current GWR model from 2D to 3D. The main advantage of the 3D GWR is that it can simulate the spatial relationships between the mineralization and its determinants in real geological space. The spatial relationships can be quantified and calculated by using the 3D GWR model, and the geological information obtained from the simulation results has three-dimensional attributes. To examine the use of GWR in 3D space to detect spatially non-stationary relationships between mineral concentration and its determinants, we performed a case study of the Dingjiashan Pb–Zn deposit by using the ore grades of Zn and Pb as the dependent variables and potential determinants as explanatory variables in the regression. After investigating the adaptability of GWR in modeling the relationships between the mineralization and its driving factors, we applied the GWR to the 3D ore deposit and made some comparisons with the predictions. Finally, we discussed the degree of non-stationary influence for all the controlling factors on mineralization.

The main original contributions of this study can be summarized as follows:

  • We introduced a new method to analyze quantitatively the spatial relationship among the complex geological factors in real 3D space. This innovation will contribute to the techniques of metallogenic prediction.

  • We extended the GWR model from 2D to 3D space and implemented it in the MATLAB language.

  • We applied the GWR model in 3D geological space to explore the spatial relationships between mineralization and its controlling factors.

Data

Study Area

The approach proposed in this paper was tested on the Dingjiashan Pb–Zn deposit, which is a nonferrous mine and is well known for its polymetallic mineralization. It is located in the northeastern part of the Wuyi–Yunkai fold belt in the eastern region of the South China fold system. The study area measures 880 m from east to west, 730 m from south to north, and has depth of 375 m based on elevation change from − 75 m to 285 m above sea level. The Pb–Zn ore in this district occurs in one of the largest deposits in eastern China (Fig. 1) (Zhang et al. 2011). The exposed strata are mainly Mesozoic, continental magmatic rocks, which are underlain by middle and upper Proterozoic metamorphic rocks. The geological structure of the district includes many folds, faults, and unconformities.

Figure 1
figure 1

Geological map of the Dingjiashan Pb–Zn deposit (Zhang et al. 2011)

Data and Variables

In this study, we choose mineralization, which is measured by the ore grades of Zn and Pb as the dependent variables. Figure 2 shows the three-dimensional spatial distribution of the grades of Zn (a) and Pb (b) ore. The explanatory variables are selected based on a block model of the region, which was built using the Datamine (Changsha, Hunan province, China) software in a geocentric 3D coordinate system. The block model consists of a set of regular blocks or units, or voxels. Each voxel has attributes, such as grade and stratigraphic types. The geological space is divided into three-dimensional voxels whose size is 10 m × 10 m × 10 m. Each voxel has values for ore concentration and each potential explanatory variable (Shao et al. 2010). In this study, the coordinates of each voxel correspond to its center point.

Figure 2
figure 2

Spatial distribution of ore grade: (a) Zn and (b) Pb

For each voxel that includes data for samples, the Zn or Pb grade is calculated as the weighted average grade, measured as the length-weighted mean of samples in the voxel. It is calculated as follows:

$$C = \mathop \sum \limits_{i = 1}^{n} C_{i} H_{i} /\mathop \sum \limits_{i = 1}^{n} H_{i} \quad \left( {x_{i} ,y_{i} ,z_{i} } \right) \in v$$

where Ci is Zn or Pb content of sample i, Hi is the length of sample i, (xiyizi) represents the coordinates of the center point, and v represents the voxel space. The ore grade of voxels that do not include samples is obtained by kriging interpolation.

In this study, the explanatory variables were selected to represent the determinants of mineralization. The variables are a series of quantified ore-controlling factors based on various geological conditions and were originally chosen for Pb–Zn deposit metallogenic prognosis in which high accuracy was obtained by comparison with the measured data (Shao et al. 2010; Zhang et al. 2011; Mao et al. 2016). A detailed description of all the variables is given in Table 1.

Table 1 Description of variables

A comprehensive analysis of the distribution of macroscale mineralization and emplacement of ore deposits and orebodies together with information on metallogenic evolution, ore genesis, and source of ore-forming materials shows that the Dingjiashan Pb–Zn deposit is jointly controlled by magmatic activities associated with the emplacement of granitoid rocks, stratigraphic fabrics and lithology, tectonic deformation, and important geological interfaces in the early and late Yanshanian period (Shao et al. 2010). The ore-controlling factors are selected mainly according to the stratigraphy expressed as the quantified factor dZ1L3_Z1L2, unconformity surface structure described by the quantified factors dU, gU, aU_S, waU, and wbU, and folds and buried uplifts of ore-bearing rock series under volcanic caprock, which are represented by the quantified factors waZ1L3_Z1L2 and wbZ1L3_Z1L2. These quantified ore-controlling factors are described as follows (Mao et al. 2016; Shao et al. 2010):

dU, the distance field of the unconformity surface, is measured as the minimum distance from the voxel to the unconformity surface.

waU and wbU, the trend–undulation factors of the unconformity surface, which represent the effect of the undulation on its surrounding geological space, are measured by the Euclidean distance from the voxel to the trend of the nearest unconformity surface. Variable waU is the first-level undulation factor while wbU represents the second-level undulation factor (Mao et al. 2016; Shao et al. 2010).

gU is the slope of the unconformity. A steeply dipping unconformity is the most favorable structure for the late-stage reformation of strata-bound Pb–Zn deposits. As ore bodies on both sides of the steep unconformable structure show a trend of thickening and enrichment, the slope of the unconformity is very important to the mineralization. The variable gU is measured by the slope from the voxel to the nearest place on the unconformity surface.

aU_S is the angle of the unconformity. A large angle of intersection of the unconformity surface with the underlying greenschist strata is a favorable condition for mineralization. The angle of the unconformity is an important factor in metallogenesis, and the unconformity can intersect with multiple strata. The variable aU_S is measured with the closest angle between the voxel and the nearest distance to the unconformity surface.

dZ1L3_Z1L2 is the distance to the stratigraphic interface. The transition zone between the greenschist belt (Z1L3) and the light schist belt (Z1L2) is the most favorable ore-containing position of the strata-bound ore. This kind of ore-controlling factor of the stratum can be described by the distance to the Z1L3_Z1L2 stratigraphic interface. The variable dZ1L3_Z1L2 is measured as the minimum distance from the voxel to the Z1L3_Z1L2 stratigraphic interface. The voxel value located above the Z1L3_Z1L2 stratigraphic interface is positive and below is negative.

waZ1L3_Z1L2 and wbZ1L3_Z1L2 are the trend–undulation factors of the stratigraphic interface (Mao et al. 2016; Shao et al. 2010). Folds and concealed uplift of ore rock under the volcanic rock can be expressed by the bending and fluctuating form of the interlayer interface of the ore rock. The variables waZ1L3_Z1L2 and wbZ1L3_Z1L2 represent the first-level and second-level degree of undulation of Z1L3_Z1L2 stratigraphic interface and are measured as the Euclidean distance from the voxel to the trend of the nearest place of the Z1L3_Z1L2 stratigraphic boundary.

Data Analysis Procedure

Five stages of analysis are undertaken in this study. In the first stage, an ordinary least-squared (OLS) regression model associating ore grade with eight explanatory variables is calibrated to generate a baseline global set of results and to examine potential multi-collinearity effects among any of the predictor variables. In the second stage, the GWR model is calibrated by using different kernel functions to explore possible spatial variation in the processes affecting ore concentration. In the third stage, the OLS and GWR results are analyzed and compared and an examination of spatial dependency in 3D space of the regression residuals from both models is presented. In the fourth stage, a spatial stationarity test is performed and the non-stationarity of different variables is demonstrated. Finally, the influences of the degree of non-stationarity of controlling factors on mineralization are discussed.

Methods

Geographically Weighted Regression in 3D Space

In this paper, OLS regression and GWR were used to investigate the relationships between ore concentration and its determinants. GWR is an extension of the OLS model that allows local parameters to be estimated (Fotheringham et al. 2001, 2002; Brunsdon et al. 2002; Yao and Fotheringham 2015). The standard GWR formulation can be represented as:

$$y_{i} = \beta_{0} \left( {u_{i} ,v_{i} } \right) + \mathop \sum \limits_{j = 1}^{k} \beta_{j} \left( {u_{i} ,v_{i} } \right)x_{ij} + \varepsilon \left( {u_{i} ,v_{i} } \right)\quad i = 1 \ldots n$$
(1)

where (uivi) is the 2D coordinate location of ith point, \(y_{i}\) is the estimated value of the dependent variable at point i,\(x_{ij}\) is the value of the variable xj at point i, \(\beta_{0} \left( {u_{i} ,v_{i} ,} \right)\) is the constant estimated for point i, βj(uivi, ) represents the local parameter estimate for independent variable xj at point i, and ɛ(uivi) is the ith value of a normally distributed error vector with mean equal to zero.

In this paper, this model is extended to three-dimensional space as follows:

$$y_{i} = \beta_{0} \left( {u_{i} ,v_{i} ,w_{i} } \right) + \mathop \sum \limits_{j = 1}^{k} \beta_{j} \left( {u_{i} ,v_{i} ,w_{i} } \right)x_{ij} + \varepsilon \left( {u_{i} ,v_{i} ,w_{i} } \right)\quad i = 1 \ldots n$$
(2)

where (uiviwi) denotes the 3D coordinate location of ith point.

Parameter estimates in GWR are obtained by weighting all observations around a specific point i using a distance decay function, which based on their spatial proximity to it. The observations closer to point i have a greater influence on the local parameter estimates for the location and are weighted more than data located farther from point i. The parameters are estimated from:

$${\hat{\mathbf{\beta }}}\left( {\mu ,v,w} \right) = \left( {\varvec{X}^{T} \varvec{W}\left( {\mu ,v,w} \right)\varvec{X}} \right)^{ - 1} \varvec{X}^{T} \varvec{W}\left( {\mu ,v,w} \right)\varvec{y}$$
(3)

where the bold type denotes a matrix, \({\hat{\mathbf{\beta }}}\left( {\mu ,v,w} \right)\) represents the unbiased estimate of \({\varvec{\upbeta}}\), and \(\varvec{W}\left( {\mu ,v,w} \right)\) is the weighting matrix, which acts to ensure that observations closer to the specific point have a higher weight. It can be determined by a kernel function.

There are two types of kernels. The fixed kernel assumes that the bandwidth at each regression point is a constant across the study area while the adaptive kernel permits use of a variable bandwidth and can adapt the bandwidth size to variations according to data density. As the data used in this study are unevenly distributed, the following two adaptive kernels are employed:

$$\begin{aligned} {\text{bi-square}}{:}\,w_{ij} & = \left[ {1 - \left( {d_{ij} /b} \right)^{2} } \right]^{2} \,\,{\text{if}}\,\,d_{ij}\,<\,b \\ & = 0\, {\text{otherwise}} \\ \end{aligned}$$
(4)
$$\begin{aligned} {\text{tri-cube}}{:}\,w_{ij} & = \left[ {1 - \left( {d_{ij}^{2} /b^{2} } \right)^{3} } \right]^{3/2} \,\,{\text{if}}\,\,d_{ij}\,<\,b \\ & = 0 \,{\text{otherwise}} \\ \end{aligned}$$
(5)

where \(w_{ij}\) represents the weight of observation \(j\) for point \(i\), \(d_{ij}\) expresses the Euclidean distance between points i and j, \(N\) is the optimal number of nearest neighbors, and b is the distance to the Nth nearest neighbor, which governs the decay rate of wij and the degree of locality of the regression model. An appropriate number of nearest neighbors can be determined by minimizing the cross-validation (CV) or Akaike information criterion (AIC) scores (Fotheringham et al. 2002).

Non-stationarity Tests

Stationary Index

In this paper, we calculate the stationary index (Brunsdon et al. 1998) which is designed to measure the spatial non-stationarity of each variable. Values smaller than one indicate stationarity (Brunsdon et al. 2002).

The calculation includes three steps: First, the interquartile range of GWR local parameter estimates for each explanatory variable is computed; second, twice the standard error of the global estimates is obtained; finally, the ratio of these two factors is calculated as the stationary index. If the interquartile range is bigger than twice the standard error of the global mean, it may suggest that the relationship is non-stationary (Brunsdon et al. 2002).

Monte Carlo Stationarity Test

The Monte Carlo significance testing procedure employs a pseudo-random number generator to reallocate the observations across the spatial voxels. The Monte Carlo stationarity test is an approach to examine the validity of any inferences drawn from the local results (Fotheringham et al. 2002). The test result depends on the rank of the observed data relative to the random samples.

Given the number of local model calibrations n, the specific process in this paper is as follows (Yao and Fotheringham 2015):

  1. (1)

    For each variable, obtain the local parameter estimates and compute the variance of the estimates.

  2. (2)

    Rearrange data randomly and at the same time keep yix1ix2ixni together.

  3. (3)

    Perform the GWR calculation and compute a new set of local parameter estimates based on rearranged data.

  4. (4)

    For each variable, calculate the variance of the local parameter estimates.

  5. (5)

    Repeat steps (2) to (4) n times.

  6. (6)

    For each variable, compare the variance of local parameter estimates in step (1) with those from steps (2) to (4), and calculate the p value associated with (1) which is the proportion of variances that lie above that for (1) in a list of variances sorted high to low.

Results

Statistical Hypothesis tests and OLS Diagnosis

Data should be examined before further analysis. The results of statistical hypothesis testing are shown in Tables 2 and 3. All values of the variance inflation factor (VIF) in Table 2 are less than 7.5, which demonstrate that multi-collinearity among the explanatory variables does not exist. The indices for explanatory variables in Table 2, including the OLS model intercepts, indicate that the regression coefficients are statistically significant at the 95% confidence level, suggesting that all the explanatory variables are important in the regression model. The OLS model diagnostic results are demonstrated in Table 3. The joint F-statistic and joint Wald statistic listed in Table 3 indicate that the regression model is significant. The value of the Jarque–Bera statistic in Table 3 shows the abnormal distribution of residuals. The multiple R-squared of the model for Zn grade is 0.150 and that for Pb grade is 0.165. These values indicate that both models can only explain about 15% or 16.5% of the variation and cannot express the relationship between the distribution of mineralization and its affecting factors very well. The Koenker (BP) statistic in Table 3 indicates that the model has statistically significant heteroscedasticity or inequalities.

Table 2 Results of significance testing of coefficient variation
Table 3 OLS diagnosis

All of these results show that the OLS model needs to be extended using the GWR model in order to describe better the non-stationary relationship between mineralization and its determinants.

Comparison between GWR and OLS Results

In this study, tricube and bisquare kernel functions of the GWR model are employed in 3D space, and both CV and AIC methods are adopted to determine the optimal nearest neighbors. The simulation calculation is implemented using MATLAB and the Econometrics Toolbox 7.0 (LeSage and Pace 2009) with modifications and extension to 3D made by us.

Comparison of Model Performance Between OLS and GWR

In this study, we measure the performance of models using R-squared (R2) and adjusted R2 values, which show how well the local regression model fits the dependent value (Lee and Schuett 2014). The R2 and adjusted R2 values of GWR and OLS are shown in Table 4. It is obvious that the GWR models perform better than the OLS model. All adjusted R2 values in the GWR models for Zn grade are bigger than 0.70, far bigger than 0.15, the adjusted R2 value in the OLS model for Zn grade. Similarly, all adjusted R2 values in GWR models for Pb grade are also much bigger than that in OLS model. It can be concluded that the GWR models provide more specific and reliable information than the OLS model (Table 4). This shows that models considering local differences of factors can achieve higher reliability and, therefore, that spatial non-stationary between the mineralization and its affecting factors does exist. However, there may be some other factors not considered in our models.

Table 4 Performance comparison of different models

Table 4 also shows the GWR results of two different kernel functions. The adjusted R2 values of the GWR model for Zn grade with tricube and bisquare kernels are about 0.71 and 0.74, respectively, and those of the model for Pb grade are 0.68 and 0.77. It is obvious that the bisquare kernel function performs better than the tricube kernel function. These results demonstrate that the bisquare kernel function can reflect better the attenuation law of affecting factors with distance, and illustrate that the spatial non-stationary between the mineralization and its affecting factors is very prominent. Because the bisquare kernel performs better than the tricube kernel, the former is selected for GWR models in the subsequent analysis.

Comparison of Residuals Obtained by OLS and GWR

Residuals are the differences between the observed y values and the predicted y values. They provide a simple way to detect the spatial varying relationships between mineralization and its related factors for GWR or OLS model. Figure 3a and b depicts the residuals distribution of the OLS and GWR models for Zn grade, and Figure 3a and b depicts the residuals distribution of OLS and GWR for Pb grade. The residuals derived of the GWR models are relatively small (Fig. 3b and d), indicating better performance in predicting the dependent variable, whereas the larger residuals of the OLS model (Fig. 3a and c) indicate lower performance in predicting the dependent variable.

Figure 3
figure 3

Residuals of (a) OLS and (b) GWR models for Zn, (c) OLS and (d) GWR models for Pb

Moreover, the spatial distribution of residuals of GWR models may provide some useful clues for the factors affecting mineralization in Dingjiashan Pb–Zn deposit. If the residual equals zero, it may indicate that the eight independent variables can fully describe the mineralization; otherwise, it can be inferred that the mineralization might be affected by other geological processes and more factors should be taken into consideration (Zhao et al. 2013).

The sizes of the residuals change with the spatial position. Larger residuals are present in two regions (the red elliptical regions in Fig. 3), which may be another indication of the existence of spatial non-stationarity between the mineralization and the factors that affect it.

Comparison of Spatial Autocorrelations of Residuals Obtained by OLS and GWR

If an OLS model has a spatial autocorrelation problem, GWR can help reduce it. On the other hand, if an OLS model does not have this problem, the application of GWR may increase spatial autocorrelation (Tu and Xia, 2008). The global Moran’s I and local Moran’s I of residuals for both the OLS and GWR model are computed to measure the ability to deal with the spatial dependence.

Table 5 shows global Moran’s I statistics on the residuals from OLS and GWR models. The global Moran’s I of OLS is 0.0725 for Zn and 0.0650 for Pb, which indicates a slight positive spatial autocorrelation. Moran’s I statistics of residuals from GWR is 0.0296 for Zn and 0.0286 for Pb, which shows a very weak spatial autocorrelation. The global Moran’s I obtained in the GWR model is less than half of that in the OLS model, which illustrates that GWR models reduce the autocorrelation in comparison with OLS models.

Table 5 Global spatial autocorrelation of residuals

Global autocorrelation is a general description of the whole space, but is only valid for homogeneous space. It becomes unreliable when the spatial process is heterogeneous. Local autocorrelation can solve this problem. Table 6 displays the statistical result for local Moran’s I, in which the indices of OLS models indicate that the distribution is not uniform. Figure 4 shows local Moran’s I distributions in the OLS and GWR models for Zn and Pb. The Moran’s I distributions in the OLS model (Fig. 4a and c) show the spatial variability through space. Despite that the Moran’s I is small in the global models, some regions with high values (e.g., the red elliptical regions in Fig. 4a and c) are consistent with the distribution of ore grade. The spatial distribution of Moran’s I in the GWR model (Fig. 4b and d) is clearly more even than that in the OLS model.

Table 6 Local spatial autocorrelation of residuals
Figure 4
figure 4

Comparison of local Moran’s I: (a) OLS and (b) GWR models for Zn; (c) OLS and (d) GWR models for Pb

All of the above-described results illustrate that the GWR models represent better the relationships by reducing the spatial autocorrelations in residuals.

Spatially Varying Relationships Between Mineralization and its Determinants

The local R2 values and the values of t tests on the local parameter estimates for the GWR model can be calculated to explore the spatial variability between the mineralization and the controlling factors (Tu and Xia 2008). The local R2 values for GWR change with the spatial position, ranging from − 0.2826 to 0.9992 (Fig. 5), which may be an indication of the existence of spatial variability between the mineralization and the factors that affect it. Higher R2 values are mainly present in two regions (the red elliptical regions in Fig. 5), which are consistent with the mineral concentrations.

Figure 5
figure 5

Local R2 of GWR models for (a) Zn and (b) Pb

In statistics, the t-statistic is the ratio of the departure of the estimated value of a parameter from its hypothesized value to its standard error. Generally, a t-value that is greater than 1.96 or less than − 1.96 indicates a significant difference at the 95% confidence level. In this study, the t-statistic values for GWR exhibit an obvious spatial variability. The differences are mainly significant in two regions (the red elliptical regions in Fig. 6), and not significant in most of the other sites, suggesting that these controlling factors are less important for mineralization outside of the two concentrated areas.

Figure 6
figure 6

T-statistic of the parameter estimates (a) dU, (b) waU, (c) wbU, (d) gU, (e) aU_S, (f) dZ1L3_Z1L2, (g) waZ1L3_Z1L2, (h) wbZ1L3_Z1L2

Spatial Stationarity Test

Spatial Stationary Index Test

A spatial stationary index test of local parameter estimates was conducted to determine whether each explanatory variable shows significant geographical variability. The results of the test are presented in Table 7. The values are far bigger than 1, which confirms that the relationships between the mineralization and the eight explanatory variables are not uniform across space, and therefore, these variables should be modeled as local terms.

Table 7 Stationary index of explanatory variables

Monte Carlo Non-stationarity Test

Monte Carlo significance test procedures consist of the comparison of the observed data with random samples generated by the hypothesis being tested (Hope 1968). As a Monte Carlo significance test is rather computationally intensive, we select a relatively inefficient Zn grade model with lower R2 value to complete the test. In this paper, we consider the practical dataset as the observed value. The randomized dataset is generated by randomly changing the corresponding orders of the coordinates and variables. In Table 8, we report the results of a Monte Carlo test on the local parameter estimates for 1000 random samples. The p values of all the variables are less than 0.05, which indicates that there is significant spatial variation in the local parameter estimates for all the variables.

Table 8 Monte Carlo test for spatial non-stationarity on Dingjiashan practical data

Figure 7 shows the comparison of GWR results between the practical dataset and the randomized dataset. The results shown in Figure 7a are for GWR simulations on the randomized dataset that are performed with different maximum bandwidths of 100, 300, 2000, and 5900, which can generate different results. Simulations 1, 2, 3, and 4 correspond to maximum bandwidths of 100, 300, 2000, and 5900. From the radar chart in Figure 7a, we can see that the calculated optimal bandwidth increases synchronously with the increasing maximum bandwidth on the randomized dataset. The optimal bandwidth is almost equal to the given maximum bandwidth on the randomized dataset while it is fixed on the practical dataset. As the bandwidth tends to the maximum, the local model will tend to the global model. We interpret this to be because the relationships in the randomized dataset are uniform in space while they are non-stationary in the practical dataset.

Figure 7
figure 7

Monte Carlo simulation results of (a) optimal bandwidth, (b) R-squared, and (c) stationary index

As the maximum bandwidth should be set to n in order to obtain an optimal bandwidth in a GWR simulation, simulation 4 is adopted for comparison in the following analysis. Figure 7b shows the R-squared values for the randomized dataset and practical dataset. Obviously, the practical dataset has a better performance, which demonstrates that non-stationary relationship is greater in the practical dataset than a randomized dataset. From Figure 7c, we can find that the stationary indexes of all the explanatory variables on the randomized dataset are less than 1, which proves their spatial stationarity. On the other hand, the indexes of all the explanatory variables are far greater than 1 on the practical dataset, which illustrates their non-stationarity in the practical dataset.

All these results reinforce the above conclusion of the existence of non-stationarity between the mineralization and its controlling factors in the Dingjiashan practical dataset.

Discussion and Future Work

Local parameter estimates of the GWR analysis indicate the relationships between the mineralization and its controlling factors. The results described above demonstrate the existence of non-stationary influence between the mineralization and the ore-controlling factors in the Dingjiashan practical dataset. Here we will further discuss the degree of non-stationary influence. Because of space limitations, we only take the Zn grade model for an example.

Table 9 shows the statistics for local parameter estimates in the GWR model. The parameter estimates of all the variables vary considerably from negative to positive values, which underline the non-stationary influence on mineralization by the ore-controlling factors. Figure 8 displays the spatial distribution of each explanatory variable and its parameter estimates. From these data and figures, we can find:

Table 9 Statistics of local parameter estimates for the GWR model
Figure 8
figure 8figure 8

Spatial distribution of explanatory variables (a-1) dU, (b-1) waU, (c-1) wbU, (d-1) gU, (e-1) aU_S, (f-1) dZ1L3_Z1L2, (g-1) waZ1L3_Z1L2, (h-1) wbZ1L3_Z1L2 and the parameter estimates (a-2) dU, (b-2) waU, (c-2) wbU, (d-2) gU, (e-2) aU_S, (f-2) dZ1L3_Z1L2, (g-2) waZ1L3_Z1L2, (h-2) wbZ1L3_Z1L2

  1. (1)

    Different from the explanatory variables, the corresponding parameter estimates of each variable vary through space from negative to positive. The larger values of most parameter estimates are concentrated in two areas (the red elliptical regions in Fig. 8) which are mostly consistent with the mineral concentrations. This reflects that the ore-controlling factors have more influence on the mineralization in these two areas.

  2. (2)

    The parameter estimates of each variable vary in size. Generally, the larger values represent a closer relationship and a larger influence. In Table 9, both inner and outer interquartile ranges are employed to explore further the distribution trends of parameter estimates. The inner interquartile range is defined as the distance from the first to the third quartile, which is used to evaluate the less influential non-concentration areas. On the other hand, the outer interquartile range, which is the summed distance from the minimum to the first quartile and from the third quartile to the maximum, is used to evaluate the highly concentrated areas.

In the areas of high mineral concentration, the eight ore-controlling factors can be roughly divided into three classes by the outer interquartile range. The first class includes waZ1L3_Z1L2, wbZ1L3_Z1L2, which represent the trend–undulation of Z1L3_Z1L2 stratigraphic interface. The second class includes wbU, waU, and aU_S, which represent the trend–undulation of unconformity and the angle of unconformity. The third class consists of dU, dz1L3_Z1L2, and gU, which are the distance to the Z1L3_Z1L2 stratigraphic interface, distance to the unconformity, and the slope of the unconformity. From these analyses, we can conclude that the trend–undulation of the stratigraphic interface has the greatest influence and trend–undulation of the unconformity and angle of the unconformity come second, while the distance to the stratigraphic interface, distance to the unconformity, and the slope of the unconformity are the weakest.

In the low mineral concentration areas, the ore-controlling factors are classified by the inner interquartile range. The first class includes waZ1L3_Z1L2, wbZ1L3_Z1L2, and wbU; the second class waU, dZ1L3_Z1L2, and dU; the third class includes gU and aU_S. These indicate to some degree that the trend–undulation of Z1L3_Z1L2 stratigraphic interface and the second-degree trend–undulation factor of the unconformity have the most influence on the mineralization while the slope and angle of the unconformity might not be as important in low mineral concentration areas. Through the whole space of Dingjiashan Zn deposit, we can find that waZ1L3_Z1L2 and wbZ1L3_Z1L2, the trend–undulation of Z1L3_Z1L2 stratigraphic interface, always attribute the most impact to the mineralization, but the slope of the unconformity has the weakest influence on the mineralization.

All the results and discussion presented here enhance our understanding of the formation of Zn deposits. Although the GWR model achieves better performance than other models, there still exist deficiencies. In the variable selection, there may exist other factors, which may contribute to the mineralization more appropriately. In the model application, the main defect may lie in the deep analysis and reasonable interpretation of the spatial variability of the regression coefficients in space from the GWR model. An appropriate solution would be to associate the regression coefficients with the metallogenic mechanism.

One suggestion for future research can be to enhance the interpolation and prediction abilities of GWR by introducing a kriging method for describing the structure of spatial variation in the GWR weight function. Another approach can focus on improving the predictions through integration with machine learning methods such as discriminant analysis, support vector regression, kernel regression, or neural networks.

Conclusions

Geographically weighted regression, which incorporates spatial location information into the regression model, is more conducive to exploring the interaction of spatial relations within geologically complex regions than ordinary linear regression. In this study, the results of statistical hypothesis tests and OLS fitting reveal that the ore-controlling factors selected in this study are highly statistically significant for the mineralization and that spatial non-stationarity between mineralization and its determinants does exist. Results of model comparisons prove that the GWR model has a better fit and higher prediction accuracy than the OLS model.

Quantitatively understanding the spatially varying relationships between geological factors and mineralization is crucial to metallogenic predictions. The regression coefficients obtained by GWR provide more information for geological interpretation. In this study, the parameter estimates indicate that the most influential controlling factor for the mineralization is the trend of the Z1L3_Z1L2 stratigraphic interface, and the weakest factor is the slope of the unconformity. The results reveal that the influence of the ore-controlling factors on mineralization varies considerably across the whole three-dimensional space, and it is stronger closer to the ore bodies and weaker further away.

In summary, this study described a new case study for the application of GWR in a three-dimensional area of geological significance. The conclusions from this study provide a reference for further research in predictive modeling.