Introduction

A classical problem in geostatistical applications is the prediction of a physical quantity over the whole study region from a finite set of indexed continuous spatial data. This problem involves modeling and estimating the underlying spatial dependence structure of the measured data. Commonly, this is achieved through statistical tools such as computing a variogram or covariogram for the whole study domain under the stationarity assumption. However, the stationarity assumption that states that the spatial dependence structure is translation invariant over the whole study domain is more an exception than a generality. Stationary assumption is driven more by mathematical convenience than by reality. In practice, it can be doubtful due to many factors as specific landscape of the study region or other localized effects. These local influences can be reflected by computing local stationary variograms whose characteristics may vary across the study domain. In such cases, a stationary geostatistical approach is not appropriate because it could produce less accurate predictions, including an incorrect assessment of the prediction error.

In this paper, we are interested in the problem of predicting the breccia pipe elevation named Braden at the El Teniente mine in Chile. This mine is located approximately 70 km southeast of Santiago on the western margin of the Andean Cordillera and within the confines of the central Chilean porphyry Cu belt. As described by Skewes et al. (2002), Maksaev et al. (2004), and Spencer et al. (2015), the supergiant El Teniente deposit is one of the world’s largest and most complex porphyry-copper ore systems, containing an estimated premining resource of approximately 95 million metric tons Cu and 2.5 million metric tons Mo. The center of the deposit is composed of a late-stage diatreme known as the Braden pipe, which is 1200 m in diameter at the surface and close to 600 m at a depth of 1800 m. The pipe is poorly mineralized and surrounded by different kinds of mineralized geological units. As the edge of the pipe constitutes the limit of the deposit and of the mining operation, predicting it accurately is important. To achieve that, a non-stationary geostatistical method based on space deformation is applied (Fouedjio et al. 2015). Our aim by using a non-stationary geostatistical approach is to provide an accurate prediction and reliable prediction uncertainty comparatively to those based on a stationary geostatistical approach.

The space deformation approach introduced by Sampson and Guttorp (1992) is one of the most studied non-stationary methods. It involves transforming the study region into a new region where a classical stationary geostatistical approach is more suitable. Data from the study region are mapped into the deformed region, and standard geostatistical techniques for prediction can then be applied. The predicted results are then mapped back into the original region. Some variants of this approach have been proposed by Mardia and Goodall (1993), Smith (1996), Meiring et al. (1997), Perrin and Monestiez (1998), Damian et al. (2001), Schmidt and O’Hagan (2003), Iovleff and Perrin (2004), Vera et al. (2008, 2009), Schmidt et al. (2011), Bornn et al. (2012), and Castro Morales et al. (2013). Some theoretical aspects have been established by Perrin and Meiring (1999), Perrin and Senoussi (2000), Perrin and Meiring (2003), Genton and Perrin (2004), and Porcu et al. (2010). Although this approach has been introduced a long time ago, it has not been used much in geostatistical applications because it requires replicated data. However, many geostatistical applications involved only one measurement at each location. Anderes and Stein (2008) and Anderes and Chatterjee (2009) are the first authors to propose a space deformation approach under the single-realization framework. However, the proposed approach requires very dense data and has not been applied to real datasets. Following the work of Sampson and Guttorp (1992), Fouedjio et al. (2015) propose a space deformation method based on a unique realization. In the spirit of the space deformation approach, non-Euclidean distance-based approaches have been developed. The idea is to define a new distance (non-Euclidean distance) measure between locations in the study domain in order to better account for non-stationarity. Since the use of a non-Euclidean distance can make stationary second-order models not valid, locations in the study domain are transformed into a Euclidean space where the Euclidean distance between locations approximates the non-Euclidean distance. Almendral et al. (2008) and Boisvert and Deutsch (2011) define new distance that honors the study area in terms of varying local anisotropy. However, the calculation of this distance requires that the parameters of anisotropy at each location of the study area are known. McBratney and Minasny (2013) also propose a similar approach based on the calculation of a certain geographical distance. Although, non-Euclidean distance-based approaches can be regarded as space deformation approaches, they do not provide a second-order non-stationary model.

The rest of the paper is organized as follows. The non-stationary geostatistical method based on space deformation is described in Sect. 2. The results obtained by applying this non-stationary geostatistical approach on the breccia pipe elevation dataset are presented in Sect. 3. Section 4 concludes with some remarks.

Space Deformation Approach

Model

Consider \(Z(.)=\{Z(\mathbf{x}): \mathbf{x} \in G \subseteq \mathbb{R}^p, p\ge 1\}\) a random field defined on a fixed continuous study domain G and reflecting the underlying studied phenomenon. Z(.) is governed by the following equation:

$$\begin{aligned} {} Z(\mathbf{x})=Y(f(\mathbf{x})), \ \forall \mathbf{x} \in G, \end{aligned}$$
(1)

where \(Y(.)=\{Y(\mathbf{u}): \mathbf{u} \in D \subseteq \mathbb{R}^{q},q\ge p\}\) represents an isotropic stationary random field; \(f: G \rightarrow D\) is a deterministic non-linear smooth bijective function from the study domain G onto the deformed domain D. In principle, we can get \(q\ge p\), although most frequently \(q=p\).

The resulting valid non-stationary spatial dependence structure (non-stationary variogram) from Eq. (1) is given by

$$\begin{aligned} {} \gamma (\mathbf{x},\mathbf{y})\equiv \frac{1}{2}\mathbb{V}(Z(\mathbf{x})-Z(\mathbf{y}))=\gamma _0(\Vert f(\mathbf{x})-f(\mathbf{y}) \Vert ), \ \forall (\mathbf{x},\mathbf{y}) \in G \times G, \end{aligned}$$
(2)

where \(\gamma _0(.)\) is the isotropic stationary variogram of Y(.) and \(\Vert .\Vert \) is the Euclidean norm in \(\mathbb{R}^q\).

The space deformation model represented by Eq. (2) allows the use of all valid isotropic stationary variogram models. It thus tries to keep the simplicity of an isotropic stationary variogram structure. As described in Fouedjio et al. (2015), the deformation function f(.) shrinks the study domain G in areas of relatively high spatial continuity while stretches it in areas of relatively low spatial continuity, such that an isotropic stationary variogram can model the spatial dependence structure in the deformed domain D.

Inference

Let \(\mathbf{Z}={(Z(\mathbf{s}_1),\ldots ,Z(\mathbf{s}_n))}^{\text{T}}\) be an \((n \times 1)\) vector of observations from the random field Z(.) and associated with locations \(\{\mathbf{s}_1,\ldots ,\mathbf{s}_n\}\subset G\). The objective is to estimate the non-stationary spatial dependence structure model depicted by Eq. (2) based on measured data. Since the two parameters f(.) and \(\gamma _0(.)\) in Eq. (2) are unknown, they need to be estimated. This is achieved using a step-by-step estimation procedure proposed by Fouedjio et al. (2015). Firstly, a non-stationary variogram kernel estimator is defined. Secondly, a dissimilarity matrix is built from the non-stationary variogram kernel estimator, and then it is used to construct the deformed domain through the non-metric multidimensional scaling (NMDS) procedure. Thirdly, the estimation of the deformation function f(.) is carried out by interpolating between a set of locations in the study domain G and the estimations of their deformations in the deformed domain D using the class of thin-plate spline radial basis functions. Fourthly, the estimation of \(\gamma _0(.)\) is carried out by calculating the classical experimental variogram on transformed data in the deformed domain D and then by fitting it from a mixture of basic isotropic stationary variogram models.

Step 1

A non-stationary variogram kernel estimator of the model depicted by Eq. (2) is defined as (Fouedjio et al. 2015)

$$\begin{aligned} \widehat{\gamma }(\mathbf{x},\mathbf{y};\lambda ) = \frac{\sum _{i,j=1}^{n}{K_\lambda \left( (\mathbf{x},\mathbf{y}),(\mathbf{s}_{i},\mathbf{s}_{j})\right) \left( Z(\mathbf{s}_{i})-Z(\mathbf{s}_{j})\right) ^{2}}}{2\sum _{i,j=1}^{n}K_\lambda \left( (\mathbf{x},\mathbf{y}),(\mathbf{s}_{i},\mathbf{s}_{j})\right) }{\mathbbm{1}}_{\{\mathbf{x} \ne \mathbf{y}\}}, \quad \forall (\mathbf{x},\mathbf{y}) \in G \times G, \end{aligned}$$
(3)

where \(K_\lambda \left( (\mathbf{x},\mathbf{y}),(\mathbf{s}_{i},\mathbf{s}_{j})\right) =K(\mathbf{x},\mathbf{s}_{i};\lambda )K(\mathbf{y},\mathbf{s}_{j};\lambda )\), with \(K(.,.;\lambda )\) a non-negative, symmetric kernel on \(\mathbb{R}^p \times \mathbb{R}^p\) with bandwidth \(\lambda >0\).

The estimator defined by Eq. (3) is a kernel-weighted local average of squared increments of the regionalized variable. It measures the spatial dissimilarity between two arbitrary locations in the study domain G. The role of the kernel is to weight observations with respect to a reference location so that nearby observations get more weight, while remote locations receive less. Several kernels can be chosen, the most common are (Wand and Jones 1995)

  1. 1.

    uniform kernel \(K(\mathbf{x},\mathbf{y};\lambda )\propto {\mathbbm{1}}_{\Vert \mathbf{x}-\mathbf{y}\Vert \le \lambda }\);

  2. 2.

    triangular kernel \(K(\mathbf{x},\mathbf{y};\lambda )\propto (\lambda -\Vert \mathbf{x}-\mathbf{y}\Vert ){\mathbbm{1}}_{\Vert \mathbf{x}-\mathbf{y} \Vert \le \lambda }\);

  3. 3.

    Epanechnikov kernel \(K(\mathbf{x},\mathbf{y};\lambda )\propto (\lambda ^2 -\Vert \mathbf{x}-\mathbf{y}\Vert ^2){\mathbbm{1}}_{\Vert \mathbf{x}-\mathbf{y} \Vert \le \lambda }\);

  4. 4.

    Gaussian kernel \(K(\mathbf{x},\mathbf{y};\lambda )\propto \exp (-\frac{1}{2\lambda ^2}{\Vert \mathbf{x}-\mathbf{y}\Vert }^2)\).

Figure 1 shows different kernels and how they distribute the weight of a target location on a region. In this work, our choice is the Epanechnikov kernel, which is an isotropic kernel with compact support and showing optimality properties in density estimation (Wand and Jones 1995). According to Fouedjio et al. (2015), the computational burden of the estimator defined by Eq. (3) is greatly reduced using a compactly supported kernel, as it reduces the number of terms to compute.

Figure 1
figure 1

Kernel functions: a uniform, b triangular, c Epanechnikov, and d Gaussian

Step 2

Given the non-stationary variogram kernel estimator defined by Eq. (3) in step 1 and a representative set of \(m \ll n\) locations \(\mathbf{X}={[\mathbf{x}_{1},\ldots , \mathbf{x}_{m}]}^{\text{T}}\) referred to as anchor locations over the study domain G, a dissimilarity matrix \(\varvec{\Delta }_{(\lambda ,\omega )}=[\delta _{ij}(\lambda ,\omega )]\) is built as (Fouedjio et al. 2015)

$$\begin{aligned} \varvec{\Delta }_{(\lambda ,\omega )}=\omega {\tilde{\varvec{\Gamma}}_\lambda } + (1-\omega ){\tilde{\mathbf{D}}}, \end{aligned}$$
(4)

where \({\tilde{\varvec{\Gamma}}_\lambda }=\left[ \frac{{\widehat{\gamma }_{ij}}(\lambda )-\min ({\widehat{\gamma }_{ij}}(\lambda ))}{\max ({\widehat{\gamma }_{ij}}(\lambda ))-\min ({\widehat{\gamma }_{ij}}(\lambda ))}\right] , \ {\tilde{\mathbf{D}}}=\left[ \frac{d_{ij}-\min (d_{ij})}{\max (d_{ij})-\min (d_{ij})}\right] \), with \(\gamma _{ij}(\lambda )\) being the non-stationary variogram estimator defined by Eq. (3) at locations \(\mathbf{x}_i\) and \(\mathbf{x}_j\); \(d_{ij}\) is the Euclidean distance between \(\mathbf{x}_i\) and \(\mathbf{x}_j\); \(\omega \in [0,1]\) is a mixing parameter.

The dissimilarity matrix \(\varvec{\Delta }_{(\lambda ,\omega )}\) includes not only dissimilarities observed in the regionalized variable but also spatial proximities. This combination also allows to reduce via \(\omega \) the risk that the deformation function f(.) is not a bijection. From \(\varvec{\Delta }_{(\lambda ,\omega )}\), we seek a new configuration of q-dimensional locations \(\mathbf{U}={[\mathbf{u}_{1},\ldots , \mathbf{u}_{m}]}^{\text{T}} \subset D\) such that the following relations are satisfied as much as possible:

$$\begin{aligned} \phi (\delta _{ij}(\lambda ,\omega )) \approx \Vert \mathbf{u}_{i}-\mathbf{u}_{j}\Vert \equiv h_{ij}(\mathbf{U}), \end{aligned}$$
(5)

where \(\phi (.)\) is a monotonic function that preserves the rank order of the dissimilarities.

In other words, we look for a configuration of anchor locations \(\mathbf{U}\) in a given dimension space such that the rank order of the configuration distances is consistent with the rank order of the dissimilarities. This is achieved by minimizing the loss function called stress defined as (Fouedjio et al. 2015)

$$\begin{aligned} S_{(\lambda ,\omega )}(\mathbf{U})=\min _{\phi }{\left[ \sum _{i < j}{\frac{{p_{ij}(\lambda )[\phi (\delta _{ij}(\lambda ,\omega ))-h_{ij}(\mathbf{U})]}^2}{\sum _{i < j}{p_{ij}(\lambda )h_{ij}^2(\mathbf{U})}}}\right] }^\frac{1}{2}, \end{aligned}$$
(6)

where \(p_{ij}(\lambda )={\sum _{k,l=1}^{n}K_\lambda \left( (\mathbf{x}_i,\mathbf{x}_j),(\mathbf{s}_{k},\mathbf{s}_{l})\right) }/{\Vert \mathbf{x}_i-\mathbf{x}_j\Vert }\) are weights that enable to take into account a variable sampling density over the domain.

The minimization problem described by the Eq. (6) is solved using the non-metric multidimensional scaling (NMDS) iterative algorithm of Kruskal-Shepard (Kruskal 1964a). Globally, the method operates as follows: we start from an initial configuration \(\mathbf{U}^{(0)}\) as anchor locations \(\mathbf{X}\). We seek the \({\phi }(\delta _{ij}(\lambda ,\omega ))\) such that \(\sum _{i < j}{p_{ij}(\lambda )[h_{ij}(\mathbf{U}^{(0)})-\phi (\delta _{ij}(\lambda ,\omega ))]}^2\) is minimum. This problem has an unique solution: isotonic regression (Kruskal 1964b). The value of stress is so deduced. We modify the configuration via small displacements of locations according to a gradient method to decrease the stress. We return to the isotonic regression step, and so on until convergence. For details on the NMDS algorithm, see Cox and Cox (2000) and Borg and Groenen (2005).

The stress defined in Eq. (6) assesses the concordance between dissimilarities and corresponding distances. It is invariant by translation, rotation, or rescaling of the configuration. It is normalized and therefore does not depend on the size of the configuration. Kruskal (1964a) proposed an assessment of the goodness of fit of any NMDS solution through different levels of stress values: 0.20 = poor, 0.10 = fair, 0.05 = good, 0.025 = excellent, and 0.00 = perfect. However, this evaluation must be regarded just as an indication of the goodness of fit of an NMDS solution. The goodness of fit of an NMDS solution is visualized in a Shepard diagram, a scatter plot of the dissimilarities against the corresponding distances in deformed domain with the isotonic regression. In most geostatistical applications, the dimension of the study region is 2d or 3d. For illustrative purposes, the preferred number of dimensions to be chosen for NMDS is 2 or 3. However, if the results of the deformation are not satisfactory (poor goodness of fit or non-significant difference between original and deformed domains), NMDS in higher dimensions can be considered at the expense of the visualization and the interpretation of the deformed domain.

The construction of the deformed domain depends partly on anchor locations. The set of anchor locations may be chosen as a sparse grid over the study domain or a reduced subset of data locations. The sampling density, which may vary over the domain, can be accounted by non-uniform distribution of the anchor locations. It is not necessary to work with a very dense grid of anchor locations, since the dissimilarities calculated for pairs of anchor locations that are very close can be unnecessary and redundant because of their high correlation. The number of anchor locations is a trade-off between the computing time and the accuracy of the resulting deformation. The computational burden of the NMDS algorithm is roughly proportional to the number of anchor locations. From our experience, 100 to 300 anchor locations can be sufficient to build the deformed domain.

Step 3

The estimation of deformation function f(.) is carried out by interpolating between the configuration of anchor locations \(\mathbf{X}\) in the study domain G and the estimations of their deformations \(\mathbf{U}\) (step 2) in the domain D using the class of thin-plate spline radial basis functions. Specifically, the thin-plate spline estimator of f(.) is given by

$$\begin{aligned} {\widehat{f}}(\mathbf{x}) = ({\widehat{f}}_{1}(\mathbf{x}),\ldots ,{\widehat{f}}_{q}(\mathbf{x}))^{T} = \mathbf{c} + \mathbf{A}\mathbf{x} + \mathbf{V}^{T}\varvec{\sigma }(\mathbf{x}), \end{aligned}$$
(7)

where \( \mathbf{c} \text{ is } (q\times 1), \mathbf{A} \text{ is } (q\times p), \mathbf{V} \text{ is } (m\times q), \varvec{\sigma }(\mathbf{x})=(\sigma (\mathbf{x}-\mathbf{x}_1),\ldots ,\sigma (\mathbf{x}-\mathbf{x}_m))^{T}\) and \(\mathbf{x}_1,\ldots ,\mathbf{x}_m\) are the anchor locations seen as the centers of the radial basis function \(\sigma (\mathbf{h})={\Vert \mathbf{h} \Vert }^2 \log (\Vert \mathbf{h} \Vert ){\mathbbm{1}}_{\Vert \mathbf{h} \Vert > 0}\).

The parameters \(\mathbf{c}, \mathbf{A}, \) and \(\mathbf{V}\) are determined by resolving the system of equations \({\widehat{f}}(\mathbf{X})= \mathbf{U}\) under the constraints \(\mathbf{1}^{\text{T}}\mathbf{V}=0 \text{ and } \mathbf{X}^{\text{T}}\mathbf{V}=0\) (Dryden and Mardia 1998). Specifically, we have

$$\begin{aligned} \varvec{\Sigma } \begin{bmatrix} \mathbf{V} \\ \mathbf{c}^{\text{T}} \\ \mathbf{A} ^{\text{T}} \end{bmatrix} = \begin{bmatrix} \mathbf{U} \\ 0 \\ 0 \end{bmatrix}, \end{aligned}$$
(8)

where \(\varvec{\Sigma }= \begin{bmatrix} \mathbf{S}&\mathbf{1}&\mathbf{X} \\ \mathbf{1}^{\text{T}}&0&0 \\ \mathbf{X}^{\text{T}}&0&0 \end{bmatrix}\) with \(\mathbf{S}={[\sigma (\mathbf{x}_i-\mathbf{x}_j)]}_{i,j=1\ldots m}\).

The matrix \(\varvec{\Sigma }\) is symmetric and positive definite subject to the existence of the inverse of \(\mathbf{S}\). In this case, its inverse exists and we get

$$\begin{aligned} \begin{bmatrix} \mathbf{V} \\ \mathbf{c}^{\text{T}} \\ \mathbf{A} ^{\text{T}} \end{bmatrix} = \varvec{\Sigma }^{-1} \begin{bmatrix} \mathbf{U} \\ 0 \\ 0 \end{bmatrix}. \end{aligned}$$
(9)

Step 4

The estimation of the isotropic stationary spatial dependence structure \(\gamma _0(.)\) is carried out by calculating the classical experimental variogram of transformed data \({(Y({\widehat{f}}(\mathbf{s}_1)),\ldots ,Y({\widehat{f}}(\mathbf{s}_n)))}^{\text{T}}\) in the deformed domain D and then by fitting the experimental variogram from a set of theoretical isotropic stationary variograms. We use a robust method developed by Desassis and Renard (2012), which automatically finds a model that fits the experimental variogram. From a linear combination of some authorized basic structures, a numerical algorithm is used to estimate a parsimonious model that minimizes a weighted distance between the model and the experimental variogram.

Specifically, given the experimental variogram of transformed data \(\{\hat{\gamma }_0(\Vert \mathbf{h}_j\Vert ), \mathbf{h}_j\in \mathbb{R}^p, j=1,\ldots ,J \}\) and a family of parametric basic structures (normalized) \(\{\gamma _1^{(\theta _1)},\ldots ,\gamma _K^{(\theta _K)}\}\), the goal is to find a linear combination \(\gamma _{0}^{(\varvec{\Psi })}(\Vert \mathbf{h} \Vert )=\sum _{k=1}^{K}\beta _k\gamma _k^{(\theta _k)}(\Vert \mathbf{h} \Vert )\) with positive coefficients such that

$$\begin{aligned} S(\varvec{\Psi })=\frac{1}{2}\sum _{j=1}^J\omega _j{\left( \gamma _{0}^{(\varvec{\Psi })}(\Vert \mathbf{h}_j\Vert )-\hat{\gamma }_0(\Vert \mathbf{h}_j\Vert )\right) }^2, \end{aligned}$$
(10)

is minimal for the vector of parameters \(\Psi =(\theta _1,\ldots ,\theta _K,\beta _1,\ldots ,\beta _K)\); with \(\{\omega _j, j=1,\ldots ,J\}\) a set of weights, e.g., \(\omega _j=N_j/\Vert \mathbf{h}_j\Vert \), where \(N_j\) is the number of pairs used in the computation of \(\hat{\gamma }_0(\Vert \mathbf{h}_j\Vert )\).

Kriging

Since \(Z(.)=Y(f(.))\), kriging with unknown constant mean (ordinary kriging) of the non-stationary random field Z(.) in the study domain G can be transposed to the isotropic stationary random field Y(.) in the deformed domain D where standard stationary geostatistical techniques already exist (Chilès and Delfiner 2012). Given measured data \(\mathbf{Z}={(Z(\mathbf{s}_1),\ldots ,Z(\mathbf{s}_n))}^{\text{T}}\), the prediction of Z(.) at unsampled location \(\mathbf{s}_0 \in G\) is given by the ordinary kriging estimator:

$$\begin{aligned} \widehat{Z}(\mathbf{s}_{0})=\widehat{Y}(\mathbf{u}_{0})=\sum _{i=1}^{n}\alpha _{i}{(\mathbf{u}_{0})}Y(\mathbf{u}_{i}), \end{aligned}$$
(11)

which minimizes the mean square error \(\mathbb{E}{(\widehat{Z}(\mathbf{s}_{0})-Z(\mathbf{s}_{0}))}^2\) under the constraint:  \(\sum _{i=1}^{n}\alpha _{i}{(\mathbf{u}_{0})}=1\), where \(\mathbf{u}_{i}=\widehat{f}(\mathbf{s}_{i}), Y(\mathbf{u}_{i})=Z(\mathbf{s}_{i}), \ i=0,\ldots ,n\) represents the transformed data. The kriging weights \({[\alpha _{i}(\mathbf{u}_{0}]}_{i=1,\ldots ,n}\) are computed by solving the well-known ordinary kriging system in the stationary framework (Chilès and Delfiner 2012).

Note that with regard to the stationary approach, the action of the deformation function modifies the conventional kriging system through the following items: (i) the distance between a predicted location and a sampled location is changed; (ii) the geometric configuration of locations is modified, especially the support of the block model; and (iii) the spatial dependence structure of the regionalization is changed.

To predict the non-stationary random field Z(.) on target grid locations, we can proceed as follows:

  1. 1.

    obtain the image of the target grid locations and data locations through \(\widehat{f}(.)\);

  2. 2.

    krige the transformed target grid locations from \(\widehat{\gamma }_0(.)\) and transformed data locations;

  3. 3.

    obtain the kriging on the target grid locations by simple correspondence.

The space deformation method relies on two hyper-parameters \((\lambda ,\omega )\) used in the computation of the dissimilarity matrix \(\varvec{\Delta }_{(\lambda ,\omega )}\). With the estimation of the spatial dependence structure being rarely a goal per se but an intermediate step before kriging, the hyper-parameters are selected by a data-driven method consisting of choosing the hyper-parameter values that give the best cross-validation mean square prediction error (Fouedjio et al. 2015):

$$\begin{aligned} MSPE(\lambda ,\omega ) = \frac{1}{n}\sum _{i=1}^{n}{\left( Z(\mathbf{s}_{i}) - \widehat{Z}_{-i}(\mathbf{s}_{i};\lambda ,\omega )\right) }^2, \end{aligned}$$
(12)

where \(\widehat{Z}_{-i}(\mathbf{s}_{i};\lambda ,\omega )\) is the kriging at location \(\mathbf{s}_{i}\) using all measured data excluding \(\{Z(\mathbf{s}_{i})\}\).

Application to Breccia Pipe Elevation

Data Description

The non-stationary geostatistical approach based on space deformation described in Sect. 2 is applied to the elevation data of the breccia pipe at El Teniente Mine, Chile. A representation of data given in Figure 2a shows that data have a circular configuration with high values located at margins of the mine, whereas the central part exhibits low values. The data are denser at the margin of the domain and less dense at the center. The dataset contains \(n=816\) measurements divided into a training set (\(n_1=616\) measurements) and a validation set (\(n_2=200\) measurements) as shown in Figure 2b. The training set serves to calibrate the model, and the validation set serves to assess the prediction performances. A comparison scheme of kriging with unknown constant mean (ordinary kriging) under stationary and non-stationary approaches is carried out through the validation dataset. Summary statistics of training, validation, and whole data are given in Table 1. The histogram and boxplot of data values are slightly skewed with values ranging from 1429 to 2906 m, a mean of 2392 m, and a median of 2401 m (Fig. 2c, d). The data present some outliers corresponding to the lowest values, which are located at the center of the domain.

Figure 2
figure 2

Elevation data, El Teniente Mine, Chile

Table 1 Summary statistics of measured elevation data, El Teniente Mine, Chile

Exploring Evidence of Variogram Non-stationarity

To explore an evidence of non-stationarity of the underlying spatial dependence structure of observed data, a local stationary variogram (Lloyd 2010) is computed at some locations across the domain under study (Fig. 3). There is a clear evidence of variogram non-stationarity, as the variographic parameters (sill and range) vary spatially. Specifically, we can see that locations 1, 2, and 3 (East margin area) form an area of relatively high spatial variability or low spatial correlation, while locations 4, 5, and 6 (West margin area) constitute an area of relatively low spatial variability or high spatial correlation. Indeed, the latter area has a long range (~300 m) and low variance (~45,000 m2) compared to the former area, which has a small range (~200 m) and high variance (~60,000 m2). This difference between the two sub-areas may be related to lithologic conditions. Thus, a non-stationary geostatistical approach based on space deformation is appropriate because its aim is to deal with this type of non-stationarity. Indeed, it operates by contracting the study domain in areas of relatively high spatial continuity and by stretching it in areas of relatively low spatial continuity, such that an isotropic stationary variogram is suitable to model the spatial dependence structure in the deformed domain.

Figure 3
figure 3

The study domain: a measured data and bd local stationary variograms at some locations

Space Deformation Results

Figure 4a and c shows, respectively, the data locations in the study domain G and their image in the deformed domain D. The deformed domain is constructed using only a reduced set of 179 anchor locations as presented in Figure 4a (black cross) instead of all data points, allowing to reduce the computational burden. We observe that the deformation shrinks the study domain in the West margin region while stretches it in the East margin region. This means that each of these regions corresponds to an area of relatively high and low spatial correlation (or low and high spatial variability), thereby confirming the result obtained during the exploratory analysis of non-stationarity in Sect. 3.2.

Figure 4b and d presents the variograms corresponding, respectively, to an isotropic stationary model in the study and deformed domains. The two stationary variograms are quite different. Indeed, by deforming the study domain more correlated data locations become closer, while less correlated data locations become more distant. The range of the stationary variogram in the deformed domain is bigger than the range of the stationary variogram in the original domain. The difference between the sills is small. We note that the small nugget effect component present in the stationary variogram in the original domain is absent in the stationary variogram in the deformed domain. Indeed, the nugget effect component is present in the region of relatively high spatial variability (East margin region). Thus when the deformation stretches this region, the nugget effect component vanishes. The non-stationary modeling by space deformation and the stationary one lead, respectively, to the following models:

$$\begin{aligned} {\widehat{\gamma }}_{0}(\Vert \mathbf{h}\Vert )= & {} 9947\times \text{ Exp }_{96}(\Vert \mathbf{h}\Vert )+ 32{,}757\times \text{ Sph }_{323}(\Vert \mathbf{h}\Vert ),\end{aligned}$$
(13)
$$\begin{aligned} {\widehat{\gamma }}_{1}(\Vert \mathbf{h}\Vert )= & {} 3050\times \text{ Nug }(\Vert \mathbf{h}\Vert ) + 40{,}999\times \text{ Sph }_{181}(\Vert \mathbf{h}\Vert ), \end{aligned}$$
(14)

where \({\widehat{\gamma }}_{0}(.)\) is a nested isotropic stationary variogram (exponential and spherical) with total variance 42,704 m2 and \({\widehat{\gamma }}_{1}(.)\) is a nested isotropic stationary variogram (small nugget effect and spherical) with total variance 44,049 m2.

According to the hyper-parameter selection presented in Sect.  2.3, Figure 5a and b shows, respectively, the mean square prediction error for cross-validation and external validation. The optimum values in cross-validation correspond to \(\lambda =1446\) m and \(\omega =0.90\). These optimum values are consistent with those given by the external validation. The Shepard diagram of the NMDS algorithm associated with the deformed domain is shown in Figure 6. The corresponding value of stress is equal to 8 %.

A visualization of the variogram at certain points (with all other points) through the level contours for estimated stationary and non-stationary models is shown in Figure 7. We can see how the non-stationary spatial dependence structure changes from one place to another as compared to the stationary one. The region of high spatial continuity (West margin area) has a long-radius contour level (long range), while the region of low spatial continuity (East margin area) has a small-radius contour level (small range). This difference between the estimated stationary and non-stationary models will reflect on both the prediction accuracy and the prediction uncertainty accuracy (Sect. 3.4).

Figure 4
figure 4

Study domain G: a data locations and anchor locations and b the estimated isotropic stationary variogram. Deformed domain D: c data locations and d the estimated isotropic stationary variogram

Figure 5
figure 5

Hyper-parameter selection through the mean square prediction error \(MSPE(\lambda ,\omega )\) for a cross-validation and b external validation

Figure 6
figure 6

Shepard diagram of the NMDS algorithm corresponding to the deformed domain

Figure 7
figure 7

Variogram level contours at few points for a the estimated stationary model and b the estimated non-stationary model. Level contours correspond to the values: 10,000 m2 (black), 20,000 m2 (red), and 30,000 m2 (green)

Model Assessment

To assess the predictive performance of the space deformation non-stationary geostatistical approach, an external validation procedure is adopted: the regionalized variable is predicted at 200 validation data locations. Some well-known prediction performance criteria are considered: mean absolute error (MAE), root mean square error (RMSE), normalized mean square error (NMSE), logarithmic score (LogS), and continued ranked probability score (CRPS). If \(\widehat{Z}(\mathbf{s}_{i})\) denotes the kriging at a validation data location \(\mathbf{s}_{i}\) computed from all training data and \(\hat{\sigma }^2(\mathbf{s}_i)\) the corresponding kriging variance, we have

$$\begin{aligned}&\displaystyle {{\text{MAE}}=\frac{1}{n_2}\sum _{i=1}^{n_2}\left| \hat{Z}(\mathbf{s}_i)-Z(\mathbf{s}_i)\right| }, \displaystyle {{\text{RMSE}}=\left[ {\frac{1}{n_2}\sum _{i=1}^{n_2}{\left( \hat{Z}(\mathbf{s}_i)-Z(\mathbf{s}_i)\right) }^2}\right] }^\frac{1}{2},\\&\displaystyle {{\text{NMSE}}=\frac{1}{n_2}\sum _{i=1}^{n_2}{\left( \frac{\hat{Z}(\mathbf{s}_i)-Z(\mathbf{s}_i)}{\hat{\sigma }(\mathbf{s}_i)}\right) }^2}, \displaystyle {{\text{LogS}}=\frac{1}{n_2}\sum _{i=1}^{n_2}\left( \frac{1}{2}\log \left( 2\pi \hat{\sigma }^2(\mathbf{s}_i)\right) +\frac{1}{2}{\left( \frac{\hat{Z}(\mathbf{s}_i)-Z(\mathbf{s}_i)}{\hat{\sigma }(\mathbf{s}_i)}\right) }^2\right) }\\&\displaystyle {{\text{CRPS}}=\frac{1}{n_2}\sum _{i=1}^{n_2}\int _{-\infty }^{+\infty }{\left( F_i(z)-\mathbbm{1}\{Z(\mathbf{s}_i)\le z\}\right) }^2dz},\ F_i(z)=P(Z(\mathbf{s}_i)\le z| \text{ training } \text{ data }). \end{aligned}$$

For MAE, RMSE, LogS, and CRPS, the smaller the better; for NMSE the nearer to one the better. The prediction accuracy is measured through MAE and RMSE criteria. The scores NMSE, LogS, and CRPS take into account the prediction and the prediction variance, thus allowing to assess the prediction uncertainty accuracy. The criteria MAE, RMSE, and NMSE do not depend on the distribution of measured data. The LogS score is equivalent to the pseudo-likelihood in the Gaussian framework. The CRPS criterion corresponds to the distance between the distribution function of the predicted variable and the measured data (itself expressed as a distribution function). It is generally calculated in the Gaussian setting where it admits a closed-form expression. Although the LogS and CRPS scores are usually calculated in the Gaussian context, they are quite robust. The probability Gaussian-type confidence interval is calculated also at each validation location (i.e., using \(\hat{Z}(\mathbf{s}_i) \pm 1.96\hat{\sigma }(\mathbf{s}_i) \)), and the proportion of validation locations where the 95 % confidence interval actually includes the true value is computed (PCI). This proportion should be near 95 % for an accurate modeling of uncertainty. The correlation between true and estimated values (Rho) is computed also, the closer to one the better. A description of these different goodness-of-fit measures is given for example in Chilès and Delfiner (2012), Zhang and Wang (2010), and Gneiting and Raftery (2007).

Scatterplots of predicted values versus measured values for the stationary and non-stationary approaches are presented in Figure 8. The comparison shows that the space deformation approach provides a more accurate prediction. This is evidenced by a reduced mean absolute error (MAE) and root mean square error (RMSE), and by an increased correlation coefficient between true and predicted values (Rho), as Table 2 shows. The cost of not using the non-stationary approach in this case is substantial: on average, the prediction at validation locations is about 20 % better for the non-stationary approach compared to the stationary approach, in terms of RMSE. The reliability of the prediction variances measured through NMSE, LogS, CRPS, and PCI criteria (Table 2) shows that the space deformation approach is more accurate for modeling of uncertainty compared to the stationary approach. When considering the proportion of validation locations included in the 95 % confidence interval, the space deformation approach shows 12 locations outside (94 % of locations included in that interval), while the stationary approach shows 16 locations outside (92 % of locations included in that interval) as shown in Figure 8 and reported in Table 2 (we expect about \(200 \times 0.05=10\) locations outside). Specifically, the stationary approach has more difficulty in predicting the lowest values (located at the center) compared to the non-stationary approach.

Figure 8
figure 8

Scatterplots of predicted versus measured values for a the stationary approach and b the non-stationary approach

Table 2 Validation statistics of the prediction performance

The kriged values and kriging standard deviations of the stationary and non-stationary methods are shown in Figure 9. The overall look of the predicted values associated with each method differs notably (Fig. 9a and c). This difference is particularly marked at the center of the domain where there are not enough data locations. The stationary and non-stationary methods differed sharply in describing the uncertainty associated with the predictions (Fig. 9b and d). We can see that the space deformation approach provides low prediction standard deviations in the area of low spatial variability or high spatial correlation (West margin area), while it gives high prediction standard deviations in the area of high spatial variability or low spatial correlation (East margin area). Thus, prediction standard deviations reflect not only the sample configuration and availability around estimates, but also the local variability. However, kriging standard deviations’ map for the stationary approach shows slight differences in the prediction standard deviations over the margin areas of the mine, which were dependent on the sampling intensity. Such a pattern was expected as the stationary approach assumes the same variogram model over the area. The space deformation method takes into account the local characteristics of the regionalization (spatially varying range and variance) that the stationary method is unable to capture. This feature allows the space deformation approach to outperform the stationary approach in terms of prediction accuracy and reduced uncertainty.

Figure 9
figure 9

Predictions and prediction standard deviations based on a, b the estimated stationary model and c, d the estimated non-stationary model

Concluding Remarks

This paper demonstrated the added value of using the generic non-stationary geostatistical method based on space deformation to predict the elevation of the breccia pipe named Braden at the El Teniente mine, Chile. In this case study, this non-stationary approach has provided a better prediction uncertainty accuracy compared to the stationary approach. The non-stationary geostatistical method based on space deformation integrates some spatially local characteristics of the regionalization that the stationary approach is unable to capture. Moreover, as an exploratory tool for the non-stationarity, it allows to identify areas of high and low spatial continuity, giving a better understanding of the spatial behavior of the regionalized variable of interest. This approach potentially brings major improvement to decision-making procedures such as delineating areas.

Like any non-stationary geostatistical approach, the space deformation method requires enough data to be able to properly capture the non-stationarity, and it is computationally intensive compared to a stationary method. Where there are enough data to allow reliable inference, it outperforms a stationary method in terms of prediction accuracy and prediction uncertainty accuracy as in this case study. Furthermore, it works well only for smoothly varying non-stationarity. Thus, it can be difficult to apply on sparse data or data with abrupt spatial structure variations. In such cases, it may be advisable to proceed under a stationary framework.