1 Introduction

Covariance functions are central to many disciplines, including spatial statistics (Cressie 1993; Chilés and Delfiner 2012; Hristopulos 2020), stochastic processes (Porcu et al. 2018a, b), machine learning (Schaback and Wendland 2006; James et al. 2013; Barp et al. 2022), numerical analysis (Pazouki and Schaback 2011; Cockayne et al. 2019), and stochastic mechanics (Ostoja-Starzewski 2006, with the references therein). Recent applications in climatology (Guinness and Hammerling 2018; Edwards et al. 2019), oceanography (Furrer et al. 2007; Di Lorenzo et al. 2014), environmental sciences (Cressie and Kornak 2003; Stein 2007), and natural resources engineering (Chen et al. 2018; Emery and Séguret 2020) evidence the importance of covariance functions.

The covariance function is customarily assumed to depend on the distance between any pair of random variables located at two different points in the input space. Such an assumption is referred to as isotropy in spatial statistics and machine learning, and it is known as radial symmetry in other areas of applied mathematics. The behavior of the covariance function at short or long distances (we call this local and global properties, respectively) is crucial to understanding the properties of random processes with a given covariance function. Specifically, the local properties are related to both the fractal dimension and the geometric properties (e.g., mean square differentiability) of the associated random process, as well as to its sample paths. On the other hand, the global behavior of the covariance function allows one to characterize persistence or anti-persistence (i.e., the long-term behavior) of the associated process. Another global behavior of great interest is the so-called hole effect, which means that the covariance function can take negative values in a certain interval.

Finding parametric families of isotropic covariance functions that allow us to index both local and global behavior is a major challenge that has been addressed to a very limited extent. The Matérn family has been the cornerstone in spatial statistics for over half a century now (Stein 1999). Its popularity is due to a parameter that controls the degree of mean square differentiability and fractal dimension of the corresponding random field (Stein 1999). Recently, Bevilacqua et al. (2022) showed that the Matérn class is a special case of a richer class of models that, in addition to indexing local properties, make it possible to switch between compact and global supports. In turn, compactly supported models lead to sparse covariance matrices (Furrer et al. 2006; Kaufman et al. 2008), and this implies considerable computational gains in both estimation and prediction. Unfortunately, the Matérn class does not allow one to index global behavior of the associated random process. The generalized Cauchy family (Gneiting and Schlather 2004) allows for indexing of the fractal dimension and the long memory behavior, that is, it allows for power-law tail behavior in the covariance function, and this is reflected in the so-called Hurst parameter (Berg et al. 2008). Notably, it does not allow one to index mean square differentiability, as the model is either non-differentiable or infinitely differentiable at the origin. The same properties are shared by the Dagum model (Berg et al. 2008), which also does not allow one to index mean square differentiability. None of the aforementioned models allow one to obtain negative spatial dependencies.

Spectral approaches can be a promising avenue for finding flexible families of covariance functions. Laga and Kleiber (2017) proposed a modified version of spectral density associated with the Matérn family. The new class has two additional parameters that can be loosely interpreted as a continuous version of a moving average process. More recently, Ma and Bhadra (2022) proved that a twofold application of Gaussian scale mixtures can provide models with polynomial decays while preserving the local properties of the candidate covariance function. Other nonconventional properties of covariance functions have been studied by Alegría (2020) and Alegría et al. (2021), who proposed some modified scale mixture representations to obtain classes of cross-covariance functions with non-monotonic behavior (the so-called cross-dimple effect) for vector-valued random fields. In Schlather and Moreva (2017), models that enable a smooth transition between stationary and intrinsically stationary Gaussian random fields are derived.

All the previously mentioned parametric classes of covariance functions admit a scale mixture representation of a Gaussian kernel against a continuous, positive and bounded measure. Our paper starts from the Schoenberg integral representation of isotropic covariance functions on \({\mathbb {R}}^d\) (Schoenberg 1938), for all natural numbers, d. We specifically assume the Schoenberg measures to be parametric families of measures that are defined piecewise. Such a strategy is then shown to provide hybrid classes that generalize classes proposed in earlier literature. We illustrate this methodology by constructing a model that combines the global attributes of the Cauchy class and the local properties of the Matérn class. We show that the proposed model admits a closed-form expression and examine its theoretical properties. Additionally, we study a more flexible formulation where the Gaussian kernel involved in the scale mixture is replaced with a covariance kernel that is also defined piecewise. Following this approach, we derive a hybrid model with local behavior of Matérn type, and global behavior that allows for covariance functions with negative values. We conduct numerical experiments with both simulated and real data in order to assess the statistical performance of the proposed models.

The remainder of the article is organized as follows. Section 2 presents a concise review of random fields and covariance functions coming from scale mixtures. Section 3 discusses general methods for building hybrid covariance models. The hybrid Cauchy–Matérn and the hybrid hole-effect–Matérn classes are then derived. Section 4 guides the reader through some numerical studies. Finally, a critical discussion is presented in Sect. 5, including a description of technical extensions of the present work such as the multivariate case where covariance functions are matrix-valued, and the case of spherically indexed fields where isotropy is defined in terms of the geodesic distance.

2 Background

Let \(\{Z(\textbf{s}): \textbf{s}\in {\mathbb {R}}^d\}\) be a (centered) second-order stationary Gaussian random field on \({\mathbb {R}}^d\). Such a field is completely characterized by its covariance function (or kernel). The isotropy of the covariance function is defined through a mapping \(\varphi :[0,\infty ) \rightarrow {\mathbb {R}}\) such that \(\text {cov}[ Z(\textbf{s}),Z(\textbf{s}')] = \varphi (h),\) for every \(\textbf{s}, \textbf{s}' \in {\mathbb {R}}^d\), where \(h = \Vert \textbf{s} - \textbf{s}'\Vert \). The covariance function must satisfy the positive (semi)-definiteness condition: for any \(k\in {\mathbb {N}}\), \(\{a_1,\ldots ,a_k\}\subset {\mathbb {R}}\) and \(\{\textbf{s}_1,\ldots ,\textbf{s}_k\}\subset {\mathbb {R}}^d\),

$$\begin{aligned} \sum _{i,j=1}^k a_ia_j\varphi (\Vert \textbf{s}_i-\textbf{s}_j\Vert ) \ge 0. \end{aligned}$$

We use the notation \(\varphi (\cdot ; \varvec{\lambda })\) for a parametric family of continuous covariance functions, where \(\varvec{\lambda }\in {\mathbb {R}}^p\) is a vector of parameters. Further, we make use of the celebrated Schoenberg theorem (Schoenberg 1938), whereby the functions \(\varphi \) that are valid in any dimension \(d\in {\mathbb {N}}\) are uniquely written as Gaussian scale mixtures of positive and bounded measures, that is,

$$\begin{aligned} \varphi (h; \varvec{\lambda }) = \int _0^\infty \exp (-uh^2) G(\textrm{d} u;\varvec{\lambda }), \qquad h \ge 0, \end{aligned}$$

where \(\{ G(\textrm{d}\cdot ; \varvec{\lambda }), \; \varvec{\lambda } \in {\mathbb {R}}^p \}\) is a parametric family of measures, which are termed Schoenberg measures in Daley and Porcu (2014). Most of the covariance classes listed in the introduction admit such a representation against a measure that is absolutely continuous with respect to the Lebesgue measure, that is,

$$\begin{aligned} \varphi (h; \varvec{\lambda }) = \int _0^\infty \exp (-uh^2) g(u;\varvec{\lambda }) \text {d}u, \qquad h \ge 0, \end{aligned}$$
(2.1)

for \(\{ g(\cdot ; \varvec{\lambda }), \; \varvec{\lambda } \in {\mathbb {R}}^p \}\) a parametric family of nonnegative functions. Throughout, we call g the mixing function.

We now describe examples of some parametric classes of functions \(\varphi \) that are determined according to (2.1). Special attention is devoted to the Matérn, Cauchy, and generalized Cauchy models. Other examples, including the stable and generalized hyperbolic models, can be found in Yaglom (1987), Barndorff-Nielsen (1978), Schlather (2010), and Porcu et al. (2018b).

Example 1

(Matérn) This class of covariance functions is defined as Matérn (1986)

$$\begin{aligned} \varphi _{{\mathscr {M}}}\left( h; \varvec{\lambda } \right) = \frac{2^{1-\nu }}{\varGamma (\nu )} (h/\alpha )^\nu K_\nu (h /\alpha ), \qquad h \ge 0, \end{aligned}$$
(2.2)

where \(\varGamma \) is the gamma function and \(K_\nu \) is the modified Bessel function of the second kind (Abramowitz and Stegun 1972). Here, \(\varvec{\lambda }=[\alpha ,\nu ]^{\top }\), with \(\alpha \) and \(\nu \) being positive parameters that control the scale (the rate of decay of the covariance in terms of h) and shape of (2.2), respectively. More precisely, \(\nu \) regulates the degree of mean square differentiability of the random field (larger values of \(\nu \) are associated with smoother sample paths) (Stein 1999). When \(\varvec{\lambda }=[\alpha ,1/2]^{\top }\), (2.2) simplifies into the exponential model, \(\exp (- h / \alpha )\). On the other hand, as \(\nu \rightarrow \infty \), a reparameterization of (2.2) tends to the Gaussian covariance function, defined as \(\exp ( - h^2/\alpha )\).

Example 2

(Cauchy) This class of covariance functions is given by Chilés and Delfiner (2012)

$$\begin{aligned} \varphi _{{\mathscr {C}}}\left( h; \varvec{\lambda }\right) = \left( 1 + h^2/\alpha \right) ^{-\nu /2}, \qquad h \ge 0, \end{aligned}$$
(2.3)

with \(\varvec{\lambda }=[\alpha ,\nu ]^{\top }\). As in the Matérn model, \(\alpha >0\) is a scale parameter. However, unlike the Matérn model, which decays exponentially with distance (Stein 1999), (2.3) has a polynomial decay regulated by \(\nu >0\). When \(\nu \in (0,2)\), such a polynomial decay is connected with the Hurst parameter, a measure of long-term memory, given by \(H = 1-\nu /2\).

Example 3

(Generalized Cauchy) This class of covariance functions is defined as (Gneiting and Schlather, 2004 and references therein)

$$\begin{aligned} \varphi _{{{\mathscr {G}}}{{\mathscr {C}}}}\left( h; \varvec{\lambda }\right) = \left( 1 + h^\delta /\alpha \right) ^{-\nu /\delta }, \qquad h \ge 0, \end{aligned}$$
(2.4)

with \(\varvec{\lambda }=[\alpha ,\nu ,\delta ]^{\top }\), where \(\delta \in (0,2]\), \(\alpha >0\), and \(\nu >0\). This generalized class preserves the polynomial decay of (2.3) but is more flexible in the sense that the fractal dimension can be arbitrarily regulated through \(\delta \) (see Gneiting and Schlather 2004 for details). Perhaps surprisingly, this model does not allow one to control the mean square differentiability of the respective random field, as the model is either non-differentiable or infinitely differentiable at the origin.

We note that a unified version of the function g that encompasses the three cases altogether can be found in equation (4) of Porcu et al. (2018b).

Additional classes of covariance functions can be obtained from the more general mixture

$$\begin{aligned} \varphi (h; \varvec{\lambda },\varvec{\vartheta })= \int _0^\infty \phi (h; u,\varvec{\vartheta }) g(u;\varvec{\lambda }) \text {d}u, \qquad h \ge 0, \end{aligned}$$
(2.5)

where \(\phi (\cdot ; u,\varvec{\vartheta })\) is an arbitrary covariance kernel for every \(u>0\), and \(\varvec{\vartheta }\) is a vector of parameters. Since positive definiteness is preserved under products, linear combinations with nonnegative weights, and limits (see, e.g., Chilés and Delfiner 2012, p. 62), if \(\phi \) is valid (positive definite) in \({\mathbb {R}}^d\) for \(d \le d'\) for some \(d'\in {\mathbb {N}}\), then \(\varphi \) is valid in \({\mathbb {R}}^d\) for \(d \le d'\) as well. We refer the reader to Emery and Lantuéjoul (2006) for several explicit examples.

3 Hybrid Classes of Covariance Functions

3.1 General Construction

In this study, we propose new parametric classes of isotropic covariance functions, \({\widetilde{\varphi }}(\cdot ; \varvec{\lambda },\varvec{\omega },\varvec{\xi })\), determined according to

$$\begin{aligned} {\widetilde{\varphi }}(h; \varvec{\lambda },\varvec{\omega },\varvec{\xi }) = \omega _1 \int _0^{\xi _1} \exp (-uh^2) {g}_1(u; \varvec{\lambda }_1) \text {d}u + \omega _2 \int _{\xi _2}^\infty \exp (-uh^2) {g}_2(u;\varvec{\lambda }_2) \text {d}u,\nonumber \\ \end{aligned}$$
(3.1)

where \(g_1\) and \(g_2\) are nonnegative functions on \([0,\xi _1)\) and \([\xi _2,\infty )\), respectively, and \(\varvec{\lambda } = [\varvec{\lambda }_1^\top ,\varvec{\lambda }_2^\top ]^\top \), \(\varvec{\omega } = [\omega _1,\omega _2]^\top \) and \(\varvec{\xi } = [\xi _1,\xi _{2}]^\top \) are vectors of parameters, with \(\omega _i, \xi _i>0\) for \(i=1,2\). In other words, we replace the mixing function, g, in Eq. (2.1) with a function \({\widetilde{g}}\) that is defined piecewise as

$$\begin{aligned} {\widetilde{g}}(u;\varvec{\lambda },\varvec{\omega },\varvec{\xi }) = \omega _1 \, g_1(u;\varvec{\lambda }_1) 1_{[0,\xi _1)}(u) + \omega _2 \, g_2(u;\varvec{\lambda }_2) 1_{[\xi _2,\infty )}(u), \qquad u \ge 0,\qquad \end{aligned}$$
(3.2)

with \(1_A(\cdot )\) standing for the indicator function of a set A. Note that \({\widetilde{g}}\) may have discontinuities as it is built by gluing two individual pieces. If the functions \(g_i\) are continuous and bounded on their domains, a direct application of the dominated convergence theorem (which allows us to exchange limit with integral) implies that the proposed covariance function (3.1) is continuous on \([0,\infty )\). Throughout this manuscript, each function \(g_i\) is positively proportional to a continuous probability density function. Hence, the parametric family proposed in Eq. (3.1) belongs to the Schoenberg class as defined through Eq. (2.1).

A more general construction considers different kernels in each segment of the mixture

$$\begin{aligned} {\widetilde{\varphi }}(h; \varvec{\lambda },\varvec{\omega },\varvec{\xi },\varvec{\vartheta })= & {} \omega _1 \int _0^{\xi _1} \phi _1(h; u, \varvec{\vartheta }_1) {g}_1(u; \varvec{\lambda }_1) \text {d}u\nonumber \\ {}{} & {} + \omega _2 \int _{\xi _2}^\infty \phi _2(h; u, \varvec{\vartheta }_2) {g}_2(u;\varvec{\lambda }_2) \text {d}u, \end{aligned}$$
(3.3)

where \(\varvec{\vartheta } =[\varvec{\vartheta }^\top _1,\varvec{\vartheta }^\top _2]^\top \). If \(\phi _i\) is a valid covariance function in \({\mathbb {R}}^d\) for \(d\le d'_i\), for some \(d_i'\in {\mathbb {N}}\), \(i=1,2\), then (3.3) is a valid model in \({\mathbb {R}}^d\) if and only if \(d\le \text {min}(d'_1,d'_2)\). The continuity of (3.3) can be justified by following the same arguments used for the continuity of (3.1). The validity (positive definiteness) of such a construction is guaranteed by the fact that positive definite functions are closed under linear combinations with nonnegative weights.

Remark 1

Let us point out some additional remarks on this methodology:

  1. (1)

    When \(\xi _1 = \xi _2 = \xi \), this parameter creates a continuous bridge between two apparently disunited models. Indeed, as it goes from 0 to \(\infty \), we gradually go from \(\omega _2 \int _{0}^\infty \phi _2(h; u, \varvec{\vartheta }_2) {g}_2(u;\varvec{\lambda }_2) \text {d}u\) to \(\omega _1 \int _{0}^\infty \phi _1(h; u, \varvec{\vartheta }_1) {g}_1(u;\varvec{\lambda }_1) \text {d}u.\) We will use the term marginal models to refer to these limit models.

  2. (2)

    When \(\xi _1>\xi _2\), instead, there is a superposition of the marginal structures in the interval \([\xi _2,\xi _1)\). As \(\xi _2\rightarrow 0\) and \(\xi _1\rightarrow \infty \), we obtain the greatest possible superposition, which corresponds to a linear combination of the marginal models, \(\omega _1 \int _{0}^\infty \phi _1(h; u, \varvec{\vartheta }_1) {g}_1(u;\varvec{\lambda }_1) \text {d}u + \omega _2 \int _{0}^\infty \phi _2(h; u, \varvec{\vartheta }_2) {g}_2(u;\varvec{\lambda }_2) \text {d}u.\)

While the spectral density is not required throughout the manuscript, it is worth noting that explicit expressions for it can be derived depending on the functions \(g_i\), leveraging the fact that \({\widetilde{\varphi }}\) is written as a scale mixture and making use of Fubini’s theorem. The apparent flexibility of the proposed mixtures is justified by classical theory on local and global behavior of covariance functions. In particular, classical results from probability theory (see Stein 1999, for instance) prove that the local properties of the covariance functions (hence the differentiability at the origin) are uniquely determined by the tails of the function \(g_2\). On the other hand, direct inspection in concert with equation (4) in Gneiting and Schlather (2004) shows that the behavior of \({\widetilde{\varphi }}\) at long distances is determined by \(g_1\). The proofs of the main results below will elaborate on these aspects. The next sections also show that it is possible to provide examples in algebraically closed form that allow one to attain the desired flexibility.

3.2 A Hybrid Cauchy–Matérn Class

We present a hybrid Cauchy–Matérn model, for which the acronym \(\mathcal{C}\mathcal{M}\) is used. This model is a special case of (3.1). Let us first introduce the generalized incomplete gamma function (Chaudhry and Zubair 1994),

$$\begin{aligned} \varGamma (a;b;c)=\int _{b}^{\infty } t^{a-1} \exp (-t-c t^{-1}) \, \text {d}t, \end{aligned}$$

and the lower incomplete gamma function, \(\gamma (a,b) = \varGamma (a;0;0)-\varGamma (a;b;0)\).

Proposition 1

Let \(\varvec{\lambda } = {[}\varvec{\lambda }_1^\top ,\varvec{\lambda }_2^\top ]^\top \), where \(\varvec{\lambda }_i={[}\alpha _i,\nu _i]^\top \), \(\varvec{\omega } = {[}\omega _1,\omega _2]^\top \), and \(\varvec{\xi } = [{\xi }_1,\xi _2]^\top \) are vectors having positive elements. Let

$$\begin{aligned} {\widetilde{\varphi }}_{{{\mathscr {C}}}{{\mathscr {M}}}}(h; \varvec{\lambda },\varvec{\omega },\varvec{\xi }) = \omega _1 \, {\widetilde{\varphi }}^{\, (1)}_{{\mathscr {C}}}(h; \varvec{\lambda }_1,\xi _1) + \omega _2 \, {\widetilde{\varphi }}^{\, (2)}_{{\mathscr {M}}}(h; \varvec{\lambda }_2,\xi _2), \qquad h \ge 0, \end{aligned}$$
(3.4)

where

$$\begin{aligned} {\widetilde{\varphi }}^{\, (1)}_{{\mathscr {C}}}(h; \varvec{\lambda }_1,\xi _1) = \frac{\gamma (\nu _1/2,(h^2+\alpha _1)\xi _1)}{\varGamma (\nu _1/2)} \varphi _{{\mathscr {C}}}(h; \varvec{\lambda }_1) \end{aligned}$$
(3.5)

and

$$\begin{aligned} {\widetilde{\varphi }}^{\, (2)}_{{\mathscr {M}}}(h; \varvec{\lambda }_2,{\xi }_2) = \varphi _{{\mathscr {M}}}(h; \varvec{\lambda }_2) - \frac{1}{\varGamma (\nu _2)} \varGamma \left( \nu _2; \frac{1}{4\xi _2 \alpha _2^2}; \frac{h^2}{4 \alpha _2^2}\right) , \end{aligned}$$
(3.6)

where \(\varphi _{{{{\mathscr {M}}}}}\) and \(\varphi _{{{{\mathscr {C}}}}}\) are the Matérn and Cauchy models defined at (2.2) and (2.3), respectively. Then, \({\widetilde{\varphi }}_{{{\mathscr {C}}}{{\mathscr {M}}}}\) is positive definite in \({\mathbb {R}}^d\) for all \(d\in {\mathbb {N}}\).

Proof

We provide a proof of the constructive type by showing that \({\widetilde{\varphi }}_{\mathcal{C}\mathcal{M}}\) admits the representation (3.1), with \(g_1(u;\varvec{\lambda }_1) = g_{{\mathscr {C}}}(u;\varvec{\lambda }_1)\) and \(g_2(u;\varvec{\lambda }_2) = g_{{\mathscr {M}}}(u;\varvec{\lambda }_2)\), in which \(g_{{{{\mathscr {C}}}}}\) and \(g_{{{{\mathscr {M}}}}}\) are respectively defined as

$$\begin{aligned} g_{{\mathscr {C}}}(u;\varvec{\lambda }_1) = \frac{\alpha _1^{\nu _1/2}}{\varGamma (\nu _1/2)} u^{\nu _1/2-1}\exp \left( -\alpha _1 u\right) , \end{aligned}$$
(3.7)

and

$$\begin{aligned} g_{{\mathscr {M}}}(u;\varvec{\lambda }_2) = \frac{1}{\varGamma (\nu _2)} \left( \frac{1}{2 \alpha _2} \right) ^{2\nu _2} u^{-\nu _2-1} \exp \left( -\frac{1}{4u \alpha _2^2}\right) , \end{aligned}$$
(3.8)

where for both cases all the parameters are positive. To obtain the analytical expression of \({\widetilde{\varphi }}_{{{{\mathscr {C}}}}}^{(1)}\), we note that

$$\begin{aligned} \int _{0}^{\xi _1} \exp (- u h^2) {g}_{{\mathscr {C}}}(u;\varvec{\lambda }_1) \text {d}u= & {} \varphi _{{\mathscr {C}}}(h; \varvec{\lambda }_1) \int _{0}^{\xi _1} \frac{(h^2+\alpha _1)^{\nu _1/2}}{\varGamma (\nu _1/2)} u^{\nu _1/2-1}\exp (-(h^2+\alpha _1) u) \text {d}u \\= & {} \varphi _{{\mathscr {C}}}(h; \varvec{\lambda }_1) \frac{\gamma (\nu _1/2,(h^2+\alpha _1)\xi _1)}{\varGamma (\nu _1/2)}, \end{aligned}$$

where the second equality is due to the fact that the integral on the right-hand side of the first line amounts to the cumulative distribution function of a gamma random variable with parameters \(h^2+\alpha _1\) and \(\nu _1/2\).

To obtain the expression of \({\widetilde{\varphi }}_{{{{\mathscr {M}}}}}^{(2)}\), we invoke equation (10) in Alegría et al. (2021), so that

$$\begin{aligned} \int _{0}^{\xi _2} \exp (- u h^2) {g}_{{\mathscr {M}}}(u;\varvec{\lambda }_2) \text {d}u = \frac{1}{\varGamma (\nu _2)} \varGamma \left( \nu _2; \frac{1}{4\xi _2 \alpha _2^2}; \frac{h^2}{4\alpha _2^2}\right) . \end{aligned}$$
(3.9)

The function \({\widetilde{\varphi }}_{{{{\mathscr {M}}}}}^{(2)}\) is thus obtained by invoking formula 3.471.9 in Gradshteyn and Ryzhik (2007), for which we have \(\int _{0}^{\infty } \exp (- u h^2) {g}_{{\mathscr {M}}}(u;\varvec{\lambda }_2) \text {d}u = \varphi _{{\mathscr {M}}}(h;\varvec{\lambda }_2).\) \(\square \)

When \(\nu _2 = n + 1/2\), for some \(n\in {\mathbb {N}}\), (3.6) can be expressed in terms of complementary error functions and modified Bessel functions of the first and second kinds. We refer the reader to Alegría et al. (2021) for a more detailed study of these special cases.

The flexibility of the proposed structure is now illustrated through the following result, where we use the notation \(f_1(h) \sim f_2(h)\), \(h\rightarrow \infty \), to indicate that, for some positive constant \(c_0\), the asymptotic relationship \(\lim _{h\rightarrow \infty } f_1(h) / f_2(h) = c_0\) holds. For an isotropic covariance function \(\varphi \), if for some \(\beta \in (0,2)\) we have \(\varphi (h) \sim h^{\beta }\), \(h\rightarrow \infty \), then the process is said to have a long memory with Hurst effect (parameter) H that is equal to \(H= 1 -\beta /2\). If \(H \in (1/2,1)\), the covariance is called persistent, and if \( H \in (0,1/2)\), the covariance is called anti-persistent.

Proposition 2

Let Z be a Gaussian random field with covariance function of the form (3.4). Then, Z is \(\kappa \)-times mean square differentiable if and only if \(\nu _2 > \kappa \ge 0\). Moreover, it is true that \({\widetilde{\varphi }}_{{{\mathscr {C}}}{{\mathscr {M}}}}(h; \varvec{\lambda },\varvec{\omega },\varvec{\xi }) \sim h^{-\nu _1}\), \(h\rightarrow \infty .\) Hence, the Hurst parameter associated with Z is solely indexed by the parameter \(\nu _1\).

Proof

Arguments in chapter 2 of Stein (1999) show that an isotropic random field with covariance function \(\varphi \) is \(\kappa \)-times mean square differentiable if and only if \(\varphi ^{(2\kappa )}(0;\varvec{\lambda })\) exists and is finite. See also Adler (2010). In turn, a direct application of dominated convergence proves that

$$\begin{aligned} \varphi ^{(2\kappa )}(0; \varvec{\lambda }) = \int _{0}^{\infty } \Big ( \frac{\text {d}^{2\kappa }}{\text {d} h^{2\kappa }} \exp \left( - uh^2\right) \Big |_{h=0} \Big ) g(u;\varvec{\lambda })\text {d}u, \end{aligned}$$

which in turn proves that \(\varphi ^{(2\kappa )}(0; \varvec{\lambda })\) is well defined if and only if

$$\begin{aligned} \int _0^\infty u^\kappa g(u;\varvec{\lambda })\text {d}u < \infty . \end{aligned}$$
(3.10)

We use the latter argument for the special case of the function \({\widetilde{\varphi }}_{{\mathscr {C}\mathscr {M}}}\), for which the tail of the resulting mixing function is uniquely determined by the mixing function associated with \(\varphi _{{{{\mathscr {M}}}}}^{(2)}\) as in Proposition 1. Direct inspection shows that (3.10) is true if and only if \(\nu _2 > \kappa \). The first part of the proposition is established.

For the second part, note that (3.5) behaves as \(h^{-\nu _1}\), as \(h\rightarrow \infty \), because the lower incomplete gamma function involved in such an equation tends to \(\varGamma (\nu _1/2)\), and the Cauchy class with parameter \(\nu _1\) decays as \(h^{-\nu _1}\). The result follows by noting that (3.6) is dominated by the traditional Matérn model, which decays exponentially. \(\square \)

To wrap up, the hybrid Cauchy–Matérn model allows us to index both the mean square differentiability and long-term behavior of the associated Gaussian random field. We also note that these properties are independently addressed by the two parameters \(\nu _1\) and \(\nu _2\), and hence those parameters are statistically identifiable and allow us to decouple local and global properties.

From a statistical viewpoint, a parsimonious choice may be considered by setting \(\omega _1 = \omega _2=\omega \), \(\alpha _1 = \alpha _2=\alpha \) and \(\xi _1=\xi _2=\xi \). Thus, we obtain that Proposition 1 provides a five-parameter family where \(\omega \) indexes the variance, \(\alpha \) the scale, \(\nu _2\) the mean square differentiability, and \(\nu _1\) the Hurst effect, whereas \(\xi \) is a parameter that balances the shapes of the marginal structures involved in this model. Hence, (3.4) generalizes the Matérn model in that it allows for polynomial decay while indexing continuously mean square differentiability.

Figure 1 shows the parsimonious hybrid Cauchy–Matérn model for different values of \(\xi \). The traditional Matérn and traditional Cauchy, as well as their average, which are also special cases of the hybrid construction, are reported for comparison purposes. Note that the curves have a linear or parabolic decay near the origin according to \(\nu _2=1/2\) or \(\nu _2=3/2\), respectively, and then the decay is more gradual (polynomial rate) for large distances according to \(\nu _1\), which is consistent with the local and global patterns that coexist. We observe that \(\xi \) has a manifest impact on the shape of the covariance function, as it produces some interesting forms (apparent changes of concavity) that could be useful in practice.

Fig. 1
figure 1

Parsimonious hybrid Cauchy–Matérn model for \(\omega =1/2\), \(\alpha =1/8\), \(\nu _1=3/4\), and different values of \(\xi \). (Left) \(\nu _2=1/2\) and (Right) \(\nu _2=3/2\). The dashed lines represent the purely Cauchy, purely Matérn, and their average. All the models have been appropriately rescaled in order to obtain correlation functions

3.3 A Hybrid Hole-Effect–Matérn Class

We now present a hybrid class of covariance functions, with local attributes of the Matérn type, obtaining negative values at large distances. We use the acronym \({{\mathscr {H}}}{{\mathscr {M}}}\) for this model, termed hybrid hole-effect–Matérn. The proposed class comes from the mixture (3.3), where \(\phi _1\) is chosen in such a way that the resulting model can take negative values.

Proposition 3

Let \(\varvec{\lambda } = [\varvec{\lambda }_1^\top ,\varvec{\lambda }_2^\top ]^\top \), where \(\varvec{\lambda }_i=[\alpha _i,\nu _i]^\top \), \(\varvec{\omega } = [\omega _1,\omega _2]^\top \) and \(\varvec{\xi } = [{\xi }_1,\xi _2]^\top \) are vectors having positive elements, and \(\varvec{\vartheta } = [\tau ,\eta ]^\top \) is a vector of additional parameters. Let

$$\begin{aligned} {\widetilde{\varphi }}_{{{\mathscr {H}}}{{\mathscr {M}}}}(h; \varvec{\lambda },\varvec{\omega },\varvec{\xi },\varvec{\vartheta }) = \omega _1 \, {\widetilde{\varphi }}_{{\mathscr {H}}}^{\, (1)}(h; \varvec{\lambda }_1,\xi _1,\varvec{\vartheta }) + \omega _2 \, {\widetilde{\varphi }}_{{\mathscr {M}}}^{\, (2)}(h; \varvec{\lambda }_2,\xi _2), \qquad h \ge 0,\nonumber \\ \end{aligned}$$
(3.11)

where

$$\begin{aligned} {\widetilde{\varphi }}^{\, (1)}_{{\mathscr {H}}}(h; \varvec{\lambda }_1,\xi _1,\varvec{\vartheta }) = \frac{\tau }{\varGamma (\nu _1)} \varGamma \left( \nu _1; \frac{1}{4\xi _1\alpha _1^2}; \frac{\eta h^2}{4\alpha _1^2}\right) - \frac{1}{\varGamma (\nu _1)} \varGamma \left( \nu _1; \frac{1}{4\xi _1 \alpha _1^2}; \frac{h^2}{4 \alpha _1^2}\right) ,\nonumber \\ \end{aligned}$$
(3.12)

and \({\widetilde{\varphi }}^{\, (2)}_{{\mathscr {M}}}\) as in (3.6). Then, \({\widetilde{\varphi }}_{{{\mathscr {H}}}{{\mathscr {M}}}}\) is positive definite in \({\mathbb {R}}^d\) if and only if \(1<\eta < \tau ^{2/d}\).

Proof

We consider the construction (3.3), with both \(g_1\) and \(g_2\) of the form (3.8), and \(\phi _2\) of Gaussian type. Thus, the derivation of \({\widetilde{\varphi }}^{\, (2)}_{{\mathscr {M}}}\) follows the same arguments employed in the proof of Proposition 1. Before deriving (3.12), let us introduce the following lemma, which is a combination of Corollaries 4, 8, and 11 in Posa (2023).

Lemma 1

The mapping \(h\mapsto A \exp (-a h^2) - B \exp (-b h^2)\) is positive definite in \({\mathbb {R}}^d\) if and only if

$$\begin{aligned} 1< \frac{a}{b} <\left( \frac{A}{B} \right) ^{2/d}. \end{aligned}$$
(3.13)

The proof of this lemma relies on Bochner’s theorem. Specifically, under condition (3.13), Posa (2023) proved that the spectral density of \(h\mapsto A \exp (-a h^2) - B \exp (-b h^2)\) is nonnegative for almost all frequencies. Although Posa (2023) focused on dimensions \(d\le 3\), the same proof can be used in arbitrary dimensions. To obtain the expression (3.12), we take the following covariance kernel in the first segment of the scale mixture

$$\begin{aligned} \phi _1(h; u, \varvec{\vartheta }) = \tau \exp (-u\eta h^2) - \exp (-uh^2), \qquad h\ge 0. \end{aligned}$$
(3.14)

Lemma 1 ensures that (3.14) is positive definite in \({\mathbb {R}}^d\), provided that \(u>0\) and \(1< \eta <\tau ^{2/d}\). Thus,

$$\begin{aligned} {\widetilde{\varphi }}^{\, (1)}_{{\mathscr {H}}}(h; \varvec{\lambda }_1,\xi _1,\varvec{\vartheta })= & {} \tau \int _0^{\xi _1} \exp (-u\eta h^2) g_{{\mathscr {M}}}(u;\varvec{\lambda }_1) \text {d}u\nonumber \\{} & {} - \int _0^{\xi _1} \exp (-u h^2) g_{{\mathscr {M}}}(u;\varvec{\lambda }_1) \text {d}u. \end{aligned}$$
(3.15)

Finally, we invoke the identity (3.9), and we apply it to each integral involved in the right-hand side of Eq. (3.15). \(\square \)

The covariance function (3.14) always takes negative values (Posa 2023), so it is a natural building block to achieve hybrid models with the hole effect. Figure 4a in Posa (2023) can assist the reader in gaining insight into the behavior of this mapping for different combinations of parameters. The parameters in \(\varvec{\vartheta }\) are responsible for the sharpness of the hole effect. More precisely, as \(\eta \) approaches \(\tau ^{2/d}\), the hole effect is more pronounced because the positive term the right-hand side of (3.14) has less dominance. Moreover, when \(d=1\), we have the least restrictive condition on \(\eta \), and the resulting hole effect is more marked. It is well known that the possibility for significant negative correlations vanishes as the dimension increases (see, e.g., p. 45 in Stein, 1999).

The next proposition characterizes the local attributes of (3.11) and provides a lower bound for this model.

Proposition 4

Let Z be a Gaussian random field with covariance function of the form (3.11). Then, Z is \(\kappa \)-times mean square differentiable if and only if \(\nu _2 > \kappa \ge 0\). Moreover, we have the lower bound

$$\begin{aligned} {\widetilde{\varphi }}_{{{\mathscr {H}}}{{\mathscr {M}}}}(h; \varvec{\lambda },\varvec{\omega },\varvec{\xi },\varvec{\vartheta }) \ge \omega _1 (\tau \eta )^{-1/(\eta -1)} \left( \frac{1 - \eta }{\eta }\right) \left( 1 - \frac{\gamma (\nu _1;\alpha _1/\xi _1)}{\varGamma (\nu _1)} \right) , \qquad h\ge 0.\nonumber \\ \end{aligned}$$
(3.16)

Proof

The fact that \(\nu _2\) controls the mean square differentiability is a direct consequence of the arguments used in the proof of Proposition 2. On the other hand, to find a lower bound, we note that

$$\begin{aligned} {\widetilde{\varphi }}_{{{\mathscr {H}}}{{\mathscr {M}}}}(h; \varvec{\lambda },\varvec{\omega },\varvec{\xi },\varvec{\vartheta })\ge & {} \omega _1 \, \inf _{h\ge 0} {\widetilde{\varphi }}_{{\mathscr {H}}}^{\, (1)}(h; \varvec{\lambda }_1,\xi _1,\varvec{\vartheta }) + \omega _2 \, \inf _{h\ge 0} {\widetilde{\varphi }}_{{\mathscr {M}}}^{\, (2)}(h; \varvec{\lambda }_2,\xi _2)\\= & {} \omega _1 \int _0^{\xi _1} \inf _{h\ge 0} \phi _1(h; u, \varvec{\vartheta }) g_1(u;\varvec{\lambda }_1)\text {d}u. \end{aligned}$$

In the second line, we employed the fact that the infimum of the Gaussian covariance kernel, and consequently of \({\widetilde{\varphi }}_{{\mathscr {M}}}^{(2)}\), equals zero. A straightforward calculation shows that \(\phi _1\) attains its lowest value at \(h^*= \sqrt{\frac{\log (\tau \eta )}{u(\eta -1)}}\). Thus,

$$\begin{aligned}{} & {} \phi _1(h; u, \varvec{\vartheta }) \ge \phi _1(h^*; u, \varvec{\vartheta }) = \tau \exp \left( - \frac{\eta \log (\tau \eta )}{\eta -1}\right) - \exp \left( - \frac{\log (\tau \eta )}{\eta -1} \right) \\{} & {} \quad = (\tau \eta )^{-1/(\eta -1)} \left( \frac{1 - \eta }{\eta }\right) . \end{aligned}$$

Since \(g_1\) is given by (3.8), we invoke the formula of the cumulative distribution function of an inverse gamma random variable to establish that

$$\begin{aligned} \int _{0}^{\xi _1} g_{1}(u;\varvec{\lambda }_1) \text {d}u = 1 - \frac{\gamma (\nu _1;\alpha _1/\xi _1)}{\varGamma (\nu _1)}. \end{aligned}$$

This completes the proof. \(\square \)

Note that as \(\xi _1\rightarrow \infty \) (i.e., as the hole effect predominates), the lower bound in Eq. (3.16) decreases to \((\tau \eta )^{-1/(\eta -1)} (1-\eta )/\eta \). On the contrary, as \(\xi _1\rightarrow 0\), such a bound increases to zero, that is, the hole effect becomes negligible, which is not surprising, because in such a case the Matérn class is predominant. A similar conclusion can be obtained in the limit case \(\eta \rightarrow 1\).

A parsimonious variant of this model consists in taking \(\omega _1 = \omega _2=\omega \) (variance parameter), \(\alpha _1=\alpha _2=\alpha \) (scale parameter), and \(\nu _1=\nu _2=\nu \) (smoothness parameter), whereas \(\varvec{\vartheta }\) regulates the hole effect (as discussed above) and \(\xi _1=\xi _2=\xi \) has a similar interpretation as in the hybrid Cauchy–Matérn model.

Figure 2 shows the parsimonious hybrid hole-effect–Matérn model for different values of \(\xi \). The limit cases described in Remark 1 are also reported, in a similar fashion as in Fig. 1. It can be seen that negative values coexist with different levels of smoothness at the origin, as expected.

Fig. 2
figure 2

Parsimonious hybrid hole-effect–Matérn model in dimension 1, for \(\omega =1/2\), \(\alpha =1/8\), \(\tau = 2\), \(\eta =7/2\), and different values of \(\xi \). (Left) \(\nu =1/2\) and (Right) \(\nu =3/2\). The dashed lines represent the limit cases reported in Remark 1. All the models have been appropriately rescaled in order to obtain correlation functions

4 Numerical Experiments

4.1 Simulated Data

We conduct simulation studies to assess the performance of maximum likelihood inference when a hybrid covariance structure is present. We focus on the parsimonious hybrid Cauchy–Matérn dependence structure, as it will be applied to real data in the next section. We consider \(\omega = 1\), \(\alpha =1/8\), \(\nu _1=3/4\) and the following scenarios for \([\nu _2,\xi ]\): (a) [1/2, 40], (b) [1/2, 120], (c) [3/2, 40], and (d) [3/2, 120]. The choice to explore these scenarios is motivated by the significance of the unconventional parameter \(\xi \) within our formulation. Analyzing different values of \(\xi \) is of particular interest, while \(\nu _2=1/2,3/2\) align with the typical and more realistic choices when examining spatial data.

All our numerical experiments were conducted using R software. For each scenario, we simulate 200 independent realizations of a Gaussian random field on 100 uniformly sampled points in the square \([0,3]^2\) and estimate the parameters through maximum likelihood. We then repeat the experiment with 256 spatial locations. We only estimate \(\omega \), \(\alpha \), and \(\xi \), whereas \(\nu _1\) and \(\nu _2\) are fixed, which is a common practice in geostatistics. Instead of directly estimating \(\xi \), we consider the alternative parameterization \({\widetilde{\xi }} = \sqrt{\xi } \alpha \), which seems to be a natural choice according to Eqs. (3.5) and (3.6). To sum up, for each scenario and simulated sample, we estimate the vector of parameters \([\omega ,\alpha ,{\widetilde{\xi }}]\). The estimates are obtained by maximizing the likelihood function numerically using the default Nelder–Mead method (Nelder and Mead 1965), which is a direct search method known for its effectiveness in nonlinear optimization problems. In all our experiments, it successfully converges to a solution, and we have not encountered any issues with the method becoming degenerate. The initial values for initializing the algorithm are randomly selected within a broad interval around the true values of the parameters. The computation times for evaluating the objective function are standard in the context of maximum likelihood inference in spatial statistics. In this study, we have not experienced the computational challenges of the cubic computational order of maximum likelihood, as we worked with moderate sample sizes. For a more detailed discussion of these computational aspects, we refer the reader to Bevilacqua and Gaetan (2015), where several likelihood-based methods are analyzed from computational and statistical perspectives.

Figure 3 displays the results. The estimates are approximately unbiased, and the variance decreases as the sample size increases from 100 to 256, which is an expected behavior. The variability in the estimates decreases substantially in scenarios (c) and (d), that is, when the random field is smoother, which is a typical attribute of likelihood-based estimates in this context (Bevilacqua and Gaetan 2015). On the contrary, such variability deteriorates as \(\xi \) increases from 40 to 120. Figure4 shows the log-likelihood in terms of \(\xi \) and \(\alpha \), with fixed \(\omega \), for a single realization of the random field under scenario (b). Although the surface has a clear maximum value, the objective function is apparently more flat in the direction of \(\xi \). This could explain the increased variability in scenarios (b) and (d) with respect to (a) and (c). Despite the previous remarks, in general, the estimates appear to be reasonable in each scenario, and no identifiability issues are observed.

Fig. 3
figure 3

Centered boxplots of the maximum likelihood estimates for the parsimonious hybrid Cauchy–Matérn model in scenarios (a)–(d)

Fig. 4
figure 4

Log-likelihood function, with respect to \(\alpha \) and \(\xi \) for scenario (b). Left and right panels correspond to the same plot from different viewpoints

We now explore the predictive performance of the proposed class through a cross-validation analysis. We simulate 200 independent realizations on 100 uniformly sampled locations in \([0,3]^2\) according to scenarios (a) to (d) described above. We assess the accuracy through a leave-one-out prediction strategy in terms of the mean squared error (MSE), mean absolute error (MAE), log-score (LSCORE), and continuous ranked probability score (CRPS) (see Zhang and Wang 2010). Small values of these indicators suggest superior predictions. We evaluate the performance of the hybrid Cauchy–Matérn model, using the generalized Cauchy class as benchmark. Thus, for each realization, we estimate the parameters with both models and proceed to make the predictions through a simple kriging approach. The generalized Cauchy model (2.4) has been augmented with a multiplicative parameter \(\omega \), namely \(h \mapsto \omega (1+ h^\delta /\alpha )^{-\nu /\delta }\), so it is parameterized by \(\omega \) and \({\alpha }\), and \(\nu =3/4\) and \(\delta =1,2\) are fixed.

Table 1 shows that in each scenario, the proposed hybrid model outperforms its competitor. All the cross-validation scores decrease substantially in scenarios (c) and (d). From this brief study, we observe that when the true underlying covariance has a hybrid structure, an incorrect specification of the spatial association has a negative impact on the posterior predictions. Since the behavior of an isotropic covariance function near the origin has a strong impact on the quality of predictions (Stein 1999), our simulation experiment suggests that in some circumstances the local shape of the proposed model cannot be replicated by other appealing existing structures.

Table 1 Cross-validation scores for the parsimonious hybrid Cauchy–Matérn and generalized Cauchy (with \(\delta =1,2\)) models in scenarios (a)–(d)

4.2 A Real Data Illustration

The estimation of recoverable resources is a task of fundamental importance in modern mining processes. A sound evaluation of such resources is crucial from an economic viewpoint and is critical for assessing the long-term availability of mineral resources and its impact on society. Geostatistical models offer a valuable framework for addressing this challenge by considering the spatial distribution and inherent variability of these resources. This approach enables well-informed decision-making for resource management and operational planning. Next, we will investigate how the versatility of the models outlined in this manuscript can contribute to obtaining more accurate resource estimations, thereby facilitating a more precise analysis.

We consider a data set from a lateritic nickel deposit mined by open pit in Colombia, which contains measurements of the grades of nickel, iron, chrome, alumina, magnesia, and silica. This study focuses on nickel concentrations that are placed at an elevation of about 120 m, where 199 irregularly spaced observations are available. We apply a log transformation to reduce the skewness, and then the sample mean is subtracted. Figure 5 displays histograms of the original and transformed data. The resulting values are approximately Gaussian. The left panel of Fig. 7 shows the transformed data set.

Fig. 5
figure 5

Histograms of the original (left) and transformed (right) data

We fit two covariance models: the first is the parsimonious hybrid Cauchy–Matérn, parameterized by \(\omega \), \({\alpha }\), and \({\widetilde{\xi }}\), with fixed \(\nu _1=1/4\) and \(\nu _2=1/2\), and the second is the generalized Cauchy, parameterized as in Sect. 4.1, with fixed \(\nu =1/4\) and \(\delta =0.95\). The values of the fixed parameters were selected after some experimental trials, taking into account the local behavior of the sample covariance (see Fig. 6). Table 2 reports the likelihood estimates, with the corresponding standard errors, and the Akaike information criterion (AIC). We observe that the hybrid Cauchy–Matérn model outperforms its competitor in terms of the AIC. Figure 6 (left panel) shows that the fitted covariance models seem to be reasonably close to the sample covariance. The fitted models differ substantially near the origin (distances less than 3 m), since the hybrid model decays more quickly. On the contrary, for larger distances, the hybrid model decays more slowly, although the difference between the curves becomes slight for distances greater than 15 m. In brief, this graph clearly shows a significant break in the hybrid model’s curve, in contrast to its competitor, which maintains a fixed structure, thereby making it a very versatile model. In Fig. 6 (right panel), we also offer a detailed comparison of the fitted models, with a specific focus on distances less than 8 m. This range holds significant importance for predictions. Visualize this by considering a circle with an 8-m radius centered at each spatial location; it contains numerous data points. Any disparities between the models within this range can significantly impact their predictive accuracy, as kriging predictions are highly dependent on the neighboring data points. We will delve deeper into the prediction problem below.

Table 2 Parameter estimates and Akaike information criterion (AIC) of fitted covariance models
Fig. 6
figure 6

(Left) Sample (circles) and modeled (solid lines) covariances of log-nickel concentrations. (Right) More detailed illustration of the fitted models for distances less than 8 m

In order to compare the models in terms of predictive performance, we conduct a cross-validation study based on simple kriging, in a similar fashion to the experiments performed with simulated data. Table 3 shows evidence, based on a leave-one-out cross-validation scheme, that the hybrid model performance is better for this specific data set. In percentage terms, the MSE shows an improvement of approximately \(3.4\%\). The largest difference occurs when we compare the LSCOREs (about \(9\%\) improvement). We conclude this section with an illustration of a downscaled map of log-nickel concentrations (see Fig. 7), using the hybrid Cauchy–Matérn model. The interpolated spatial map, which is obtained through simple kriging, is exhibited on a spatial grid of approximately 1 m (7, 500 locations). This kriged surface could be useful in small-scale mining processes, as it is a crucial step for industrial exploration and quantifying mineral reserves. More precisely, it could play a pivotal role in directing mining operations, optimizing resource allocation, and ensuring the efficient extraction of minerals, thus making a substantial contribution to the overall success of the mining process.

Fig. 7
figure 7

Log-nickel concentrations (left), with the kriged surface (middle) and the corresponding variance (right)

Table 3 Scores for the leave-one-out cross-validation study of log-nickel concentrations

5 Conclusions and Perspectives

We introduced a simple formalism to build sophisticated parametric families of covariance functions. We focused on a combination of the Matérn and Cauchy models, where local (mean square differentiability) and global (long memory) properties coexist in a single family. We have also illustrated the use of our methodology by constructing a model that behaves as the Matérn class at short distances and obtains negative values at large distances. Simulation studies show that a parsimonious hybrid Cauchy–Matérn model has statistically identifiable parameters. Also, this model provides improvements in terms of predictive performance relative to existing models, when a hybrid inherent dependence structure is present. We reach similar conclusions when we apply this methodology to a mining dataset. While similar numerical studies could be performed for the hybrid hole-effect–Matérn model, we avoid them for the sake of simplicity and brevity. Additional interesting extensions of this work can be tackled in future investigations. We now provide two concrete research lines that could emerge from this work.

In many practical situations, two or more variables are simultaneously recorded. Thus, our findings can be generalized to the case of multivariate fields \(\{ \textbf{Z}(\textbf{s})= (Z_1(\textbf{s}),\ldots , Z_p(\textbf{s}))^{\top }, \; \textbf{s} \in {\mathbb {R}}^d \}\), having an isotropic matrix-valued covariance function \(\varPhi : [0,\infty ) \rightarrow {\mathbb {R}}^{p \times p}\), that is, \( \textrm{cov}[Z_i(\textbf{s}),Z_j(\textbf{s}')] = \varPhi _{ij}(h)\), \(h\ge 0\), where \(h=\Vert \textbf{s} - \textbf{s}'\Vert \) and \(i,j=1,\ldots ,p\). We propose the hybrid model

$$\begin{aligned} {\widetilde{\varPhi }}(h; \varvec{\lambda },\varvec{\omega },\varvec{\xi }) = \omega _1 \int _{0}^{\xi _1} \exp (-uh^2) \textbf{G}_1(u;\varvec{\lambda }_1) \text {d}u + \omega _2 \int _{\xi _2}^{\infty } \exp (-uh^2) \textbf{G}_2(u;\varvec{\lambda }_2) \text {d}u, \end{aligned}$$

which generalizes (3.1), where the vectors of parameters \(\varvec{\lambda }_i\) must be chosen in such a way that the \(p\times p\) matrices \(\textbf{G}_i(u;\varvec{\lambda }_i)\) are positive semi-definite for every fixed \(u \ge 0\). Hence, a straight application of Proposition 4 in Porcu and Zastavnyi (2011) would ensure that \({\widetilde{\varPhi }}\) is positive semi-definite. A multivariate version of the hybrid Cauchy–Matérn covariance function is a natural candidate. The works of Gneiting et al. (2010) and Moreva and Schlather (2022) are relevant to tackle this challenge. A multivariate version of the formulation (3.3) could be similarly deduced.

For random fields that are indexed by the d-dimensional unit sphere, \({\mathbb {S}}^d\), which is a useful framework when analyzing global data (\({\mathbb {S}}^2\) is used as an approximation of the Earth), the isotropy assumption is given by \(\textrm{cov}[Z(\textbf{s}),Z(\textbf{s}')] = \psi (\theta )\), \(\textbf{s},\textbf{s}'\in {\mathbb {S}}^d\), where \(\psi :[0,\pi ]\rightarrow {\mathbb {R}}\) is a continuous mapping and \(\theta = \arccos (\textbf{s}^\top \textbf{s}') \in [0,\pi ]\) is the geodesic distance. Schoenberg’s characterization (Schoenberg 1942) establishes that a parametric isotropic covariance function \(\psi (;\varvec{\lambda })\) is valid in any dimension d if and only if it can be written as \({\psi }(\theta ; \varvec{\lambda }) = \sum _{\ell =0}^\infty \beta _{\ell }(\varvec{\lambda }) (\cos \theta )^{\ell }\), \(\theta \in [0,\pi ]\), for some nonnegative and summable parametric sequence \(\{\beta _{\ell }(\varvec{\lambda }) \}_{\ell =0}^\infty \). Thus, the hybrid models can be adapted to the spherical context by considering a modified sequence of the form

$$\begin{aligned} {\widetilde{\beta }}_{\ell }(\varvec{\lambda },\varvec{\omega },\varvec{\xi }) = \omega _1 \, \beta _{\ell }^{(1)}(\varvec{\lambda }_1) 1_{ [0, \left\lfloor \xi _1 \right\rfloor ) }(\ell ) + \omega _2 \, \beta _{\ell }^{(2)}(\varvec{\lambda }_2) 1_{ [ \left\lfloor \xi _{2} \right\rfloor , \infty ) }(\ell ), \qquad \ell =0,1,\ldots , \end{aligned}$$

where \( \left\lfloor \xi _{i} \right\rfloor \ge 0\) for \(i=1,2\), with \(\left\lfloor \cdot \right\rfloor \) standing for the floor function and \(\beta _{\ell }^{(i)}\) being a nonnegative and summable sequence. The local properties of spherically indexed random fields and their connections with the covariance function have been studied in past works (Bingham 1973; Guinness and Fuentes 2016). However, global properties such as long memory are less intuitive in this scenario, as the spatial domain is a compact set. Covariance functions with the hole effect for low-dimensional spheres could be obtained by adapting formulation (3.3).