1 Introduction

Structural geology is partly concerned with the characterization of geological bodies and structures. From the millimetric to the kilometric scale, these descriptions help in understanding the local and regional geological settings (Fossen 2016). The geometries of folds and geological discontinuities, such as faults and unconformities, are key for developing an understanding of the subsurface geology, which is essential to a range of applications such as the estimation of natural resources.

Structural modeling intends to accurately reproduce the geometry of geological structures with a numerical model (Caumon et al. 2009). The quality of the data acquired on the field depends on acquisition tools, operator skills and rock exposure. Field observations are then interpreted by geologists and geophysicists to obtain the spatial points, lines and vectors used as input data in structural modeling. Depending on both acquisition and interpretation, these numerical inputs may be noisy, sparse and unevenly sampled (both scattered and clustered) (Carmichael and Ailleres 2016; Houlding 1994; Mallet 1992, 1997, 2002). Reproducing complex geological structures from such data requires simplifications and empirical rules based on analog structures.

Implicit structural modeling algorithms have drawn significant attention during the past 30 years (Caumon et al. 2013; Hillier et al. 2014; Lajaunie et al. 1997; Mallet 1988, 2014). They represent a model by an implicit function, also called a stratigraphic function, defined on the entire volume of interest. A horizon, which is an interface between two stratigraphic layers, is given by a single iso-value of the stratigraphic function, and a structural discontinuity, such as a fault or an unconformity, corresponds to a discontinuous jump in the function. The advantage of algorithms building implicit functions is that they take all data constraining the geological structures into account at once, and without involving projections. Geological structures can, therefore, be interpolated and extrapolated away from the data everywhere in the studied area.

The discrete smooth interpolation (DSI) is a class of explicit (Mallet 1992, 1997, 2002) and implicit (Frank et al. 2007; Mallet 1988, 2014; Souche et al. 2014) methods to construct structural models, which is well known in the oil and gas industry [software: SKUA-Gocad by Paradigm (2018) and volume based modeling by Schlumberger (2018)]. The implicit DSI variant discretizes the stratigraphic function on a volumetric mesh. The function’s coefficients are centered on the mesh vertices and are linearly interpolated within the mesh elements.

By assuming that the expected model should be as smooth as possible, DSI introduces a global roughness factor minimized in the least squares sense. By locally changing the weight of the roughness, it is possible to control the model features away from the data. Caumon (2009) and Caumon et al. (2013) generate kink folds with this principle. This roughness factor has been formulated with constant gradient equations (Frank et al. 2007) or smooth gradient equations (Souche et al. 2014). As this roughness factor has only been described discretely, its relationship with continuous physical principles remains unclear.

In DSI, the implicit function is continuous on the mesh elements. Therefore, the mesh elements should not intersect structural discontinuities (i.e., the triangles of the discontinuities should be faces of the mesh elements). While unstructured meshes can efficiently handle discontinuities, their construction represents a challenge in some geological settings (Karimi-Fard and Durlofsky 2016; Pellerin et al. 2014). For instance, mesh algorithms cannot always ensure creating a mesh in which the gradient of the implicit function can be correctly computed (Shewchuk 2002), especially when dealing with fault networks of complex geometries and intersections. It may also be computationally and memory expensive and may require significant user interactions. Moreover, DSI results depend on the mesh and its quality (Laurent 2016).

The potential field method (PFM) (Calcagno et al. 2008; Chilès et al. 2004; Cowan et al. 2003; Lajaunie et al. 1997) is another class of numerical methods that can be used to create implicit structural models and which is well known in the mining industry [software: Geomodeller by Intrepid-Geophysics (2018) and LeapFrog by ARANZ Geo (2018)]. PFM can be formulated as a dual cokriging interpolation or as a radial basis functions interpolation [Hillier et al. (2014), based on Matheron (1981)’s proof of equivalence between kriging and splines]. Some radial basis functions link PFM to physical principles. For instance, the thin plate splines (Duchon 1977) are the Green’s functions of the bending energy, which means that an implicit function defined in a PFM scheme by a sum of thin plate splines [as in Jessell et al. (2014)] intrinsically minimizes the bending energy (also called thin plate energy) (Dubrule 1984; Wahba 1990).

In PFM, no mesh is involved in the computation of the implicit function, although the results are generally evaluated on a grid for visualization. Instead, the interpolation is supported by the data and their position: each data point is associated with an interpolant and a coefficient. The implicit function is thus dependent on the data distribution and the range of influence of the chosen interpolants. With global interpolants, such as thin plate splines, the data coefficients have an influence everywhere in space. With local interpolants, such as compactly supported radial basis functions (Wendland 1995; Wu 1995), the data coefficients have a restricted influence centered on their position. In structural modeling, defining the implicit function everywhere in the domain of study is a requirement, as horizons are supposed to be infinite surfaces also existing in wide areas where data are missing. This can be ensured even with local interpolants in PFM by a global polynomial drift added to the solution. However, the transition between sampled areas, influenced by local interpolants, and empty areas, defined by the drift alone, often leads to high curvature artifacts in the solution. Therefore, although local interpolants are common in PFM, such as the cubic covariance (Aug 2004; Calcagno et al. 2008; De la Varga et al. 2019), they are used like global interpolants in structural modeling applications, with a range of influence scaled on the domain’s dimensions. This creates a dense system whose size increases with the number of data and becomes unsolvable for more than a few thousand data without optimization techniques (Cowan et al. 2003; Cuomo et al. 2013; Yokota et al. 2010).

Specific treatments as introduced in Marechal (1984) are given to the discontinuities in PFM (Calcagno et al. 2008; Chilès et al. 2004). The faults are handled by enriching the implicit function with polynomial drifts weighted by jump functions. This implies the definition of a fault zone or a fault influence radius for the jump functions, which is difficult to determine (Godefroy et al. 2018) and may require many manual interactions. The stratigraphic unconformities are handled by computing an implicit function for each conformable stratigraphic series, and by using Boolean operations to reconstruct a unique model.

In this article, an algorithm is suggested to build implicit functions using locally defined moving least squares interpolants. Those interpolants are centered on regularly sampled nodes and ensure the definition of the implicit function everywhere in the domain. The discontinuities are handled with meshless techniques, limiting user interactions even in complex cases. The proposed framework also explicitly minimizes the bending energy, as a continuous regularization to solve the structural modeling problem. Everything is described in two dimensions for simplicity, but the formalism is adaptable in three dimensions.

The proposed method is described in Sect. 2. Section 3 emphasizes the ability of the method to handle some well-known issues with data inputs in structural modeling applications. Section 4 discusses the limits of the method, the proposed related solutions, and some perspectives.

2 Locally Based Structural Modeling with Meshless Concepts

2.1 Construction of the Implicit Function

In two dimensions, the implicit function \(u(\mathbf {x})\) is a function in the vector space V defined as

$$\begin{aligned} V = \left\{ u(\mathbf {x}) = \sum _{\mathrm{l}=1}^{N} \varPhi _\mathrm{l}(\mathbf {x}) \ u_\mathrm{l} = {\varvec{\Phi }}(\mathbf {x})^T \cdot \mathbf {U} \ | \ \mathbf {x} \in \varOmega \right\} , \end{aligned}$$
(1)

where \(\mathbf {x} (x, y)\) is a position in \(\mathbb {R}^2\), \(\varOmega \) is the domain of study, \({\varvec{\Phi }}^T = [\varPhi _{1}, \ldots , \varPhi _{N}]\) is a basis of linearly independent functions, \(\mathbf {U}^T = [u_{1}, \ldots , u_{N}]\) is a set of scalar coefficients, and N is the number of terms in \({\varvec{\Phi }}\) and \(\mathbf {U}\). This section explains how the N shape functions \(\varPhi _\mathrm{l}\) and coefficients \(u_\mathrm{l}\) are defined.

2.1.1 Implicit Function Coefficients: The Discretization of the Domain

For simplicity, the domain of study \(\varOmega \) is discretized regularly. The grid is not stored, but its N cell corner points are key to the method and are referred to as interpolation nodes in the remainder of the paper. The implicit function is constructed on these nodes: an interpolant \(\varPhi _\mathrm{l}\) and a coefficient \(u_\mathrm{l}\) are centered on each interpolation node \(\mathbf {x}_\mathrm{l}\). The number of interpolants and coefficients is, therefore, equal to the number of interpolation nodes N. The shape functions \(\varPhi _\mathrm{l}\) are linearly independent as long as the interpolation nodes have distinct coordinates, so the implicit function is uniquely defined by the coefficients \(u_\mathrm{l}\). In this paper, the moving least squares are used as interpolation functions and are described below.

2.1.2 Interpolation Functions: The Moving Least Squares

Weight functions The weight functions \(w_\mathrm{l}\) are key to define the moving least squares functions \(\varPhi _\mathrm{l}\). They are continuous functions centered on the interpolation nodes \(\mathbf {x}_\mathrm{l} (x_\mathrm{l}, y_\mathrm{l})\) and defined in a local area \(S_\mathrm{l}\) called support. In the presented method, they are defined with a rectangular support as

$$\begin{aligned} w_\mathrm{l}(\mathbf {x}) = w(q_{\mathrm{l}_x}) \ w(q_{\mathrm{l}_y}) = w\left( \frac{|x - x_\mathrm{l}|}{\rho _{\mathrm{l}_x}}\right) \ w\left( \frac{|y - y_\mathrm{l}|}{\rho _{\mathrm{l}_y}}\right) . \end{aligned}$$
(2)

The dilatation parameters \({{\rho }}_\mathrm{l} (\rho _{\mathrm{l}_x}, \rho _{\mathrm{l}_y})\) control the size of the supports \(S_\mathrm{l}\) and the normalized distances \(q_\mathrm{l} (q_{\mathrm{l}_x}, q_{\mathrm{l}_y})\). Here, a global and adimensional dilatation parameter \(\rho \) is used to define constant dilatation parameters scaled on the nodal spacing in each axis as

$$\begin{aligned} \begin{array}{lll} \left\{ \begin{array}{l} \rho _{\mathrm{l}_x} = \rho \frac{L_x (\varOmega )}{(N_x - 1)} \\ \rho _{\mathrm{l}_y} = \rho \frac{L_y (\varOmega )}{(N_y - 1)} \end{array} \right. ,&\,&l = \{1, \ldots , N\}, \end{array} \end{aligned}$$
(3)

where \(L_x(\varOmega )\) and \(L_y(\varOmega )\) are the lengths of the domain \(\varOmega \) and \(N_x\) and \(N_y\) are the number of interpolation nodes for each axis.

A wide range of weight functions have been defined in the literature (Fries and Matthias 2004). In this paper, the fourth order splines are used; they are highly derivable (\(C^2\)) and polynomial-based as

$$\begin{aligned} w(q_\mathrm{l}) \left\{ \begin{array}{ll} = 1 - 6q_\mathrm{l}^2 + 8q_\mathrm{l}^3 - 3q_\mathrm{l}^4 &{} \quad q_\mathrm{l} \le 1 \\ = 0 &{} \quad q_\mathrm{l} > 1 \end{array} \right. . \end{aligned}$$
(4)

Moving least squares (MLS) functions The MLS functions (Lancaster and Salkauskas 1981; McLain 1976) are locally defined meshless functions. They are constructed by approximating the implicit function u as a polynomial function of degree d with spatially varying coefficients. The V space from Eq. (1) is redefined as

$$\begin{aligned} V = \left\{ u(\mathbf {x}) = \sum _{\mathrm{l}=1}^{N} \varPhi _\mathrm{l}(\mathbf {x}) \ u_\mathrm{l} = \sum _{j=1}^{m} p_j(\mathbf {x}) \ a_j(\mathbf {x}) = {\mathbf {p}}^T (\mathbf {x}) \cdot {\mathbf {a}}(\mathbf {x}) \ | \ \mathbf {x} \in \varOmega \right\} , \end{aligned}$$
(5)

with m the number of monomials \(p_j\) and coefficients \(a_j\).

When performing a local approximation around \(\mathbf {x}\), in a least squares sense, the MLS shape functions \(\varPhi _\mathrm{l}\) are defined as

$$\begin{aligned} \varPhi _\mathrm{l}(\mathbf {x}) = w_\mathrm{l}(\mathbf {x}) {\mathbf {p}}^T(\mathbf {x}) \cdot [{\mathbf {A}}(\mathbf {x})]^{-1} \cdot {\mathbf {p}}(\mathbf {x}_\mathrm{l}) = {\varvec{\Gamma }}(\mathbf {x}) \cdot {\mathbf {B}}_\mathrm{l}(\mathbf {x}), \end{aligned}$$
(6)

where

$$\begin{aligned} {\mathbf {A}}(\mathbf {x}) = \sum _{j = 1}^{N} w_j(\mathbf {x}) {\mathbf {p}}(\mathbf {x}_j) \cdot {\mathbf {p}}^T(\mathbf {x}_j), \end{aligned}$$
(7)

and with \({\mathbf {A}} \cdot {\varvec{\Gamma }} = {\mathbf {p}}\) and \({\mathbf {B}}_\mathrm{l} = w_\mathrm{l}(\mathbf {x}) {\mathbf {p}}(\mathbf {x}_\mathrm{l})\). The matrix \({\mathbf {A}}\) is called the moment matrix. Appendix A gives the details of the construction of MLS functions, and Appendix B gives a geometrical illustration of a one-dimensional MLS interpolation. Some plots of these functions in one and two dimensions can be found in Nguyen et al. (2008) and Liu and Gu (2005). Other useful details on their construction can be found in Fries and Matthias (2004), and Nguyen et al. (2008), and useful empirical values for their parameters can be found in Liu and Gu (2005).

The MLS functions form a partition of unity (PU), which means that a function u defined as in Eq. (1) with MLS functions can exactly fit a constant field. This is a necessary property in the presented approach; without a PU, the solution would bend abnormally between the sample nodes. For instance, the compactly supported radial basis functions do not form a PU. They are thus not adapted to the method without further treatments [e.g., radial polynomial interpolation method, Liu and Gu (2005)].

Moving least squares (MLS) derivatives Solving a modeling problem requires an adapted degree of derivability of the implicit function and, therefore, of its linearly independent interpolation functions. The MLS functions have the same degree of continuity and derivability as the chosen weight functions. With fourth order splines (Eq. 4), they are \(C^2\).

The exact equations of MLS derivatives are complex and can be found in Fries and Matthias (2004). We use the approximation forms given in Belytschko et al. (1996) and Liu and Gu (2005) where the first order derivative is

$$\begin{aligned} \frac{\partial }{\partial i} \varPhi _\mathrm{l}(\mathbf {x}) = \frac{\partial }{\partial i} {\varvec{\Gamma }}^T \cdot {\mathbf {B}}_\mathrm{l} + {\varvec{\Gamma }}^T \cdot \frac{\partial }{\partial i} {\mathbf {B}}_{\mathrm{l}}, \end{aligned}$$
(8)

with \(\frac{\partial }{\partial i} \varGamma \) the solution of

$$\begin{aligned} {\mathbf {A}} \cdot \frac{\partial }{\partial i} {\varvec{\Gamma }} = \frac{\partial }{\partial i} {\mathbf {p}} - \frac{\partial }{\partial i} {\mathbf {A}} \cdot {\varvec{\Gamma }}, \end{aligned}$$
(9)

and where the second order derivative is written as

$$\begin{aligned} \frac{\partial ^2}{\partial ij} \varPhi _\mathrm{l}(\mathbf {x}) = \frac{\partial ^2}{\partial ij} {\varvec{\Gamma }}^T \cdot {\mathbf {B}}_\mathrm{l} + \frac{\partial }{\partial i} {\varvec{\Gamma }}^T \cdot \frac{\partial }{\partial j} {\mathbf {B}}_\mathrm{l} + \frac{\partial }{\partial j} {\varvec{\Gamma }}^T \cdot \frac{\partial }{\partial i} {\mathbf {B}}_\mathrm{l} + {\varvec{\Gamma }}^T \cdot \frac{\partial ^2}{\partial ij} {\mathbf {B}}_\mathrm{l}, \end{aligned}$$
(10)

with \(\frac{\partial ^2}{\partial ij} \varGamma \) the solution of

$$\begin{aligned} {\mathbf {A}} \cdot \frac{\partial ^2}{\partial ij} {\varvec{\Gamma }} = \frac{\partial ^2}{\partial ij} {\mathbf {p}} - \left( \frac{\partial }{\partial i} {\mathbf {A}} \cdot \frac{\partial }{\partial j} {\varvec{\Gamma }} + \frac{\partial }{\partial j} {\mathbf {A}} \cdot \frac{\partial }{\partial i} {\varvec{\Gamma }} + \frac{\partial ^2}{\partial ij} {\mathbf {A}} \cdot {\varvec{\Gamma }}\right) , \end{aligned}$$
(11)

where i and j represent the x or the y axis. With these equations, only the matrix \({\mathbf {A}}\) needs to be inverted to obtain all the derivatives, which thus does not require more computational effort than constructing the shape functions.

2.1.3 Visibility Criterion: A Modification of Interpolation Functions Handling Discontinuities

The discontinuities represent one of the main challenging problems in structural modeling. The discrete smooth interpolation creates meshes conformal to the discontinuities, and the potential field method defines fault zones with polynomials and jump functions, but both approaches become challenging when dealing with complex fault networks and may require heavy user interactions. Stratigraphic unconformities (discontinuities in the sedimentary record) may be addressed by several implicit functions (Calcagno et al. 2008; Chilès et al. 2004), which also requires properly editing the sequences beforehand. Therefore, both types of discontinuities are here addressed as input lines, and they are treated with the visibility criterion (Belytschko et al. 1994).

The implicit function u is defined by Eq. (1) as a sum of weighted continuous functions. The visibility criterion introduces discontinuities in the interpolation functions by truncating their local supports (Belytschko et al. 1994). The principle is as follows: each interpolation node emits light and the discontinuities are opaque to this light. For any point \(\mathbf {x}\), if a discontinuity happens to intersect a ray of light coming from a node \(\mathbf {x}_\mathrm{l}\), then \(\mathbf {x}_\mathrm{l}\) is not considered as a neighbor to \(\mathbf {x}\).

Discontinuous jumps in the implicit function are thus introduced by local intersection tests between segments and discontinuity objects. This approach has the potential to drastically reduce the user interactions to handle structural discontinuities compared to other modeling methods. There are two limitations of the visibility criterion: (I) it performs badly at discontinuity tips, and (II) it completely isolates fault blocks one from another. To solve the tip issue (I), Organ et al. (1996) introduce the diffraction and the transparency criteria. Such criteria can also be employed in the proposed framework, but the tip issue is generally negligible with a sufficient number of interpolation nodes, which is why the visibility criterion is used. Also, if necessary, these criteria may be adapted to solve the isolation issue (II) as shown in Appendix C. Illustrations of the effects of the different optic criteria on the shapes of the local supports \(S_\mathrm{l}\) can be found in Fries and Matthias (2004).

2.2 Solving the Structural Modeling Problem

Implicit structural modeling consists of finding the unknown values \(u_\mathrm{l}\) (Eq. 1) constructing an implicit function that represents the domain of study while honoring the input data. For this, the proposed method performs a spatial regression of \(N_\mathrm{D}\) data points (i.e., values \(\alpha _k\) at positions \(\mathbf {x}_k\), \(k = {1, \dots , N_\mathrm{D}}\)) with the bending energy penalization (or thin plate energy) (Dubrule 1984; Wahba 1990) by minimizing

$$\begin{aligned} J(u)= & {} \frac{1}{2} \int _\varOmega \lambda _\epsilon ^2 \left( \left( \frac{\partial ^2}{\partial _{xx}} u\right) ^2 + \left( \frac{\partial ^2}{\partial _{yy}} u\right) ^2 + 2\left( \frac{\partial ^2}{\partial _{xy}} u\right) ^2 \right) \hbox {d}\varOmega \nonumber \\&+ \frac{1}{2}\sum _{k=1}^{N_\mathrm{D}} \lambda _k^2 \ (u(\mathbf {x}_k) - \alpha _k)^2, \end{aligned}$$
(12)

with \(\lambda _\epsilon \) the bending penalization weight and \(\lambda _k\) the weight at a data point \(\mathbf {x}_k\). The use of other types of information and the choice of data values \(\alpha _k\) are further discussed in Appendix D.

To solve this problem, the domain \(\varOmega \) is regularly divided into N subdomains \(\varOmega _\mathrm{l}\) (i.e., \(\varOmega \subset \cup \varOmega _\mathrm{l}, \forall l \in N\), Sect. 2.1.1). For simplicity, a regular sampling is used so that all the subdomains \(\varOmega _\mathrm{l}\) have the same volume value \(\nu \) (i.e., the volume of a cell of the regular sampling) and are centered on the nodes \(\mathbf {x}_\mathrm{l}\). The integration term is approximated as constant in each subdomain \(\varOmega _\mathrm{l}\) and evaluated at the center node \(\mathbf {x}_\mathrm{l}\) (i.e., Gauss quadrature with one point).

The least squares system corresponding to the minimization of Eq. (12) is

$$\begin{aligned} \begin{bmatrix} \lambda _\epsilon \sqrt{\nu }\partial _{xx}^2 \varPhi _1^1&...&\lambda _\epsilon \sqrt{\nu }\partial _{xx}^2 \varPhi _N^1\\ \lambda _\epsilon \sqrt{\nu }\partial _{yy}^2 \varPhi _1^1&...&\lambda _\epsilon \sqrt{\nu }\partial _{yy}^2 \varPhi _N^1\\ \lambda _\epsilon \sqrt{2\nu }\partial _{xy}^2 \varPhi _1^1&...&\lambda _\epsilon \sqrt{2\nu }\partial _{xy}^2 \varPhi _N^1\\ \vdots&\vdots \\ \lambda _\epsilon \sqrt{\nu }\partial _{xx}^2 \varPhi _1^N&...&\lambda _\epsilon \sqrt{\nu }\partial _{xx}^2 \varPhi _N^N\\ \lambda _\epsilon \sqrt{\nu }\partial _{yy}^2 \varPhi _1^N&...&\lambda _\epsilon \sqrt{\nu }\partial _{yy}^2 \varPhi _N^N\\ \lambda _\epsilon \sqrt{2\nu }\partial _{xy}^2 \varPhi _1^N&...&\lambda _\epsilon \sqrt{2\nu }\partial _{xy}^2 \varPhi _N^N\\ \\ \lambda _1\varPhi _1^1&...&\lambda _1\varPhi _N^1\\ \vdots&\vdots \\ \lambda _{N_\mathrm{D}}\varPhi _{1}^{N_\mathrm{D}}&...&\lambda _{N_\mathrm{D}}\varPhi _{N}^{N_\mathrm{D}}\\ \end{bmatrix}. \begin{bmatrix} u_{1}\\ \vdots \\ u_{N}\\ \end{bmatrix} = \begin{bmatrix} 0\\ 0\\ 0\\ \vdots \\ 0\\ 0\\ 0\\ \\ \lambda _1\alpha _1\\ \vdots \\ \lambda _{N_\mathrm{D}}\alpha _{N_\mathrm{D}} \end{bmatrix}, \end{aligned}$$
(13)

where \(\varPhi _\mathrm{l}^k = \varPhi _\mathrm{l}(\mathbf {x}_k)\) are given by Eq. (6) and \(\partial _{ij}^2 \varPhi _\mathrm{l}^k = \frac{\partial ^2}{\partial _{ij}} \varPhi _\mathrm{l}(\mathbf {x}_k)\) are given by Eq. (10). After solving this system, the obtained coefficients \(u_\mathrm{l}\) are used to evaluate the implicit function. For visualization, the domain \(\varOmega \) is discretized into a grid with \(N_V\) points. After evaluating the implicit function on each visualization point of the grid, the horizon lines are extracted bilinearly on each grid element.

2.3 Reference Example of the Presented Method

Figure 1 is a summary of the presented workflow. It is applied on a synthetic cross section of an eroded, faulted and folded domain in two dimensions. Figure 1a shows the reference input data and illustrates the numerical supports of the interpolation. Figure 1b shows the output implicit function with extracted horizons. Equations (1) and (12) are also recalled as they are key to the method.

All the parameters used to create Fig. 1b and their values are given in Table 1. The interpolation nodes used in the computation are actually more numerous than illustrated in Fig. 1a and were generated by using \((50 \times 50)\) nodes in the x and y axes. The spacing between the nodes in the y axis is, therefore, smaller than in the x axis, which reflects the anisotropy of the studied structures. The support \(S_B\) is an example of a support affected by the visibility criterion. The number of visualization points is purposely bigger than the number of nodes and data points [i.e., a grid of \((100 \times 100)\) is used] to observe the behavior of the implicit function close to the discontinuities and away from the data points. This is also why the banded color template is used, giving an idea on what other iso-values than the expected horizons would look like if extracted.

The implicit function of Fig. 1b is used as reference model for sensitivity analysis. The parameters and their values are discussed and tested separately. The values given in Table 1 are systematically used as default values for all parameters; only the tested parameter values are changed in each sensitivity test. All the models are run on a laptop with Intel Core i7-4940 3 GHz with 32 Gb of RAM, and running Windows 7 Enterprise 64 bits.

Fig. 1
figure 1

Schematic representation of the proposed workflow. a An input cross section densely and regularly interpreted with data points and fault and unconformity segments in two dimensions. An example of sampling and two local supports \(S_A\) and \(S_B\) are illustrated. b Computed implicit function u obtained with the proposed method using the parameter values of Table 1. Between a, b The two main equations of the method: the implicit function definition and the problem to solve

Table 1 Parameters of the proposed method and their default values

3 Sensitivity to Data Quality

3.1 Model Distance and Data Distance

In this section, the method is tested on typical issues with geological data. It focuses on the way data points constrain the modeling problem, considering the impact of their availability, quality, and reliability on the results. Two types of distances are suggested to compare the results: the distance to the reference model \(D_\mathrm{model}\) and the distance of a model to data points \(D_\mathrm{data}\). The distance \(D_\mathrm{model}\) is only applicable in a synthetic example, whereas the distance \(D_\mathrm{data}\) is applicable in real settings where no reference model is available.

In \(D_\mathrm{model}\), the tested values are the implicit function values evaluated at the visualization points positions \(u(\mathbf {x}_{\mathrm{visu}_\mathrm{l}})\). The reference values are the reference model’s implicit function values (Fig. 1b) evaluated at the corresponding visualization points: \(u_{\mathrm{ref}_\mathrm{l}} = u_\mathrm{ref}(\mathbf {x}_{\mathrm{visu}_\mathrm{l}})\). The distance \(D_\mathrm{model}\) is thus evaluated as

$$\begin{aligned} D_\mathrm{model} = \frac{1}{N_V} \sum _{\mathrm{l} = 1}^{N_V} \frac{|u(\mathbf {x}_{\mathrm{visu}_\mathrm{l}}) - u_\mathrm{ref}(\mathbf {x}_{\mathrm{visu}_\mathrm{l}})|}{||\overline{{\mathbf {g}}}||_\mathrm{ref}}. \end{aligned}$$
(14)

with \(||\overline{{\mathbf {g}}}||_\mathrm{ref}\) the average gradient norm of the reference model.

In \(D_\mathrm{data}\), the tested values are the implicit function values evaluated at the data points positions \(u(\mathbf {x}_{\mathrm{data}_\mathrm{l}})\) of the currently tested model [e.g., sparse, noisy, depending on the application]. The reference values are the expected data values: \(u_{\mathrm{ref}_\mathrm{l}} = \alpha _\mathrm{l}\) (Sect. 2.2 and Appendix D). The distance \(D_\mathrm{data}\) is thus evaluated as

$$\begin{aligned} D_\mathrm{data} = \frac{1}{N_\mathrm{D}} \sum _{\mathrm{l} = 1}^{N_\mathrm{D}} \frac{|u(\mathbf {x}_{\mathrm{data}_\mathrm{l}}) - \alpha _\mathrm{l}|}{||\overline{{\mathbf {g}}}||_\mathrm{ref}}. \end{aligned}$$
(15)

As it is normalized by the reference gradient norm \(||\overline{{\mathbf {g}}}||_\mathrm{ref}\), the evaluated errors are actual distances in meters comparable to the domain’s dimensions [i.e., \((50 \,\hbox {m} \,\times 25 \,\hbox {m})\)], and they are independent of the nodal spacing and the number of evaluated points. They are also independent of the implicit function’s trend, although this is an approximation as local variations of the gradient’s norm in the tested model may overestimate or underestimate the error. As an example, a \(D_\mathrm{model}\) of 1 m indicates that a value evaluated at a visualization point exists, on average, at 1 m in the reference model; a \(D_\mathrm{data}\) of 1 m indicates that the average distance between each data point and the corresponding iso-line is equal to 1 m.

Fig. 2
figure 2

Sensitivity of the proposed method to irregular data points. a Distances to the reference model and the decimated data with a varying random decimation in the data points. b A resulting implicit function obtained with a decimation value of \(98\%\) [i.e., 14 data points]. The white circles indicate where geological structures were lost as compared to Fig. 1b

3.2 Data Sparsity

Depending on the types of field samples [e.g., seismic and wells] and the quality of the rock exposure, the interpreted input points for structural modeling may be more or less clustered and sparse. Figure 2a shows the accuracy of the method for different degrees of random decimation in the reference data points (Fig. 1a).

As the decimation is performed randomly, a unique simulation for each data decimation percentage is not enough to understand the dependency of the method to the degree of data sparsity. In this article, when a random parameter is involved, 100 simulations are computed for each given set of parameter values. To ease interpretation, only the average distances of the 100 simulations are represented in the graphs. The minimum, maximum, percentiles and standard deviation numbers were also computed and given in ONLINE RESOURCE 1, but not represented for visibility reasons.

In Fig. 2a, it can be observed that the fewer the data points, the larger the distance to the reference model \(D_\mathrm{model}\). On the contrary, \(D_\mathrm{data}\) is the same for all data decimation values. The method thus fits the decimated data but fails to recover the features of the reference model. The drastic change in the distances for a decimation above \(98\%\) [i.e., using 29 data points on average] comes from the emergence of models with less than three data points in one or more fault blocks, which creates unstable results. Figure 2b shows a result with a data decimation value of \(98\%\), which gave, in this random case, 14 input data points. In the circled areas of missing data, the folds are smoothed, but the remainder of the model is well reconstructed. The distances \(D_\mathrm{data}\) and \(D_\mathrm{model}\) of this model are reported in Fig. 2a.

These results show how the proposed method behaves with irregularly and sparsely sampled data. The structures are well represented if the data points sample the non-redundant parts of the geometry. Otherwise, the solution is smoothed where data are missing. Consequently, the proposed method performs best when data points sample high curvature areas and areas of thickness variation.

3.3 Noisy Data

The quality of the field measurements, the processing errors, and interpretation errors can lead to noise in the data. To test the proposed method on this aspect, perturbed data points are created by adding different levels of noise to the reference data (Fig. 1a). The intensity of noise indicates the maximum displacement a point can have during the perturbation, as a radial Euclidian distance in meters around the point. The displacement of each point is sampled from a uniform distribution between zero and this maximum displacement value. Data points having crossed a fault between their initial position and their perturbed position are deleted to avoid stratigraphic inconsistencies.

The noise in the data is handled by the smoothing ability of the bending energy penalization. Figure 3 shows the resulting models for three different values of the smoothness parameter \(\lambda _\epsilon \) (Sect. 2.2). It is difficult to have a priori knowledge of a proper \(\lambda _\epsilon \) value to use. In addition, the number of equations also have an influence on the results. The terms of volumes \(\nu \) in Eq. (13) obtained during the discretization stage normalize the influence of the number of smoothness equations, but an equivalent principle should also be applied on the \(N_\mathrm{D}\) data equations. The relation between the smoothness and \(\lambda _\epsilon \) is thus independent of N, but dependent to \(N_\mathrm{D}\).

Fig. 3
figure 3

Resulting implicit function when using the same set of perturbed data as input (intensity of noise is 1 m), but with different energy weights \(\lambda _\epsilon \). a\(\lambda _\epsilon = 1\). b\(\lambda _\epsilon = 30\). c\(\lambda _\epsilon = 400\)

Figure 4 shows how the error evolves when the intensity of noise and the smoothing level change. For an intensity of noise fixed to 1 m (Fig. 4a), the best \(D_\mathrm{model}\) value is obtained for a \(\lambda _\epsilon \) around 30 (illustrated by Fig. 3b). Below this range, the noisy data points are better represented [i.e., \(D_\mathrm{data}\) decreases], which drives the results away from the reference model [i.e., \(D_\mathrm{model}\) increases, illustrated by Fig. 3a]. Above this range, the structures start to be smoothed, which deteriorates the fitting [i.e., \(D_\mathrm{data}\) and \(D_\mathrm{model}\) slowly increase, illustrated by Fig. 3c]. For a \(\lambda _\epsilon \) fixed to 30 (Fig. 4b), both \(D_\mathrm{data}\) and \(D_\mathrm{model}\) increase together with the intensity of noise. As the noise increases, the data points represent the geological structures less and less. Although the misfit trend depends on the data distribution, satisfactory results are generally obtained with a \(\lambda _\epsilon \) between 1 and 100 for most datasets with a \(\lambda _k\) between 1 and 10.

Fig. 4
figure 4

Sensitivity of the proposed method to the noise and the energy weight \(\lambda _\epsilon \). a The intensity of noise is fixed to 1 m, and \(\lambda _\epsilon \) varies. b The weight \(\lambda _\epsilon \) is fixed to 30 and the intensity of noise varies

3.4 Data Reliability

Data points may originate from different types of sources, hence having a varying reliability [e.g., a well datum is generally considered as more reliable than a seismic pick]. The data weight \(\lambda _k\) (Sect. 2.2) can be taken differently between data points. As stated in Sect. 3.3, the higher the value of the penalization weight \(\lambda _\epsilon \), the smoother the results. The same principle applies for the data weight \(\lambda _k\): the higher its value, the better the fit to the corresponding data point. Thus, data reliability can be expressed by varying each data weight \(\lambda _k\). Figure 5 shows how the relative fit to data can be controlled with \(\lambda _k\) values.

Fig. 5
figure 5

Example of the influence of data constraints on the implicit function when changing their weights. (Filled square) Hard data: \(\lambda _k^\mathrm{hard} = 10\), (filled circle) Soft data: \(\lambda _k^\mathrm{soft} = 1\)

Unfortunately, the weights \(\lambda _k\) have no physical meaning and cannot be associated simply with data geometric errors. In Mallet (2002), the problem is simplified by only distinguishing two types of data: (i) the hard data that must be honored (usually borehole data), and (ii) the soft data that should be honored as much as possible under the smoothness criterion (usually seismic data).

In the proposed method, an average weight \(\lambda _k^\mathrm{soft}\) is given to all soft data, and a greater weight \(\lambda _k^\mathrm{hard}\) is given to all hard data. Deciding the value of \(\lambda _k^\mathrm{hard}\), compared to the two weights \(\lambda _k^\mathrm{soft}\) and \(\lambda _\epsilon \), is not trivial: if taken too small, hard data may not be honored, and if taken too large, System (13) may become badly conditioned. Also, hard data can be honored with a negligible error, but not exactly.

Mallet (2002) avoids these issues by adding a node in the mesh at each hard data position and fixing its nodal coefficient \(u_\mathrm{l}\) at the hard data value. This operation may decrease the mesh quality.

Fig. 6
figure 6

A resulting implicit function using sparse and heterogeneous data with noise in the soft data (intensity of noise: 1 m) and a few hard and gradient data points. (Filled square) Hard data: \(\lambda _k^\mathrm{hard} = 10\), (filled circle) Soft data: \(\lambda _k^\mathrm{soft} = 1\), Gradient data: \(\lambda _k^{grad} = 10\), \(\lambda _\epsilon = 30\). \(D_\mathrm{data} = 8.3e^{-3}\,\hbox {m}\) and \(D_\mathrm{model} = 2.9e^{-2}\,\hbox {m}\)

With MLS functions, due to the local least squares approximation (Appendix A), imposing a nodal coefficient \(u_\mathrm{l}\) is not equivalent to imposing the implicit function value at this interpolation node position \(\mathbf {x}_\mathrm{l}\). MLS functions are said to lack the Kronecker delta property

$$\begin{aligned} \varPhi _\mathrm{l}(\mathbf {x}_i) \ne \delta (\mathbf {x}_\mathrm{l} - \mathbf {x}_i) = \left\{ \begin{array}{ll} 1 &{} \quad \text {if} \,l = i\\ 0 &{} \quad \text {if} \, l \ne i \end{array}\right. \quad \Leftrightarrow \quad u(\mathbf {x}_\mathrm{l}) \ne u_\mathrm{l}. \end{aligned}$$
(16)

Adding nodes on hard data positions is thus irrelevant to enforce the corresponding constraint. A possible solution would be to use Lagrange multipliers to automatically determine appropriate weights for hard data, but it would change the problem into a saddle-point one, thus modifying its complexity (Brezzi 1974). Though strongly dependent on the case study, a hard data weight \(\lambda _k^\mathrm{hard}\) ten times greater than the soft data weight \(\lambda _k^\mathrm{soft}\) is sufficient in practice and avoids a badly conditioned system (Fig. 5).

Figure 6 summarizes the presented data constraints and their influence on the implicit function. The soft data were obtained using a decimation of \(98\%\) and an intensity of noise of 1 m, generating thirteen noisy points. The hard data were obtained using a decimation of \(99\%\), generating five points positioned as in the reference data. Additionally, five gradient data vectors (constraint presented in Appendix D) were extracted on the implicit function from the reference model to control the structures in areas with missing data. With a total of 23 constraints, this model is closer to the reference model than those generated with a decimation of \(60\%\) on the reference data and above [i.e., less than about 395 data points] from Fig. 2a.

Fig. 7
figure 7

Resulting implicit function with different polynomial orders d for the MLS shape functions and adapted dilatation values \(\rho \). a\(d = 0\) and \(\rho = 1.99\). b\(d = 2\) and \(\rho = 2.99\)

4 Discussions and Perspectives

4.1 Complexity and Stability of the Moving Least Squares Functions

4.1.1 Polynomial Order of the MLS Functions

The polynomial order d defines, together with the dimension of space (e.g., two here), the number of monomials m in the polynomial basis. This number m describes the dimensions of the moment matrix \({\mathbf {A}}\) [i.e., \(\hbox {dim}({\mathbf {A}}) = (m \times m)\)]. It thus has an influence on the complexity of the method and must be chosen small enough to avoid unnecessary computational costs. In practice, Fig. 7a shows that MLS functions with an order of 0 are not enough to reproduce a geological model: the solution tends to be perpendicular to the discontinuities and the domain’s borders. MLS functions with an order of 2 give results similar to an order of 1 (Fig. 7b), with a higher computational complexity. The order d is, therefore, fixed to 1 in the presented method. Although purely empirical, first-order MLS functions seem to reproduce any complex geological structures.

4.1.2 Local Supports and Domain Coverage

An MLS shape function \(\varPhi _\mathrm{l}(\mathbf {x})\) is only defined within the support \(S_\mathrm{l}\) of its weight function \(w_\mathrm{l}\) (Eq. 6). By definition, \(S_\mathrm{l}\) and the support of \(\varPhi _\mathrm{l}(\mathbf {x})\) are the same. The node \(\mathbf {x}_\mathrm{l}\) has an influence on the points existing within the support \(S_\mathrm{l}\). If a point \(\mathbf {x}\) is influenced by a node \(\mathbf {x}_\mathrm{l}\), then \(\mathbf {x}_\mathrm{l}\) is said to be a neighbor of \(\mathbf {x}\).

This restricted influence of the nodes \(\mathbf {x}_\mathrm{l}\) in space represents the main advantage of the MLS functions as each constraint involves a small number of neighbors [i.e., System (13) is sparse]. Unfortunately, it may also lead to singularities. Let the intersection of all supports \(S_\mathrm{l}\) be denoted as the cover. When the entire domain \(\varOmega \) is included in the cover, it is said to be complete. If the cover is not complete, then the implicit function \(u(\mathbf {x})\) is undefined in the uncovered areas. In addition, the number of linearly independent neighboring nodes required around a position \(\mathbf {x}\) should be equal at least to the number of monomials m in the MLS polynomial basis to have a non-singular matrix \({\mathbf {A}}\) (Appendix A). Therefore, the distribution of the nodes \(\mathbf {x}_\mathrm{l}\) and the size of the supports \(S_\mathrm{l}\), controlled by the dilatation parameters \(\rho \), must be defined carefully to avoid singularities. This situation is analogous to ordinary and universal kriging when the neighborhood size is too small.

4.1.3 Theoretical Relationship Between the Dilatation Parameter and the Number of Neighbors

The cover problem (Sect. 4.1.2) is tackled with squared supports (Eq. 2) and by making the dilatation parameter \(\rho \) proportional to the regular spacing of interpolation nodes. Figure 8a illustrates the relationship between \(\rho \) and the resulting support of an MLS function centered on a given interpolation node \(\mathbf {x}_\mathrm{l}\). If close to a discontinuity with the visibility criterion (Sect. 2.1.3) or close to a border, the support may cover fewer neighboring nodes than illustrated. The dilatation parameter \(\rho \) has thus a direct impact on the maximum number of neighboring interpolation nodes \(n_\mathrm{node}^\mathrm{max}\) around a given node \(\mathbf {x}_\mathrm{l}\), following

$$\begin{aligned} n_\mathrm{node}^\mathrm{max} = (2\lfloor \rho \rfloor + 1)^2, \end{aligned}$$
(17)

with \(\lfloor .\rfloor \) the integer part operator.

As \(\rho \) is constant for all nodes, if an MLS support centered on a node \(\mathbf {x}_\mathrm{l}\) covers a point \(\mathbf {x}\), then an imaginary MLS support centered on \(\mathbf {x}\) covers \(\mathbf {x}_\mathrm{l}\). Figure 8b illustrates the relationship between \(\rho \) and the support of an MLS function centered on a data point \(\mathbf {x}\), defining the neighboring nodes influencing this data. Following the comments on discontinuities and borders, the relationship between \(\rho \) and the maximum number of neighboring interpolation nodes \(n_\mathrm{data}^\mathrm{max}\) around a given data point \(\mathbf {x}\) is

$$\begin{aligned} n_\mathrm{data}^\mathrm{max} = (2\lfloor \rho + 0.5\rfloor )^2 . \end{aligned}$$
(18)

Equation (18) is only true if the data point is not exactly located on a node position or not colinear with two nodes in the x or y axis. The maximum number of neighboring nodes of a data point is therefore between \(n_\mathrm{data}^\mathrm{max}\) and \(n_\mathrm{node}^\mathrm{max}\) depending on its location.

The numbers \(n_\mathrm{node}^\mathrm{max}\) and \(n_\mathrm{data}^\mathrm{max}\) must be considered when defining \(\rho \), as they give an idea of the sparsity of System (13). In addition, both must at least be greater than the number of monomials m, which is given by the dimension of space and the polynomial order d (Sect. 4.1.2).

Fig. 8
figure 8

Illustration of the relationship between the dilatation parameter \(\rho \), proportional to the regular spacing, and the MLS functions supports. a The supports for three different \(\rho \) values centered on a node \(\mathbf {x}_j\). b The imaginary supports for the three same \(\rho \) values centered on a data or visualization point \(\mathbf {x}\)

Fig. 9
figure 9

Influence of the support size \(\rho \) on the method when \(\rho \) is proportional to the regular spacing and with a polynomial order d of 1

4.1.4 Practical Influence of the Support Size on the Method

Even if the theory defines a minimum value for the dilatation parameter \(\rho \) compared to the MLS parameter m (Sect. 4.1.3), this minimum value is not necessarily reliable when discontinuities are present. This is illustrated in Fig. 9 where the influence of \(\rho \) on the method is given for a polynomial order of 1 [i.e., \(m=3\) in two dimensions].

If the moment matrix \({\mathbf {A}}\) is singular at a position \(\mathbf {x}\), the implicit function u cannot be evaluated. In this case, the concerned point [e.g., data or visualization point] is avoided during the computation of distances (Sect. 2.3) and flagged as undefined. Such anomalies are not explicitly represented on Fig. 9 for visibility reasons, but their numbers per model can be found in ONLINE RESOURCE 1.

Some undefined data points (up to 14) are found in models with a \(\rho \) value between 1 and 1.5 even though these \(\rho \) values are theoretically large enough to invert the moment matrix \({\mathbf {A}}\) (i.e., \(n_\mathrm{data}^\mathrm{max} = 4 > 3\), \(n_\mathrm{node}^\mathrm{max} = 9 > 3\)). This is due to the visibility criterion [Sect. 2.1.3], which reduces the number of neighbors by cutting the supports near the discontinuities. In this case, the number \(n_\mathrm{data}^\mathrm{max}\) is too close to m and the number of actual neighbors is very likely to drop under m if a data point is close to a discontinuity. No other undefined points are found in the other models of Fig. 9.

For \(\rho \) values greater than 1.5, \(D_\mathrm{data}\) and \(D_\mathrm{model}\) present an error of a few millimeters. Such differences can be considered negligible when considering that the model’s folds and faults are several meters long. The method thus converges for a fixed number of nodes and an increasing support size. The perfect fit to the reference model (i.e., \(D_\mathrm{model} = 0\) around a \(\rho \) value of 1.99) is caused by the exact equivalence with the reference parameters (Table 1, default values).

Concerning the computational time t, it increases incrementally with \(\rho \). The computational cost of the method is, therefore, dependent on \(n_\mathrm{node}^\mathrm{max}\) as each unit of \(\rho \) defines a different number of neighboring nodes (Eq. 17). It is also dependent on \(n_\mathrm{data}^\mathrm{max}\) (Eq. 18). It thus seems unnecessary to define \(\rho \) greater than 2 as the same results are obtained with a slower computation. Also, the fit to data is slightly better with a \(\rho \) around 2 than with smaller values.

In conclusion, MLS functions are stable even with an increasing number of neighbors. The dilatation parameter \(\rho \) should be taken as close as possible to 2 when using a polynomial order of 1 to avoid unnecessary computational cost while obtaining similar results. It is fixed to 1.99 in the presented method (Table 1) to avoid dealing with neighbors exactly on the edge of the supports (i.e., neighbors with no influence). When using a greater polynomial order, the \(\rho \) value must be increased accordingly, which is why a \(\rho \) equal to 2.99 was used for a polynomial order d equal to 2 in Fig. 7b.

4.2 Regular or Irregular Sampling

4.2.1 Comparison Test with Random Sampling

Distributing the interpolation nodes regularly (Sect. 2.1.1) is not a requirement of the proposed method, but has several advantages. In Fig. 10, the method is tested with a varying number of interpolation nodes, which are distributed either randomly or regularly in the domain of study. The randomly generated nodes follow a uniform distribution on the x and y axes, respectively. In this case, the dilatation parameters \(\rho _{\mathrm{l}_x}\) and \(\rho _{\mathrm{l}_y}\) cannot be specified relatively to the interpolation node spacing [e.g., using \(\rho \) as in Eq. (3)]. Therefore, they are fixed for all simulations, regardless the number of nodes N and the sampling technique. In this application, \(\rho _{\mathrm{l}_x}\) is fixed to 3.5 m, and \(\rho _{\mathrm{l}_y}\) is fixed to 1.75 m, so that the supports \(S_\mathrm{l}\) of the interpolation functions cover \(\approx 0.5\%\) of the domain \(\varOmega \).

The model distance \(D_\mathrm{model}\) and data distance \(D_\mathrm{data}\) (Sect. 2.3) are represented respectively by \(D_\mathrm{model}^\mathrm{reg}\) and \(D_\mathrm{data}^\mathrm{reg}\) for regular sampling, and \(D_\mathrm{model}^\mathrm{rand}\) and \(D_\mathrm{data}^\mathrm{rand}\) for random sampling. The method’s computation time is also represented for both techniques as \(t^\mathrm{reg}\) and \(t^\mathrm{rand}\). All the simulations results together with basic statistics on the simulations are given in ONLINE RESOURCE 1.

Figure 10 emphasizes several characteristics of the method: (i) the results are dependent on the interpolation node sampling; (ii) both methods converge to the reference model when the number of interpolation nodes increases; (iii) regular sampling gets models closer to the reference model and the data set than the average random sampling for all N; and (iv) the computational efficiencies of the two methods are equivalent. As observed in the full results (ONLINE RESOURCE 1), the two methods can generate undefined points for small numbers of nodes N (up to 900; small compared to the used dilatation parameters), but these anomalies are more represented in random sampling than in regular sampling.

This study shows that, for a given set of dilatation parameters \(\rho _{\mathrm{l}_x}\) and \(\rho _{\mathrm{l}_y}\), both sampling techniques obtain close results as long as a minimum number of interpolation nodes N is used. The main difference is that the number of nodes N and the dilatation parameters \(\rho _{\mathrm{l}_x}\) and \(\rho _{\mathrm{l}_y}\) can be theoretically correlated to avoid singularities with regular sampling (Sect. 4.1.3), which is not the case with random sampling. In practice, this correlation can also avoid unnecessary large supports of interpolation and thus reduce drastically the computation time with regular sampling (Sect. 4.1.4). As a reference, it takes 0.9 s to generate a model with 10, 000 nodes and a dilatation parameter \(\rho \) of 1.99 scaled on the nodal spacing (Eq. 3); the evaluated distances are \(D_\mathrm{data} = 5.14e^{-4}\,\hbox {m}\) and \(D_\mathrm{model} = 1.56e^{-3}\,\hbox {m}\).

Fig. 10
figure 10

Influence of the number of interpolation nodes N on the presented method with regular and random sampling. The distances and the computation time are evaluated for both random and regular sampling separately

4.2.2 Possible Optimizations on Regular Sampling

Regular sampling could be accelerated by optimization techniques not implemented in this paper. Once an adapted dilatation parameter value \(\rho \) has been chosen for a given sampling resolution (Sect. 4.1.4), several repetitive calculations can be avoided. For instance, all the nodes far enough from the borders and the discontinuities (i.e., not affected by the visibility criterion, Sect. 2.1.3) have the same pattern of neighbors (Fig. 8a). The MLS second derivatives evaluated at these nodes are therefore the same. It is possible to evaluate these derivatives once, store the results, and use them for all nodes with the same pattern. The nodes close to the borders, but far from the discontinuities also follow patterns simple enough to be stored. This principle can even be extended for all possible neighboring configurations, but the number of tests to find the right pattern may then become computationally demanding.

Another possible improvement is to approximate the evaluation of the MLS functions on data and visualization points. When far from borders and discontinuities, the disposition of the neighboring nodes around a point varies continuously (Fig. 8b). This variation is restricted to the containing cell (i.e., the square area between 4 sampling nodes). Each MLS function \(\varPhi _\mathrm{l}\) could thus be approximated by studying their evolution depending on the position \(\mathbf {x}\) within a cell. The cells are implicit (i.e., not stored) as the sampling is regular. The previous comments for nodes close to the borders also apply for this suggestion on points.

Finally, the presented method is adapted for parallelization, as the equations written in System (13) are independent from one another. When considering a node or a data point, the set of neighbors is defined with the chosen support size and proximity to discontinuities; the MLS functions or their second derivatives are then evaluated; and the corresponding equation can be written in the system. Each of these steps is only dependent on the interpolation node or data point of the concerned equation. In addition, and contrarily to mesh based methods, handling the discontinuities with the visibility criterion (Sect. 2.1.3) does not require any preprocessing on the sampling, but only intersection tests between segments (and triangles in three dimensions). Therefore, each equation in System (13) can be written in parallel.

4.3 Complex Geometries of Structural Discontinuities

4.3.1 Lack of Neighboring Nodes

Although the visibility criterion (Sect. 2.1.3) is criticized for stability reasons (Belytschko et al. 1996), it shows satisfactory results in the presented application. The main issue is the modification of the set of neighboring nodes. Cutting the supports decreases the number of neighbors, which can produce areas where the estimation is impossible (Sect. 4.1.4) or with singularities.

On Fig. 11a, singularities and undefined values at the intersection of two faults are represented. This is due to a lack of neighbors on visualization points, making the moment matrix \({\mathbf {A}}(\mathbf {x})\) singular in the concerned area (Sect. 4.1.2). In Fig. 11b, in addition to singularities, the generated implicit function bends abnormally away from the intersection. The unevenly distributed nodes also have deteriorated the evaluation of the second derivatives, which has impacted the solution coefficients \(u_\mathrm{l}\) attached to the concerned nodes \(\mathbf {x}_\mathrm{l}\).

Those results were generated with the default parameters (Table 1) and smaller numbers of interpolation nodes N. The described issues are thus related to the resolution of the sampling, but also to the discontinuities, their geometries and interactions. As undefined values are not acceptable in structural modeling, a solution is to use a finer resolution for the sampling, or a greater value for the dilatation parameter. The computing times given in Figs. 9 and 10 show that both solutions are possible but costly.

Fig. 11
figure 11

Limits to the visibility criterion and a proposed correction. a Case of singularities and undefined values at visualization point positions when neighboring interpolation nodes are missing. b Case of badly evaluated nodal coefficients \(u_\mathrm{l}\) when neighboring interpolation nodes are unevenly distributed. c, d Corrections to the limits observed in (a, b) by randomly generating new neighbor nodes

4.3.2 Changing the Polynomial Order

An alternative strategy to address the lack of neighbors could be to locally reduce the polynomial order d: if the number of interpolation neighbors of a point is smaller than the chosen number m, d can be decreased accordingly.

Unfortunately, this solution is not applicable as it does not solve situations with no neighbors (Fig. 11a) or with unevenly distributed neighbors (Fig. 11b). Also, decreasing d to the zeroth order, if necessary, is not adapted to a structural modeling application (Sect. 4.1.1, Fig. 7).

4.3.3 Generation of New Nodes

In Sect. 4.2.1, the proposed method is shown to converge to the same model whether the interpolation nodes are distributed randomly or regularly and given a sufficient number of interpolation nodes. This means that, for a position \(\mathbf {x}\), the evaluation of the MLS functions and their derivatives should be approximately the same if there are enough neighbors, even if randomly distributed. The lack of neighboring nodes can, therefore, be solved by randomly generating new interpolation nodes locally.

If a position \(\mathbf {x}\) (interpolation node, data point or visualization point) is close to a discontinuity, the following procedure can be employed:

  1. 1.

    Count the number of neighboring nodes \(n_{\mathbf {x}}\).

  2. 2.

    Compare this number to a reference number \(n_\mathrm{ref}\) for which a shape function is considered stable.

  3. 3.

    Add neighbors randomly until \(n_{\mathbf {x}} = n_\mathrm{ref}\) within the support.

Figure 11c, d show how this technique solves the two problems exposed in Sect. 4.3.1 respectively with a number of reference \(n_\mathrm{ref}\) fixed to 4. Nodes generated on the other side of the faults are deleted during the generation as they are not considered neighbors (i.e., not within the support). The procedure could be improved by some node placement strategies, such as using a repulsion factor allowing to generate nodes evenly around a position \(\mathbf {x}\). This could increase the chances to obtain a stable interpolation in those areas with a small \(n_\mathrm{ref}\) number, which is not guaranteed in the present approach. Adding interpolation nodes changes the problem’s dimensions and its density (Sect. 2.2), but this technique does not change drastically the computational efficiency of the algorithm as the modifications are local.

4.4 Bending Energy and Structural Modeling

Structural modeling algorithms are supposed to create models that represent geological structures. As a smooth energy, the bending energy has been presented as robust enough to deal with sparse, irregular and noisy data (Sect. 3). This ability is key for complex applications.

Unfortunately, smoothing methods perform badly on poorly sampled anisotropic, periodic and thickness variation features. A common strategy is to add artificial data for these issues: orientation and gradient data are typically used to control a fold geometry away from the data (Hillier et al. 2014; Laurent 2016) or to include a known periodicity in fold and foliation structures (Grose et al. 2017; Laurent et al. 2016). Similar constraints could also be used to impose some thickness variations. Such approaches often require expertise and manual interactions.

Other approaches intend to consider the structural anisotropy calibrated from the available data into the smoothing regularization itself. This is generally included in PFM with the experimental variogram, modifying the covariance and its range (Aug 2004). Gonalves et al. (2017) further reduce the workload by using the maximization of the log-likelihood to automatically determine such parameters. However, the global interpolation of PFM limits the application of such techniques when assessing local anisotropy. In volume contouring, Martin and Boisvert (2017) infer the local anisotropy of the targeted geobodies by iteratively adapting partitions of the domain and local interpolations per partition.

The presented formalism introduces the concept of continuous energies in a local approach well adapted to the structural modeling application. While some versions of PFM intrinsically minimize the bending energy using Green’s functions as interpolants, it is explicitly minimized in the least squares sense in the proposed approach. It is then closer to DSI, although the regularization term is posed as a continuous, and not discrete, operator. With the proposed formalism, different energies and their ability to solve specific geological features can be tested by changing the regularization term in Eq. (12) and discretizing it accordingly in System (13). It is also possible to mix several energies and even tune the weight \(\lambda _\epsilon \) to each derivative term separately and spatially.

As an example, Fig. 12 shows how the algorithm behaves when using the Dirichlet energy (Courant 1950) as

$$\begin{aligned} J_\mathrm{Dirichlet}(u) = \frac{1}{2}\int _{\varOmega } \lambda _\epsilon ^2 ||\nabla u||^2 \hbox {d}\varOmega . \end{aligned}$$
(19)

The obtained implicit function tends to be constant where data are missing and perpendicular to the discontinuities. Therefore, the Dirichlet energy is not adapted to the structural modeling problem. The best continuous energy fitting geological structures is yet to be found, and the proposed framework may be used to test new ideas.

Fig. 12
figure 12

Resulting implicit function using the Dirichlet energy instead of the bending energy

5 Conclusions

In this paper, an alternative to the existing implicit structural modeling methods is proposed. It is a locally defined method that handles the structural discontinuities with meshless concepts. The implicit function is defined as a weighted sum of moving least squares shape functions (Lancaster and Salkauskas 1981; McLain 1976). The supports of these interpolation functions are centered on regularly distributed nodes and cut by the structural discontinuities with the visibility criterion. Therefore, no mesh needs to be stored and the system to solve is sparse.

The proposed method consists of a spatial regression of data points penalized by a physical energy. It thus introduces the explicit use of continuous energy to solve the modeling problem. In this paper, the bending energy (Dubrule 1984; Wahba 1990), which has the ability to filter noise, is suggested to extrapolate between data gaps and handle clustered and sparse data by smoothing the generated structures. Based on this formalism, it is possible to test other energies and mix them to better relate geology. In addition, the related continuous equations can be discretized in many other ways than the one presented here, centering the modeling problem on the choice of the regularization and not the discretization itself. Although only the typical smoothing assumption of structures is adopted here, the introduced concept may have the potential to enable inherent control of complex geological cases.

The sensitivity tests on data quality and the method’s parameters are presented in two dimensions. This provides knowledge on the relations between results and parameters also for other models. Only two-dimensional results are shown for simplicity, but the suggested formalism is applicable to higher dimensions. Figure 13 shows a result on a synthetic three-dimensional model.

Fig. 13
figure 13

Resulting implicit function on a synthetic model in three dimensions using 8815 data points with two faults, one erosion, folds, and thickness variations. The discontinuity objects are the transparent triangulated surfaces. The implicit function was computed with \((40\times 40\times 30)\) nodes in 25 s