Keywords

1 Introduction

The intrinsic indetermination of the inverse gravimetric problem is well known and the description of the whole set of possible internal masses, given the external gravity potential, has been fully described on a purely mathematical ground (e.g. Parker 1975; Sampietro and Sansò 2012). However, in order to obtain realistic solutions, some constraints should be added in the solution of the inverse gravimetric problem. For instance the solution can be derived from the “experience” of an operator assisted by fast forward algorithms (Parker 1973; Caratori Tontini et al. 2009; Gordon et al. 2012) and from generic geological information by means of trial and error procedures.

Another possible solution could be to add severely constraints in terms of mass contrast leading to search for the geometry of discontinuity surfaces (Barbosa et al. 19971999; Fedi and Rapolla 1999; Fedi 2006). This approach is commonly called non-linear inverse problem due to the non linearity of the functional relating gravity observations and geometrical parameters of the sources. On the contrary in the so-called linear inversion (Last and Kubik 1983; Guillen and Menichetti 1984; Barbosa and Silva 1994) there is a linear relation in terms of Newtonian integral between the mass density and the functional of the external gravity potential, which is usually described as a summation on volume elements (voxels). Considering the linear problem the relation between data and unknowns is univocal when the number of voxel is conveniently taken smaller than the number of observations. However this relation is highly unstable because, as the dimension of voxels decreases, we are approaching the continuous setting, where non-uniqueness is large, as recalled above. Therefore the solution is usually obtained by imposing proper constraints. This can be done either under deterministic models (Medeiros and Silva 1996) or stochastic ones (Tarantola and Valette 1982; Tarantola 2002). In any case this approach is reconducted to the optimization of some non-linear, often quadratic, functionals of the gravity observations and the unknown mass distribution. This optimization can be obtained by Monte Carlo Markov Chain methods, including simulated annealing (Nagihara and Hall 2001; Roy et al. 2005), as it is very well known in literature. Naturally the relation between sources and observations, i.e. the forward model, can be conveniently reckoned using a Fourier approach that greatly speeds up the computational time.

This paper is in the flow of the above way of reasoning, but trying to incorporate also the interactive approach mentioned at the beginning by modelling the geological information in a Bayesian mode as prior probability. This is already present in geophysical literature even coupling gravimetric and magnetic observations (e.g. Bosch 19992004; Bosch and McGaughey 2001; Mosegaard and Tarantola 2002; Bosch et al. 2006; Guillen et al. 2008). In particular we propose here an approach similar to the one shown in Guillen et al. (2008) in which a field of discrete variables (namely geological units) is introduced as an additional unknown, with some prior information. As it will be explained in the following, the main differences with respect to Guillen et al. (2008) are in the way the prior information is formalized and in the algorithm used to find the solution of the inverse problem. Note that this work represents only a preliminary study, mainly focused on the mathematical formalization of the problem and that the improvement of the method is still a matter of investigation. Wishing to estimate a MAP (Maximum A Posteriori) of our posterior distribution, we are facing an optimization problem with part of the variables which are discrete. The proposed solution resorts to an application of a Gibbs sampler combined with a simulated annealing (Smith and Roberts 1993; Sansò et al. 2011), as it can be found in a large part of literature; here the application of the method to the image analysis, with the seminal paper by Geman and Geman (1984), is worth being mentioned.

A remark however can be put forward already in this introduction, namely that while image analysis deals only with “local” observations, i.e. observations that solely depend on the pixel to be updated in the Gibbs sampler, in our case any variation of density at any point will instead affect all the observable gravity anomalies wherever they are.

2 Problem Formalization

Similarly to Guillen et al. (2008) the inversion algorithm is developed assuming that some geological information is available in the studied region. In details, we suppose to know a list of all the possible geological units present in the area and their approximate geometrical distribution (e.g. from geological sections). We also suppose to know for each geological unit the most probable density and its variability (e.g. from literature). However, while in Guillen et al. (2008) only the boundaries of the geological units can be modified, in case merging separated portions of features or removing isolated ones, in the proposed method the formalization of the prior probability allows a more general solution to the problem, e.g. the possibility to generate new features.

In the following we formalize these assumptions in a Bayesian scheme: we start from the Bayes theorem in the usual form (Bayes 1984; Box and Tiao 2011):

$$\displaystyle{ P\left (\mathbf{x}\vert \mathbf{y}\right ) \propto \mathcal{L}\left (\mathbf{y}\vert \mathbf{x}\right )P\left (\mathbf{x}\right ) }$$
(1)

where y is a vector of observable quantities, while x is a vector of body parameters. The investigated volume is split into voxels, V i , with index i = 1, 2, , N; each voxel will carry two parameters \(\left (\rho _{i},L_{i}\right )\) where ρ i is the voxel mass density and L i is a “label” attributing to V i the presence of a certain geological unit chosen from the a-priori archive (e.g. water, sediment, salt, rock of a given type, etc.). So ρ i is a continuous variable and L i a discrete one among the M integers denoting the various materials.

Crucial is the way in which the prior probability \(P\left (\mathbf{x}\right )\) is supplied, namely the shape of the distribution \(P\left (\mathbf{x}\right ) = P\left (L_{1},\rho _{1};L_{2},\rho _{2};\ldots;L_{N},\rho _{N}\right )\). We assume that:

$$\displaystyle\begin{array}{rcl} P\left (\mathbf{x}\right )& =& \prod _{i=1}^{N}P\left (\rho _{ i}\vert L_{i}\right ) \cdot P\left (\mathbf{L}\right ) = \\ & =& \prod _{i=1}^{N}P\left (\rho _{ i}\vert L_{i}\right ) \cdot P\left (L_{1},L_{2},\ldots,L_{N}\right ){}\end{array}$$
(2)

meaning that, once a label L i  =  has been chosen for V i , the corresponding density will follow the law \(P\left (\rho _{i}\vert L_{i} =\ell\right )\), which in our case is a normal distribution:

$$\displaystyle{ P\left (\rho _{i}\vert L_{i} =\ell\right ) \sim \mathcal{N}\left (\overline{\rho }_{\ell},\sigma _{\ell}^{2}\right ) }$$
(3)

with the mean \(\overline{\rho }_{\ell}\) and the variance \(\sigma _{\ell}^{2}\) given by geological literature. In this respect a comprehensive set of rock properties can be found for instance in Christensen and Mooney (1995). As for the prior \(P\left (\mathbf{L}\right ) \equiv P\left (L_{1},L_{2},\ldots,L_{N}\right )\), we assume to have a Gibbs distribution (Azencott 1988):

$$\displaystyle{ P\left (\mathbf{L}\right ) \propto e^{-\mathcal{E}\left (\mathbf{L}\right )} }$$
(4)

where the energy \(\mathcal{E}\left (\mathbf{L}\right )\) depends only on the values \(\ell_{i}^{o}\) of L i provided by the geological model, as well as from cliques (Geman and Geman 1984) of order two expressing the fact that the value of L i is more likely to be equal to the value of the labels of the nearest neighbour voxels according to the following rules:

$$\displaystyle{ P\left (L_{i} =\ell \vert \mathbf{L}_{\Delta i}\right ) \propto e^{-\gamma s^{2}\left (L_{ i},\ell_{i}^{o}\right )-\lambda \sum \limits _{ j\in \Delta _{i}}q^{2}\left (L_{ i},L_{j}\right )} }$$
(5)

where γ, λ are parameters to be empirically tuned,

$$\displaystyle\begin{array}{rcl} s^{2}\left (L_{ i},\ell_{i}^{o}\right ) = s_{ i}^{2} = \left \{\begin{array}{@{}l@{\quad }l@{}} 0\quad &\mbox{ if }L_{i} =\ell_{ i}^{o} \\ \alpha _{i} \quad &\mbox{ if }L_{i}\neq \ell_{i}^{o} \end{array} \right.& &{}\end{array}$$
(6)
$$\displaystyle\begin{array}{rcl} q^{2}\left (L_{ i},L_{j}\right ) = q_{ij}^{2} = \left \{\begin{array}{@{}l@{\quad }l@{}} a_{i} \quad &\mbox{ if }L_{i} = L_{j} \\ a_{ij}\quad &\mbox{ if }L_{i}\neq L_{j} \end{array} \right.& &{}\end{array}$$
(7)

with \(V _{j} \in \Delta _{i}\) and \(\Delta _{i}\) is the neighbourhood of the voxel V i defined by the cliques of order two, as mentioned above.

Note that given the geological model it is possible to create a table of proximity of geological units and then, by tuning α i , a i and a ij , to create a hierarchy of the most probable values for L i . For example supposing to have three units, \(\ell= \{1,2,3\}\), and a proximity table as the one presented in Fig. 1, this translates into the following definition:

$$\displaystyle\begin{array}{rcl} s_{i}^{2}& =& \left \{\begin{array}{@{}l@{\quad }l@{}} 0\quad &\mbox{ if }L_{i} =\ell_{ i}^{o} \\ \alpha \quad &\mbox{ if }L_{i}\mbox{ is a geological neighbour of }\ell_{i}^{o} \\ \beta \quad &\mbox{ if }L_{i}\mbox{ is not a geological neighbour of }\ell_{i}^{o}\\ \quad \end{array} \right.{}\end{array}$$
(8)
$$\displaystyle\begin{array}{rcl} q_{ij}^{2} = \left \{\begin{array}{@{}l@{\quad }l@{}} a\quad &\mbox{ if }L_{i} = L_{j} \\ b\quad &\mbox{ if }L_{i}\mbox{ is a geological neighbour of }L_{j} \\ c\quad &\mbox{ if }L_{i}\mbox{ is not a geological neighbour of }L_{j}\\ \quad \end{array} \right.& &{}\end{array}$$
(9)

with β > α > 0 and c > b > a.

Fig. 1
figure 1

Example of proximity table. The geological unit 1 can be close to unit 2, but not to unit 3

Summarizing, the geological information enters into the solution providing the set of the possible geological units (i.e. the possible labels) with their mean density and its variability, the neighborhood relationship between the different geological units and the most probable value i o of each voxel. All these data can be derived from basin geological studies (e.g. geological sections or maps) or through geophysical techniques.

Two remarks are in order: the first is that L, with prior \(P\left (\mathbf{L}\right )\), is indeed a Markov random field (MRF), see Rozanov (1982). The second is that the final result of our optimization will depend from the chosen value of all the constants, which have to be tuned on the specific example.

As always for a MRF, the characteristics, namely the conditional distributions (5), determine a joint distribution \(P\left (\mathbf{L}\right )\) such that:

$$\displaystyle{ \log P\left (\mathbf{L}\right ) \propto -\dfrac{1} {2}\gamma \sum _{i=1}^{N}s^{2}\left (L_{ i},\ell_{i}^{o}\right ) -\dfrac{1} {2}\lambda \sum _{i=1}^{N}\sum _{ j\in \Delta _{i}}q^{2}\left (L_{ i},L_{j}\right ). }$$
(10)

The logarithm of the posterior distribution (1) will be written as:

$$\displaystyle\begin{array}{rcl} \log P\left (\mathbf{x}\vert \mathbf{y}\right )& =& \log P\left (\boldsymbol{\rho },\mathbf{L}\vert \boldsymbol{\Delta g}^{o}\right ) \propto \\ & &\propto -\frac{1} {2}\left (\boldsymbol{\Delta g}^{o} -\text{A}\boldsymbol{\rho }\right )^{T}\text{C}_{ \Delta g}^{-1}\left (\boldsymbol{\Delta g}^{o} -\text{A}\boldsymbol{\rho }\right ) + \\ & & -\frac{1} {2}\left (\boldsymbol{\rho }-\boldsymbol{\overline{\rho }}\right )^{T}\text{C}_{\rho }^{-1}\left (\boldsymbol{\rho }-\boldsymbol{\overline{\rho }}\right ) -\frac{1} {2}\gamma \sum _{i=1}^{N}s^{2}\left (L_{ i},\ell_{i}^{o}\right ) + \\ & & -\frac{1} {2}\lambda \sum _{i=1}^{N}\sum _{ j\in \Delta _{i}}q^{2}\left (L_{ i},L_{j}\right ) {}\end{array}$$
(11)

where we recall that \(\boldsymbol{\Delta g}^{o}\) is the vector of observed gravity anomalies, \(\mathrm{C}_{\Delta g}\) its noise covariance matrix, A is the forward modelling operator from densities to gravity anomalies, \(\boldsymbol{\rho }\) and \(\boldsymbol{\overline{\rho }}\) the vectors of components ρ i and \(\overline{\rho }_{i} = \overline{\rho }\left (\ell_{i}\right )\), C ρ the corresponding covariance matrix and \(s^{2}\left (L_{i},\ell_{i}^{o}\right )\), \(q^{2}\left (L_{i},L_{j}\right )\) given by (6) and (7). This is the target function we want to maximize with respect to ρ i and L i .

The maximization of (11), due to the fact that some variables are discrete, is never an easy task, as we know from other important problems in geodesy, e.g. the GNSS initial phase ambiguity fixing (De Lacy et al. 2002). The idea, mutuated from image analysis, is to apply a Gibbs sampler, chained with a simulated annealing (Casella and Robert 1999). In order to apply it to both the variables \(\left (\rho _{i},L_{i}\right )\), which are functions of the voxel V i , we have simplified the problem by considering ρ i as a discrete variable too. In practice we have substituted the normal distribution (3) with a discrete distribution on K values, e.g. on five argumental values taken at the average \(\overline{\rho }_{\ell}\), and at \(\overline{\rho }_{\ell} \pm \sigma _{\ell}\), \(\overline{\rho }_{\ell} \pm 2\sigma _{\ell}\) respectively. Of course to each argument the proper probability is assigned, according to the normal law. Once this is done, the Gibbs sampler is applied by drawing one couple \(\left (\rho _{i},L_{i}\right )\) at a time, holding fixed all the other values and following a simple updating routine. The probabilities of the sampling are computed from (11) letting ρ i run over its K values and i run over 1, 2, , M; in this way we have a table of K × M knots with their probabilities.

Actually the probability of x is modulated by introducing a “temperature” parameter T:

$$\displaystyle{ P_{T}\left (\mathbf{x}\right ) \propto e^{ \frac{1} {T}\log P\left (\mathbf{x}\vert \mathbf{y}\right )} }$$
(12)

and T is slowly reduced at each step (e.g. by 5% of its value). In this way starting from a very large T, we obtain a sequence of samples converging in probability to the point \(\overline{\mathbf{x}}\) where the maximum of \(\log P\left (\mathbf{x}\vert \mathbf{y}\right )\) is achieved (Azencott 1988).

3 Numerical Experiment

In order to assess the effectiveness of the presented Bayesian approach, which is able to consider also qualitative geological information, two simple experiments are carried out. They consist in recovering the mass density distribution of 3D synthetic models from their gravitational field. The density of each voxel is assumed to be equal to the mean density of the associated geological unit and moreover the model is assumed constant along one planar direction, i.e. all the vertical cross sections in this direction are equal. From this reference model the two inputs of the inversion algorithm, i.e. the gravitational signal and the approximate geological model, are simulated. In particular the latter is obtained by slightly modifying the labels of the reference model. The inversion algorithm is therefore applied and the result is compared with the reference model in a closed-loop test.

In this work we will present two numerical examples: the first simulates the recovering of a bathymetry, while the second consists in recovering the shape of a salt dome.

In the bathymetry model only two geological units are considered, water and bedrock, defined by ρ w  = 1, 000 kg m−3, σ w  = 5 kg m−3 and ρ b  = 2, 900 kg m−3, σ b  = 50 kg m−3 respectively. The investigated area is a square of 30 km side and has a depth of 5 km. A vertical cross section of the synthetic model, displayed in terms of “labels”, is represented in Fig. 2a. The volume is modelled by means of 1,200 rectangular prisms, each of them of dimensions 1. 5 km (x) × 5. 0 km (y) × 0. 5 km (z) and its gravitational observations are simulated by means of Nagy equations (see Nagy 1966) in a noiseless scenario. In particular the observations are generated on a regular grid at an altitude of 250 m and with a spatial resolution of 1 km, thus simulating the result of an aerogravimetric flight. As explained above, the geological model is simulated by slightly modifying the reference model as shown in Fig. 2b. The two parameters λ and γ are empirically set to the values of 0. 833 and 0. 733 respectively and finally the values of the labels are randomly initialized from a uniform distribution (i.e. drawn with an infinite temperature in the simulating annealing). The solution is obtained in about 5, 000 iterations and about 4 h on a common personal computer. A vertical cross section of the resulting synthetic model is depicted in Fig. 2c showing how the error in the geological model is properly corrected. In fact 86% of the wrong labels are corrected and the error on density has a standard deviation of 216 kg m−3.

Fig. 2
figure 2

Vertical cross sections representing the geological units (“labels”) of the bathymetry test. (a) reference model; (b) geological model; (c) solution

In the salt dome experiment three geological units are considered: salt dome (ρ dome  = 2, 000 kg m−3, σ dome  = 50 kg m−3), salt (ρ salt  = 2, 700 kg m−3, σ salt  = 50 kg m−3) and sediments (ρ sed  = 3, 000 kg m−3, σ sed  = 50 kg m−3). The volume is modelled by means of 2,400 voxels, each of them with size of 0. 4 km (x) × 0. 1 km (y) × 0. 3 km (z). The investigated area has a planar size of 3 km × 2 km and has a depth of 6 km. The geological units of a vertical cross section of the synthetic model are shown in Fig. 3a. The gravitational signal is simulated using point masses into a white noise scenario (noise standard deviation \(\sigma _{\Delta g} = 1\) mGal). The simulated geological model is shown in Fig. 3b. In Fig. 4 three examples of the prior distribution are depicted, thus showing its dependence from the function s 2 and q 2 defined in (8) and (9). These sample distributions are obtained by counting the occurrences of each geological unit for each voxel and then computing the corresponding relative frequencies over 2, 000 samples. From these three examples it can be noticed that the s 2 function controls the “certainty” of the geological unit of each voxel (the closer are the numerical values of the parameters α and β in (8), the more non-informative is the prior), while q 2 is related to the “certainty” of the geological unit boundaries (the closer are the numerical values of a, b and c in (9), the more unreliable are the boundaries). As for λ and γ, they are constants that practically controls the relative weight in the prior (2) between the density information and the geometrical one.

Fig. 3
figure 3

Vertical cross sections representing the geological units (“labels”) of inputs to the salt dome test. (a) reference model; (b) geological model

Fig. 4
figure 4

Vertical cross sections representing the relative frequency of each geological unit (“labels”) obtained from 2,000 realizations of the prior distribution. Each row is computed assuming different values of the prior parameters. (a) γ = 0. 6, λ = 0. 03, \(s^{2} = \left \{0,1,10\right \}\forall \,i\) and \(q^{2} = \left \{0,1,10\right \}\forall \,i,j\); (b) γ = 0. 6, λ = 0. 03, \(s^{2} = \left \{0,0.5,5\right \}\forall \,i\) and \(q^{2} = \left \{0,0.5,5\right \}\forall \,i,j\); (c) γ = 0. 6, λ = 0. 03, \(s^{2} = \left \{0,0.01,0.1\right \}\forall \,i\) and \(q^{2} = \left \{0,0.5,2\right \}\forall \,i,j\)

The solution is carried out by computing the prior fixing λ = 0. 03, γ = 0. 6, α = 1, β = 10, a = 0, b = 1 and c = 10, see Fig. 4a, in about 2 h and 200 iterations and it is shown in Fig. 5. In this case the algorithm is able to recover about 60% of the wrong voxels and the error on density has a standard deviation of 244 kg m−3. It can be seen from the salt dome experience that the algorithm is able to properly recover the shallowest part of the investigated volume, while the deepest one still present uncorrect features. This is probably due to the fact that the functions s 2 and q 2 are defined in the same way for the whole region, while a dependence at least on the vertical coordinate should be included.

Fig. 5
figure 5

Vertical cross sections representing the solution of the salt dome test. (a) geological units; (b) density

4 Conclusions and Future Works

In the present paper a Bayesian approach to invert gravity data with the support of a given geological model has been studied. The method works properly at least in the performed preliminary test scenarios. Actually, the two main limiting factors are the choice of all the parameters playing a role in the formulation of the a-priori probability and the computational time.

In this respect it would be useful, in order to limit the impact of user decisions on the solution, to implement a semi-automatic determination of the optimal numerical values of the s 2 and q 2 functions and of the λ and γ parameters. These parameters in fact can modulate how close/far the final solution is from the geological model and from the gravity observations.

The order of magnitude of these parameters, as seen from the numerical experiments, is strongly linked with the extension of the investigated volume, with the total number of voxels and with the “certainty” of the geological model. A further foreseen improvement is to consider possible dependences of s 2 and q 2 from the voxel position, thus allowing the prior to be more informative where the geological model is considered more reliable (e.g. in presence of borehole logging).

Last but not least, the algorithm needs to be numerically optimized in order to increase the model resolution. This step will imply a relevant growth of the total number of variables, thus increasing the total computational burden.