1 Introduction

The main objective of reservoir geophysics is to predict reservoir properties from geophysical data to build an initial model for fluid flow simulations and production predictions. Generally, there are two main mathematical problems of interest in reservoir geophysics: the prediction of elastic properties from seismic data (seismic inversion) and the prediction of rock and fluid properties from elastic attributes (petrophysical or rock physics inversion). Both mathematical problems can be seen as inverse problems, because the objective is the assessment of the model variables from the data, assuming that the physical relations between the variables to be predicted and the measured data can be modeled. Indeed, the physics that links rock and fluid properties to geophysical measurements is generally known for conventional reservoirs (Aki and Richards 1980; Sheriff and Geldart 1995; Avseth et al. 2005; Mavko et al. 2009; Dvorkin et al. 2014). A common method in seismic inversion is amplitude versus offset (AVO) inversion, a pre-stack seismic inversion technique for the prediction of elastic properties, such as velocities and density, in the reservoir. In AVO inversion, the forward model is based on the assumption that the reflections at the subsurface interfaces in the reservoir depend on the elastic properties of the porous rocks in the reservoir layers. Petrophysical inversion aims to predict rock and fluid properties, such as porosity and saturation, from the set of elastic attributes obtained from seismic inversion. The forward model is based on rock physics relations calibrated using well log data or core measurements.

The solution of these inverse problems is still challenging due to the uncertainty in the measurements and the non-uniqueness of the solution. Several methods have been proposed in geophysics. These methodologies can be deterministic methods such as optimization-based methods (Aster et al. 2011; Sen and Stoffa 2013), or probabilistic approaches such as Bayesian inversion (Tarantola 2005). Most of these approaches have been first applied to the seismic inversion problem and then extended to the petrophysical inversion (Doyen 2007). Because of the non-uniqueness of the solution and the uncertainty in the data, the solution of the inverse problem should not be limited to a single set of predicted values, but it should be represented by a probability density function (pdf) to quantify the model uncertainty. Therefore, a probabilistic setting appears to be the most suitable approach for the above-mentioned geophysical inverse problems.

Bayesian inversion methods are commonly used in geophysics for solving inverse problems and predicting unknown model variables from measured data in the subsurface (Scales and Tenorio 2001; Ulrych et al. 2001; Tarantola 2005). The traditional Bayesian framework for the prediction and uncertainty quantification of elastic properties from seismic data was introduced in Buland and Omre (2003) and later adopted for litho-fluid prediction from seismic data. Buland and Omre (2003) proposed a Bayesian approach to AVO seismic inversion for the prediction of seismic velocity and density, using a linearized model based on the convolution of the seismic wavelet and the AVO linearized approximation. Larsen et al. (2006), Gunning and Glinsky (2007), Ulvmoen and Omre (2010), Grana and Della Rossa (2010), Rimstad and Omre (2010), and Buland et al. (2012) introduced rock physics models in a Bayesian inversion setting to predict the rock and fluid properties in the reservoir conditioned by the geophysical measurements and to assess the uncertainty associated with the predictions. Furthermore, Buland and Kolbjørnsen (2012) extended the Bayesian approach to the inversion of electromagnetic data for resistivity prediction. Bayesian inverse methods have also been combined with geostatistical algorithms for generating multiple realizations from the posterior distributions: Asli et al. (2000) and Gloaguen et al. (2005) proposed inversion methods based on co-Kriging and cosimulations for gravity and borehole radar velocity; Gloaguen et al. (2004) proposed a Bayesian tomographic inversion using geostatistical simulations; and Hansen et al. (2006) proposed an inverse method that combines sequential simulations and linear Gaussian inversion for seismic applications.

Two common assumptions in Bayesian inversion are the Gaussian prior distribution of the model, and the linearity of the physical relation that links the model to the data (i.e., the likelihood function). These two assumptions are not necessary for the solution of the inverse problem, but allows the analytical evaluation of the solution of the Bayesian inverse problem. Otherwise, Markov chain Monte Carlo (McMC) methods can be used to sample from the prior and accept or reject the proposed model according to the likelihood of observing the measured data from the proposed model (Mosegaard and Tarantola 1995; Sen and Stoffa 1996). Stochastic approaches have also been presented in Doyen (1988) for porosity estimation and Doyen and Den Boer (1996) for elastic property estimation. Recently, Connolly and Hughes (2016) presented a stochastic inversion based on pseudo-wells. However, straightforward stochastic simulations of inverse problem solutions involving seismic data have high computational costs as a result of the dimensionality of the problem and the spatial coupling in the likelihood.

Several physical models used in geophysics, such as seismic convolution or rock physics relations, are linear or can be linearized (Aki and Richards 1980; Mavko et al. 2009). However, many properties in the subsurface, such as elastic attributes, porosity, or permeability, are generally non-Gaussian, but show a multimodal behavior due to the different rock and fluid properties of the different facies (Grana and Della Rossa 2010; Dubreuil-Boisclair et al. 2012; Sauvageau et al. 2014; Amaliksen 2014). For example, porosity in a mixture of sand and shale is generally bimodal. Gaussian mixture models are linear combinations of Gaussian components that can be used to describe the multimodal behavior of the model and the data (Hasselblad 1966). Sung (2004) introduces Gaussian mixture distributions in multivariate nonlinear regression modeling, while Hastie and Tibshirani (1996) propose a mixture discriminant analysis as an extension of linear discriminant analysis by using Gaussian mixtures and the expectation–maximization algorithm. The multimodal behavior of elastic and petrophysical properties is due to the presence of different rock types and fluids in the subsurface that can be mathematically represented by a latent categorical variable.

In this work, a Bayesian inversion method for geophysical inverse problems is proposed, under the assumptions that the prior distribution is a spatial Gaussian mixture model and the likelihood model is linear with additive Gaussian errors (i.e., Gaussian linear likelihood). The main advantage of the inversion method is the ability to solve mixed discrete–continuous problems. In particular, the aim of this work is to jointly predict elastic or petrophysical properties (continuous properties) and the litho-fluid classification (discrete property), by combining Gaussian mixture random fields and hidden Markov models. Markov models have been previously used to model geological layering (Krumbein and Dacey 1969; Elfeki and Dekking 2001). Eidsvik et al. (2004) propose the use of HMMs (Cappe et al. 2005; Frühwirth-Schnatter 2006; Dymarski 2011) for well log inversion into geological attributes. Lindberg and Grana (2015) propose to estimate the hidden Markov model parameters using the expectation–maximization algorithm (Baum et al. 1970; Dempster et al. 1977). In this work, the prior distribution of the latent categorical variable is assumed to follow a first-order hidden Markov model, whereas the prior distribution of the continuous variable is a Gaussian mixture model. The result of the Bayesian inversion is the posterior probability distribution of the continuous properties that can be analytically computed according to the proposed formulation. Furthermore, the posterior probability distribution of the discrete variable is also assessed through stochastic simulations. Two case studies are presented: the first case is a seismic model where the operator is a convolutional model that links velocities to seismic amplitudes, and the second case is a rock physics model where the operator is a linear relation that links porosity and clay content to seismic attributes.

2 Methods

2.1 Geophysical Inverse Problem

This work focuses on the inversion of seismic data for the prediction of elastic properties, such as velocities or impedances, or petrophysical properties, such as porosity and saturation, along a discretized vertical subsurface profile. The prediction of reservoir properties (elastic or petrophysical) represented in the \(n_r\)-vector \({{\varvec{r}}}\) from seismic data (seismic amplitudes) represented in the \(n_d\)-vector \({{\varvec{d}}}\) is a mathematical inverse problem in the form

$$\begin{aligned} {{\varvec{d}}}={{\varvec{f}}}({{\varvec{r}}}) + {{\varvec{e}}}_d, \end{aligned}$$
(1)

where \({{\varvec{f}}}\) is the geophysical forward model and \({{\varvec{e}}}_d\) is the \(n_d\)-vector containing the centered error associated with the data. Generally, many geophysical models can be linearized and the inverse problem in Eq. (1) can be rewritten as a linear inverse problem

$$\begin{aligned} {{\varvec{d}}}={{\varvec{H}}}{{\varvec{r}}} + {{\varvec{e}}}_d, \end{aligned}$$
(2)

where \({{\varvec{H}}}\) is the \((n_d \times n_r)\)-matrix associated with the linear operator obtained by linearizing the operator \({{\varvec{f}}}\).

Two main geophysical inverse problems in reservoir characterization are: (i) seismic inversion where the objective is to predict elastic properties from seismic data and (ii) petrophysical inversion where the aim is to predict rock and fluid properties from the model of elastic attributes obtained from seismic inversion. If a linearization of the forward physical model is available, then both seismic and petrophysical inversion can be written in the form of Eq. (2). In seismic inversion, the property of interest \({{\varvec{r}}}\) is the set of elastic properties (for example, P-impedance), the data \({{\varvec{d}}}\) are the set of seismograms, and the seismic forward model can be linearized through a convolutional model. Linearized seismic forward modeling generally provides satisfactory approximations for small angles (lower than the critical angle). If a linearized model is not applicable, full waveform inversion algorithms (Pratt 1999; Herrmann et al. 2009; Virieux and Operto 2009; Zhu et al. 2016) should be used. In petrophysical inversion, the property of interest \({{\varvec{r}}}\) is the set of rock and fluid properties (for example, porosity), the data \({{\varvec{d}}}\) are the set of elastic properties obtained from seismic inversion, and the forward model is a linearization of the rock physics model. Rock physics models are in general nonlinear, but can be locally linearized using Taylor’s series approximations (Grana 2016); however, the fluid effect in homogeneous fluid mixtures and the porosity effect in unconsolidated rocks can introduce nonlinear effects in the model and linearized approximations might be inaccurate.

The distribution of the subsurface variables to be predicted is generally multimodal due to the presence of different rock types in the subsurface, the so-called lithological facies. Therefore, the variable \({{\varvec{r}}}\) in Eq. (2) depends on a latent categorical variable, the facies, represented in the \(n_r\)-vector \({\varvec{\kappa }}\). The seismic operator is invariant with respect to the facies, whereas the rock physics operator might be facies dependent. For example, one could use inclusion models in carbonate and granular media models in sandstone. The following two sections describe the forward physical model for the two inverse problems.

2.1.1 Seismic Inversion

Due to the dispersion of the seismic waves, the seismic signal recorded at the time t corresponding to a given depth location x can be computed as a convolution of the reflection coefficients \(c(t,\theta )\) and a wavelet \(w(t,\theta )\), where \(\theta \) is the incident angle. The reflection coefficients \(c(t,\theta )\) depend on the relative contrast of the elastic impedances i(t). The exact expression for the reflection coefficients is given by nonlinear Zoeppritz equations (Sheriff and Geldart 1995), but several linear approximations in terms of impedance are available (Aki and Richards 1980). Because of the presence of observation errors, the geophysical data \(s(t,\theta ) \) at a time t are then given by

$$\begin{aligned} s(t,\theta ) = \int w(u,\theta ) c(t-u,\theta ) \mathrm{d}u + e_s(t), \end{aligned}$$
(3)

where \(e_s(t)\) is the observation error at time t.

In the following, for simplicity, only one incident angle \(\theta =0\) is considered. The forward model is in time domain T, and the property of interest is acoustic impedance i(t). Therefore, the parameter to be predicted is \(v(t)=\log i(t)\). By discretizing the variables s(t) and v(t), for example, every 1 ms, the \(n_s\)-vector \({{\varvec{s}}}\) and the \(n_v\)-vector \({{\varvec{v}}}\) are obtained.

The prediction of acoustic impedance from seismic data is then an inverse problem, commonly known as acoustic inversion. In this work, a linearized approximation of the Zoeppritz equations based on Aki–Richards formulation valid for vertical weak contrasts is used (Aki and Richards 1980). If \({{\varvec{s}}}\) represents the seismic data and \({{\varvec{v}}}\) represents the logarithm of acoustic impedance, then the inverse problem in Eq. (2) can be written as

$$\begin{aligned} {{\varvec{s}}}={{\varvec{F}}} \, {{\varvec{v}}} + {{\varvec{e}}}_s. \end{aligned}$$
(4)

where \({{\varvec{F}}}\) is the (\(n_s \times n_v\))-matrix associated with the linear operator including the convolution and the Aki–Richards linearized approximation and \({{\varvec{e}}}_s\) is the \(n_s\)-vector containing a random centered observation error. The matrix \({{\varvec{F}}}\) can be written as the product of three matrices \({{\varvec{F}}}={\varvec{WAD}}\), where \({{\varvec{W}}}\) is the convolution matrix, \({{\varvec{A}}}\) is the matrix containing Aki–Richards reflection coefficients, and \({{\varvec{D}}}\) is a first-order differential matrix (Buland and Omre 2003; Rimstad and Omre 2013). Therefore, the seismic inverse problem can be written as

$$\begin{aligned} {{\varvec{s}}} = {\varvec{WAD}} \, {{\varvec{v}}} + {{\varvec{e}}}_s. \end{aligned}$$
(5)

In the seismic inversion problem [Eq. (5)], generally, \(n_s<n_v\).

2.1.2 Petrophysical Inversion

Rock and fluid properties affect the velocity of the seismic waves propagating through a porous rock, and consequently its seismic response. If the variable of interest m(t) is a set of rock and fluid properties along a vertical profile in the subsurface, then the elastic response \(v(t)=g(m(t))\) is a function of m(t). Generally, the function g is referred to as the rock physics model. Examples of rock physics models are Raymer’s equation, Dvorkin’s cemented sand model, Dvorkin’s soft sand model, Kuster–Toksoz model, etc. (Mavko et al. 2009). These models allow computing the compressional and shear velocity (or impedance) of the seismic waves when the porosity, lithology, and fluid content of the porous rock are known. In the following, the property of interest is porosity \(m(t)=\phi (t)\) and the data are acoustic impedance \(v(t)=i(t)\). In general, acoustic impedance is not measured directly, but estimated from seismic data through seismic inversion (Sect. 2.1.1). By discretizing the variables v(t) and m(t), the \(n_v\)-vector \({{\varvec{v}}}\) and the \(n_m\)-vector \({{\varvec{m}}}\) are obtained.

The prediction of porosity from acoustic impedance is an inverse problem, generally called rock physics inversion or petrophysical inversion. If \({{\varvec{v}}}\) represents the acoustic impedance and \({{\varvec{m}}}\) represents the porosity, then

$$\begin{aligned} {{\varvec{v}}}= {{\varvec{G}}}{{\varvec{m}}} + {{\varvec{e}}}_v, \end{aligned}$$
(6)

where \({{\varvec{G}}}\) is the matrix associated with the linearized rock physics model and \({{\varvec{e}}}_v\) is the \(n_v\)-vector containing the centered error associated with the data. In general, in the rock physics model [Eq. (6)], \(n_v=n_m\).

If a linearization of the forward model (seismic or rock physics) is applicable, the two inverse problems in Eqs. (5) and (6) can be seen as different examples of the general inverse problem in Eq. (2). The variables \({{\varvec{v}}}\) and \({{\varvec{m}}}\) are assumed to depend on the latent categorical variable \({\varvec{\kappa }}\), and the linear operators \({{\varvec{F}}}\) and \({{\varvec{G}}}\) are invariant with respect to \({\varvec{\kappa }}\). Graphical representations of the seismic and petrophysical inversion models are shown in Fig. 1a, b, respectively. The next section describes the mathematical formulation of the solution of this inverse problem in a Bayesian setting.

Fig. 1
figure 1

Schematic graphical representation of the geophysical inverse problems under study: a seismic inversion assuming a convolutional model; b petrophysical inversion assuming a pointwise linear rock physics model

2.2 Bayesian Gaussian Mixture Inversion

In the following, the focus is on the generic inverse problem in \({{\varvec{d}}}= {{\varvec{H}}}{{\varvec{r}}} + {{\varvec{e}}}_d\) [Eq. (2)], where \({{\varvec{r}}}\) is the variable to be assessed, \({{\varvec{d}}}\) is the data, and \({{\varvec{H}}}\) is the linear operator.

The problem is solved in a probabilistic setting. The probability density function (pdf) of a continuous random n-vector \({{\varvec{y}}}\) is \(p({{\varvec{y}}})\), and the same notation is used for the probability mass function (pmf) for categorical variables. The objective is to assess the probability of the model \({{\varvec{r}}}\) given the data \({{\varvec{d}}}\). The assessment of \(p({{\varvec{r}}} | {{\varvec{d}}})\) is done in a Bayesian framework

$$\begin{aligned} p({{\varvec{r}}} | {{\varvec{d}}}) = \frac{p({{\varvec{d}}} | {{\varvec{r}}})p({{\varvec{r}}})}{p({{\varvec{d}}})} = \frac{p({{\varvec{d}}} | {{\varvec{r}}})p({{\varvec{r}}})}{\int p({{\varvec{d}}} | {{\varvec{r}}})p({{\varvec{r}}})d{{\varvec{r}}}} = \mathrm {const} \times p({{\varvec{d}}} | {{\varvec{r}}})p({{\varvec{r}}}), \end{aligned}$$
(7)

where \(p({{\varvec{r}}})\) is the prior distribution, \(p({{\varvec{d}}} | {{\varvec{r}}})\) is the likelihood function, and \(p({{\varvec{d}}})\) is a normalizing constant to ensure that the posterior distribution \(p({{\varvec{r}}} | {{\varvec{d}}}) \) is a valid probability density function.

2.2.1 Likelihood Model

In this work, the focus is on linear operators and additive Gaussian errors, that is Gaussian linear likelihoods. The probability density function of a n-dimensional Gaussian variable \({{\varvec{y}}}\) is denoted as \(\phi _{n} ({{\varvec{y}}}; {\varvec{\mu }}, {\varvec{\Sigma }})\) with mean \({\varvec{\mu }}\) and covariance matrix \({\varvec{\Sigma }}\). Therefore, the likelihood model is

$$\begin{aligned} p({{\varvec{d}}} | {{\varvec{r}}})=\phi _{n_d} ({{\varvec{d}}}; {{\varvec{H}}} {{\varvec{r}}}, {\varvec{\Sigma }}_e), \end{aligned}$$
(8)

where \({\varvec{\Sigma }}_e\) is the covariance matrix of the centered Gaussian error \({{\varvec{e}}}_d\). Seismic convolution and linearized rock physics models belong to this category.

2.2.2 Prior Model

The focus of this work is on Gaussian mixture prior models. A Gaussian mixture distribution can be written as

$$\begin{aligned} p({{\varvec{r}}}) = \sum _{{\varvec{\kappa }} \in \Omega _\kappa ^{n_r}} p({{\varvec{r}}}|{\varvec{\kappa }}) p({\varvec{\kappa }}), \end{aligned}$$
(9)

with

$$\begin{aligned} p({{\varvec{r}}}|{\varvec{\kappa }}) = \phi _{n_r} \left( {{\varvec{r}}}; {\varvec{\mu }}_{r|\kappa }, {\varvec{\Sigma }}_{r|\kappa }\right) , \end{aligned}$$
(10)

where \({\varvec{\mu }}_{r|\kappa }\) and \({\varvec{\Sigma }}_{r|\kappa }\) are the conditional mean \(n_r\)-vector and the conditional covariance (\(n_r \times n_r\))-matrix , respectively, and the discrete latent \(n_r\)-vector \({\varvec{\kappa }} \) contains the facies labels \(\kappa _t \in \Omega _\kappa = \{ 1, \dots , K\}\) along the discretized vertical profile.

The prior vector \({\varvec{\mu }}_{r|\kappa }\) has means dependent on the facies \({\varvec{\kappa }}\). The prior covariance (\(n_r \times n_r\))-matrix is decomposed as

$$\begin{aligned} {\varvec{\Sigma }}_{r|\kappa } = {\varvec{\Sigma }}_{r|\kappa }^{\sigma 1/2} {\varvec{\Sigma }}_{r}^{o} {\varvec{\Sigma }}_{r|\kappa }^{\sigma 1/2}, \end{aligned}$$
(11)

where the diagonal (\(n_r \times n_r\)) matrix \({\varvec{\Sigma }}_{r|\kappa }^{\sigma } \) contains the facies-dependent variances on the diagonal, whereas the spatial correlation (\(n_r \times n_r\)) matrix \({\varvec{\Sigma }}_{r}^{o}\) has elements \([{\varvec{\Sigma }}_{r}^{o}]_{t, t+\Delta }=\rho _r(\Delta )\), where \(\rho _r(\Delta )\) is a spatial correlation function. This prior model defines a discretized mixture Gaussian random field with spatial correlation function \(\rho _r(\cdot )\) and facies-dependent means and variances.

The model for the facies label \({\varvec{\kappa }}\) is assumed to be a stationary first-order Markov chain

$$\begin{aligned} p({\varvec{\kappa }})=p(\kappa _1) \prod _{t=2}^{n_r} p(\kappa _t|\kappa _{t-1}), \end{aligned}$$
(12)

with stationary transition (\(K \times K\)) matrix \({\mathbb {P}}_{\kappa }={\mathbb {P}}_{\kappa _t}=[p(\kappa _t|\kappa _{t-1})]_{\kappa _t,\kappa _{t-1} \in \Omega _{\kappa }^2}\) and \(p(\kappa _1)=p_s(\kappa _1)\) being the stationary distribution of \({\mathbb {P}}_{\kappa }\) defined by \(p_s={\mathbb {P}}_{\kappa }^T p_s\). Therefore, by combining the definitions above, the stationary prior model for \({{\varvec{r}}}\) is

$$\begin{aligned} p({{\varvec{r}}}) = \sum _{{\varvec{\kappa }} \in \Omega _\kappa ^{n_r}} \phi _{n_r} \left( {{\varvec{r}}}; {\varvec{\mu }}_{r|\kappa }, {\varvec{\Sigma }}_{r|\kappa }^{\sigma 1/2} {\varvec{\Sigma }}_{r}^{o} {\varvec{\Sigma }}_{r|\kappa }^{\sigma 1/2}\right) \times p_s(\kappa _1) \prod _{t=2}^{n_r} p(\kappa _t|\kappa _{t-1}). \end{aligned}$$
(13)

The mixture prior model in Eq. (13) is multimodal with \(K^{n_r}\) modes.

2.2.3 Posterior Distribution

For a mixture prior model [Eq. (9)], the posterior model is

$$\begin{aligned} p({{\varvec{r}}} | {{\varvec{d}}})= & {} p({{\varvec{d}}} | {{\varvec{r}}}) p({{\varvec{r}}}) \left[ p({{\varvec{d}}})\right] ^{-1} \nonumber \\= & {} \sum _{{\varvec{\kappa }} \in \Omega _\kappa ^{n_r}} p({{\varvec{d}}} | {{\varvec{r}}}) p({{\varvec{r}}} | {\varvec{\kappa }}) p({\varvec{\kappa }}) \left[ p({{\varvec{d}}})\right] ^{-1} \nonumber \\= & {} \sum _{{\varvec{\kappa }} \in \Omega _\kappa ^{n_r}} p({{\varvec{d}}} | {{\varvec{r}}}, {\varvec{\kappa }}) p({{\varvec{r}}} | {\varvec{\kappa }}) p({\varvec{\kappa }}) \left[ p({{\varvec{d}}},{\varvec{\kappa }})\right] ^{-1} p({{\varvec{d}}},{\varvec{\kappa }}) \left[ p({{\varvec{d}}})\right] ^{-1} \nonumber \\= & {} \sum _{{\varvec{\kappa }} \in \Omega _\kappa ^{n_r}} p({{\varvec{r}}} | {{\varvec{d}}}, {\varvec{\kappa }}) p({\varvec{\kappa }}|{{\varvec{d}}}), \end{aligned}$$
(14)

where the last expression is a mixture of conditional distributions \(p({{\varvec{r}}} | {{\varvec{d}}}, {\varvec{\kappa }})\) and conditional weights \(p({\varvec{\kappa }}|{{\varvec{d}}})\). This result can be extended to general mixture models: If the prior is a mixture distribution of some basis pdfs, then the posterior distribution is a mixture distribution of the posterior pdfs. In this work, the focus is on Gaussian mixture priors [Eq. (13)] and Gaussian linear likelihood functions [Eq. (8)]; hence, the posterior is also a Gaussian mixture model since the conditional basis pdfs are Gaussian.

The posterior distribution \(p({{\varvec{r}}}|{{\varvec{d}}})\) is then

$$\begin{aligned} p({{\varvec{r}}}|{{\varvec{d}}}) = \sum _{{\varvec{\kappa }} \in \Omega _\kappa ^{n_r}} \phi _{n_r} \left( {{\varvec{r}}}; {\varvec{\mu }}_{r|d,\kappa }, {\varvec{\Sigma }}_{r|d,\kappa } \right) p({\varvec{\kappa }}|{{\varvec{d}}}). \end{aligned}$$
(15)

with conditional mean and conditional covariance matrix

$$\begin{aligned} {\varvec{\mu }}_{r|d,\kappa }= & {} {\varvec{\mu }}_{r|\kappa } + {\varvec{\Sigma }}_{r|\kappa } {{\varvec{H}}}^T [{{\varvec{H}}} {\varvec{\Sigma }}_{r|\kappa } {{\varvec{H}}}^T + {\varvec{\Sigma }}_e]^{-1} ({\varvec{d}}- {\varvec{H}}{\varvec{\mu }}_{r|\kappa }), \end{aligned}$$
(16)
$$\begin{aligned} {\varvec{\Sigma }}_{r|d,\kappa }= & {} {\varvec{\Sigma }}_{r|\kappa } - {\varvec{\Sigma }}_{r|\kappa } {{\varvec{H}}}^T [{{\varvec{H}}} {\varvec{\Sigma }}_{r|\kappa } {{\varvec{H}}}^T + {\varvec{\Sigma }}_e]^{-1} {{\varvec{H}}} {\varvec{\Sigma }}_{r|\kappa }, \end{aligned}$$
(17)

respectively.

The posterior distribution for the facies profile is

$$\begin{aligned} p({\varvec{\kappa }}|{{\varvec{d}}}) = \mathrm {const} \times p({{\varvec{d}}}|{\varvec{\kappa }}) p({\varvec{\kappa }}), \end{aligned}$$
(18)

where the normalizing constant is prohibitive to compute since \({\varvec{\kappa }} \in \Omega _\kappa ^{n_r}\), which is very large. However, a reliable approximation of the posterior distribution parameterized by an integer k is available

$$\begin{aligned} p_k^*({\varvec{\kappa }}|{{\varvec{d}}}) = \mathrm {const} \times p_k^*({{\varvec{d}}}|{\varvec{\kappa }}) p_k({\varvec{\kappa }}), \end{aligned}$$
(19)

with

$$\begin{aligned} p_k^*({{\varvec{d}}}|{\varvec{\kappa }})= & {} \prod _{t = k}^{n_r} \left[ p_k^*({{\varvec{d}}}|{\varvec{\kappa }}_t^k) \right] ^{1/k} + c \end{aligned}$$
(20)
$$\begin{aligned} p_k({\varvec{\kappa }})= & {} p({\varvec{\kappa }}_1^k) \prod _{t=k+1}^{n_r} p({\varvec{\kappa }}_t^k |{\varvec{\kappa }}_{t-1}^k ), \end{aligned}$$
(21)

where \({\varvec{\kappa }}_t^k =(\kappa _{t-k+1}, \dots , \kappa _t) \) for \(t=k,\dots , n_r\) and c represents the edge correction term.

The \(p_k^*({{\varvec{d}}}|{\varvec{\kappa }})\) term is expressed as a k-order factorial form based on the approximate projection likelihoods \(p_k^*({{\varvec{d}}}|{\varvec{\kappa }}_t^k)\) for \(t=k,\dots , n_r\), plus some fully defined edge correction terms (Fjeldstad 2015). The interpretation of this projection approximation is the effect of the facies sequence \({\varvec{\kappa }}_t^k\) on the observation vector \({{\varvec{d}}}\), and the exponent 1 / k is caused by each datum \(d_t\) being used k times. The \(p_k({\varvec{\kappa }})\) term contains the reformulation of the first-order Markov chain prior model \(p({\varvec{\kappa }})\) as a k-order Markov chain, which can be exactly made. Since the likelihood is on a k-order factorial form and the prior is a k-order Markov chain, the resulting posterior distribution \(p_k^*({\varvec{\kappa }}|{{\varvec{d}}})\) is a non-stationary k-order Markov chain (Fjeldstad 2015). The initial probability and the \((n_r-k+1)\) transition matrices that fully define \(p_k^*({\varvec{\kappa }}|{{\varvec{d}}})\) can be exactly calculated by the very efficient recursive forward–backward algorithm (Baum et al. 1970). Once \(p_k^*({\varvec{\kappa }}|{{\varvec{d}}})\) is calculated, realizations from the approximate posterior can be extremely efficiently generated.

The quality of the approximation \(p_k^*({\varvec{\kappa }}|{{\varvec{d}}})\) for \(p({\varvec{\kappa }}|{{\varvec{d}}})\) tends to improve for increasing k, while the computer demand of the recursive algorithm increases as \((n_r-k+1)K^{k+1}\) with k. Previous studies (Fjeldstad 2015) show that \(k=3\) usually provides a good trade-off between approximation quality and computer demand. In order to generate realizations from the correct posterior distribution \(p({\varvec{\kappa }}|{{\varvec{d}}})\), the approximate posterior \(p_k^*({\varvec{\kappa }}|{{\varvec{d}}})\) may be used as proposal distribution in an independent-proposal Markov chain Monte Carlo Metropolis Hastings (McMC-MH) algorithm. Previous studies (Fjeldstad 2015) show that reasonable acceptance rates can be obtained. In the current work, two case studies are presented, one with \(k=3\) and acceptance rate 0.16, and the other with \(k=3\) and acceptance rate 0.05. These cases are typically based on \(10^6\) proposals and \(10^5\) accepted realizations from the correct posterior distribution \(p({\varvec{\kappa }}|{{\varvec{d}}})\), which then entails \(10^5\) realizations from \(p({{\varvec{r}}}|{{\varvec{d}}})\).

3 Application

The application of the inversion methodology is demonstrated on two different case studies. The first dataset is from an onshore field in Texas. The reservoir interval is characterized by thin tight-sandstone layers filled by gas. In this example, an elastic inversion of the real seismic trace collocated at the well location is presented. The results of the inversion are the posterior distributions of elastic properties, P- and S-impedance, and facies. The predicted properties are compared to the actual sonic and density logs to validate the results. The second dataset is from an offshore oil reservoir in the North Sea. The geological environment is a clastic reservoir with a sequence of hydrocarbon sand, shale, and silt layers. In this example, a rock physics inversion based on a multilinear rock physics model is presented. The results of the application are the posterior distributions of petrophysical properties, porosity and clay volume, and facies. For both datasets, a set of measured geophysical logs and computed properties, including sonic and petrophysical data, is available at the well location.

3.1 Seismic Inversion

The first case study is a deep clastic gas reservoir in Texas. A set of three facies has been identified in the reservoir, namely sand, tight sand, and shale. In this case, the proposed methodology is applied to real seismograms collocated at the well location. The objective of this study is to predict P- and S-impedance and the facies, given a set of two seismograms (near and far angles). The depth of the well log has been converted into time, and the inversion has been performed in the time domain.

The vertical profile with the reference lithology facies classification (LFC) is shown in Fig. 2 together with P- and S-impedance logs, and the set of seismic observations (signal-to-noise ratio 2.5). The time interval under consideration is approximately 52 ms which entails \(n_r=61\). The main reservoir layer has a thickness corresponding to a time interval of approximately 10 ms. The reservoir layer has an average porosity of 0.18 and a low percentage of gas. Another potential reservoir layer with low clay content can be identified in the bottom part of the interval of study; however, at the well location this layer consists of a tight sandstone with low porosity and the well measurements do not show hydrocarbon presence. The histograms of the well observations of the elastic properties, P- and S-impedance, are shown in Fig. 3. The reservoir facies has intermediate properties compared to shale and tight-sand layers.

Fig. 2
figure 2

Well logs data and computed properties. From left to right: reference facies classification (shale in black, tight sand in light brown, and sand in brown), P-impedance, S-impedance, and two seismograms (near and far angles)

Fig. 3
figure 3

Histograms of the elastic properties at the well location sorted by facies: P-impedance (top) and S-impedance (bottom). The solid gray lines represent the smoothed gross histograms

Table 1 Model parameters (logarithm of P- and S-impedance) for the seismic inversion case study

The prior transition matrix \({\mathbb {P}}_{\kappa }\) for the latent categorical variable (i.e., the lithological facies) is estimated from the well logs by counting the transitions, where the diagonal elements of \({\mathbb {P}}_{\kappa }\) are related to the expected thickness of each layer (Table 1). The conditional densities \(p\left( r_t|\kappa _t\right) \) are also estimated from the well logs (Table 1). An exponential correlation function is estimated from the measured elastic logs (Table 1).

In Fig. 4, the inversion results for the discrete facies are shown: the reference classification, realizations from the posterior model, the marginal probabilities, and the marginal maximum a posteriori (MMAP) predictor for the posterior model. The main reservoir layer is reliably identified by the inversion results, although some misclassifications occur in the tight-sandstone layer. Note that the set of realizations reflects the uncertainty of the prediction.

In Fig. 5, the inversion results for the elastic properties are shown. From left to right, the plots show: the actual P-impedance log (solid black line) with a set of conditional realizations (gray lines); the MMAP predictor for P-impedance (dashed black line) with the 80% prediction interval and actual P-impedance log (solid black line); the actual S-impedance log (solid black line) with a set of conditional realizations (gray lines); and the MMAP predictor for S-impedance (dashed black line) with the 80% prediction interval and actual S-impedance log (solid black line). The bottom plots show the marginal smoothed histograms of P- and S-impedance. Overall, the proposed approach reliably predicts the elastic properties and captures the variability in the logs. The conditional realizations do reproduce the abrupt changes in the variables due to the multimodal prior pdf. The marginal densities appear highly skewed and multimodal. The predictions closely reproduce the variables with realistic prediction intervals. The results are summarized in Table 2, and the coverage ratios are close to 0.90. The uncertainty at the borders of the interval is probably due to the limited bandwidth and coverage of seismic data.

Fig. 4
figure 4

Inversion results for the lithology facies. From left to right: reference facies classification (shale in black, tight sand in light brown, and sand in brown), subset of conditional realizations, marginal probabilities, and MMAP predictor for the posterior model

Fig. 5
figure 5

Inversion results for the continuous properties (P- and S-impedance) for the Gaussian mixture prior pdf. Refer to the text for further details

The results of the proposed inversion methodology are compared to the traditional Bayesian Gaussian approach defined in Buland and Omre (2003) (Fig. 6). The plots in Fig. 6 show the conditional realizations, the marginal smoothed histograms, and the posterior distributions (MMAP predictions and 80% prediction intervals) of P- and S-impedance. The layout of the plots for the Gaussian case (Fig. 6) is similar to the Gaussian mixture case (Fig. 5). In the Gaussian case, the conditional realizations are smoothed toward the global mean values of P- and S-impedance, resulting in unimodal marginal histograms. Furthermore, the Gaussian case shows predictions that are generally regressed toward the global mean values. The comparison of the root-mean-square error (RMSE) of the predictions and coverage ratio of the 0.80 prediction interval is summarized in Table 2. The proposed method provides superior predictions and prediction intervals compared to the traditional Gaussian approach with 30% reduction in the RMSE.

The computational demand for the proposed approach is larger than for the traditional one, since the former is based on McMC simulations, whereas the latter has a closed-form analytical solution. However, the computational demand is reasonable on a standard laptop computer for the presented case study.

Table 2 Results for the seismic inversion case study
Fig. 6
figure 6

Inversion results for the continuous properties (P- and S-impedance) for the Gaussian prior pdf. Refer to the text for further details

3.2 Petrophysical Inversion

The second case study is a clastic oil reservoir in the North Sea. The interval under consideration is approximately 65 m thick, which entails \(n_r=424\). A set of three facies is identified in the reservoir, namely sand, silt, and shale, where silt is the facies with intermediate petrophysical properties. The objective of this case study is to jointly assess the facies, porosity, and clay volume given P- and S-impedance. In Fig. 7, the reference lithological facies classification is shown together with the porosity and clay volume processed logs and the measured P-impedance and S-impedance logs. The main reservoir layer is located at approximately 2065 m, the average porosity is 0.26, and it is saturated with oil with a small amount of irreducible water saturation of approximately 0.10. The reservoir thickness is approximately 20 m, and the interval is embedded in two shaley layers with effective porosity close to 0. The lower part of the interval of interest consists of a sequence of thin layers of sand, silt, and shale. The histograms of the petrophysical properties of interest (i.e., porosity and clay volume) are shown in Fig. 8. Both porosity and clay volume distributions show a multimodal behavior; therefore, Gaussian prior models are not suitable to describe the property distributions.

The transition matrix and the Gaussian densities are estimated from well logs (Table 3). A modified logit transformation is applied to transform porosity and clay volume logs, bounded between 0 and 1, to variables with support in \({\mathbb {R}}\). An exponential correlation function is estimated to represent the spatial continuity of porosity and clay volume (Table 3).

Fig. 7
figure 7

Well logs data and computed properties. From left to right: reference facies classification (shale in black, silt in light brown, and sand in dark brown), porosity, clay volume, P-impedance, and S-impedance

Fig. 8
figure 8

Histograms of the petrophysical properties at the well location sorted by facies: porosity (top) and clay volume (bottom). The solid gray lines represent the smoothed gross histograms

Table 3 Model parameters (porosity and clay volume) for the petrophysical inversion case study (after a modified \(\mathrm {logit}\)-transformation)
Fig. 9
figure 9

Rock physics crossplots: P-impedance versus porosity (left) and S-impedance versus porosity versus porosity (right). Top plots are color-coded by clay volume; bottom plots are color-coded by facies

In order to build the likelihood function, a rock physics model is first estimated at the well location. An empirical linear rock physics model is chosen for this study and fitted to the well logs. The crossplots showing the linear models are given in Fig. 9. More complex models, such as stiff sand and cemented sand models (Mavko et al. 2009), could be used, as long as the model can be reliably linearized. In this application, a stationary likelihood model \({\mathbf {G}}\) independent of the facies \(\varvec{\kappa }\) is chosen. The proposed inversion methodology is applied to the set of elastic well logs, P- and S-impedance, to assess the posterior distribution of porosity and clay volume.

In Fig. 10, the inversion results for the discrete facies are shown: the reference facies classification, a subset of conditional realizations from the posterior model, the marginal probabilities of the facies, and the MMAP predictor for the posterior model. The facies misclassifications are probably due to the noise and lower resolution of sonic logs compared to the petrophysical property logs. The slight under-prediction of the proportion of silt is expected since it is the facies with intermediate petrophysical properties and it overlaps with the other two facies in the prior histograms (Fig. 8). The set of realizations reflects the prediction uncertainty.

Fig. 10
figure 10

Inversion results for the lithology facies. From left to right: reference facies classification (shale in black, silt in light brown, and sand in dark brown), subset of conditional realizations, marginal probabilities, and MMAP predictor for the posterior model

Fig. 11
figure 11

Inversion results for the continuous properties (porosity and clay volume) for the Gaussian mixture prior pdf. Refer to the text for further details

Fig. 12
figure 12

Inversion results for the continuous properties (porosity and clay volume) for the Gaussian prior pdf. Refer to the text for further details

In Fig. 11, the assessment of the posterior distribution of porosity and clay volume is shown. From left to right, the plots show: the actual porosity log (solid black line) with a set of conditional realizations (gray lines); the MMAP predictor for porosity (dashed black line) with the 80% prediction interval and actual porosity log (solid black line); the actual clay volume log (solid black line) with a set of conditional realizations (gray lines); and the MMAP predictor for clay volume (dashed black line) with the 80% prediction interval and actual clay volume log (solid black line). The bottom plots show the marginal smoothed histograms of porosity and clay volume. In general, the results are satisfactory and the MMAP predictors provide an accurate prediction of the properties of interest. The conditional realizations show abrupt changes, causing multimodal marginal densities. In thin-layer sequences, the facies prediction may be slightly time-shifted, which causes large prediction errors. Therefore, the prediction intervals are relatively wide.

Similarly to the previous case, the inversion results are compared to a traditional Gaussian approach (Fig. 12). The plots in Fig. 12 show the conditional realizations, the marginal smoothed histograms, and the posterior distributions (MMAP predictions and 80% prediction intervals) of porosity and clay volume. The layout of the plots for the Gaussian case (Fig. 12) is similar to the Gaussian mixture case (Fig. 11). In the Gaussian case, a considerable regression toward the global mean values can be observed, resulting in the loss of the multimodality of the posterior distribution. The comparison of the RMSE of the predictions and coverage ratio of the 0.80 prediction interval is summarized in Table 4, and it shows an improvement of the RMSE and coverage ratios on the Gaussian mixture case for both porosity and clay volume. The porosity and clay volume predictions in the main reservoir (2065–2085 m) obtained through the Gaussian mixture inversion (Fig. 11) are closer to the actual observations than the results of the Gaussian case (Fig. 12). Furthermore, the proposed method provides an additional result, that is the facies classification, which is not available in the Gaussian case. The computational demand of the Gaussian mixture case is still limited.

4 Discussion

The main advantage of the proposed method is the use of Gaussian mixture distributions, which allows jointly predicting categorical and continuous properties. Compared to previous works based on Gaussian mixture models, a Markov model for the categorical variable is introduced to model the spatial continuity of the categorical property.

Compared to the approach described in Buland and Omre (2003) that does not account for an explicit dependence on the facies, the proposed inversion results are overall more accurate. Indeed, the Gaussian assumption (Buland and Omre 2003) provides MMAP predictions that are regressed toward the global mean values and underrepresents the distinct mode changes. The coverage ratios are close to the correct prediction interval specifications for the proposed methodology than for the Gaussian approach. Compared to the petrophysical inversion approach described in Grana and Della Rossa (2010), the use of a Markov model for the facies classification provides more realistic spatial transitions of the categorical property. Indeed, in the prior transition matrix, one can enforce geological constraints, such as high likelihood to observe shale on top of sand and low likelihood to observe silt on top of sand, or zero probability of observing water on top of oil and oil on top of gas in litho-fluid studies (Larsen et al. 2006).

The proposed method has higher computational demand than the traditional Gaussian approach, but still very acceptable and easy to parallelize. The extension of the Markov chain approach for facies prediction to two- and three-dimensional models has been previously introduced in Ulvmoen and Omre (2010), Rimstad and Omre (2010), and Rimstad et al. (2012). For the proposed Gaussian mixture case, the extension in the spatial domain appears to be achievable with an acceptable computational demand. Furthermore, the algorithm can be parallelized to improve the computational efficiency. Different from other stochastic optimization methods, in the proposed McMC approach, realizations are not proposed by sampling from the prior distribution but are sampled conditioned by seismic data. In stochastic optimization approaches such as the method proposed by Connolly and Hughes (2016) or the approximate Bayesian computation (ABC) methods as in Sadegh and Vrugt (2013), the likelihood function is approximated by statistical sampling. Simulated outcomes from such optimization methods are often dependent on the prior model and hence tend to underestimate the posterior uncertainty. The proposed method explicitly evaluates the likelihood function and allows for an accurate assessment of the uncertainty.

In the work, the same rock physics model is assumed for all the facies; however, in many practical applications, the rock physics model should be facies dependent (Grana and Della Rossa 2010). The assumption that the rock physics model is linear or almost linear is generally verified in many applications, with the exception of the fluid effect in homogeneous saturation mixtures and the porosity effect in unconsolidated sandstones (Dvorkin et al. 2014), but the linear relations between petrophysical properties and elastic properties are not necessarily the same for each facies. The proposed approach can be extended to account for facies-dependent rock physics models. Nonlinear models can be used in the likelihood function in the McMC approach, but the computational cost of the likelihood evaluation increases compared to the linearized case.

5 Conclusions

In this paper, the formulation of the solution of the Bayesian inverse problem is presented under the assumptions of a Gaussian linear likelihood function and a mixture Gaussian Markov random field for the prior model. The proposed inversion approach allows for the joint assessment of the posterior distribution of the continuous model variables and the latent categorical variable. This approach allows sampling from the posterior distribution with relatively limited computational costs. The method was successfully applied to two case studies of seismic and rock physics inversion. Compared to the inversion under Gaussian assumptions for the prior model, the use of Gaussian mixture models improves the description of the multimodal behavior of the model parameters.

Table 4 Results for the petrophysical inversion case study