1 Introduction

As the rise of human population and the modernization of civilization, the need for fossil fuel energy has been increasing. To fulfill this demand, maximizing oil production from an oil reservoir is a challenging task. An oil reservoir is usually modeled by partial differential equations (PDEs), where the geological model is in the order of 106–109 discretized grids. The geological model is developed by geologists which is then further used by reservoir engineers. In this paper we focus on the production optimization of oil reservoirs with emphasis on water flooding which is typically handled by the reservoir engineers. The injected water aims to sweep remaining oil efficiently. With the current state-of-the-art computing power, reservoir simulation models are usually reduced to the order between \(10^{4}\) and \(10^{6}\) grid blocks. This process is known as upscaling since it creates a coarse model from the geological model. Upscaling is done based on geophysical interpretation by the reservoir engineers. This involves heuristics and can be a time-consuming runtime simulation process both with regards to computations and human labour (see e.g., Aarnes et al. (2007)).

Model order reduction techniques can be used to facilitate the upscaling process. The use of model order reduction techniques has been around in the reservoir simulation research since early 2000, see e.g., Markovinovic et al. (2002) and Markovinovic et al. (2002). The work of Heijn et al. (2004) compared methods for reduced-order modeling which treat oil reservoirs both as linear and nonlinear models. The methods originate from systems and control theory. Balanced truncation, subspace identification, and proper orthogonal decomposition (POD) methods were compared. Based on the examples considered in Heijn et al. (2004), the conclusion was that the POD method gave the best approximation of the oil reservoir dynamics. In follow up work Doren et al. (2006), Markovinovic and Jansen (2006), the POD method was used for gradient-based production optimization. The adjoint equations were derived using reduced-order models of the state equations. The POD method generates reduced-order models with global basis functions. Another approach was presented in Krogstad et al. (2011) where a multiscale method was applied to compute local basis functions. In that work an optimization problem using a real geometry of an oil reservoir was solved in 15 minutes, compared to a normal length of hours or even days. In a more recent work a combination of multiscale and POD methods was presented in Krogstad (2011), yielding a local POD basis function. The selection of local basis functions in a multiscale method is done by considering physical aspects such as fault locations and flux boundaries, which is more intuitive to the reservoir engineers since the reservoir model is divided into some coarsened segments, where each of the segment has its own local basis function.

The use of the trajectory-piecewise-linearization (TPWL) method, which models the oil reservoir as a linear time varying (LTV) system along selected operating points, was proposed in Cardoso and Durlofsky (2010). The same authors also proposed the use of missing point estimation (MPE)-POD (Astrid et al. (2008)) in Cardoso et al. (2009) and further used the TPWL-based model order reduction for production optimization in Cardoso and Durlofsky (2010). Two optimization methods were presented in Cardoso and Durlofsky (2010). These were the gradient-based and generalized pattern search methods. None of the production optimization papers mentioned above discuss the state or nonlinear output constraint problem. In more recent work, the use of approximate dynamic programming combined with POD method was proposed in Wen et al. (2011). This work used the penalty method to handle the state constraint problem.

In other areas, the POD method has been used for constraints handling in low-fidelity model optimization. Among these, the trust-region POD (TRPOD), originally proposed in Fahl (2000) for unconstrained optimization problems, was further developed for constrained optimization. The idea is that the POD method gives a good approximation of the high-fidelity model by updating POD basis functions in limited (or “trusted”) operating points. During the course of optimization the decision variables are always changed, therefore the POD basis functions need to be updated using the new update of decision variables. Without this updating, the POD basis functions represent the previous/old decision variables, which are no longer valid and give a poor approximation of the high-fidelity model. The constrained TRPOD, which means optimization using reduced-order models in the presence of (equality/inequality) constraints, was initiated in Alexandrov et al. (2001). The authors developed penalty, augmented Lagrangian, and SQP-like methods. A similar approach was used in Robinson (2007), where POD, space mapping methods, and their combination were proposed for constructing the reduced-order models. Furthermore, the use of the filter method, for nonlinear constraints handling, in low-fidelity models optimization along with TRPOD was presented in Agarwal (2010), Agarwal and Biegler (2011).

In this work, we follow the TRPOD method and to handle the state constraints we use the Lagrangian barrier method, which is a continuation of our work in Suwartadi et al. (2010). To best of our knowledge, the TRPOD method has not been applied to the reservoir simulation problem. Hence, the contribution of this work is to apply the TRPOD method to the production optimization of oil reservoirs. Furthermore, we consider nonlinear inequality constraints. Our method is a gradient-based optimization method which uses the POD method for computing basis functions for state and adjoint equations. Since we have implemented the adjoint method in the high-fidelity model and to avoid the difficulty of re-implementating the adjoint-based gradient in the reduced-order model, we take snapshots of the adjoint equations as well. Thus, the reduced-order models in this work consist of reduced-order state and adjoint equations. Our approach is different from the work of Doren et al. (2006), Markovinovic and Jansen (2006) where the reduced order model for the adjoint equations were derived based on the forward reduced order model.

It should be noted that there are many variants of the POD methods in addition ones mentioned above. A combined POD and discrete empirical interpolation method (DEIM), where DEIM is a variant of EIM Barrault et al. (2004), was recently proposed in Chaturantabut and Sorensen (2010). This work pointed out that the POD method is only good for approximating linear or bi-linear terms of equations. As shown in an example in Chaturantabut and Sorensen (2011), for nonlinear systems, the POD method in conjunction with DEIM gives considerable CPU time speedup compared to the POD method alone. Since the oil reservoir models contain highly nonlinear terms, in this work we also compare the POD and POD-DEIM methods. The application of DEIM to optimization problems involving oil reservoir models is another contribution of this work.

The outline of this paper is the following. In Sect. 2 we describe the oil reservoir model which consists of pressure and saturation equations representing the state variables. We refer to these state equations as forward equations. In this section we also derive the adjoint equations and the reduced-order models. The production optimization problem is explained in Sect. 3, which basically is an economic optimization problem. In Sect. 4, we present the algorithms for the TRPOD method and the Lagrangian barrier method for nonlinear constraint handling. The algorithms use the TRPOD method in the inner iteration and the Lagrangian barrier in the outer iteration. This means the TRPOD method is used within the Lagrangian barrier iteration. Case examples that use 2D and 3D oil reservoirs are presented and the results are discussed in Sect. 5. Finally, based on the case example results we conclude this paper in Sect. 6.

In this paper we use standard linear algebra notations for describing mathematical equations. The superscript T is used to denote vector or matrix transpose. Matrices and vectors are written with bold letters while scalars are typed as ordinary letters.

2 Oil reservoir model

Water flooding is the most common secondary recovery technique for oil reservoirs. During early stages of oil reservoir production, the pressure in the reservoir is high enough to support production alone. However, water is often injected to provide additional pressure support in the reservoir and thereby increase recovery.

We assume the reservoir is above the bubble point so that the oil component is in liquid form only. Furthermore, we assume the process is isothermal, the liquids are incompressible, immiscible (water and oil cannot be mixed), no capillary pressure between oil and water, no gravity effect, and no-flow at the boundary of the reservoirs.

2.1 Forward model

The oil reservoir is governed by the continuity equation which expresses conservation of mass. We refer the model exposition here to Aarnes et al. (2007). The state equations consist of pressure and saturation equations. Let \(\varOmega \) be a porous media domain with boundary \(\partial \varOmega \). The pressure equation is given by

$${\mathbf{v}}=-{\mathbf {K}}\lambda _t (s)\nabla p, \quad \nabla \cdot {\mathbf{v}} = q \quad {\mathrm{in}}\quad \varOmega ,$$
(1)

where \({\mathbf{v}}\) is the Darcy velocity, \({\mathbf {K}}\) is the permeability tensor, \(p\) is the pressure, \(s\) is the water saturation, and \(q\) is the volumetric source/sink term. Finally \(\lambda _t\) is the total mobility, which in this setting is the sum of the water and oil mobility functions,

$$\lambda _{t}\left( s\right) =\lambda _{w}\left( s\right) +\lambda _{o} \left( s\right) =\frac{k_{rw}\left( s\right) }{\mu _{w}}+ \frac{k_{ro}\left( s\right) }{\mu _{o}}.$$
(2)

Here, \(k_{rw},k_{ro}\) and \(\mu _w,\mu _o\) are the water and oil relative permeabilities and viscosities, respectively. Assuming no-flow boundaries means that the normal component of the Darcy velocity across boundaries is zero.

The saturation equation is given by

$$\phi \frac{\partial s}{\partial t}+\nabla \cdot \left( f_w(s){\mathbf{v}} \right) =q_{w} \quad {\mathrm{in}}\quad \varOmega ,$$
(3)

where \(\phi \) is the porosity and \(q_{w}\) is the volumetric water source. Finally, \(f_w\) is the water fractional flow function \(f_w(s) = \lambda _w(s)/\lambda _t(s)\), which is also known as water cut. The nonlinear behavior of the above equations is mainly dictated by the shape of the relative permeability functions, which in this paper are taken to be quadratic. The relative permeability data are obtained from laboratory experiments using small portions of rocks which do not generally represent the rock properties of the whole reservoir. Hence, uncertainties are unavoidable.

Equations (1) and (3), which are elliptic and parabolic PDEs respectively, are solved numerically. Hence, we need to discretize the equations. We discretize the domain \(\varOmega \) into a set of polyhedral grid blocks \(\left\{ E_{i}\right\} \), where a grid block \(E\) contains faces \(e_{k}\), \(k=1,\ldots ,n_{E}\). Let \({\mathbf {v}}_{E}=\left( v_{e_{1}},v_{e_{2}},\ldots ,v_{e_{n_{E}}}\right) \) be the outward pointing flux vectors corresponding to the faces of \(E\), \(p_{E}\) the pressure at the grid block center, and \({\varvec{\pi }}_{E}\) the pressures at the grid faces. Then, the discretized pressure equation for a single grid-block is

$$\begin{aligned}{\mathbf {v}}_{E}= &\lambda \left( {s}_{E}\right) {\mathbf {T}}_{E}\left( p_{E}-{\varvec{\pi }}_{E} \right) \\&\sum _{i=1}^{n_{E}}v_{i}|e_{i}|=q_{E},\nonumber \end{aligned}$$
(4)

where \({\mathbf {T}}_{E}\) is the transmissibility matrix, and \(q_{E}\) is the source/sink term in block \(E\). Here, we discretize according to the two-point flux-approximation (TPFA) (see e.g., Aziz and Settari (1979)), which will result in diagonal transmissibility matrices.

The boundary conditions are only located at wells since we assume no-flow boundary. As in (1) the sink/source terms represent injector/producer wells. The wells are modeled by the Peaceman equation Peaceman (1983) as follows

$$q_{E}^{w}=-\lambda \left( s_{E}\right) WI_{E}^{w}\left( p_{E}-p_{E}^{w}\right) .$$
(5)

Here \(q_{E}^{w}\) is the flow rate from well \(w\) into grid block \(E\) and \(p_{E}^{w}\) is the wellbore pressure (assumed to be constant since we neglect gravity and wellbore flow effects). Finally, \(WI_{E}^{w}\) is the Peaceman well-index for the grid block \(E\).

The discretized pressure Eq. (4) and the well Eq. (5) can be combined such that they construct the following linear equation

$$\left( \begin{array}{ccccc} {\mathbf {B}}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {0}} &{} {\mathbf {C}} &{} {\mathbf {D}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {B}}_{w}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {C}}_{w} &{} {\mathbf {0}} &{} {\mathbf {D}}_{w,N}\\ {\mathbf {C}}^{T} &{} {\mathbf {C}}_{w}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {D}}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {D}}_{w,N}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} \end{array}\right) \left( \begin{array}{c} {\mathbf {v}}^{n}\\ -{\mathbf {q}}_{w}^{n}\\ -{\mathbf {p}}^{n}\\ {\varvec{\pi }}^{n}\\ {\mathbf {p}}_{w,N}^{n} \end{array}\right) =\left( \begin{array}{c} {\mathbf {0}}\\ -{\mathbf {D}}_{w,D}{\mathbf {p}}_{w,D}^{n}\left( {\mathbf {u}}^{n}\right) \\ {\mathbf {0}}\\ {\mathbf {0}}\\ -{\mathbf {q}}_{tot,N}^{n}\left( {\mathbf {u}}^{n}\right) \end{array}\right) .$$
(6)

Here, the first and the second rows in the block-matrix above represent to Darcy’s law as in (1) and (5) for all grid blocks. The third row corresponds to mass conservation for all grid blocks and the last two rows refer to continuity of fluxes for all grid block faces. The solution vector of the block-matrix equation above is

$$\left[ \begin{array}{ccccc} {\mathbf {v}}^{n}&-{\mathbf {q}}_{w}^{n}&-{\mathbf {p}}^{n}&{\varvec{\pi }}^{n}&{\mathbf {p}}_{w,N}^{n}\end{array}\right] ^{T} $$

include the fluxes, the well rates, the grid-block pressures, the face and well pressures, and the wellbore pressure, respectively. The matrices \({\mathbf {B}}\), \({\mathbf {B}}_{w}\), \({\mathbf {C}}\), and \({\mathbf {C}}_{w}\) are block diagonal with each block corresponding to a grid block. Similarly, each column of \({\mathbf {D}}\) and \({\mathbf {D}}_{w,N}\) correspond to a unique face. Superscript \(n\) represents the time step and \({\mathbf {u}}^{n}\) is the control input at time step \(n\), which could be either bottom-hole pressure (BHP): \({\mathbf {u}}^{n}={\mathbf {p}}_{w,N}^{n}\) or well rate: \({\mathbf {u}}^{n}={\mathbf {v}}^{n}\). The block-matrix Eq. (6) is solved for time step \(n\) using the default linear solver in MATLAB which is a direct sparse method (see Davis (2006)). We note that when TPFA is used, the pressure Eq. (6) can be reduced to a system of cell pressure-unknowns only, while the current implementation uses a mixed formulation where fluxes and cell pressures are solved for simultaneously.

We discretize the saturation Eq. (3) using a standard upstream weighted implicit finite volume method to form

$${\mathbf {s}}^{n}={\mathbf {s}}^{n-1}+\triangle t^{n}\,{\mathbf {D}}_{PV}^{-1}\left( {\mathbf {A}}\left( {\mathbf {v}}^{n}\right) \,f_w\left( {\mathbf {s}}^{n}\right) + {\mathbf {q}}\left( {\mathbf {v}}^{n}\right) _{+}\right) .$$
(7)

Here, \(\triangle t^{n}\) is the time step and \({\mathbf {D}}_{PV}\) is the diagonal matrix containing the grid block pore volumes. The matrix \({\mathbf {A}}\left( {\mathbf {v}}^{n}\right) \) is the sparse flux matrix based on the upstream weighted discretization scheme, and \({\mathbf {q}}({\mathbf {v}^n})_{+}\) is the vector of positive sources (in this setting, water injection rates). We note that the matrix \({\mathbf {A}}\) and vector \({\mathbf {q}}\) are linear functions of \({\mathbf {v}}^n\). The discretized saturation Eq. (7) is solved implicitly for the current time step \(n+1\) using a Newton-Raphson method.

As seen the Eqs. (6) and (7) are coupled. The solution strategy to solve these equations is first solving the discretized pressure Eq. (6) using initial water saturation values, and then solve the discretized saturation Eq. (7). This procedure is repeated forward in time until the final time is reached. This kind of solution strategy is known as a sequential-splitting method Aarnes et al. (2007). The model used in this work is implemented in Lie et al. (2011). For convenience, we write the discrete state Eqs. (6) and (7) in an implicit form \({\mathbf {F}}\left( \widetilde{{\mathbf {x}}}, \widetilde{{\mathbf {u}}}\right) =0\) as

$$\begin{aligned} {\mathbf {F}}(\widetilde{{\mathbf {x}}},\widetilde{{\mathbf {u}}})&= \left( \begin{array}{c} {\mathbf {F}}^{1}\left( {\mathbf {p}}^{1},{\mathbf {s}}^{0},{\mathbf {s}}^{1}, {\mathbf {u}}^{1}\right) \\ \vdots \\ {\mathbf {F}}^{N}\left( {\mathbf {p}}^{N},{\mathbf {s}}^{N-1},{\mathbf {s}}^{N}, {\mathbf {u}}^{N}\right) \end{array} \right) =0 \\ {\mathbf {x}}^{nT}&= ({\mathbf {p}}^{nT},{\mathbf {s}}^{nT}),\quad n=1,...,N, \\ \widetilde{{\mathbf {x}}}^{T}&= ({\mathbf {x}}^{1T},...,{\mathbf {x}}^{NT}), \\ \widetilde{{\mathbf {u}}}^{T}&= ({\mathbf {u}}^{1T},...,{\mathbf {u}}^{NT}). \end{aligned}$$
(8)

The state vectors and control input vectors are stacked for all time instances from \(n=1,\ldots ,N\). The dimension of \(\widetilde{{\mathbf {x}}}\) and \(\widetilde{{\mathbf {u}}}\) depends on the number of grid blocks and time steps.

2.2 Adjoint equations

Let \({\mathcal {J}}\left( \widetilde{{\mathbf {x}}},\widetilde{{\mathbf {u}}}\right) =\sum _{n=1}^{N}{\mathcal {J}}^{n}\left( {\mathbf {x}}^{n},{\mathbf {u}}^{n}\right) \) be an objective function, and denote by \(\nabla _{\widetilde{{\mathbf {u}}}}{\mathcal {J}}\) the gradient with respect to a control input \(\widetilde{{\mathbf {u}}}\). The detailed description of the objective function \({\mathcal {J}}\left( \widetilde{{\mathbf {x}}},\widetilde{{\mathbf {u}}}\right) \) will be explained in Sect. 3. We then construct an augmented objective function or Lagrangian functional

$$\begin{aligned} {\mathcal {L}}\left( \widetilde{{\mathbf {x}}},\widetilde{{\mathbf {u}}}, {\varvec{\lambda }}\right)&= {\mathcal {J}}\left( \widetilde{{\mathbf {x}}}, \widetilde{{\mathbf {u}}}\right) +{\varvec{\lambda }}^{T}{\mathbf {F}} \left( \widetilde{{\mathbf {x}}},\widetilde{{\mathbf {u}}}\right) \\&= \sum _{n=1}^{N}\left( {\mathcal {J}}^{n}\left( {\mathbf {x}}^{n}, {\mathbf {u}}^{n}\right) +{\varvec{\lambda }}^{nT}{\mathbf {F}} \left( {\mathbf {x}}^{n},{\mathbf {x}}^{n-1},{\mathbf {u}}^{n}\right) \right) , \end{aligned}$$
(9)

for \(n=1,\ldots ,N\), where

$$\begin{aligned} {\varvec{\lambda }}^{nT}F=\quad&{\varvec{\lambda }}_{v}^{nT}\left( {\mathbf {B}}^{n}{\mathbf {v}}^{n}-{\mathbf {C}} {\mathbf {p}}^{n}+{\mathbf {D}}{\varvec{\pi }}^{n}\right) \\&\,+ {\varvec{\lambda }}_{q_{w}}^{nT} \left( -{\mathbf {B}}_{w}^{n}{\mathbf {q}}_{w}-{\mathbf {C}}_{w} {\mathbf {p}}^{n}+{\mathbf {D}}_{w,N}{\mathbf {p}}_{w,N}^{n} \left( {\mathbf {u}}^{n}\right) \right) \\&\,+ {\varvec{\lambda }}_{p}^{nT}\left( {\mathbf {C}}^{T} {\mathbf {v}}^{n}-{\mathbf {C}}_{w}^{T}{\mathbf {q}}_{w}^{n}\right) \\&\,+ {\varvec{\lambda }}_{\pi }^{nT}{\mathbf {D}}^{T}{\mathbf {v}}^{n}\\&+ {\varvec{\lambda }}_{p_{w,N}}^{nT}\left( -{\mathbf {D}}_{w,N}^{T} {\mathbf {q}}_{w}^{n}+{\mathbf {q}}_{tot,N}^{n}\left( {\mathbf {u}}^{n}\right) \right) \\&\,+ {\varvec{\lambda }}_{s}^{nT}\left( {\mathbf {s}}^{n} -{\mathbf {s}}^{n-1}-\triangle t^{n}{\mathbf {i}}^{n}\right) . \end{aligned}$$

Here \({\mathbf {i}}^{n}={\mathbf {D}}_{PV}^{-1}\left( {\mathbf {A}}\left( {\mathbf {v}}^{n}\right) f_{w}\left( {\mathbf {s}}^{n}\right) +{\mathbf {q}}\left( {\mathbf {v}}^{n}\right) _{+}\right) \). By choosing \({\varvec{\lambda }}\) that makes \(\nabla _{{\tilde{\mathbf{x}}}}{\mathcal {L}}={\mathbf {0}}\), we arrive at the adjoint equations

$$\begin{aligned} \left( \frac{\partial F\left( {\mathbf {x}}^{n},{\mathbf {x}}^{n-1}, {\mathbf {u}}^{n}\right) }{\partial {\mathbf {x}}^{n}}\right) ^{T} {\varvec{\lambda }}^{n}+\left( \frac{\partial F\left( {\mathbf {x}}^{n+1}, {\mathbf {x}}^{n},{\mathbf {u}}^{n+1}\right) }{\partial {\mathbf {x}}^{n}}\right) ^{T} {\varvec{\lambda }}^{n+1}=-\left( \frac{\partial {\mathcal {J}}^{n} \left( {\mathbf {x}}^{n},{\mathbf {u}}^{n}\right) }{\partial {\mathbf {x}}^{n}}\right) ^{T}, \end{aligned}$$
(10)

for \(n=N,\ldots ,1\). The details of (10) are

$$\left( \begin{array}{ccccc} {\mathbf {B}}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {0}} &{} {\mathbf {C}} &{} {\mathbf {D}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {B}}_{w}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {C}}_{w} &{} {\mathbf {0}} &{} {\mathbf {D}}_{w,N}\\ {\mathbf {C}}^{T} &{} {\mathbf {C}}_{w}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {D}}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {D}}_{w,N}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} \end{array}\right) \left( \begin{array}{c} {\varvec{\lambda }}_{v}^{n}\\ {\varvec{\lambda }}_{q_{w}}^{n}\\ {\varvec{\lambda }}_{p}^{n}\\ {\varvec{\lambda }}_{\pi }^{n}\\ {\varvec{\lambda }}_{p_{w,N}}^{n} \end{array}\right) =\left( \begin{array}{c} \left( \frac{\partial {\mathbf {i}}^{n}}{\partial {\mathbf {v}}^{n}}\right) ^{T} {\varvec{\lambda }}_{s}^{n}-\left( \frac{\partial {\mathcal {J}}}{\partial {\mathbf {v}}^{n}}\right) ^{T}\\ \left( \frac{\partial {\mathcal {J}}^{n}}{\partial {\mathbf {q}}_{w}^{n}}\right) ^{T}\\ {\mathbf {0}}\\ {\mathbf {0}}\\ {\mathbf {0}} \end{array}\right) ,$$
(11)

for the corresponding pressure equation and the following for the saturation

$$\begin{aligned} \left( {\mathbf {I}}-\triangle t\left( \frac{\partial {\mathbf {i}}^{n}}{\partial {\mathbf {s}}^{n}}\right) ^{T} \right) {\varvec{\lambda }}_{s}^{n}= \quad& {\varvec{\lambda }}_{s}^{n+1}-\left( \frac{\partial {\mathcal {J}}^{n}}{\partial {\mathbf {s}}^{n}}\right) ^{T} \\&\,- \left( \frac{\partial }{\partial {\mathbf {s}}^{n}}\left( {\mathbf {B}}^{n+1} {\mathbf {v}}^{n+1}\right) \right) ^{T}{\varvec{\lambda }}_{v}^{n+1} \\\,&\,+ \left( \frac{\partial }{\partial {\mathbf {s}}^{n}}\left( {\mathbf {B}}_{w}^{n+1} {\mathbf {q}}_{w}^{n+1}\right) \right) ^{T}{\varvec{\lambda }}_{q_{w}}^{n+1}. \end{aligned}$$
(12)

Using the fact that at the final time \({\varvec{\lambda }}_{\alpha }^{N}={\mathbf {0}}\) for \(\alpha =\left\{ v,q_{w},p,\pi ,p_{w,N},s\right\} \), we are able to compute the Lagrangian multiplier for each time step backward in time. It should be noted that (11) and (12) are linear equations and they are solved using the direct sparse method as well. Finally using the obtained Lagrangian multipliers values, the gradient with respect to \(\widetilde{\mathbf {u}}\) is

$$\nabla _{\widetilde{\mathbf {u}}}{\mathcal {L}}^{n}= \frac{\partial {\mathcal {J}}^{n}\left( {\mathbf {x}}^{n}, {\mathbf {u}}^{n}\right) }{\partial {\mathbf {u}}^{n}}+{\varvec{\lambda }}^{nT} \frac{\partial F\left( {\mathbf {x}}^{n},{\mathbf {x}}^{n-1},{\mathbf {u}}^{n}\right) }{\partial {\mathbf {u}}^{n}}. $$
(13)

2.3 Reduced-order models

2.3.1 POD method

In order to build a reduced-order model based on the POD method, we need to take snapshots of the high-fidelity model described in (6) and (7). Let \({\mathbf {x}}^{T}=\left[ \begin{array}{cc} {\mathbf {p}}&{\mathbf {s}}\end{array}\right] =\left[ \begin{array}{cc} {\mathbf {x}}_{p}^{T}&{\mathbf {x}}_{s}^{T}\end{array}\right] \in {\mathbb {R}}^{n_{x}}\) be the snapshot of the solution of the forward equations with \(n_{x}\) as the dimension of the solution, which is the number of grid block. Given a set of snapshots \(\left\{ {\mathbf {x}}_{1},\ldots ,{\mathbf {x}}_{\varXi }\right\} \in {\mathbb {R}}^{n_{x}\times \varXi }\), the snapshot matrices are

$${\mathbf {x}}_{p}=\left[ \begin{array}{cccc} {\mathbf {x}}_{p_{1}}^{1} &{} {\mathbf {x}}_{p_{1}}^{2} &{} \ldots &{} {\mathbf {x}}_{p_{1}}^{\varXi }\\ {\mathbf {x}}_{p_{2}}^{1} &{} {\mathbf {x}}_{p_{2}}^{2} &{} \ldots &{} {\mathbf {x}}_{p_{2}}^{\varXi }\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ {\mathbf {x}}_{p_{n_{x}}}^{1} &{} {\mathbf {x}}_{p_{n_{x}}}^{2} &{} \ldots &{} {\mathbf {x}}_{p_{n_{x}}}^{\varXi } \end{array}\right] _{n_{x}\times \varXi },\;{\mathbf {x}}_{s}=\left[ \begin{array}{cccc} {\mathbf {x}}_{s_{1}}^{1} &{} {\mathbf {x}}_{s_{1}}^{2} &{} \ldots &{} {\mathbf {x}}_{s_{1}}^{\varXi }\\ {\mathbf {x}}_{s_{2}}^{1} &{} {\mathbf {x}}_{s_{2}}^{2} &{} \ldots &{} {\mathbf {x}}_{s_{2}}^{\varXi }\\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ {\mathbf {x}}_{s_{n_{x}}}^{1} &{} {\mathbf {x}}_{s_{n_{x}}}^{2} &{} \ldots &{} {\mathbf {x}}_{s_{n_{x}}}^{\varXi } \end{array}\right] _{n_{x}\times \varXi }$$
(14)

It should be noted there is no additional computational time to build the snapshot matrices since they are merely solutions of the forward state equations. Let \({\mathcal {V}}={\mathrm{span}}\left\{ {\mathbf {x}}_{1},\ldots ,{\mathbf {x}}_{\varXi }\right\} \) , the POD basis function is a solution of an optimization problem for finding orthonormal vectors \(\left\{ {\psi }_{i}\right\} _{i=1}^{\ell }\), where \(\ell \le {\mathrm{rank}}\left( {\mathcal {V}}\right) \). The optimization formulation is

$$\begin{aligned} \underset{\left\{ {\psi }_{i}\right\} _{i=1}^{\ell }}{\mathrm{min}}{\mathcal {J}}\left( {\psi }_{1}, \ldots ,{\psi }_{\ell }\right) :=&\sum _{j=1}^{\varXi }\left\| {\mathbf {x}}_{j}-\sum _{i=1}^{\ell }\left( {\mathbf {x}}_{j}^{T} {\psi }_{i}\right) {\psi }_{i}\right\| _{2}^{2}\\ {\mathrm{subject\, to}}\,&{\psi }_{i}^{T}{\psi }_{j}={\delta }_{ij}={\left\{ \begin{array}{ll} 1 &{} {\mathrm{if}}\, i=j\\ 0 &{} {\mathrm{otherwise}} \end{array}\right. }. \end{aligned}$$
(15)

We define a Lagrangian functional

$$ {\mathcal {L}}\left( {\psi }_{1},\ldots ,{\psi }_{\ell },{\lambda }_{11}, \ldots ,{\lambda }_{\ell \ell }\right) ={\mathcal {J}}\left( {\psi }_{1}, \ldots ,{\psi }_{\ell }\right) +\sum _{i,j=1}^{\ell }{\lambda }_{ij} \left( {\psi }_{i}^{T}{\psi }_{j}-{\delta }_{ij}\right) .$$
(16)

The necessary optimality conditions, \(\frac{\partial {\mathcal {L}}}{\partial {\psi }_{i}}=0\) and \(\frac{\partial {\mathcal {L}}}{\partial {\lambda }_{ij}}=0\), give us an eigenvalue problem

$$\sum _{j=1}^{\varXi }{\mathbf {x}}_{j}\left( {\mathbf {x}}_{j}^{T}{\psi }_{i}\right) ={\lambda }_{ii}{\psi }_{i},\;{\mathrm{for}}\,i=1,\ldots ,\ell $$
(17)

or by setting \({\lambda }_{i}={\lambda }_{ii}\) and \({\mathbf {X}}=\left[ {\mathbf {x}}_{1},\ldots ,{\mathbf {x}}_{\varXi }\right] \in {\mathbb {R^{\mathrm{n_{x}\times \varXi }}}}\), then the problem reads

$$ {\mathbf {X}}{\mathbf {X}}^{T}{\psi }_{i}={\lambda }_{i}{\psi }_{i}, \;{\mathrm{for}}\, i=1,\ldots ,\ell .$$
(18)

To compute the solution of (18), we decompose the vector \({\mathbf {X}}\) using singular value decomposition (SVD), that is,

$$ {\mathbf {X}}={\mathbf {U}}{\varvec{\varSigma }}{\mathbf {V}}^{T},$$
(19)

where \({\mathbf {U}}=\left[ {\mathbf {u}}_{1},\ldots ,{\mathbf {u}}_{n_{x}}\right] \in {\mathbb {R^{\mathrm{n_{x}\times n_{x}}}}}\) and \({\mathbf {V}}=\left[ {\mathbf {v}}_{1},\ldots ,{\mathbf {v}}_{\varXi }\right] \in {\mathbb {R^{\mathrm{\varXi \times \varXi }}}}\) are orthogonal matrices, and \({\varvec{\varSigma }}\in {\mathbb {R}}^{\mathrm{n_{x}\times \varXi }}\) is the diagonal matrix with diagonal arranged in a decreased order, that is, \(\sigma _{1}\ge \sigma _{2}\ge \ldots \ge \sigma _{\varXi }\ge 0.\) In other words,

$${\mathbf {U}}^{T}{\mathbf {X}}{\mathbf {V={\varvec{\varSigma }}}}. $$
(20)

Moreover, it follows that for \(1\le i\le \varXi \) that

$${\mathbf {X}}{\mathbf {v}}_{i}=\sigma _{i}{\mathbf {u}}_{i},\; {\mathbf {X}}^{T}{\mathbf {u}}_{i}=\sigma _{i}{\mathbf {v}}_{i},\; {\mathbf {X}}{\mathbf {X}}^{T}{\mathbf {u}}_{i}=\sigma _{i}^{2} {\mathbf {u}}_{i}. $$
(21)

The solution of problem (18) is a POD basis \({\psi }_{i}={\mathbf {u}}_{i}\) and \(\lambda _{i}=\sigma _{i}^{2}>0\) for \(i=1,\ldots ,\ell \le d={\mathrm{dim}}\,{\mathcal {V}}\). The minimized objective function (15) is then

$$ {\mathcal {J}}\left( {\psi }_{1},\ldots ,{\psi }_{\ell }\right) :=\sum _{j=1}^{\varXi }\left\| {\mathbf {x}}_{j}-\sum _{i=1}^{\ell }\left( {\mathbf {x}}_{j}^{T}{\psi }_{i}\right) {\psi }_{i}\right\| _{2}^{2}=\sum _{i=\ell +1}^{d}\lambda _{i}. $$
(22)

To determine the dimension of \(\ell \), the singular value is cut according to the following ‘energy’ truncation

$$ E=\frac{\sum _{i=1}^{\ell }\sigma _{i}}{\sum _{i=1}^{\varXi }\sigma _{i}}<\alpha ,$$
(23)

where typically \(0.9\le \alpha <1\). This choice of truncation is a rather heuristic consideration Volkwein (2003). We follow what is commonly used in the literature. In other work, one may use quadratic summation of the singular value, see e.g., Doren et al. (2006), Markovinovic and Jansen (2006).

The POD method is applied to the state and adjoint equations. Let \(\ell _{p}\), \(\ell _{s}\), and \(n_{p}\), \(n_{s}\) be the dimension of the pressure and saturation equations in reduced-order and high-fidelity models respectively, where \(\ell _{p}\ll n_{p}\) and \(\ell _{s}\ll n_{s}\). Then transformation from the reduced-order to the high-fidelity model is

$$\begin{aligned} {\mathbf {x}}_{p}&= {\mathbf {V}}_{p}\hat{\mathbf {x}}_{p}+\overline{\mathbf {x}}_{p}, \\ {\mathbf {x}}_{s}&= {\mathbf {V}}_{s}\hat{\mathbf {x}}_{s}+\overline{\mathbf {x}}_{s}. \end{aligned}$$
(24)

The high-fidelity model is represented by \({\mathbf {x}}_{p}\in {\mathbb {R}}^{n_{p}}\), \({\mathbf {x}}_{s}\in {\mathbb {R}}^{n_{s}}\) and their respective averages during the snapshots \({\mathbf {\overline{x}}}_{p}\in {\mathbb {R}}^{n_{p}}\) and \({\mathbf {\overline{x}}}_{s}\in {\mathbb {R}}^{n_{s}}\), i.e.,

$$ \overline{\mathbf {x}}=\frac{1}{\varXi }\sum _{i=1}^{\varXi }{\mathbf {x}}_{i}. $$
(25)

In the reduced-order space, \(\hat{\mathbf {x}}_{p}\in {\mathbb {R}}^{\ell _{p}}\) and \(\hat{\mathbf {x}}_{s}\in {\mathbb {R}}^{\ell _{s}}\), the forward equations now become

$$\begin{aligned}&{\mathbf {V}}_{p}^{T}\left( \begin{array}{ccccc} {\mathbf {B}}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {0}} &{} {\mathbf {C}} &{} {\mathbf {D}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {B}}_{w}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {C}}_{w} &{} {\mathbf {0}} &{} {\mathbf {D}}_{w,N}\\ {\mathbf {C}}^{T} &{} {\mathbf {C}}_{w}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {D}}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {D}}_{w,N}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} \end{array}\right) {\mathbf {V}}_{p}\hat{\mathbf {x}}_{p}^{n} \\&\quad ={\mathbf {V}}_{p}^{T}\left( \left( \begin{array}{c} {\mathbf {0}}\\ -{\mathbf {D}}_{w,D}{\mathbf {p}}_{w,D}^{n}\left( {\mathbf {u}}^{n}\right) \\ {\mathbf {0}}\\ {\mathbf {0}}\\ -{\mathbf {q}}_{tot,N}^{n}\left( {\mathbf {u}}^{n}\right) \end{array}\right) -\left( \begin{array}{ccccc} {\mathbf {B}}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {0}} &{} {\mathbf {C}} &{} {\mathbf {D}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {B}}_{w}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {C}}_{w} &{} {\mathbf {0}} &{} {\mathbf {D}}_{w,N}\\ {\mathbf {C}}^{T} &{} {\mathbf {C}}_{w}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {D}}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {D}}_{w,N}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} \end{array}\right) \bar{\mathbf {x}}_{p}\right) ,\end{aligned}$$
(26)
$$ \hat{\mathbf {s}}^{n}=\hat{\mathbf {s}}^{n-1}+\triangle t\,{\mathbf {V}}_{s}^{T}{\mathbf {D}}_{PV}^{-1}\left( {\mathbf {A}}\left( {\mathbf {v}}^{n}\right) \, f_{w}\left( {\mathbf {V}}_{s}\hat{\mathbf {s}}^{n}+\bar{\mathbf {s}}\right) +{\mathbf {q}}\left( {\mathbf {v}}^{n}\right) _{+}\right) . $$
(27)

Similarly, we also take snapshots of the adjoint Eqs. (11) and (12) and obtain reduced-order adjoint equations. The reduced-order corresponding adjoint pressure and saturation respectively are

$$\begin{aligned}&{\mathbf {V}}_{ap}^{T}\left( \begin{array}{ccccc} {\mathbf {B}}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {0}} &{} {\mathbf {C}} &{} {\mathbf {D}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {B}}_{w}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {C}}_{w} &{} {\mathbf {0}} &{} {\mathbf {D}}_{w,N}\\ {\mathbf {C}}^{T} &{} {\mathbf {C}}_{w}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {D}}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {D}}_{w,N}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} \end{array}\right) {\mathbf {V}}_{ap}\left( \begin{array}{c} \hat{\varvec{\lambda }}_{v}^{n}\\ \hat{\varvec{\lambda }}_{q_{w}}^{n}\\ \hat{\varvec{\lambda }}_{p}^{n}\\ \hat{\varvec{\lambda }}_{\pi }^{n}\\ \hat{\varvec{\lambda }}_{p_{w,N}}^{n} \end{array}\right) \\&\quad ={\mathbf {V}}_{ap}^{T}\left\{ \left( \begin{array}{c} \left( \frac{\partial {\mathbf {i}}^{n}}{\partial {\mathbf {v}}^{n}}\right) ^{T}{\varvec{\lambda }}_{s}^{n}- \left( \frac{\partial {\mathcal {J}}}{\partial {\mathbf {v}}^{n}}\right) ^{T}\\ \left( \frac{\partial {\mathcal {J}}^{n}}{\partial {\mathbf {q}}_{w}^{n}}\right) ^{T}\\ {\mathbf {0}}\\ {\mathbf {0}}\\ {\mathbf {0}} \end{array}\right) -\left( \begin{array}{ccccc} {\mathbf {B}}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {0}} &{} {\mathbf {C}} &{} {\mathbf {D}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {B}}_{w}^{n}\left( {\mathbf {s}}^{n-1}\right) &{} {\mathbf {C}}_{w} &{} {\mathbf {0}} &{} {\mathbf {D}}_{w,N}\\ {\mathbf {C}}^{T} &{} {\mathbf {C}}_{w}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {D}}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}}\\ {\mathbf {0}} &{} {\mathbf {D}}_{w,N}^{T} &{} {\mathbf {0}} &{} {\mathbf {0}} &{} {\mathbf {0}} \end{array}\right) \left( \begin{array}{c} {\varvec{\bar{\lambda }}}_{v}\\ {\varvec{\bar{\lambda }}}_{q_{w}}\\ {\varvec{\bar{\lambda }}}_{p}\\ {\varvec{\bar{\lambda }}}_{\pi }\\ {\varvec{\bar{\lambda }}}_{p_{w,N}} \end{array}\right) \right\} ,\end{aligned}$$
(28)
$$\begin{aligned}{\mathbf {V}}_{as}^{T}\left( {\mathbf {I}}-\triangle t\left( \frac{\partial {\mathbf {i}}^{n}}{\partial {\mathbf {s}}^{n}}\right) ^{T}\right) {\mathbf {V}}_{as}\hat{\varvec{\lambda }}_{s}^{n} =&\, {\mathbf {V}}_{as}^{T}\left\{ {\varvec{\lambda }}_{s}^{n+1}-\left( \frac{\partial {\mathcal {J}}^{n}}{\partial {\mathbf {s}}^{n}}\right) ^{T}-\left( \frac{\partial }{\partial {\mathbf {s}}^{n}}\left( {\mathbf {B}}^{n+1}{\mathbf {v}}^{n+1}\right) \right) ^{T}{\varvec{\lambda }}_{v}^{n+1}\right\} \\& + {\mathbf {V}}_{as}^{T}\left\{ \left( \frac{\partial }{\partial {\mathbf {s}}^{n}}\left( {\mathbf {B}}_{w}^{n+1}{\mathbf {q}}_{w}^{n+1}\right) \right) ^{T}{\varvec{\lambda }}_{q_{w}}^{n+1}\right\} \\& - {\mathbf {V}}_{as}^{T}\left\{ \left( {\mathbf {I}}-\triangle t\left( \frac{\partial {\mathbf {i}}^{n}}{\partial {\mathbf {s}}^{n}}\right) ^{T}\right) \bar{\varvec{\lambda }}_{s}\right\} , \end{aligned}$$
(29)

where \({\mathbf {V}}_{ap}\) and \({\mathbf {V}}_{as}\) are the basis functions for the corresponding adjoint pressure and saturation equations, and \(\bar{\varvec{\lambda }}\) is the average snapshot of Lagrangian multipliers of adjoint equation solutions. As seen in all reduced-order Eqs. (11), (12), (28) and (29), the reconstruction of high-fidelity equations are needed in order to solve the reduced-order equations. Hence, after solving the reduced-order equations we reconstruct the high-fidelity solution through the transformation (24). This will inevitably give an overhead in the computational time. Nonetheless, we still gain computational reduction in CPU time compared to the high-fidelity model run.

2.3.2 POD-DEIM

Let us consider the water saturation Eq. (7) in the following form

$$ {\mathbf {s}}^{n+1}={\mathbf {s}}^{n}+\triangle t^{n}\,{\mathbf {D}}_{PV}^{-1}\left( {\mathbf {A}}\left( {\mathbf {v}}^{n}\right) \, f_{w}\left( {\mathbf {s}}^{n+1}\right) +{\mathbf {q}}\left( {\mathbf {v}}^{n}\right) _{+}\right) . $$
(30)

This equation is solved for time step \(n+1\) implicitly using Newton-Raphson method, that is,

$$0\equiv G\left( {\mathbf {s}}^{n+1}\right) ={\mathbf {s}}^{n+1}-{\mathbf {s}}^{n}-\triangle t^{n}\,{\mathbf {D}}_{PV}^{-1}\left( {\mathbf {A}}\left( {\mathbf {v}}^{n}\right) \, f_{w}\left( {\mathbf {s}}^{n+1}\right) +{\mathbf {q}}\left( {\mathbf {v}}^{n}\right) _{+}\right) .$$
(31)

Given an initial guess \(\tilde{\mathbf {s}}\), then by Taylor expansion, the equation above is approximated by

$$ 0 = G\left( {{\mathbf{s}}^{{n + 1}} } \right) \approx G\left( {\widetilde{{\mathbf{s}}}} \right) + G^{\prime}\left( {\widetilde{{\mathbf{s}}}} \right)\left( {{\mathbf{s}}^{{n + 1}} - \widetilde{{\mathbf{s}}}} \right), $$
(32)

where the solution \({\mathbf {s}}^{n+1}\) is obtained through \({\mathbf {s}}^{n+1}=\tilde{\mathbf {s}}+d\tilde{\mathbf {s}}\). The changes \(d\tilde{\mathbf {s}}={\mathbf {s}}^{n+1}-\tilde{\mathbf {s}}\) satisfies the linear equation \(-G'\left( \tilde{\mathbf {s}}\right) d\tilde{\mathbf {s}}=G\left( \tilde{\mathbf {s}}\right) \). The Jacobian \(G'\left( \tilde{\mathbf {s}}\right) \) is

$$G'\left( \tilde{\mathbf {s}}\right) ={\mathbf {I}}-\triangle t^{n}\,{\mathbf {D}}_{PV}^{-1}{\mathbf {A}}\left( {\mathbf {v}}^{n}\right) \, f_{w}^{'}\left( {\mathbf {s}}^{n}\right)$$

Note that the terms \(f_{w}^{'}\left( {\mathbf {s}}^{n}\right) \) and \(f_{w}\left( {\mathbf {s}}^{n+1}\right) \) are evaluated componentwise, which means that they are evaluated at each gridblock.

The reduced-order water saturation equation can be written as follows

$$\hat{\mathbf {s}}^{n+1}=\hat{\mathbf {s}}^{n}+\Delta t^{n}\,{\mathbf {D}}_{PV}^{-1}\left( \underset{\ell _{s}\times n_{s}}{\underbrace{\mathbf {V}_{s}^{T}}}\underset{n_{s}\times n_{s}}{\underbrace{\mathbf {A}\left( {\mathbf {v}}^{n}\right) }}\underset{n_{s}\times 1}{\underbrace{f_{w}\left( {\mathbf {V}}_{s}\hat{\mathbf {s}}^{n+1}+\bar{\mathbf {s}}\right) }}+{\mathbf {V}}_{s}^{T}q\left( {\mathbf {v}}^{n}\right) _{+}\right) .$$
(33)

As seen above, we still need to evaluate the nonlinear term, which is in this case is the water cut \(f_{w}\left( {\mathbf {s}}^{n+1}\right) \), in high-fidelity dimension \(n_{s}\). Similarly, in solving (32) in the reduced-order space, we evaluate the Jacobian in high-fidelity. This is obviously not desirable. To mitigate this, we construct another reduced-order model for the water cut term. This is when the POD-DEIM Chaturantabut and Sorensen (2010) comes into play. The method projects the nonlinear term onto a lower dimension, such that

$${\mathbf {f}}\left( \tau \right) \simeq {\varvec{\varPhi }}c \left( \tau \right) +\bar{\mathbf {f}}$$
(34)

where \({\varvec{\varPhi }}=\left[ \varPhi _{1},\ldots ,\varPhi _{m}\right] \in {\mathbb {R}}^{\mathrm{n_{x}\times m}}\), \(c\left( \tau \right) \) is the corresponding vector coefficient, and \(\bar{\mathbf {f}}\) is the average value of the nonlinear term in the snapshot.

The vector \(c\left( \tau \right) \) is determined by selecting the appropriate \(m\) rows from the overdetermined \({\mathbf {f}}\left( \tau \right) \simeq {\varvec{\varPhi }}c\left( \tau \right) +\bar{\mathbf {f}}\). The selection is done by a matrix \({\mathbf {P}}=\left[ {\mathbf {e}}_{\wp _{1}},\ldots ,{\mathbf {e}}_{\wp _{m}}\right] \in {\mathbb {R}}^{n_{x}\times m}\), where \({\mathbf {e}}_{j}\) is the \(j\)-th column of the identity matrix. If \({\mathbf {P}}^{T}{\varvec{\varPhi }}\) is invertible, the coefficient vector \(c\left( \tau \right) \) can be determined from

$${\mathbf {P}}^{T}{\mathbf {f}}\left( \tau \right) ={\mathbf {P}}^{T}{\varvec{\varPhi }}c\left( \tau \right) +{\mathbf {P}}^{T}\overline{\mathbf {f}},$$
(35)

with some rearrangement, we end up with

$$c\left( \tau \right) =\left( {\mathbf {P}}^{T}{\varvec{\varPhi }}\right) ^{-1} {\mathbf {P}}^{T}\left( {\mathbf {f}}\left( \tau \right) -\overline{\mathbf {f}}\right) .$$
(36)

Finally, the high-fidelity nonlinear approximation of (34) is

$${\mathbf {f}}\left( \tau \right) \simeq {\varvec{\varPhi }}c\left( \tau \right) +\overline{\mathbf {f}}\,=\left\{ \underset{\ell \times m}{\underbrace{\varvec{\varPhi }\left( {\mathbf {P}}^{T}{\varvec{\varPhi }} \right) ^{-1}}}\underset{m\times 1}{\underbrace{\mathbf {P}^{T}\left( {\mathbf {f}}\left( \tau \right) -\overline{\mathbf {f}}\right) }}\right\} +\overline{\mathbf {f}}.$$
(37)

From this equation, we now need to construct the \({\varvec{\varPhi }}\) and \({\mathbf {P}}\) matrices. \({\varvec{\varPhi }}\) is selected as the POD basis function of the water cut \(f_{w}\), while the \({\mathbf {P}}\) matrix (interpolation index) is determined by Algorithm 1. The max in the algorithm refers to MATLAB function max. Therefore, \(\left[ \begin{array}{cc} \rho&\wp _{\ell }\end{array}\right] ={\mathrm{max}}\left\{ \left| r\right| \right\} \) means \(\rho =\left| r_{\wp _{\ell }}\right| ={\mathrm{max_{i=1,\ldots ,n}}}\left\{ \left| r_{i}\right| \right\} \).

figure a

We employ the POD-DEIM just for the forward saturation equation, since this is the only equation that contains the nonlinear water cut term. So now the reduced-order equation of water saturation is

$$\hat{\mathbf {s}}^{n+1}=\hat{\mathbf {s}}^{n}+\triangle t^{n}{\mathbf {D}}_{PV}^{-1}\left( {\varvec{\varUpsilon }}+ {\mathbf {V}}_{s}^{T}{\mathbf {q}}\left( {\mathbf {v}}^{n}\right) _{+}\right) ,$$
(38)

where \({\varvec{\varUpsilon }}=\underset{\ell _{s}\times m}{\underbrace{{\mathbf {V}}_{s}^{T}{\mathbf {A}}\left( {\mathbf {v}}^{n}\right) {\varvec{\varPhi }}^{T}\left( {\mathbf {P}}^{T}{\varvec{\varPhi }}\right) ^{-1}}} \underset{m\times 1}{\,\underbrace{\mathbf {P}^{T}\left( f_{w} \left( {\mathbf {V}}_{s}\hat{\mathbf {s}}^{n+1}+\overline{\mathbf {s}}\right) -\overline{f}_{w}\right) }}\). Here, by using the interpolation matrix \({\mathbf {P}}\) we evaluate the nonlinear water cut term \(f_{w}\) in the reduced-space of dimension \(m\) rather than in the high-fidelity space of dimension \(n_{s}\).

One may notice there is a nonlinear dependence in the pressure Eq. (6) as well, that is, in the term \({\mathbf {B}}^{n}\left( {\mathbf {s}}^{n-1}\right) \), involving water saturation from the previous time step. Because (6) is a linear equation and the POD method has proved to be good for linear terms, we do not apply POD-DEIM for the pressure equations. Nevertheless, this opens an opportunity for future investigation. To this end, we introduce the implicit form of reduced-order equation equivalent to (8) as \({\mathbf {F}}\left( \tilde{\mathbf {x}}_{r},\tilde{\mathbf {u}}\right) =0\) in the following

$$\begin{aligned} {\mathbf {F}}\left( \tilde{\mathbf {x}}_{r},\tilde{\mathbf {u}}\right)&= \left( \begin{array}{c} {\mathbf {F}}^{1}\left( \hat{\mathbf {p}}^{1},\hat{\mathbf {s}}^{0},\hat{\mathbf {s}}^{1},{\mathbf {u}}^{1}\right) \\ \vdots \\ {\mathbf {F}}^{N}\left( \hat{\mathbf {p}}^{N},\hat{\mathbf {s}}^{N-1},\hat{\mathbf {s}}^{N},{\mathbf {u}}^{N}\right) \end{array}\right) \\ \hat{\mathbf {x}}^{nT}&= \left( \hat{\mathbf {p}}^{nT},\hat{\mathbf {s}}^{nT}\right) ,\quad n=1,\ldots ,N, \\ \tilde{\mathbf {x}}_{r}^{T}&= \left( \begin{array}{ccc} \hat{\mathbf {x}}^{1T},&\ldots ,&\hat{\mathbf {x}}^{NT}\end{array}\right) , \\ \tilde{\mathbf {u}}^{T}&= \left( \begin{array}{ccc} {\mathbf {u}}^{1T},&\ldots ,&{\mathbf {u}}^{NT}\end{array}\right) . \end{aligned}$$
(39)

\(\tilde{\mathbf {x}}_{r}\) is the stacked reduced-order state vector from the solution of the forward equations.

3 Production optimization problem

By injecting water into reservoirs, cumulative oil production may increase. This is cast as the following optimization problem using a reduced-order model

$$\begin{aligned} ({\mathcal {\hat{P:}}})&{\mathcal {\underset{\tilde{\mathbf {u}}}\in {\mathbb {R^{\mathrm{n_{\tilde{\mathbf {u}}}}}}}}{\mathrm{max}}J}\left( \tilde{\mathbf {x}}_{r},\tilde{\mathbf {u}}\right) \\ {\mathrm{subject\, to:}}&{\mathbf {F}}\left( \tilde{\mathbf {x}}_{r},\tilde{\mathbf {u}}\right)&=0\\&g\left( {\mathbf {u}}^{n}\right)&\ge 0,\,\forall n=1,\ldots ,N\\&h\left( \hat{\mathbf {x}}^{n},{\mathbf {u}}^{n}\right)&\ge 0,\,\forall n=1,\ldots ,N\\&{\mathbf {x}}^{0}\,{\mathrm{is\, given}}. \end{aligned}$$

\({\mathcal {J}}\left( \tilde{\mathbf {x}}_{r},\tilde{\mathbf {u}}\right) \), \(g({\mathbf {u}}^{n})\), \(h\left( {\hat{\mathbf{x}}}^{n},{\mathbf {u}}^{n}\right) \) are assumed \({\mathcal {C^{\mathrm{1}}}}\). The control input and the state constraints are represented by \(g:{\mathbb {R}}^{n_{\tilde{u}}}\rightarrow {\mathbb {R}}^{n_{g}}\) and \(h:{\mathbb {R}}^{n_{x}\times n_{\tilde{u}}}\rightarrow {\mathbb {R}}^{n_{h}}\), respectively. The objective function is given by \({\mathcal {J}}:{\mathbb {R}}^{n_{\tilde{x}_{r}}\times n_{\tilde{u}}}\rightarrow {\mathbb {R}}\) and the state equations are posed as implicit constraints. The state variables and the control inputs are dependent, therefore we are able to perform the optimization in the control input space of \(\widetilde{\mathbf {u}}\) instead of in the space of \(\left( \tilde{\mathbf {x}}_{r},\tilde{\mathbf {u}}\right) \). To this end, we denote the objective as \({\mathcal {J}}\left( \tilde{\mathbf {u}}\right) \) omitting \({\mathcal {J}}\left( \tilde{\mathbf {x}}_{r}\left( \tilde{\mathbf {u}}\right) ,\tilde{\mathbf {u}}\right) \). In this work, we use the recovery factor (RF) as the objective function

$${\mathcal {J}}\left( \tilde{\mathbf {u}}\right) =\frac{\sum _{i=1}^{N_{gb}} {\mathbf {D}}_{PV_{i}}(1-s_{i}^{N}\left( \tilde{\mathbf {u}})\right) }{V_{gb}} \times 100,$$
(40)

where \({\mathbf {D}}_{PV_{i}}\) is the diagonal element of the pore volume matrix, \(s_{i}^{N}\left( \tilde{\mathbf {u}}\right) \) is the water saturation at grid block \(i\) at the final time step \(N\), \(\tilde{\mathbf {u}}\) the control input in this case is well rate, \(N_{gb}\) is the number of grid blocks in the reservoir, and \(V_{gb}\) is the total pore volume of the reservoir. In other words, the recovery factor represents the percentage of oil that can be produced from the reservoir. Water flooding can normally give a recovery factor somewhere between 20 and 40 %, meaning that 20–40 % of the oil is extracted from the reservoir.

The optimization problem in high-fidelity is described as

$$\begin{aligned} ({\mathcal {P}}:)&\mathop{\rm max}\limits_{{\tilde{\mathbf {u}}} \in {\mathbb {R}}^{{\mathrm {n}}_{\tilde{\mathbf {u}}}}}{\mathcal{J}}\big({\tilde {\mathbf {x}}_r, {\tilde{\mathbf {u}}} \big)} \\ {\mathrm{subject\, to:}}&{\mathbf {F}}\left( \tilde{\mathbf {x}},\tilde{\mathbf {u}}\right)=0\\&g\left( {\mathbf {u}}^{n}\right)\ge 0,\quad\forall n=1,\ldots ,N\\&h\left( {\mathbf{x}}^{n},{\mathbf {u}}^{n}\right)\ge 0,\quad\forall n=1,\ldots,N\\&{\mathbf {x}}^{0}\,{\mathrm{is\, given}}.\end{aligned}$$

Let \({\mathcal {U}}_{opt}\) and \(\hat{\mathcal {U}}_{opt}\) be the solutions of the optimization problem in high-fidelity \({\mathcal {P}}\) and reduced-space \(\hat {\mathcal {P}}\), respectively. In Hinze and Volkwein (2005), the error estimate between \({\mathcal {U}}_{opt}\) and \(\hat{\mathcal {U}}_{opt}\) is

$$\left\| {\mathcal {U}}_{opt}-\hat{\mathcal {U}}_{opt}\right\| \le c_{p}\left\{ \left\| {\mathbf {x}}^{0}-{\varvec{\varPhi }}\hat{\mathbf {x}}^{0}\right\| +\left\| \tilde{\mathbf {x}}-{\varvec{\varPhi }}\tilde{{\hat{x}}}\right\| +\left\| {\varvec{\lambda }}_{\tilde{\mathbf {x}}}\left( \tilde{\mathbf {x}}\left( \tilde{\mathbf {u}}\right) \right) -{\varvec{\varPhi }}\left( {\varvec{\lambda }}_{\tilde{\varvec{x}}}\left( \tilde{\mathbf {x}}\left( \tilde{\mathbf {u}}\right) \right) \right) \right\| +\sqrt{\sum _{i=\ell +1}^{d}{\varvec{\lambda }}_{i}}\right\} .$$
(41)

Here \(c_{p}\) is a positive constant, \({\mathbf {\Phi }}\) are the basis functions (eigenvectors) obtained from the POD method, \({\varvec{\lambda }}_{\tilde{\mathbf {x}}}\) is the Lagrangian multiplier in the adjoint Eqs. (11) and (12), and the last term contains \({\varvec{\lambda }}\) from the residual of POD truncation as in (22).

The second term of the error estimate (41) \(\left\| \tilde{\mathbf {x}}-{\varvec{\varPhi }}\tilde{{\mathbf {x}}}_{r}\right\| \) can be reduced by taking snapshots of the state equations. Similarly, the term \(\left\| {\varvec{\lambda }}_{\tilde{{\mathbf {x}}}}\left( \tilde{{\mathbf {x}}}\left( \tilde{{\mathbf {u}}}\right) \right) -{\varvec{\varPhi }}\left( {\varvec{\lambda }}_{\tilde{\varvec{x}}}\left( \tilde{{\mathbf {x}}}\left( \tilde{{\mathbf {u}}}\right) \right) \right) \right\| \) may give smaller error by taking snapshots of the adjoint equations. We will proceed with this approach and will explain it in the next section.

4 Solution method

In this section we explain how to use a reduced-order model to solve the optimization problem \(\hat {\mathcal {P}}\).

4.1 Trust-region POD

The optimization is performed using a reduced-order model, which is known as surrogate optimization. The principle of surrogate optimization is depicted in Fig. 1. During the course of optimization many simulation runs are needed, and therefore by using a reduced-order model the goal is to reduce simulation runtime. Furthermore, as mentioned in the introduction there are alternative ways to construct reduced-order models. Here we will use POD and DEIM described in the previous section.

Fig. 1
figure 1

Optimization in reduced space (surrogate optimization). Optimization is performed using reduced-order models (ROMs) and the ROMS are updated according to the trust-region rule. This figure is modified after Alexandrov et al. (2001)

To maintain the quality of reduced-order models, we apply a trust-region framework. The trust-region framework is used as the globalization strategy in gradient-based optimization Conn et al. (2000). In a trust-region globalization strategy a quadratic approximation is used to approximate the objective function while in the surrogate optimization trust-region framework, which is called the trust-region POD (TRPOD) method, one builds a POD-based reduced-order model. The method will in principle enlarge its region when good approximations are obtained and reduce the region when the quality of model approximation is poor, or keep the region if the approximation quality is the same as at the previous iteration. The quality of approximation is measured by checking the value of the objective function in the high-fidelity model. Finally, the method will terminate due to some stopping criteria. The details of this method is explained in Algorithm 2 with some remarks below.

During the \(k\)-th iteration the TRPOD method solves the following subproblem

$$\begin{aligned} \underset{\varvec{\delta }\in {\mathbb {R}}^{n_{{\tilde{\mathbf{u}}}}}}{\mathrm{max}}&{\mathcal {J}}_{b,k}^{\mathcal {R}}\left( \tilde{{\mathbf {u}}}+{\varvec{\delta }}\right) \\ {\mathrm{s.t}}.\,&{\mathbf {F}}_{k}\left( \tilde{{\mathbf {x}}}_{r},\tilde{{\mathbf {u}}}+{\varvec{\delta }}\right)=0 \\&g_{k}\left( {\mathbf {u}}^{n}+{\varvec{\delta }}\right)\ge 0,\quad\forall n=1,\ldots ,N \\&h_{k}\left( {\mathbf {\hat{x}}}^{n},{\mathbf {u}}^{n}+{\varvec{\delta }}\right)\ge 0,\quad\forall n=1,\ldots ,N \\&\left\| {\varvec{\delta }}\right\| _{\infty }\le \triangle _{k}&. \end{aligned}$$
(42)

The optimized variables in the subproblem are the steps \({\varvec{{\delta }}}\), where the length, expressed in infinite norm, is bounded by a trust-region radius \(\triangle _{k}\), and the implicit form of reservoir dynamics

$${\mathbf {F}}_{k}\left( \tilde{\mathbf {x}}_{r},\tilde{\mathbf {u}}+{\varvec{\delta }}\right) =\left( \begin{array}{c} {\mathbf {F}}^{1}\left( \hat{\mathbf {p}}^{1},\hat{\mathbf {s}}^{0},\hat{\mathbf {s}}^{1},{\mathbf {u}}_{k}^{1}+{\varvec{\delta }}_{k}^{1}\right) \\ \vdots \\ {\mathbf {F}}^{N}\left( \hat{\mathbf {p}}^{N},\hat{\mathbf {s}}^{N-1},\hat{\mathbf {s}}^{N},{\mathbf {u}}_{k}^{N}+{\varvec{\delta }}_{k}^{N}\right) \end{array}\right) .$$
(43)

\({\mathcal {J}}_{b,k}^{\mathcal {R}}\) is the modified objective function using the Lagrangian barrier method (44), explained in next subsection, evaluated using the (forward) reduced-order model.

figure b

Remark 1

The subproblem (42) is not the standard quadratic model approximation as in the trust-region globalization strategy. Instead, it is an approximation of the high-fidelity model. In Fahl (2000), an algorithm based on the Cauchy condition is used to solve the subproblem. In this work, we use the KNITRO optimization package Byrd et al. (2006) for finding optimal steps \({\varvec{{\delta }}}_{k}\).

Remark 2

The stopping criteria applied in high-fidelity optimization are usually the absolute changes in the objective function or constraints violation as described in Algorithm 3. In the TRPOD method, the stopping criterion based on the trust-region radius. If the trust-region radius is less than the minimum trust-region radius \(\triangle _{min}\), then the optimization is terminated. Since the objective function of the surrogate model can be lower (maximization case) than in the previous iteration, which may yield negative value of \(\rho _{k}\), we take absolute value of \(\rho _{k}\). Moreover the value of \(\rho _{k}\) can be infinite due to constraint violation, we hence reduce the trust-region radius.

Remark 3

The bound constraint in the high-fidelity optimization \(g\left( {\mathbf {u}}^{n}\right) \) is adjusted in the reduced-space optimization due to the infinite norm constraints on the steps \({\varvec{{\delta }}}_{k}\). The optimization in surrogate model may be stopped if the bound constraint in high-fidelity optimization is violated.

Remark 4

The trust-region parameters in the TRPOD method are chosen as follows

$$\eta _{1}=0.02,\,\eta _{2}=0.5,\,\eta _{3}=1,\,\gamma _{1}=0.25, \,\gamma _{2}=0.5,\,\gamma _{3}=1.5.$$

The small value of \(\eta _{1}=0.02\) means we accept small improvement in the objective function value.

Remark 5

The discussion on convergence and convergence rate of the algorithm can be found in Fahl (2000), Agarwal (2010).

It should be noted that to speed up the optimization convergence we employ the BFGS method using the first-order gradient from the adjoint method. Alternatively, one may use the SR1 algorithm to approximate the Hessian matrix, which is quite common in the trust-region scheme.

4.2 Lagrangian barrier methods

Since we also handle state constraints, in this work we employ the Lagrangian barrier method (a simplified version of Conn et al. 1997), which requires a Lagrangian barrier function

$${\mathcal {J}}_{b}\left( \widetilde{\mathbf {u}},{\varvec{\lambda }}, \mu \right) ={\mathcal {J}}\left( \widetilde{\mathbf {u}}\right) + \mu \sum _{i=1}^{n_{h}}\lambda _{i}{\mathrm{log}}\left( h_{i} \left( {\hat{\mathbf {x}}}^{n},{\mathbf {u}}^{n}\right) \right) . $$
(44)

Here \(\mu \) is the barrier parameter and \(\lambda _{i}\) is the componentwise Lagrange multiplier estimates, which are updated during the course of optimization. The Lagrangian barrier method is described in Algorithm 3. The TRPOD method is used in step 1 of the Lagrangian barrier method. This method will terminate either due to (most likely) objective function criterion or constraint violation. We refer to Suwartadi et al. (2012) for further details of algorithm discussion and its uses for production optimization.

figure c

5 Case examples

In this section, we present four case examples. The first case will compare POD and DEIM in building reduced-order models in terms of CPU time and its accuracy. The second example will demonstrate how the TRPOD method works in an optimization case without the presence of nonlinear output constraints. The last two case examples will show how the TRPOD and Lagrangian methods handle the nonlinear output constraints in surrogate optimization. Simulations for these case examples were done on a 64-bit Linux box with Intel(R) Xeon(R) CPU @ 3.00GHz. All the SVD computation in the case examples are done by using the SVD function in MATLAB. We use the economical option (SVD(‘*’, ‘con’)) in order to choose the eigenvectors corresponding to the largest singular values.

5.1 Case 1

The reservoir model in this case is taken from layer 10 of SPE 10th comparative study Christie and Blunt (2001). The grid consists of \(60\times 220\) gridblocks, where the dimension of a grid block is \(10\,{\mathrm{ft}}\times 20\,{\mathrm{ft}}\times 2\,{\mathrm{ft}}\). The connate oil saturation and residual water saturation are zero. The porosity, for simplicity, is set homogenously to 0.3, while the permeability is heterogenous as depicted in Fig. 2. The oil to water mobility ratio is set to 0.2 and initial water saturation is zero. The well configuration is a 5-spot pattern with an injector in the middle and four producers at the corners. The simulation is run for 1,200 days and the control inputs are the well rates. We divide the control inputs into 40 intervals, which means we can change the well rates every 30 days. The number of control variables are \(40\times 5=200\). Initial injection rate is set to 0.5 \(\frac{PVI}{1,200\, days}\). Moreover, the snapshots of the forward and adjoint equations are taken from the 40 control intervals.

Fig. 2
figure 2

The logarithm of permeability field in millidarcy (mD), well location and relative permeability curves. The well locations follow the 5-spot pattern in which four producers (Prd1, Prd2, Prd3, Prd4) are placed in the corners and one injector (Inj1) in the middle

In this case example, we compare reduced-order models obtained from the POD and DEIM methods. The reduced-order models are constructed based on the snapshots of the forward and adjoint equations. For the POD method, the snapshots for the forward equations comprise the solution of pressures and water saturation for 40 control steps. While for the DEIM method, we need additional water cut snapshots representing the nonlinear terms, which is also from 40 control steps. In the adjoint equations, since there is no nonlinear term, we apply the POD method. Thus, the snapshot for the adjoint equations will be the solution of the adjoint equations. We will explain the reduced-order models for both types of equations in the following subsection.

5.1.1 Forward equations

The runtime of the high-fidelity model is described in Table 1. To build reduced-order models for the forward equations, we choose an energy level truncation. We vary the value of energy truncation in order to know a good value or dimension of the reduced-order models. Furthermore, we define the error of the reduced-order model by the following equation

$${\mathcal {E}}{:=}\frac{\left\| {\mathbf {s}}^{N}-\hat{\mathbf {s}}^{N}\right\| _{2}}{\left\| {\mathbf {s}}^{N}\right\| _{2}}, $$
(45)

where \({\mathbf {s}}^{N}\) is water saturation at final time step \({N}\), and \(\hat{\mathbf {s}}^{N}\) is the reduced-order water saturation at the final time step.

Table 1 CPU time measured in second for forward equations using the high-fidelity model

Figures 34, and 5 depict the singular values of the state variables: pressure, saturation, and the nonlinear term, water cut.

Fig. 3
figure 3

Singular values of pressure snapshots

Fig. 4
figure 4

Singular values of water saturation snapshots

Fig. 5
figure 5

Singular values of water cut snapshots

We then run simulations with a variation of energy truncations and the results are displayed in Table 2. In general the POD method gives significant speedup for the pressure equation compared to that of the high-fidelity model runtime described in Table 1. However, only a slight CPU time reduction is obtained for the saturation equation. On the other hand, DEIM gives more speedup for the saturation equation. The approximation errors and the CPU time speedups decrease when the number of basis functions increase.

Table 2 Comparison POD and DEIM in the variation of energy truncation
Fig. 6
figure 6

Comparison of water saturation at final time for the high-fidelity model and reduced order models; POD and DEIM

To show the quality of reduced-order model POD and DEIM, we display water saturation at the end time using energy truncation 90 % for the pressure, 90 % for the water saturation, and 90 % for the water cut in Fig. 6.

5.1.2 Adjoint equations

Here, we continue to vary the energy truncation of the adjoint equations. The runtime for the high-fidelity model of the adjoint equations is described in Table 3. Furthermore, we plot the singular values of the corresponding pressure and saturation equations in Figs. 7 and 8.

Table 3 CPU time measured in second for adjoint equations using the high-fidelity model
Fig. 7
figure 7

Singular values of corresponding adjoint pressure equation snapshots

Fig. 8
figure 8

Singular values of corresponding adjoint water saturation equation snapshots

Similarly, we have done some simulations using variation of energy truncations and the results are described in Table 4. We define the error of the adjoint-gradient in the reduced-order model by the following equation

$${\mathcal {E}}_{\mathrm{grad}}\,:=\frac{\left\| {\mathbf {grad}}-\hat{\mathbf {grad}}\right\| _{2}}{\left\| {\mathbf {grad}}\right\| _{2}}, $$
(46)

which compare the gradient in high-fidelity (\({\mathbf {grad}}\)) and in reduced-order (\(\hat{\mathbf {grad}}\)).

Both the POD and DEIM methods, shown in Table 4, give speedup in runtime compare to the adjoint equations in high-fidelity described in Table 3. We run both POD and DEIM for the adjoint equations since they need forward reduced-order models. Furthermore, the CPU time for corresponding adjoint saturation is comparable to that of high-fidelity runtime. This is because of the sparsity property in the linear adjoint saturation equation. In the high-fidelity equation the adjoint saturation is solved using a sparse linear solver. However, in a reduced-order model we loose the sparsity structure of the adjoint saturation equation. One may get better speedup for the adjoint saturation equation if the reservoir model has a larger number of grid blocks.

Table 4 Adjoint POD and DEIM in variation of energy truncation

We also present the quality of the gradient approximation in the reduced-order models in Fig. 9, where the truncation is 90 % for both the corresponding pressure and saturation.

Fig. 9
figure 9

Comparison of adjoint-gradient in high-fidelity and reduced-order models (POD and DEIM)

5.1.3 Effect of perturbations

In order to know the robustness of basis functions, we first build a reduced order model using DEIM with 90 % energy truncation for pressure, saturation, and water cut. We then change well rates at producer wells around 5, 10, and 20 % in the sense that we perturb the initial well rates when the basis functions are constructed. As seen in Figs. 10 and 11 below, the reduced-order model is good enough to approximate the high-fidelity model. However, it is not good enough for the 20 % perturbation as depicted in Fig. 12. The relative error saturation is the water saturation difference between high-fidelity model and reduced-order model divided by saturation in high-fidelity model.

Fig. 10
figure 10

5 % variation of producers well rates with relative error in saturation approximation of 0.021

Fig. 11
figure 11

10 % variation of producers well rates with relative error in saturation approximation of 0.029

Fig. 12
figure 12

20 % variation of producers well rates with relative error in saturation approximation of 0.052

5.2 Case 2

In this case we set up a surrogate optimization without any output constraints. The goal is to show the performance of TRPOD method compared to the optimization using a high-fidelity model. Since there are no nonlinear output constraint, the constraints appear only on the control, that is, bound constraints and an equality constraint due to the incompressible flow (the total injector rate must be equal the total producer rate). The objective function in this case is net present value (NPV) with oil price \(80\frac{\$}{m^{3}}\), water separation cost \(19\frac{\$}{m^{3}}\), and water injection cost \(1\frac{\$}{m^{3}}\). It should be noted there is no augmented objective function in this case. Furthermore, we continue using the reservoir setting described in case 1 with initial injection rate is 0.4 PVI for 1,200 simulation days (40 control intervals). The control inputs are well rates at producer and injector wells. The reduced-order model is built using DEIM due to its faster CPU time than POD.

To fully capture the dynamics of reservoir, we extend the simulation a bit further until 1,800 days ensuring a water cut value of 0.80 reaches all producer wells (based on the price setting). This is performed when building the initial basis functions but not during the basis functions update within the TRPOD strategy. We use an energy level of 99 % for the forward pressure, 95 % for saturation equation, and 95 % for the water cut. For the adjoint equations we use 95 % energy level for the corresponding pressure and saturation equations. Using these energy truncations, an initial forward reduced-order model consists of 12, 16, and 15 basis functions for the pressure, saturation, and water cut, respectively. The interpolation points are shown in Fig. 13. The adjoint reduced-order model has 13 and 8 basis functions for the corresponding pressure and saturation equations, respectively.

Fig. 13
figure 13

Interpolation points (represented with D) for the nonlinear water cut term are located at grid blocks: 12076, 4285, 13031, 3445, 12622, 5495, 314, 10308, 5129, 282, 12437, 3746, 378, 7106, and 11869

We then run surrogate optimization. To evaluate the optimization, we also run the optimization using the high-fidelity model. The stopping criteria are the absolute gradient tolerance \(10^{-8}\) and absolute step length \(10^{-8}\). These stopping criteria apply both for reduced and high-fidelity model optimizations. The surrogate optimization is run with an initial trust-region radius, \(\triangle _{0}\), set to \(0.03\times \frac{V_{gb}}{1,200\,{\mathrm{days}}}\), the maximum trust-region radius, \(\triangle _{max}\), is \(0.03\times \frac{V_{gb}}{1,200\,{\mathrm{days}}}\). This maximum trust-region radius represents the bound in which the reduced order model is robust enough to the control input perturbation. Moreover, the minimum trust-region radius \(\triangle _{min}\), is \(10^{-3}\times \frac{V_{gb}}{1,200\,{\mathrm{days}}}\). The minimum trust-region ratio \(\rho _{min}\) is set 0.001. The bound constraints on the control input are set between 0 and \(0.6\times \frac{V_{gb}}{1,200\,{\mathrm{days}}}\).

After running the optimization, the evolution of the objective function is depicted in Fig. 14. Note that the number of iterations in the high-fidelity optimization represents the number of inner iteration while the number of iterations in the surrogate optimization denotes the number of outer iterations, involved in the TRPOD method. Table 5 describes the runtime and obtained objective function values. The details of the surrogate optimization are described in Table 6, where \(\hat{\mathcal {J}}\) is the objective function in reduced-order model and \({\mathcal {J}}\) is the objective function evaluated in high-fidelity model. The surrogate optimization terminates due to the minimum trust-region radius.

Fig. 14
figure 14

Evolution of the objection functions using the initial injection 0.4 \(\frac{PVI}{1,200\, days}\)

Table 5 Comparison of optimization in high-fidelity and reduced-space using initial injection of 0.4 \(\frac{PVI}{1,200\, days}\)
Table 6 Iteration in surrogate optimization using initial injection of 0.4 \(\frac{PVI}{1,200\, days}\)

In Table 6 above, the trust-region ratio \(\rho \) determines when the basis functions must be updated as well as when the radius ratio will be enlarged or shrunk. The trust-region parameter settings are described in Sect. 4. The basis functions and the trust-region will not be updated and enlarged if the trust-region ratio \(\rho \) has value less than 0.02. The trust-region radius will be enlarged only if the trust-region ratio is larger than 1. Otherwise, it will be kept or reduced. Hence, starting from the fifth iteration, seen in Table 6, the trust-region radius is shrunk. In the seventh iteration, the trust-region radius is kept as the same previous iteration and is decreased afterwards. Consequently, the optimization terminates due to the minimum trust-region radius criterion. Fig. 15 shows the comparison of optimization solutions in high-fidelity and surrogate optimization. Here, we denote the injector rate with positive sign and producer rate with negative sign.

Fig. 15
figure 15

Optimization solutions in high-fidelity and reduced-space models and water saturation at final time

Next, we run another optimization with initial injection rate of 0.5 \(\frac{PVI}{1,200\, days}\) using the same parameter values (initial trust-region radius). This initial rate is closer to the optimization solution in a high-fidelity model. The objective function evolutions are described in Fig. 16. The details of the optimization in the reduced-space model can be seen in Table 8 and CPU time speedup is described in Table 7. The optimization in high-fidelity model terminates due to the step length tolerance, while in surrogate optimization stops because it hits the upper bound constraint, that is, \(0.6\times \frac{V_{gb}}{1,200\,{\mathrm{days}}}\).

Fig. 16
figure 16

Evolution of the objection functions using the initial injection of 0.5 \(\frac{PVI}{1,200\, days}\)

Table 7 Comparison of optimization in high-fidelity and reduced-space using initial injection of 0.5 \(\frac{PVI}{1,200\, days}\)
Table 8 Iteration in surrogate optimization using initial injection of 0.5 \(\frac{PVI}{1,200\, days}\)
Fig. 17
figure 17

Optimization solutions in high-fidelity and reduced-space models and water saturation at final time

It turns out that the surrogate optimization reaches a different local maximum than optimization with high-fidelity model. However, the runtime is still quite cheap. The optimization in high-fidelity seems to converge in the same local maxima but with the cost of higher CPU time. The speedup factor in this case can be up to 20 times.

5.3 Case 3

This case is a continuation of the previous case, which uses the same reservoir, and well setting, however, with the inclusion of output constraints and a more accurate reduced-order model with an increased energy truncation for saturation and water cut to 99 %. The objective function is now recovery factor (RF), described in (40). In this case we constrain the water fractional flow (water cut), which is function of water saturation, at the producer wells. We limit the water cut for the producer wells at the final time to \(f_{w,max}\), which is set to \(0.80\). To this end, the augmented objective function is

$${\mathcal {J}}_{b}\left( \tilde{\mathbf {u}},{\varvec{\lambda }}, \mu \right) ={\mathcal {J}}\left( \tilde{\mathbf {u}}\right) +\mu \sum _{i=1}^{4}\lambda _{i}{\mathrm{log}}\left( f_{w,max} -f_{w,prod_{i}}^{N}\right) . $$
(47)

The parameter settings in this case are: \({\varvec{\lambda }}_{0}=\left[ \begin{array}{cccc} 1&1&1&1\end{array}\right] ^{T}\), \(\tau =0.1\), \(\mu _{0}=10^{4}\), \(\omega _{0}=10^{-6}\), absolute maximum water cut tolerance \(\eta _{*}=10^{-6}\), and absolute objective function changes \(\epsilon _{*}=10^{-4}\) percent of recovery factor. We choose an active set algorithm in KNITRO to the handle control input constraints \(g\left( \tilde{\mathbf {u}}\right) \). The initial trust-region radius, \(\triangle _{0}\), and the maximum trust-region radius, \(\triangle _{max}\), are set equally to \(0.01\times \frac{V_{gb}}{1200\,{\mathrm{day}}}\), the minimum trust-region radius \(\triangle _{min}\), is \(10^{-4}\times \frac{V_{gb}}{1200\,{\mathrm{day}}}\), and minimum trust-region ratio \(\rho _{min}\) is 0.001. The bound constraints on the control input are set between 0 and \(0.8\times \frac{V_{gb}}{1,200\,{\mathrm{days}}}\). We run two optimizations with different initial injector settings.

5.3.1 Initial injection rate 0.5 \(\frac{PVI}{1,200\, days}\)

We start the optimization with an initial injector rate 0.5 \(\frac{PVI}{1,200\, days}\). The optimization with the high-fidelity model stops due to the objective function change criterion. Furthermore, Table 9 describes the results and constraint violations are shown in Fig. 18. The comparison of the objective function evolution is displayed in Fig. 19. The infinite objective function value in the reduced-order space indicates that the output constraint is violated.

Table 9 Optimization results. The water injected is measured in pore volume injected (PVI)
Fig. 18
figure 18

The state constraints satisfaction, i.e., the water-cut at final time step

Fig. 19
figure 19

Comparison of the objective function evolution in high-fidelity and reduced-space. The objective value from reduced-space in the figure is the objective function evaluation using the high-fidelity model given the solution of surrogate optimization

The surrogate optimization terminates because of the minimum trust-region radius. POD-DEIM in this case gives a speedup more than 1.3 times. The output constraints are active only for \(Prd1\) in the reduced-order model while for the other production wells, the constraints are very close to being active.

Fig. 20
figure 20

Optimization solutions in high-fidelity and reduced-space models and water saturation at final time

In Fig. 20 the optimization solutions using high-fidelity and reduced-order model are shown. The producer rates are shown in negative sign while the injector rate is in positive sign. Again, it is clear the surrogate optimization resulted in a local maximum. The figure also shows water saturation at final time step. As seen, the water saturation is slightly different around \(Prd4\) area.

5.3.2 Initial injection rate 0.4 \(\frac{PVI}{1,200\, days}\)

We continue the optimization using reduced-order model with different initial solution. Here, we set initial injection rate to 0.4 \(\frac{PVI}{1,200\, days}\). The results are shown in Table 10, constraints satisfaction in Fig. 21. Similarly, the optimization terminates due to the minimum trust-region radius and \(Prd1\) is active, while the other production wells are almost active. The evolution of the objective function is shown in Fig. 22, and control input solution is in Fig. 23.

Table 10 Optimization results. The water injected is measured in pore volume injected (PVI)
Fig. 21
figure 21

The state constraints satisfaction, i.e., the water-cut at final time step

Fig. 22
figure 22

Comparison of the objective function evolution in high-fidelity and reduced-space. The objective value from reduced-space in the figure is the objective function evaluation using the high-fidelity model given the solution of surrogate optimization

Fig. 23
figure 23

Optimization solutions in high-fidelity and reduced-space models and water saturation at final time

5.4 Case 4

This case originates from the Norne comparative study Rwechungura et al. (2010) with a simplified model. The reservoir is depicted in Fig. 24 and there are 6 wells. Initial water saturation and pressures at each grid block are set to 0.2 and 40 bar, respectively. The mobility ratio between water and oil is 1 to 5. The end points of connate water saturation and oil saturation are both set to 0.2. The relative permeability curves are displayed in Fig. 24. The simulation is run for 500 days and divided into 50 control intervals. Thus, in total the controls consist of 300 variables. The controls in this case are well rates. Similar to the previous cases, we deal with equality constraints due to incompressible flow, that is, total injection rate must equal total production rate. In addition, we set bound constraint on the injectors, which is set lower than 2 \(\frac{PVI}{500\, days}\). The initial water injection rates are 0.25 and 0.30 of total pore volume using constant rates. The number of snapshots for building the reduced-order models is the same as the number control of intervals, which is 50 snapshots.

In this case we constrain the total water production at the final control interval to \(5\times 10^{-3}\) of the pore volume of the reservoir and define this constraint as \(Q_{w,max}\). Hence, in this case the augmented objective function is

$$\begin{aligned} {\mathcal {J}}_{b}\left( \tilde{\mathbf {u}},\lambda ,\mu \right) ={\mathcal {J}}\left( \tilde{\mathbf {u}}\right) +\mu \lambda \, {\mathrm{log}}\left( Q_{w,max}-\sum _{i=1}^{n_{h}=4}Q_{w,i}^{N}\right) . \end{aligned}$$
(48)

Here the output constraint is just a scalar, that is, the total water production of each producer well at final control interval.

Fig. 24
figure 24

Norne field, a 3D reservoir, with six wells: four producers (E-1H, K-1H, B-2H, K-2H) and 2 injectors (C-1H and C-2H). Permeability field is plotted in millidarcy (mD). The right hand figure shows relative permeabilities

The reduced-order models are constructed using 90 % energy truncation for the pressure and saturation equations, and 90 % for the non-linear water cut term. This results in 4, 7, 7 basis functions for the pressure, saturation and water cut, respectively. For the adjoint equations, we use a 95 % energy level both for the corresponding pressure and saturation equations. To this end, the reduced-order adjoint equations have dimension 16 and 4, respectively. The CPU time comparison of the full-model and reduced-order models are described in Table 11. The POD-DEIM results are consistently faster than the standard POD. Significant speedup is obtained for the forward equations, but the corresponding saturation equation is again similar to the first case. Due to sparsity property in the high-fidelity adjoint equation and the dense matrix in the reduced-order models, the speedup of the corresponding saturation adjoint equations is not that significant. To this end, we use the reduced-order model of POD-DEIM for surrogate optimization.

The parameter settings in this case are: \(\lambda =1\), \(\tau =0.1\), \(\mu =10^{7}\), \(\omega _{0}=10^{-3}\), and absolute total water production tolerance \(\eta _{*}=10^{-4}\). We choose an active set algorithm in KNITRO to handle control input constraints \(g\left( \tilde{\mathbf {u}}\right) \). The maximum trust-region radius, \(\triangle _{max}\), is \(0.1\times \frac{V_{gb}}{500\,{\mathrm{day}}}\), and minimum trust-region radius \(\triangle _{min}\), is \(0.001\times \frac{V_{gb}}{500\,{\mathrm{day}}}\) and initial trust-region radius, which are \(\triangle _{0}=0.1\times \frac{V_{gb}}{500\, days}\).

Table 11 Comparison of CPU time of forward and adjoint equations

5.4.1 Initial injection rate 0.25 \(\frac{PVI}{500\, days}\)

We firstly run the optimization with initial injection 0.25 \(\frac{PVI}{500\, days}\). The results are summarized in Table 12, the constraint satisfaction in Fig. 25, and comparison of optimized control inputs in Fig. 26. The optimization terminated due to the minimum trust-region radius.

Table 12 Optimization results with initial injection rate 0.25 \(\frac{PVI}{500\, days}\). The water injected is measured in pore volume injected (PVI)
Fig. 25
figure 25

The output constraints satisfaction, i.e., the total volume of water production at the final time step

Fig. 26
figure 26

Optimization solutions in high-fidelity and reduced-space models

5.4.2 Initial injection rate 0.3 \(\frac{PVI}{500\, days}\)

We then run optimization with initial injection rate 0.3 \(\frac{PVI}{500\, days}\). The results are described in Table 13, Figs. 27 and  28. The optimization in this case terminates due to the minimum trust-region radius.

Table 13 Optimization results with initial injection rate 0.3 \(\frac{PVI}{500\, days}\). The water injected is measured in pore volume injected (PVI)
Fig. 27
figure 27

The output constraints satisfaction, i.e., the total volume of water production at the final time step

Fig. 28
figure 28

Optimization solutions in high-fidelity and reduced-space models

5.5 Discussion

We have presented four case examples. The first case example shows the quality of a reduced-order model given a variation of energy truncation. It turns out that the POD method is more CPU intensive than the POD-DEIM simulation run. Reducing the energy truncation will reduce CPU time at the expense of accuracy. The case example uses a 2D reservoir consisting of 13200 grid blocks. The POD method results in significant speedup for the pressure equation. However, for the saturation equation the CPU time in the reduced-order model is marginally faster than the high-fidelity model. In the same case example it is shown that DEIM can slightly improve the CPU time of the saturation equation. Furthermore, in the saturation adjoint equation we loose the sparsity property. Consequently, the sparse linear solver, which is used to solve the adjoint equation, consumes more CPU time.

In the second case, we demonstrate the performance of the TRPOD method. Based on the first case, we continue the optimization only with the DEIM reduced-order model because the method gives more speedup than POD. Using two different initial controls (injector rate), the TRPOD method is trapped into local maxima with comparable objective function (NPV) values. This implies that the choice of initial control and initial trust-region radius are important considerations. The speedups in this case are significant, between 5 and 20 times for the two different initial controls.

In the third case, we introduce output constraints in the surrogate optimization. The output constraints are water cut at the producer wells. This represents a multidimensional output constraint problem. The TRPOD combined with Lagrangian barrier does not achieve the same solution as the high-fidelity optimization. We have tried to use two different initial controls. The water cut value at one of the producer wells is active, while at the other wells are almost active. The speedups in this case are slightly faster than the high-fidelity optimization, which are around 27 and 18 %.

In the fourth case, we apply TRPOD and the Lagrangian barrier method to constrain total water production. This case is a one-dimensional output constraint problem. The results from this case show that our proposed method is able to make the optimization in reduced-space converges to a better local maxima than that of high-fidelity optimization. Moreover, the constraint is active and the speed up factor is up to four times.

The third and the fourth case examples show the performance of TRPOD for nonlinear constraint handling. As this work focuses on the nonlinear constraint handling, the speedup factor can be obtained up to 400 % for the 3D oil reservoir while only 27 % for the 2D oil reservoir.

Apart from the results above, both the TRPOD and Lagrangian barrier methods rely on some parameter settings. In the Lagrangian barrier method we need to supply suitable values of initial \(\mu \) and its stopping criteria, and the TRPOD method needs more parameters, which are the minimum trust-region radius \(\triangle _{min}\), its initial value \(\triangle _{0}\), and its maximum value \(\triangle _{max}\). These parameters values have important impact on the optimization results. To find good values for them, it would be interesting to use a derivative-free optimization method, see e.g. Audet and Orban (2006), rather than to use a heuristic approach.

Another point worthwhile to note is the fact that gradient-based optimization is sensitive to initial guess values. We have therefore run some optimizations with different initial controls. The results consistently show that the surrogate optimization has substantially lower CPU time while honoring the nonlinear output constraints.

6 Conclusion

The use of the TRPOD method in two case examples has been presented in this paper. Two kinds of model order reduction techniques, the POD and POD-DEIM methods, have been presented. Because of the nonlinear nature of oil reservoirs, particularly the water saturation equation, the POD method may result in a slight speedup in terms of CPU runtime. To get more CPU time speedup, we use POD-DEIM that is consistently faster for the forward saturation equation. In the 2D reservoir example, the sparse linear solver seems to be efficient to solve the linear equation of the adjoint systems. Hence, the corresponding adjoint-saturation equation in the reduced-order model cannot be much faster. The surrogate optimization using the POD-DEIM reduced-order model has shown to give considerable speedup and may also give a comparable objective function value. In addition, result from surrogate optimization can be used as an initial guess to an optimization algorithm using a high-fidelity model.

The Lagrangian barrier method is sensitive to the choice of algorithm parameters, such as the barrier multiplier. In addition, the TRPOD method also requires suitable choices of parameter values. The choice of these parameters will affect the optimization solution.

The state equations in this work are solved using a sequential method, that is, implicit-pressure and implicit-saturation solvers. The POD-DEIM is then applied to the implicit water saturation equation. In commercial reservoir simulators, a fully-implicit method, that is, implicit solutions for both pressure and saturation are commonly used. We foresee the use of POD-DEIM in fully-implicit reservoir simulators may give even better speedups than observed in this work.

In this work the choice of POD basis functions is paramount. Due to the limited operating point, the POD basis functions are updated with a trust-region strategy during the course of optimization. However, there exist many variants of POD methods that extend the operating range of the POD model. For example the extrapolation strategy proposed in Burkardt et al. (2006), may give better approximation. Furthermore, the TRPOD method depends heavily on some important parameters, namely, the initial and maximum trust-region radius. Method like adaptive TRPOD Sachs (2009) can be applied to partially overcome dependency of these parameters.