1 Introduction

In this paper, we discuss the minimization of operation costs for natural gas transport networks based on a model hierarchy (see [11, 21]), which ranges from detailed models based on instationary partial differential equations with temperature dependence to highly simplified algebraic equations. The detailed models are necessary to achieve a good understanding of the system state, but in many practical optimization applications, only the stationary algebraic equations—or even further simplifications like piecewise linearizations as in [15, 16, 32]—are used in order to reduce the high computational effort of evaluating the state of the system with the more sophisticated models. However, it is then unclear how good the true state is approximated by these simplified models and error bounds are typically not available in this context (see the chapter [22] in [23] for a more detailed discussion of this issue). Recently, in [37], a detailed error and perturbation analysis has been developed for several components in the model hierarchy and it has been shown how the more detailed model components can be used to estimate the error obtained in the simplified models.

Here, we use these error estimates from the model hierarchy together with classical error estimate grid adaptation techniques for the space discretization within an optimization method to control the error adaptively by switching to more detailed models or finer discretizations if necessary. Moreover, our adaptive method also allows to locally switch back to coarser models or to coarser discretizations if they are sufficiently accurate with respect to the local flow situation. Our new approach can, in general, be used for the entire model hierarchy by also using space-time grid adaptation. However, to keep things simple and to illustrate the functionality of the new adaptive approach, we will use three stationary isothermal models from the hierarchy in [11].

Using adaptive techniques to achieve a trade-off between computational efficiency and accuracy by using adaptive discretization methods in the context of optimization and optimal control problems is an important research topic, in particular in the context of real-time optimal control of constrained dynamical systems (see, e.g., [3, 8, 9, 29]), or in the context of optimal control of problems constrained by partial differential equations (see, e.g., [1, 24,25,26]). We extend these ideas and combine adaptive grid refinement and model selection in a model hierarchy in the context of nonlinear optimization problems. We also theoretically analyze the new algorithm. First promising numerical results for such an approach were presented in [34, 35].

The paper is structured as follows. The models used in this paper are described in Section 2 together with a simple first-order Euler method for the space discretization. In Section 3, we introduce model and discretization error estimators, which are used in Section 4 to derive an adaptive model and discretization control algorithm for the nonlinear optimization of gas transport networks that, in the end, delivers solutions that satisfy prescribed error tolerances. Numerical results are presented in Section 5 and the paper concludes in Section 6.

2 Problem Description, Modeling Hierarchy, and Discretizations

In this section, we introduce the problem of operation cost minimization for natural gas transport networks. We present our overall model of a gas transport network involving continuous nonlinear models describing a stationary flow for all the considered network elements. Since the majority of the elements are pipes, our focus lies on the precise and physically accurate modeling of these pipes. The typical models for the pipe flow are nonlinear instationary partial differential equations (PDEs) on a graph and their appropriate space-time discretizations. To address the fact that the behavior of the flow and the accuracy of the model may vary significantly in different regions of the network, we discuss a small part of the complete model hierarchy of instationary models (see [11]), where the lower level models in the hierarchy are simplifications of the higher level models. Which model is most appropriate to obtain a computationally tractable, adequately accurate, and finite-dimensional approximation depends on the task that needs to be performed with the model.

Our modeling approach is based on the following physical assumptions. First, we only consider a stationary gas flow, i.e., we neglect all time effects of gas dynamics, so that we have ordinary differential equations (ODEs) in space instead of systems of PDEs on a graph. Second, we assume an isothermal regime, i.e., we neglect all effects arising from changes in the gas temperature.

These assumptions are taken carefully such that we still obtain physically meaningful solutions and such that we are still able to derive and analyze an adaptive model and discretization control algorithm—without unnecessarily overloading the models with all technical details of the application that may distract us from the main mathematical ideas.

2.1 The Network

We model the gas transport network by a directed and connected graph G = (V, A). The node set is made up of entry nodes V+, where gas is supplied, of exit nodes V, where gas is discharged, and of inner nodes V0, i.e., we have V = V+VV0. The set of arcs in our models comprises pipes Api and compressor machines Acm, i.e., we have A = ApiAcm.

Real-world gas transport networks contain many other element types like (control) valves, short cuts, or resistors. For detailed information on modeling these devices, see [14] in general or [34, 35] for a focus on nonlinear programming (NLP) type models. However, we restrict ourselves to models with pipes and compressors in order to streamline the presentation of our basic ideas and methods, and to show in a prototypical way that our approach of space discretization and model adaptivity leads to major accuracy and efficiency improvements.

As basic quantities we introduce gas pressure variables pu at all nodes uV and mass flow variables qa at all arcs aA of the network. Both types of variables are bounded due to technical constraints on the pipes, i.e.,

$$\begin{array}{@{}rcl@{}} p_{u} & \in& [\smash{\underline{p}}_{u}, \overline{p}_{u}] \quad \text{for all }u \in V, \end{array} $$
(1a)
$$\begin{array}{@{}rcl@{}} q_{a} & \in& [\smash{\underline{q}}_{a}, \overline{q}_{a}] \quad\text{for all }a \in A. \end{array} $$
(1b)

All other required quantities are introduced where they are used first.

2.2 Nodes

In stationary gas network models, the nodes uV are modeled by a mass balance equation, i.e., we have the constraint

$$ \sum\limits_{a \in \delta^{\text{in}}(u)} q_{a} - \sum\limits_{a \in \delta^{\text{out}}(u)} q_{a} = q_{u} \quad\text{for all }u \in V, $$
(2)

where for ingoing arcs we use the notation

$$\delta^{\text{in}}(u) := \{a \in A: \text{ there exists } w \in V \text{ and } a = (w, u)\} $$

and for outgoing arcs

$$\delta^{\text{out}}(u) := \{a \in A: \text{ there exists } w \in V \text{ and } a = (u, w)\}. $$

Moreover, qu models the supplied or discharged mass flow at the corresponding node, i.e., we have

$$q_{u} \left\{\begin{array}{ll} \geq 0 & \text{ for all } u \in V_{-},\\ \leq 0 & \text{ for all } u \in V_{+},\\ = 0 & \text{ for all } u \in V_{0}. \end{array}\right. $$

2.3 Pipes

Isothermal gas flow through cylindrical pipes is described by the Euler equations for compressible fluids,

$$\begin{array}{@{}rcl@{}} \frac{\partial\rho}{\partial t} + \frac{1}{A}\frac{\partial q}{\partial x}&=& 0, \end{array} $$
(3a)
$$\begin{array}{@{}rcl@{}} \frac{1}{A}\frac{\partial q}{\partial t} + \frac{\partial p}{\partial x} + \frac{1}{A}\frac{\partial(qv)}{\partial x} &=& - \lambda(q)\frac{|v|v}{2D}\rho -g\rho h^{\prime}, \end{array} $$
(3b)

see, e.g., [13, 27] for a detailed discussion. Here and in what follows, ρ is the gas density, v is its velocity, λ = λ(q) is the friction term, A denotes the cross-sectional area of the pipe, h′ is its slope, and D is the diameter of the pipe. Furthermore, g is the acceleration due to gravity, t is the temporal coordinate, and x ∈ [0, L] is the spatial coordinate with L being the length of the pipe. Equation (3a) is called the continuity equation and (3b) the momentum equation. Since we only consider the stationary case, all partial derivatives with respect to time vanish and we obtain the simplified stationary model

$$\begin{array}{@{}rcl@{}} \frac{1}{A}\frac{\partial q}{\partial x} &=& 0, \end{array} $$
(4a)
$$\begin{array}{@{}rcl@{}} \frac{\partial p}{\partial x} + \frac{1}{A}\frac{\partial(qv)}{\partial x}&=& -\lambda(q) \frac{|v|v}{2D}\rho - g\rho h^{\prime}. \end{array} $$
(4b)

Thus, the continuity equation in its stationary variant simply states that the mass flow along the pipe is constant, i.e., q(x) ≡ q = const for all x ∈ [0, L].

To simplify the stationary momentum equation (4b), we consider two more model equations. First, the equation of state

$$p = \rho c^{2} \quad \text{ with } \quad c = \sqrt{R_{\text{s}} Tz}, $$

where c is the speed of sound, Rs is the specific gas constant, and z is the compressibility factor. The second model is the relation of gas mass flow, density, and velocity given by

$$q = A \rho v. $$

Substituting both these models into (4b), we obtain

figure a

i.e., the stationary momentum equation written in dependence of the gas pressure p = p(x), x ∈ [0, L], and the mass flow q.

A simplified version of the latter equation can be obtained by ignoring the ram pressure term

$$\frac{1}{A}\frac{\partial(qv)}{\partial x}, $$

in (4b), i.e., the total pressure exerted on the gas by the pipe wall, or, equivalently, the term

$$ -\frac{q^{2}}{A^{2}}\frac{c^{2}}{p^{2}} \frac{\partial p}{\partial x} $$
(5)

in (M1). For a discussion of this simplification step, see [38]. Neglecting the ram pressure term (5) yields

figure b

Finally, one may also neglect gravitational forces, i.e., set the term \(gh^{\prime } p / c^{2}\) to 0 and obtain

figure c

Analytical solutions for the models (M1)–(M3) are only rarely known (see, e.g., [18, 19, 34]). Thus, in order to obtain finite-dimensional nonlinear optimization models, we discretize these differential equations in space. Applying, e.g., the implicit Euler method, we obtain

figure d
figure e
figure f

where pk = p(xk) and \({\Gamma } = \{x_{0}, x_{1},\dots , x_{n}\}\) is an equidistant spatial discretization of the pipe with constant stepsize h = xkxk− 1 andx0 = 0, xn = L. Of course, one could also apply a higher-order Runge–Kutta method, which would allow a larger stepsize and would thus reduce the computational cost.

These discretizations extend the model hierarchy (M1)–(M3) for the Euler equations by infinitely many models that are parameterized by the discretization stepsize h applied in (D1)–(D3). In summary, we obtain the pipe model hierarchy of stationary Euler equations depicted in Fig. 1.

Fig. 1
figure 1

Pipe model hierarchy based on the Euler equations. The space continuous models are positioned in the left column and their space-discretized counterparts are positioned in the right column

2.4 Compressors

Compressor machines a = (u, w) ∈ Acm increase the inflow gas pressure to a higher outflow pressure, i.e., they can be described in a simplified way by

$$ p_{w} = p_{u} + {\Delta}_{a}, \quad {\Delta}_{a} \in [0, \bar{\Delta}_{a}] \quad \text{ for all } a \in A_{\text{cm}}. $$
(6)

Moreover, for simplicity, we assume that we are given cost coefficients ωa ≥ 0 for every compressor aAcm that converts pressure increase to compression cost. Of course, this is an extremely coarse approximation of a compressor machine. An alternative would be to use a simple input-output surrogate model obtained from a realization or system identification of an input-output transfer function (see, e.g., [5]). However, our focus is on an accurate modeling of the gas flow in pipes and on deriving an adaptive model and discretization control algorithm. Model (6) allows for setting up a reasonable objective function for our NLPs and is thus appropriate in this work. For more details, see [31, 34, 35] or [14].

2.4.1 The Optimization Problem

We will use the adaptive model and discretization control algorithm in the context of the following nonlinear ODE-constrained optimization problem

$$\begin{array}{@{}rcl@{}} &&\min \quad \sum\limits_{a \in A_{\text{cm}}} \omega_{a} {\Delta}_{a} \end{array} $$
(7a)
$$\begin{array}{@{}rcl@{}} &&\text{s.t.} \quad ~~\text{variable bounds~(1)}, \end{array} $$
(7b)
$$\begin{array}{@{}rcl@{}} && ~~~~~~~~~~~\text{mass balance~(2)}, \end{array} $$
(7c)
$$\begin{array}{@{}rcl@{}} && ~~~~~~~~~~~\text{compressor model~(6) for all } a \in A_{\text{cm}}, \end{array} $$
(7d)
$$\begin{array}{@{}rcl@{}} && ~~~~~~~~~~~\text{pipe model }(\mathrm{M}_{1}) \text{ for all } a \in A_{\text{pi}}, \end{array} $$
(7e)

where our objective function models the cost for the compressor activity that is constrained by an infinite-dimensional description of the gas flow in pipes. Problem (2.4.1) is a classical nonlinear optimal control problem. A typical approach to solve such problems in practice is the first-discretize-then-optimize paradigm (see, e.g., [2]). In this setting, one replaces the ODE constraints by finite sets of nonlinear constraints that arise, e.g., from implicit Euler discretizations like (D1) for (M1). Moreover, practical experience suggests that for the evaluation of the constraints, it is often not required to apply the most accurate model like (D1) with a small stepsize for every pipe in the network. Instead, in many situations, it is sufficient to use simplified models like (D2) and (D3) with a coarse grid, which then typically yields fast execution times for the evaluation of the constraint functions.

To this end, we define discretized problem variants of Problem (2.4.1) by specifying the model level a ∈ {1, 2, 3} for every arc aApi (i.e., the discretized model (D1), (D2), or (D3), respectively) together with a stepsize ha. This yields the family of finite-dimensional NLPs

$$\begin{array}{@{}rcl@{}} &&\min \quad \sum\limits_{a \in A_{\text{cm}}} \omega_{a} {\Delta}_{a} \end{array} $$
(8a)
$$\begin{array}{@{}rcl@{}} &&\text{s.t.} \quad~~ \text{variable bounds (1)}, \end{array} $$
(8b)
$$\begin{array}{@{}rcl@{}} && ~~~~~~~~~~~\text{mass balance (2)}, \end{array} $$
(8c)
$$\begin{array}{@{}rcl@{}} && ~~~~~~~~~~~\text{compressor model (6) for all } a \in A_{\text{cm}}, \end{array} $$
(8d)
$$\begin{array}{@{}rcl@{}} && ~~~~~~~~~~~\text{pipe model } (\text{D}_{\ell_{a}}) \text{ with stepsize } h_{a} \text{ for all } a \in A_{\text{pi}}. \end{array} $$
(8e)

Note that the constraints (7b)–(7d) in the infinite-dimensional problem are exactly the same as constraints (8b)–(8d) in the family of discretized problems.

3 Error Estimators

In this section, we introduce a first-order estimate for the error between the most detailed infinite-dimensional and an arbitrary space-discretized model. This error estimator is obtained as the sum of a discretization and a model error estimator. Since we consider the stationary case, mass flows in pipes are constant in the spatial dimension. This is why we base our error estimators on the differences of the pressures p(x) for different models and discretizations.

Suppose that for a given pipe aApi, the model level a ∈ {1, 2, 3} with discretization stepsize ha is currently used for the computations. The overall solution of the optimization problem for the entire network, also including pressure increases in compressors etc., is denoted by y and contains the discretized pressure distributions of the separate pipes aApi, which we denote by \(p^{\ell _{a}}(x_{k};h_{a})\) with discretization grid \({\Gamma }_{1} = \{x_{k}\}_{k = 0}^{L_{a}/h_{a}}\) obtained by using the stepsize ha. We now compute an estimate for the error between the solution of the currently used model \((\text {D}_{\ell _{a}})\) and the solution of the reference model (M1). Let the solution of model (M1) for pipe aApi be denoted by \(\hat {p}(x)\) with x ∈ [0, La].

Furthermore, let the solutions of Model (D1) with discretization grids \({\Gamma }_{2} = \{x_{s}\}_{s = 0}^{L_{a}/(2h_{a})}\) and \({\Gamma }_{3} = \{x_{r}\}_{r = 0}^{L_{a}/(4h_{a})}\) using stepsizes 2ha and 4ha, be denoted by p1(xs;2ha) and p1(xr;4ha), respectively. Due to the larger stepsize, the computation of these two solutions is in general less expensive than computing a solution of Model \((\text {D}_{\ell _{a}})\) on the grid Γ1. Since the discretization grid Γ3 is the coarsest grid and all computed pressure profiles can be evaluated on this grid, Γ3 is called the evaluation grid. This grid is used in the definitions of the following error estimators. The considered discretization grids and the evaluation grid are depicted in Fig. 2.

Fig. 2
figure 2

Overview of the three considered discretization grids Γ1, Γ2, and Γ3 with gridpoints xk, xs, and xr and stepsizes ha, 2ha, and 4ha, respectively. The vertical lines represent the evaluation grid Γ3 for the error estimators in (9) and (10)

For a pipe aApi, let the discretization error estimator be defined by

$$ \eta_{\text{d},a}(y) := \| p^{1}(x_{r};2h_{a}) - p^{1}(x_{r};4h_{a})\|_{\infty} $$
(9)

and let the model error estimator be defined by

$$ \eta_{\text{m},a}(y) := \|p^{1}(x_{r};2h_{a}) - p^{\ell_{a}}(x_{r};h_{a})\|_{\infty}. $$
(10)

Here,

$$p^{\ell_{a}}(x_{r};h_{a}) = [p^{\ell_{a}}(x_{0};h_{a}), \dots, p^{\ell_{a}}(x_{n};h_{a})]^{\top}, \quad n = L_{a}/(4h_{a}), $$

denotes the solution of Model \((\text {D}_{\ell _{a}})\) computed with stepsize ha that is evaluated at the gridpoints xr, i.e., on the grid Γ3. If a = 1, i.e., if the considered solution already corresponds to the most accurate model, then we set the model error to zero, i.e., ηm,a(y) = 0. Furthermore, let the overall error estimator ηa(y) for a pipe aApi be defined to be a first-order upper bound for the maximum error between the solutions of models (M1) and \((\text {D}_{\ell _{a}})\) at gridpoints xr with stepsize 4ha. Thus, we have

$$\begin{array}{@{}rcl@{}} && \|\hat{p}(x_{r}) - p^{\ell_{a}}(x_{r};h_{a})\|_{\infty}\\ \leq && \|\hat{p}(x_{r}) - p^{1}(x_{r};2h_{a})\|_{\infty} + \|p^{1}(x_{r};2h_{a}) - p^{\ell_{a}}(x_{r};h_{a})\|_{\infty} \end{array} $$
(11)
$$\begin{array}{@{}rcl@{}} \doteq && \eta_{\text{d},a}(y) + \eta_{\text{m},a}(y) =: \eta_{a}(y), \end{array} $$

where \(\doteq \) denotes a first-order approximation in ha (see [36, p. 420]), and we use that the implicit Euler method has convergence order 1. The error estimator ηa(y) is the absolute counterpart of the componentwise relative error estimator given in [37]. An overview of the considered models in this section together with the considered stepsizes is depicted in Fig. 3.

Fig. 3
figure 3

Overview of the models and stepsizes used for the computation of the overall error estimator ηa(y) between models (M1) and \((\text {D}_{\ell _{a}})\) in (11). Here, for a pipe a, ηd,a(y) is the discretization error estimator and ηm,a(y) is the model error estimator

We close this section with a remark on the computation of the discretization error estimator in (9). A straightforward way is to solve Model (D1) once with stepsize 2ha and once again with stepsize 4ha for every aApi. Another possibility would be to use an embedded Runge–Kutta method (see, e.g., [20]), which in general saves computational cost due to the reduced number of function evaluations.

4 The Grid and Model Adaptation Algorithm

In this section, we present and analyze an algorithm that adaptively switches between the model levels in the hierarchy of Fig. 1 and adapts discretization stepsizes in order to find a convenient trade-off between physical accuracy and computational costs. To this end, the algorithm iteratively solves NLPs and initial value problems (IVPs). Solutions of the latter are used to evaluate the error estimators discussed in the last section and to decide on the model levels and the discretization stepsizes for the next NLP.

Consider a single NLP of the sequence of NLPs that are solved during the algorithm and assume that pipe aApi is modeled using model \((\text {D}_{\ell _{a}})\) and stepsize ha. Let the solution of this NLP be denoted by y. According to the last section, the overall model and discretization error estimator for this pipe is given by ηa(y) as defined in (11). Thus, it is given by the error estimator between the solutions of the most accurate model (M1) and the current model \((\text {D}_{\ell _{a}})\).

The overall goal of our method is to compute a solution of a member of the family of discretized problems (8a) for which it is guaranteed that this solution has an estimated average error per pipe with respect to the reference model (M1), that is less than an a-priorily given tolerance ε > 0. This leads us to the following definition:

Definition 1

(ε-feasibility) Let ε > 0 be given. We say that a solution y of problem (8a) with discretized models \((\text {D}_{\ell _{a}})\), a ∈ {1, 2, 3}, and stepsizes ha for the pipes aApi is ε-feasible with respect to the reference problem (2.4.1) if

$$\frac{1}{|A_{\text{pi}}|} \sum\limits_{a \in A_{\text{pi}}} \eta_{a}(y) \leq \varepsilon. $$

The remainder of this section is organized as follows. Section 4.1 introduces rules about how the model levels and discretization stepsizes are modified. The strategies for marking pipes for model or grid adaptation are explained in Section 4.2. The adaptive model and discretization control algorithm are introduced in Section 4.3, together with a theorem for the finite termination of the algorithm. Finally, some remarks regarding the adaptive control algorithm are given in Section 4.4.

4.1 Model and Discretization Adaptation Rules

Before we present and discuss the overall adaptive model control algorithm, we have to

  1. 1.

    describe the mechanisms of switching up or down pipe model levels as well as that of refining and coarsening the discretization grids, and

  2. 2.

    discuss our marking strategy that determines the arcs on which the model or grid should be adapted.

We start with the first issue and follow the standard PDE grid adaptation technique (see, e.g., [6, 7, 12] or [4]). The general strategy is as follows. We switch up one level in the model hierarchy if this yields an error reduction that is larger than ε; otherwise, we switch up to the most accurate discretized model (D1). Hence, for pipe aApi, we have the rule

$$ \ell^{\text{new}}_{a} = \left\{\begin{array}{ll} \ell_{a} - 1 & \quad \text{if } \eta_{\text{m},a}(y; \ell_{a}) - \eta_{\text{m},a}(y;\ell_{a} - 1) > \varepsilon, \\ 1 & \quad \text{otherwise}, \end{array}\right. $$
(12)

for switching up levels in the model hierarchy. We apply this rule because it is possible that the effects of neglecting the ram pressure term (which is the difference between model levels = 1 and = 2) and neglecting gravitational forces for non-horizontal pipes (which is the difference between model levels = 2 and = 3) balance each other out in the computation of the pressure profile of model (D3). In this case, switching from model (D3) to (D2) would increase the model error, which is why we switch from (D3) to (D1) directly.

A discretization grid refinement or coarsening with a factor γ > 1 is defined by taking the new stepsize as

$$ h_{a}^{\text{new}} := \left\{\begin{array}{ll} h_{a}/\gamma & \quad \text{for a grid refinement},\\ \gamma h_{a} & \quad \text{for a grid coarsening}. \end{array}\right. $$
(13)

For a discretization scheme of order β, it is well-known that a first-order approximation for the discretization error in x ∈ [0, La] is given by \(e_{\text {d},a}(x) \doteq c(x)h_{a}^{\beta }\), where c(x) is independent of ha (see, e.g., [36]). From this, it follows that the new discretization error after a grid refinement or coarsening can be written as

$$e_{\text{d},a}^{\text{new}}(x) \doteq (h_{a}^{\text{new}}/h_{a})^{\beta} e_{\text{d},a}(x). $$

Since the implicit Euler method has convergence order β = 1, with \(h_{a}^{\text {new}}\) in (13) and γ = 2, for the new discretization error estimator after a grid refinement or coarsening, it holds that

$$ \eta_{\text{d},a}^{\text{new}}(y) \doteq \left\{\begin{array}{ll} \eta_{\text{d},a}(y)/2& \quad \text{for a grid refinement},\\ 2 \eta_{\text{d},a}(y) & \quad \text{for a grid coarsening}. \end{array}\right. $$
(14)

4.2 Marking Strategies

We now describe our marking strategies, i.e., how we choose which pipes should be switched up or down in their model level and which pipes should get a refined or coarsened grid. Given marking strategy parameters Θd, Θm ∈ [0, 1], we compute subsets \(\mathcal {R}, \mathcal {U} \subseteq A_{\text {pi}}\) such that they are the minimal subsets of arcs that satisfy

$$ {\Theta}_{\text{d}} \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d}, a}(y) \leq \sum\limits_{a \in \mathcal{R}} \eta_{\text{d}, a}(y) $$
(15)

and

$$ {\Theta}_{\text{m}} \sum\limits_{a \in A_{\text{pi}}^{> \varepsilon}} \left( \eta_{\text{m},a}(y;\ell_{a}) - \eta_{\text{m},a}(y;\ell_{a}^{\text{new}})\right) \leq \sum\limits_{a \in \mathcal{U}} \left( \eta_{\text{m},a}(y;\ell_{a}) - \eta_{\text{m},a}(y;\ell_{a}^{\text{new}})\right) $$
(16)

with

$$A_{\text{pi}}^{> \varepsilon} := \{a \in A_{\text{pi}}: \eta_{\text{m},a}(y;\ell_{a}) - \eta_{\text{m},a}(y;\ell_{a}^{\text{new}}) > \varepsilon\}, $$

where \(\ell _{a}^{\text {new}}\) is given in (12). Analogously, given marking strategy parameters Φd, Φm ∈ [0, 1] and τ ≥ 1, we compute \(\mathcal {C}, \mathcal {D} \subseteq A_{\text {pi}}\) such that they are the maximal subsets of arcs that satisfy

$$ {\Phi}_{\text{d}} \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d}, a}(y) \geq \sum\limits_{a \in \mathcal{C}} \eta_{\text{d}, a}(y) $$
(17)

and

$$ {\Phi}_{\text{m}} \sum\limits_{a \in A_{\text{pi}}^{< \varepsilon}(\tau)} \left( \eta_{\text{m},a}(y;\ell_{a}^{\text{new}}) - \eta_{\text{m},a}(y;\ell_{a})\right) \geq \sum\limits_{a \in \mathcal{D}} \left( \eta_{\text{m},a}(y;\ell_{a}^{\text{new}}) - \eta_{\text{m},a}(y;\ell_{a})\right) $$
(18)

with

$$A_{\text{pi}}^{< \varepsilon}(\tau) := \{a \in A_{\text{pi}}: \eta_{\text{m},a}(\ell_{a}^{\text{new}}) - \eta_{\text{m},a}(\ell_{a}) \leq \tau \varepsilon\}. $$

In (18), \(\ell _{a}^{\text {new}}\) is always set to \(\min \{\ell _{a} + 1, 3\}\). For every arc \(a \in \mathcal {R}\) (\(a \in \mathcal {C}\)), we refine (coarsen) the discretization grid by halving (doubling) the stepsize, i.e., we set γ = 2 in (13). We note that these marking strategies are very similar to the greedy strategies on a network described in [10], where those pipes are marked for a spatial, temporal, or model refinement which yield the largest error reduction.

4.3 The Algorithm

With these preliminaries, we can now state the overall adaptive model and discretization control algorithm for finding an ε-feasible solution of the reference problem (2.4.1). The formal listing is given in Algorithm 1.

figure g

The algorithm makes use of the safeguard parameter μ ∈ ℕ. This parameter ensures that the algorithm performs grid coarsenings and switches down the model level only after applying μ rounds of grid refinements and switching up model levels. It prevents an alternating switching up and down model levels or an alternating refining and coarsening of the discretization grid. We note that this technique is similar to the use of hysteresis parameters (see, e.g., [28]). By employing this safeguard, we can prove that Algorithm 1 terminates after a finite number of iterations with an ε-feasible point of the reference model (M1).

To improve readability, we split the proof of our main theorem into two parts. The first lemma states finite termination at an ε-feasible point if only discretization grid refinements and coarsenings are applied, whereas the second lemma considers the case of switching levels in the model hierarchy only, i.e., with a fixed stepsize for every pipe.

Lemma 1

Suppose that the model level a ∈ {1, 2, 3} is fixed for every pipe aApi. Let the resulting set of model levels be denoted by \(\mathcal {M}\). Suppose further that ηa(y) = ηd,a(y) holds in (11) and that every NLP is solved to local optimality. Consider Algorithm 1 without applying the model switching steps in Lines 10 and 17. Then, the algorithm terminates after a finite number of refinements in Line 11 and coarsenings in Line 18 with an ε-feasible solution with respect to model level set \(\mathcal {M}\) if there exists a constant C > 0 such that

$$ \frac{1}{2} {\Theta}_{\text{d}}^{k} \mu^{k} > {\Phi}_{\text{d}}^{k} + C $$
(19)

holds for all k.

Proof

We consider the total discretization error

$$\eta_{\text{d}}(y) = \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d}, a}(y) $$

and show that for every iteration k the difference between the decrease obtained in the inner for-loop and the increase obtained due to the coarsenings applied in Line 18 is positive and uniformly bounded away from zero. In what follows, we only consider a single iteration and drop its index k for better readability.

First, we consider one refinement step in Line 11. Let \(\eta _{\text {d},a}^{j-1}\) denote the discretization error before the j th inner iteration and let \(\eta _{\text {d},a}^{j}\) denote the discretization error after the j th inner iteration. With this, we have

$$\begin{array}{@{}rcl@{}} && \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{j}\\ =&&\sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1} + \sum\limits_{a \in A_{\text{pi}} \setminus \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j} - \sum\limits_{a \in A_{\text{pi}} \setminus \mathcal{R}_{j}} \eta_{\text{d},a}^{j}\\ = && \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j}\\ = && \sum\limits_{a \in \mathcal{R}_{j}} \frac{1}{2} \eta_{\text{d},a}^{j-1} \end{array} $$

for every \(j = 1, \dots , \mu \). For the last equality, we have used that the implicit Euler method has convergence order 1, which (for small stepsizes ha) implies \(\eta _{\text {d},a}^{j} = \frac {1}{2} \eta _{\text {d},a}^{j-1}\) when we take the new stepsize as half the current stepsize (see (14)). Summing up over all μ inner iterations, we obtain a telescopic sum and finally get an error decrease of

$$\sum\limits_{j = 1}^{\mu} \left( \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{j} \right) = \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{0} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{\mu} = \frac{1}{2} \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1}. $$

We now consider the coarsening step. For this, let \(\eta _{\text {d},a}^{\mu }\) denote the discretization error before and \(\eta _{\text {d},a}^{\mu + 1}\) the discretization error after the coarsening step in Line 18. Using similar ideas like above, we obtain

$$\begin{array}{@{}rcl@{}} && \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{\mu + 1} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{\mu} \\ &= & \sum\limits_{a \in A_{\text{pi}} \setminus \mathcal{C}} \eta_{\text{d},a}^{\mu + 1} + \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu + 1} - \sum\limits_{a \in A_{\text{pi}} \setminus \mathcal{C}} \eta_{\text{d},a}^{\mu} - \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu}\\ &= & \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu + 1} - \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu} \\ &= & 2 \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu} - \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu}\\ &= & \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu}. \end{array} $$

Thus, we are finished if we prove that

$$\frac{1}{2} \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu} $$

is positive and uniformly bounded away from zero. Using

$$\eta_{\text{d},a}^{j-1} \geq \eta_{\text{d},a}^{\mu} \quad\text{for all } j = 1, \dots, \mu, $$

(15), (17), and (19), we obtain

$$\begin{array}{@{}rcl@{}} && \frac{1}{2} \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d}, a}^{j-1} \geq \frac{1}{2} {\Theta}_{\text{d}} \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d}, a}^{j-1} \geq \frac{1}{2} {\Theta}_{\text{d}} \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d}, a}^{\mu}\\ &= & \frac{1}{2} {\Theta}_{\text{d}} \mu \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d}, a}^{\mu} > ({\Phi}_{\text{d}} + C) \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d}, a}^{\mu} > \sum\limits_{a \in \mathcal{C}} \eta_{\text{d}, a}^{\mu} + C \left|A_{\text{pi}}\right| \varepsilon, \end{array} $$

which completes the proof. □

Next, we prove an analogous lemma for the case that we fix the stepsize of every arc aApi and only allow for model switching.

Lemma 2

Suppose that the discretization stepsize ha is fixed for every pipe aApi. Suppose further that ηa(y) = ηm,a(y) holds in (11) and that every NLP is solved to local optimality. Consider Algorithm 1 without applying the discretization refinements in Line 11 and the coarsenings in Line 18. Then, Algorithm 1 terminates after a finite number of model switches in Lines 10 and 17 with an ε-feasible solution with respect to the stepsizes ha, aApi, if there exists a constant C > 0 such that

$$ {\Theta}_{\mathrm{m}}^{k} \mu^{k} > \tau^{k} {\Phi}_{\mathrm{m}}^{k} |A_{\text{pi}}| + C $$
(20)

holds for all k.

Proof

We consider the total model error

$$\eta_{\text{m}}(y) = \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m}, a}(y) $$

and show that the difference between the decrease obtained in the inner loop and the increase obtained due to switching model levels down in Line 17 is positive and uniformly bounded away from zero for every iteration k. We again consider only a single iteration and drop the corresponding index.

First, we consider a single step of switching up the model level in Line 10. Let \(\eta _{\text {m},a}^{j-1}\) denote the model error before the j th inner iteration and \(\eta _{\text {m},a}^{j}\) the model error after the j th inner iteration. We then have

$$\begin{array}{@{}rcl@{}} && \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{j-1} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{j} \\ &= & \sum\limits_{a \in \mathcal{U}_{j}} \eta_{\text{m},a}^{j-1} + \sum\limits_{a \in A_{\text{pi}} \setminus \mathcal{U}_{j}} \eta_{\text{m},a}^{j-1} - \sum\limits_{a \in \mathcal{U}_{j}} \eta_{\text{m},a}^{j} - \sum\limits_{a \in A_{\text{pi}} \setminus \mathcal{U}_{j}} \eta_{\text{m},a}^{j} \\ &= & \sum\limits_{a \in \mathcal{U}_{j}} \eta_{\text{m},a}^{j-1} - \sum\limits_{a \in \mathcal{U}_{j}} \eta_{\text{m},a}^{j} \end{array} $$

for every \(j = 1, \dots , \mu \). Summing up over all j yields the overall model error decrease after μ for-loop iterations of

$$\sum\limits_{j = 1}^{\mu} \left( \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{j-1} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{j}\right) = \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{0} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{\mu} = \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in \mathcal{U}_{j}} (\eta_{\text{m},a}^{j-1} - \eta_{\text{m},a}^{j}). $$

We now consider the step of switching down the model level in Line 17. Let \(\eta _{\text {m},a}^{\mu }\) denote the model error before and \(\eta _{\text {m},a}^{\mu + 1}\) the model error after this step. It holds that

$$\begin{array}{@{}rcl@{}} && \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{\mu+ 1} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{\mu} \\ &= & \sum\limits_{a \in \mathcal{D}} \eta_{\text{m},a}^{\mu+ 1} + \sum\limits_{a \in A_{\text{pi}} \setminus \mathcal{D}} \eta_{\text{m},a}^{\mu+ 1} - \sum\limits_{a \in \mathcal{D}} \eta_{\text{m},a}^{\mu} - \sum\limits_{a \in A_{\text{pi}} \setminus \mathcal{D}} \eta_{\text{m},a}^{\mu} \\ &= & \sum\limits_{a \in \mathcal{D}} \left( \eta_{\text{m},a}^{\mu+ 1} - \eta_{\text{m},a}^{\mu}\right). \end{array} $$

Thus, the proof is finished if we show that

$$\sum\limits_{j = 1}^{\mu} \sum\limits_{a \in \mathcal{U}_{j}} \left( \eta_{\text{m},a}^{j-1} - \eta_{\text{m},a}^{j}\right) - \sum\limits_{a \in \mathcal{D}} \left( \eta_{\text{m},a}^{\mu+ 1} - \eta_{\text{m},a}^{\mu}\right) $$

is positive and uniformly bounded away from zero. With similar ideas as in the proof of Lemma 1 and using (16), (18), and (20), we obtain

$$\begin{array}{@{}rcl@{}} && \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in \mathcal{U}_{j}} \left( \eta_{\text{m},a}^{j-1} - \eta_{\text{m},a}^{j}\right) \geq {\Theta}_{\text{m}} \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in A_{\text{pi}}^{>\varepsilon}} \left( \eta_{\text{m},a}^{j-1} - \eta_{\text{m},a}^{j}\right) > {\Theta}_{\text{m}} \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in A_{\text{pi}}^{>\varepsilon}} \varepsilon\\ &= & {\Theta}_{\text{m}} \mu \left|A_{\text{pi}}^{>\varepsilon}\right| \varepsilon \geq {\Theta}_{\text{m}} \mu \varepsilon > \tau {\Phi}_{\text{m}} \left|A_{\text{pi}}\right| \varepsilon + C \varepsilon \geq {\Phi}_{\text{m}} \sum\limits_{a \in A_{\text{pi}}^{<\varepsilon}(\tau)} \varepsilon \tau + C \varepsilon\\ &\geq & {\Phi}_{\text{m}} \sum\limits_{a \in A_{\text{pi}}^{<\varepsilon}(\tau)} \left( \eta_{\text{m},a}^{\mu+ 1} - \eta_{\text{m},a}^{\mu}\right) + C \varepsilon \geq \sum\limits_{a \in \mathcal{D}} \left( \eta_{\text{m},a}^{\mu+ 1} - \eta_{\text{m},a}^{\mu}\right) + C \varepsilon, \end{array} $$

where we used that \(|A_{\text {pi}}^{>\varepsilon }| \geq 1\). This completes the proof. □

Let \(\eta _{\text {m},a}^{\text {new}}(y)\) denote the new model error estimator after a grid refinement or coarsening. In order to prove our main theorem we need to assume that, for every pipe aApi, the change in the model error estimator after a grid refinement or coarsening can be neglected as compared to ηm,a(y), i.e., \(|\eta _{\text {m},a}(y) - \eta _{\text {m},a}^{\text {new}}(y)| \ll \eta _{\text {m},a}(y)\), such that we may write \(\eta _{\text {m},a}^{\text {new}}(y) = \eta _{\text {m},a}(y)\). A sufficient condition for this assumption to hold is given by ηd,a(y) ≪ ηm,a(y) for every aApi. This condition also implies that ηm,a(y) is a first-order approximation of the exact model error em,a(y) and is thus reliable for small stepsizes ha.

Lemma 3

Let the discretization and model error estimator ηd,a(y) andηm,a(y) as defined in (9) and (10) be given for every aApi . Let furtherem,a(y) be the exact error between models (M1) and (M\(_{\ell _{a}}\)) and let \(\eta _{\text {m},a}^{\text {new}}(y)\) be the new model error estimator after a grid refinement or coarsening. Then, the implications

  1. 1.

    \(\eta _{\text {d},a}(y) \ll \eta _{\text {m},a}(y) \Longrightarrow \eta _{\text {m},a}(y) \doteq e_{\text {m},a}(y)\),

  2. 2.

    \(\eta _{\text {d},a}(y) \ll \eta _{\text {m},a}(y) \Longrightarrow \eta _{\text {m},a}^{\text {new}}(y) = \eta _{\text {m},a}(y)\)

hold for every aApi.

Proof

Let pipe aApi be arbitrary. To improve readability, in the following we drop the dependencies of the exact errors and the error estimators on a and y. Without loss of generality, we consider only one arbitrary spatial gridpoint xk.

Let us first introduce some notation. The exact model error is given by \(e_{\text {m}}(x_{k}) = \hat {p}(x_{k}) - p^{\text {M}_{\ell _{a}}}(x_{k})\) for the current model level a, the exact discretization error for model (D1) is given by \(e_{\text {d}}^{1}(x_{k}) = \hat {p}(x_{k}) - p^{1}(x_{k};2h_{a})\) and the exact discretization error for model (\(\text {D}_{\ell _{a}}\)) is denoted by \(e_{\text {d}}^{\ell _{a}}(x_{k}) = p^{\text {M}_{\ell _{a}}}(x_{k}) - p^{\ell _{a}}(x_{k};2h_{a})\). Furthermore, the model error estimator is given by \(\eta _{\text {m}}(x_{k}) = p^{1}(x_{k};2h_{a}) - p^{\ell _{a}}(x_{k};2h_{a})\) (see (10)), and we define the discretization error estimators \(\eta _{\text {d}}^{1}(x_{k}) := p^{1}(x_{k};2h_{a}) - p^{1}(x_{k};4h_{a})\) and \(\eta _{\text {d}}^{\ell _{a}}(x_{k}):= p^{\ell _{a}}(x_{k};2h_{a}) - p^{\ell _{a}}(x_{k};4h_{a})\) as in (9). Then, we have \(\eta _{\text {d}}^{1}(x_{k}) \doteq e_{\text {d}}^{1}(x_{k})\) and \(\eta _{\text {d}}^{\ell _{a}}(x_{k}) \doteq e_{\text {d}}^{\ell _{a}}(x_{k})\) (see [36, p. 420]). Further, it holds that

$$ |\eta_{\text{d}}^{1}(x_{k})| \ll |\eta_{\text{m}}(x_{k})| ~\Longleftrightarrow~ |\eta_{\text{d}}^{\ell_{a}}(x_{k})| \ll |\eta_{\text{m}}(x_{k})|, $$
(21)

because \(\eta _{\text {d}}^{1}(x_{k})\) and \(\eta _{\text {d}}^{\ell _{a}}(x_{k})\) use the same stepsizes 2ha and 4ha to compute the discrete pressure distributions.

We now prove the first implication. Using the previously defined notation, it holds that

$$\begin{array}{@{}rcl@{}} e_{\text{m}}(x_{k}) & =& \hat{p}(x_{k}) - p^{\text{M}_{\ell_{a}}}(x_{k}) \\ & =& e_{\text{d}}^{1}(x_{k}) + p^{1}(x_{k};2h_{a}) - e_{\text{d}}^{\ell_{a}}(x_{k}) - p^{\ell_{a}}(x_{k};2h_{a}) \\ & \doteq& \eta_{\text{d}}^{1}(x_{k}) + p^{1}(x_{k};2h_{a}) - \eta_{\text{d}}^{\ell_{a}}(x_{k}) - p^{\ell_{a}}(x_{k};2h_{a}) \\ & =& \eta_{\text{d}}^{1}(x_{k}) - \eta_{\text{d}}^{\ell_{a}}(x_{k}) + \eta_{\text{m}}(x_{k}). \end{array} $$

Thus, if \(|\eta _{\text {d}}^{1}(x_{k})|\) and \(|\eta _{\text {d}}^{\ell _{a}}(x_{k})|\) may be neglected as compared to \(|\eta _{\text {m}}(x_{k})|\), then we have \(e_{\text {m}}(x_{k}) \doteq \eta _{\text {m}}(x_{k})\), i.e.,

$$\left|\eta_{\text{d}}^{1}(x_{k})\right| \ll \left|\eta_{\text{m}}(x_{k})\right| \wedge \left|\eta_{\text{d}}^{\ell_{a}}(x_{k})\right| \ll \left|\eta_{\text{m}}(x_{k})\right| ~\Longrightarrow ~ e_{\text{m}}(x_{k}) \doteq \eta_{\text{m}}(x_{k}). $$

Considering also the equivalence relation (21), it follows that

$$\left|\eta_{\text{d}}^{1}(x_{k})\right| \ll \left|\eta_{\text{m}}(x_{k})\right| ~\Longrightarrow ~ e_{\text{m}}(x_{k}) \doteq \eta_{\text{m}}(x_{k}), $$

from which the first implication follows directly.

Finally, we prove the second implication. We show that this implication holds for the case that \(\eta _{\text {m}}^{\text {new}}(x_{k})\) is the new model error estimator after a grid coarsening. The case for a grid refinement can be shown analogously. It holds that

$$\begin{array}{@{}rcl@{}} \eta_{\text{m}}^{\text{new}}(x_{k}) & =& p^{1}(x_{k};4h_{a}) - p^{\ell_{a}}(x_{k};4h_{a}) \\ & =& - \eta_{\text{d}}^{1}(x_{k}) + p^{1}(x_{k};2h_{a}) + \eta_{\text{d}}^{\ell_{a}}(x_{k}) - p^{\ell_{a}}(x_{k};2h_{a}) \\ & =& - \eta_{\text{d}}^{1}(x_{k}) + \eta_{\text{d}}^{\ell_{a}}(x_{k}) + \eta_{\text{m}}(x_{k}). \end{array} $$

This yields

$$ \left|\eta_{\text{d}}^{1}(x_{k})\right| \ll \left|\eta_{\text{m}}(x_{k})\right| \wedge |\eta_{\text{d}}^{\ell_{a}}(x_{k})| \ll \left|\eta_{\text{m}}(x_{k})\right| ~\Longrightarrow~ \eta_{\text{m}}^{\text{new}}(x_{k}) = \eta_{\text{m}}(x_{k}). $$
(22)

Again, considering (21) and (22) results in

$$\left|\eta_{\text{d}}^{1}(x_{k})\right| \ll \left|\eta_{\text{m}}(x_{k})\right| ~\Longrightarrow ~\eta_{\text{m}}^{\text{new}}(x_{k}) = \eta_{\text{m}}(x_{k}), $$

from which the second implication follows immediately. □

With the three preceding lemmas at hand, we are now ready to state and prove our main theorem about finite termination of Algorithm 1.

Theorem 1

(Finite termination) Suppose that ηd,aηm,a for every aApi and that every NLP is solved to local optimality. Then, Algorithm1 terminates after a finite number of refinements, coarsenings and model switches in Lines10, 11, 17, and18 with anε-feasible solution with respect to the reference problem (2.4.1) if there exist constantsC1, C2 > 0 such that

$$\frac{1}{2} {\Theta}_{\text{d}}^{k} \mu^{k} > {\Phi}_{\text{d}}^{k} + C_{1}, \quad {\Theta}_{\text{m}}^{k} \mu^{k} > \tau^{k} {\Phi}_{\text{m}}^{k} |A_{\text{pi}}| + C_{2} $$

hold for all k.

Proof

We consider the total error \({\sum }_{a \in A_{\text {pi}}} \eta _{a}\) and show that the difference between the decrease obtained in the inner loop and the increase obtained due to switching down the model level and coarsening the grid is positive and uniformly bounded away from zero for every iteration k. Again, we consider only a single iteration and drop the corresponding index. We first consider Lines 10 and 11 for fixed j. It holds that

$$\begin{array}{@{}rcl@{}} && \sum\limits_{a \in A_{\text{pi}}} \eta_{a}^{j-1} - \sum\limits_{a \in A_{\text{pi}}} {\eta_{a}^{j}}\\ &= & \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{j-1} + \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{j} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{j}\\ &= & \sum\limits_{a \in A_{\text{pi}} \setminus (\mathcal{U}_{j} \cup \mathcal{R}_{j})} \eta_{\text{m},a}^{j-1} - \sum\limits_{a \in A_{\text{pi}} \setminus (\mathcal{U}_{j} \cup \mathcal{R}_{j})} \eta_{\text{m},a}^{j} + \sum\limits_{a \in \mathcal{U}_{j} \setminus \mathcal{R}_{j}} \eta_{\text{m},a}^{j-1} - \sum\limits_{a \in \mathcal{U}_{j} \setminus \mathcal{R}_{j}} \eta_{\text{m},a}^{j}\\ && + \sum\limits_{a \in \mathcal{R}_{j} \setminus \mathcal{U}_{j}} \eta_{\text{m},a}^{j-1} - \sum\limits_{a \in \mathcal{R}_{j} \setminus \mathcal{U}_{j}} \eta_{\text{m},a}^{j} + \sum\limits_{a \in \mathcal{R}_{j} \cap \mathcal{U}_{j}} \eta_{\text{m},a}^{j-1} - \sum\limits_{a \in \mathcal{R}_{j} \cap \mathcal{U}_{j}} \eta_{\text{m},a}^{j} \\ && + \sum\limits_{a \in A_{\text{pi}} \setminus (\mathcal{U}_{j} \cup \mathcal{R}_{j})} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in A_{\text{pi}} \setminus (\mathcal{U}_{j} \cup \mathcal{R}_{j})} \eta_{\text{d},a}^{j} + \sum\limits_{a \in \mathcal{U}_{j} \setminus \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in \mathcal{U}_{j} \setminus \mathcal{R}_{j}} \eta_{\text{d},a}^{j}\\ && + \sum\limits_{a \in \mathcal{R}_{j} \setminus \mathcal{U}_{j}} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in \mathcal{R}_{j} \setminus \mathcal{U}_{j}} \eta_{\text{d},a}^{j} + \sum\limits_{a \in \mathcal{R}_{j} \cap \mathcal{U}_{j}} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in \mathcal{R}_{j} \cap \mathcal{U}_{j}} \eta_{\text{d},a}^{j} \\ &= & \sum\limits_{a \in \mathcal{U}_{j}} \eta_{\text{m},a}^{j-1} - \sum\limits_{a \in \mathcal{U}_{j}} \eta_{\text{m},a}^{j} + \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1} - \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j}\\ &= & \sum\limits_{a \in \mathcal{U}_{j}} \left( \eta_{\text{m},a}^{j-1} - \eta_{\text{m},a}^{j}\right) + \frac{1}{2} \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1}, \end{array} $$

where we use that \(\eta _{\text {m},a}^{j} = \eta _{\text {m},a}^{j-1}\) for every \(a \in \mathcal {R}_{j} \setminus \mathcal {U}_{j}\) since \(\eta _{\text {d},a}^{j-1} \ll \eta _{\text {m},a}^{j-1}\) for every aApi (see Lemma 3). Moreover, the discretization error estimator ηd,a does not change after a switching up the model level.

Again, summing up over all \(j = 1, \dots , \mu \) yields the overall error decrease after μ for-loop iterations of

$$\begin{array}{@{}rcl@{}} && \sum\limits_{j = 1}^{\mu} \left( \sum\limits_{a \in A_{\text{pi}}} \eta_{a}^{j-1} - \sum\limits_{a \in A_{\text{pi}}} {\eta_{a}^{j}}\right) = \sum\limits_{a \in A_{\text{pi}}} {\eta_{a}^{0}} - \sum\limits_{a \in A_{\text{pi}}} \eta_{a}^{\mu} \\ &= & \sum\limits_{j = 1}^{\mu} \left( \sum\limits_{a \in \mathcal{U}_{j}} \left( \eta_{\text{m},a}^{j-1} - \eta_{\text{m},a}^{j}\right) + \frac{1}{2} \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1}\right). \end{array} $$

With similar arguments as before for Lines 10 and 11, we consider Lines 17 and 18 and obtain

$$\begin{array}{@{}rcl@{}} && \sum\limits_{a \in A_{\text{pi}}} \eta_{a}^{\mu + 1} - \sum\limits_{a \in A_{\text{pi}}} \eta_{a}^{\mu}\\ &= & \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{\mu + 1} + \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{\mu + 1} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d},a}^{\mu} - \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{m},a}^{\mu}\\ &= & \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu + 1} - \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu} + \sum\limits_{a \in \mathcal{D}} \eta_{\text{m},a}^{\mu + 1} - \sum\limits_{a \in \mathcal{D}} \eta_{\text{m},a}^{\mu}\\ &= & \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu} + \sum\limits_{a \in \mathcal{D}} \left( \eta_{\text{m},a}^{\mu + 1} - \eta_{\text{m},a}^{\mu}\right). \end{array} $$

Finally, it remains to prove that

$$\sum\limits_{j = 1}^{\mu} \left( \sum\limits_{a \in \mathcal{U}_{j}} \left( \eta_{\text{m},a}^{j-1} - \eta_{\text{m},a}^{j}\right) + \frac{1}{2} \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1}\right) - \sum\limits_{a \in \mathcal{C}} \eta_{\text{d},a}^{\mu} - \sum\limits_{a \in \mathcal{D}} \left( \eta_{\text{m},a}^{\mu + 1} - \eta_{\text{m},a}^{\mu}\right) $$

is positive and uniformly bounded away from zero. Using the proofs of Lemmas 1 and 2, we have

$$\begin{array}{@{}rcl@{}} && \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in \mathcal{U}_{j}} \left( \eta_{\text{m},a}^{j-1} - \eta_{\text{m},a}^{j}\right) + \frac{1}{2} \sum\limits_{j = 1}^{\mu} \sum\limits_{a \in \mathcal{R}_{j}} \eta_{\text{d},a}^{j-1}\\ &> & {\Theta}_{\text{m}} \mu \varepsilon + \frac{1}{2} \mu {\Theta}_{\text{d}} \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d}, a}^{\mu} \\ & > & \tau {\Phi}_{\text{m}} \left|A_{\text{pi}}\right| \varepsilon + C_{2} \varepsilon + ({\Phi}_{\text{d}} + C_{1}) \sum\limits_{a \in A_{\text{pi}}} \eta_{\text{d}, a}^{\mu} \\ &> & {\sum}_{a \in \mathcal{D}} \left( \eta_{\text{m},a}^{\mu+ 1} - \eta_{\text{m},a}^{\mu}\right) + C_{2} \varepsilon + \sum\limits_{a \in \mathcal{C}} \eta_{\text{d}, a}^{\mu} + C_{1} \left|A_{\text{pi}}\right| \varepsilon, \end{array} $$

which completes the proof. □

4.4 Remarks

Before we close this section, we discuss some details and extensions regarding Algorithm 1. First, we give an overview of the main computations that are performed in the algorithm. In Lines 2 and 12, the NLP (8a) is solved using the current model level a and the current stepsize ha for every pipe aApi. Most types of NLP algorithms are iterative methods. That is, the computational costs of the algorithms depend on the number of iterations required to converge to a (local) optimal solution and the costs per iteration. The latter mainly consist of the solution of a linear system (e.g., suitable forms of the KKT system for interior-point or active-set methods) for computing the search direction. The size of this linear system typically is \(\mathcal {O}(n+m)\), where n is the number of variables and m is the number of constraints of the NLP. Both n and m are directly controlled by the stepsizes ha that we use in our NLP models. The model level a mainly determines the sparsity/density of the system matrices of the linear systems and the overall nonlinearity of the NLP, which typically influences the number of required iterations.

In Lines 3 and 13, the overall error estimator ηa(y) is computed for every pipe aApi. Thus, for all pipes, the solution of model (D1) is computed with stepsize both 2ha and 4ha and the solution of model (\(\text {D}_{\ell _{a}}\)) is computed with stepsize ha. These solutions are obtained by solving the initial value problems consisting of the ordinary differential equations (M1) and (\(\text {M}_{\ell _{a}}\)) together with the initial value p(x0), which is contained in the optimal solution y of Problem (8a). Continuing with the example of the implicit Euler method that we use as numerical integration scheme throughout this paper, the initial value problems can be solved (i) by considering the implicit equations in (D1) and (\(\text {D}_{\ell _{a}}\)) and using, e.g., the Newton method to solve for pk in every space integration step, or (ii) by using an existing software code and setting the order of the numerical integration scheme to one.

The subset \(\mathcal {R}\) in Line 9 can be determined efficiently, since ηd,a(y) has already been computed in Line 3 or 13 for every aApi. For subset \(\mathcal {U}\) in Line 9 and in (16), the error estimator ηm,a(y) has also already been computed in Line 3 or 13 for every aApi. Moreover, \(\ell _{a}^{\text {new}}\) in (12) has to be computed in order to determine \(\mathcal {U}\). For this, we compute ηm,a(y;a − 1) if and only if a = 3. In the case a = 2 we have \(\eta _{\text {m},a}(y;\ell _{a}-1) = 0\) and for a = 1 we have \(\eta _{\text {m},a}(y;\ell _{a}^{\text {new}}) = \eta _{\text {m},a}(y;\ell _{a}) = 0\). Subset \(\mathcal {C}\) in Line 16 can also be computed efficiently, since ηd,a(y) has already been computed in Line 3 or 13 for every aApi. For subset \(\mathcal {D}\) in Line 16 and in (18), the error estimator ηm,a(y) has been computed already in Line 3 or 13 for every aApi. If a ∈{1,2}, then \(\eta _{\text {m},a}(y;\ell _{a}+ 1)\) has to be computed for every aApi in order to determine \(\mathcal {D}\).

We note that the optimal solution y of Problem (8a) contains, among others, the model level a, stepsize ha, and pressure \(p^{\ell _{a}}(x_{0})\) at the beginning of the pipe, for every \(a \in A_{\text {pi}}\). Using a, ha, and \(p^{\ell _{a}}(x_{0})\), the discretization and model error estimator for pipe \(a \in A_{\text {pi}}\) can be computed without information from other pipes. Hence, the error estimators, e.g., in Line 13, can be computed in parallel.

Up to now, we have discussed two types of errors: modeling and discretization errors. Both are handled by Algorithm 1 and we have shown that the algorithm terminates with a combined model and discretization error that satisfies a user-specified error tolerance ε > 0. What we have ignored so far is that the NLPs are also solved by a numerical method that introduces numerical errors as well. However, it is easy to integrate the control of this additional error source into Algorithm 1. Let εopt > 0 be the optimality tolerance that we hand over to the optimization solver and suppose that the solver always satisfies this tolerance. Furthermore, let the tolerance ε considered so far now be denoted by εdm. Using the triangle inequality, we easily see that the upper bound of the total error (that is aggregated modeling, discretization, and optimization error) is εopt + εdm. Hence, in order to satisfy an overall error tolerance ε > 0, we have to ensure that εopt + εdmε holds, which can be formally introduced in Algorithm 1 by replacing ε with εopt + εdm.

Finally, note that this additional error source directly suggests itself for adaptive treatment as well. In the early iterations of Algorithm 1, it is not important that εopt is small. That is, the optimization is allowed to produce coarser approximate local solutions. However, in the course of the algorithm, one can observe the achieved modeling and discretization error and can adaptively tighten the optimization tolerance. Since this strategy allows the optimization method to produce coarse approximate solutions in the beginning, it can be expected that this leads to a speed-up in the overall running times of Algorithm 1.

The choice of the error tolerance ε that has to be provided in Algorithm 1 will depend on the user requirements; however, one should be aware that due to the round-off errors committed during every single step of the procedure, and due to possible ill-conditioning of the linear systems solved by the NLP solver, none of the three errors, the discretization error, the modeling error, and the NLP error can be chosen extremely small. Since the backward error and the associated condition number of the linear systems can be estimated during the procedure (see [17]) and since the error estimates for the discretization method are at hand, it is just the modeling error which is not known a priori. To estimate this latter error (of the finest model) usually requires a comparison with experimental data. If these are available during a real-world process, then it is possible to adjust the required tolerances ε in a feedback loop using a standard PI controller (see, e.g., [30]), i.e., if measured data are available that show that the finest model has a given accuracy, then ε should not be chosen smaller than this.

Finally, we want to stress that the described adaptive error control algorithm can be used with any number of model levels in the hierarchy, with any higher order discretization scheme, and with any number of grid refinement levels.

5 Computational Results

In this section, we present numerical results obtained by the adaptive error control algorithm. To this end, we compare the efficiency of the method with an approach that directly solves an NLP that satisfies the same error tolerance and that is obtained without using adaptivity. Before we discuss the results in detail, we briefly mention the computational setup and the gas transport network instances that we solve.

We implemented the adaptive error control Algorithm 1 in Python 2.7.13 and used the scipy 0.14.0 module for solving the initial value problems. All nonlinear optimization models have been implemented using the C++ framework LaMaTTO++Footnote 1 for modeling and solving mixed-integer nonlinear optimization problems on networks. The computations have been done on a six-core AMD OpteronTM Processor 2435 with 2.2 GHz and 64 GB of main memory. The NLPs have been solved using Ipopt 3.12 (see [39, 40]).

For our computational study, we choose publicly available GasLib instances (see [33]). This has the advantage that, if desired, all numerical results can be reproduced on the same data. In what follows, we consider the networks GasLib-40 and GasLib-135, since these are the largest networks in the GasLib that only contain pipes and compressor stations as arc types. Detailed statistics are given in Table 1.

Table 1 Statistics for the instances

Next, we describe the parameterization of Algorithm 1. We initialize every pipe aApi with the coarsest model level a = 3 and with the coarsest possible discretization grid. In order to yield a well-defined algorithm, the number of discretization grid intervals has to be a multiple of 4 (see Fig. 2). Thus, we initially set ha = La/4 and ensure in Step 18 of Algorithm 1 that we never obtain a coarser grid size than the initial one. The overall tolerance is set to ε = 10− 4bar. Moreover, we set Θd = Θm = 0.7, Φd = Φm = 0.3, τ = 1.1, and μ = 4. Here, we refrain from updating these parameters from iteration to iteration, which is possible in general. Note that our parameter choice violates the second inequality of Theorem 1. This could be fixed by simply increasing the hysteresis parameter μ. However, we refrain from using a larger μ in order to give the adaptive algorithm more chances to also switch down in the model hierarchy or to coarsen discretization grids. Our numerical experiments show that the violation of the second inequality of Theorem 1 does not harm convergence in practice but leads to slightly faster computations.

The same rationale holds for the relation between model and discretization error as assumed in Theorem 1 (see also Lemma 3). To be fully compliant with the theory, the initial discretization grids need to be much finer. Again, coarser initial discretization grids do not harm convergence in our numerical experiments but yield much faster computations.

We now turn to the discussion of the numerical results. Both instances are solved using 8 iterations. Thus, together with the initially solved NLP, we have to solve 9 NLPs for solving both instances.

Using the adaptive control algorithm, it takes 3.82 s to solve the GasLib-40 instance and 7.50 s to solve the GasLib-135 instance. For the GasLib-40 network, the final NLP contains 2026 variables and 1988 constraints, whereas for the GasLib-135 the final NLP contains 3405 variables and 3271 constraints.

Most interesting is the speed-up that we obtain by using the adaptive control algorithm. Thus, we compare the above given solution times with the solution times for an NLP that satisfies the same error tolerances but that is obtained without using model level and discretization grid adaptivity. This NLP contains 40 034 variables and 39996 constraints for the GasLib-40 instance and 144757 variables as well as 144623 constraints for the GasLib-135 instance. Compared to the final NLPs that have to be solved within the adaptive algorithm, the NLPs obtained without using adaptivity are quite large scale. This directly translates to solution times. The GasLib-40 instance requires 53.11s and the GasLib-135 instance requires 122.42s. Thus, we get a speed-up factor of 13.89 and 16.33, respectively.

Figure 4 illustrates the adaptivity of the algorithm by plotting how many pipe grids are refined (\(|\mathcal {R}|\)) and how many pipe models are switched up in the hierarchy (\(|\mathcal {U}|\)). It can be clearly seen that increasing the accuracy is only needed for a small fraction of the pipes. For the GasLib-40 network, we never refine grids for more than 9 pipes, whereas we never refine grids for more than 21 pipes for the GasLib-135 network. Thus, for the larger network, we never refine grids for more than 15% of all pipes.

Fig. 4
figure 4

Number of pipes with refined grid (y-axis; \(|\mathcal {R}|\)) and number of pipes where the model is switched up in the model hierarchy (y-axis; \(|\mathcal {U}|\)) over the course of the iterations (x-axis). Left: GasLib-40, right: GasLib-135

For both networks, the Lines 17 and 18 are only reached once. For the smaller network, only 1 pipe grid is coarsened, whereas 3 pipe grids are coarsened for the larger network. Moreover, the algorithm never switches down in the model hierarchy. Consequently, the NLPs get larger from iteration to iteration. This then yields increased running times for the NLP solver as depicted in Fig. 5. It can be seen that the subsequent NLPs can be solved quite fast. There are two main reasons for this phenomenon. First, the NLP’s size only increases moderately due to the adaptive control strategy. Second, the overall algorithm allows for warm-starting: When solving a single NLP, we always use the last NLP’s solution to set up the initial iterate.

Fig. 5
figure 5

Aggregated run times (y-axis; in s) required for solving the nonlinear optimization problems (NLPs) and the initial value problems (IVPs) for the computation of the error estimates. Left: GasLib-40, right: GasLib-135

Lastly, we consider the decrease in the respective errors. In Fig. 6, the discretization, model, and total errors are plotted over the course of the iterations. Both profiles show the expected decrease in the errors.

Fig. 6
figure 6

Discretization, model and total error estimates (y-axis) over the course of the iterations (x-axis). Left: GasLib-40, right: GasLib-135

6 Conclusion

We have considered the problem of operation cost minimization for gas transport networks. In this context, we have focused on stationary and isothermal models and developed an adaptive model and discretization error control algorithm for nonlinear optimization that uses a hierarchy of continuous and finite-dimensional models. Out of this hierarchy, the new method adaptively chooses different models in order to finally achieve an optimal solution that satisfies a prescribed combined model and discretization error tolerance. The algorithm is shown to be convergent and its performance is illustrated by several numerical results.

The results pave the way for future work in the context of model switching and discretization grid adaptation for nonlinear optimal control. On the one hand, it should be extended to non-isothermal and instationary models of gas transport, in particular, in a port-Hamiltonian formulation. On the other hand, it would be interesting to extend the new technique to mixed-integer nonlinear optimal control.