Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In 2004, the first author of this chapter was awarded the SIAM Von Kármán Prize for his various contributions to Computational Fluid Dynamics, the direct numerical simulation of particulate flow in particular. Consequently, he was asked by some people at SIAM to contribute an article to SIAM Review, related to the Von Kármán lecture he gave at the 2004 SIAM meeting in Portland, Oregon. Since operator-splitting was playing a most crucial role in the results presented during his Portland lecture, he decided to write, jointly with several collaborators (including the second author), a review article on operator-splitting methods, illustrated by several selected applications. One of the main reasons for that review article was that, to the best of our knowledge at the time, the last comprehensive publication on the subject was [121], a book-size article (266 pages) published in 1990, in the Volume I of the Handbook of Numerical Analysis. Our article was rejected, on the grounds that it was untimely. What is ironical is that the very day (of August 2005) we received the rejection e-mail message, we were having a meeting with computational scientists at Los Alamos National Laboratory (LANL) telling us that one of their main priorities was further investigating the various properties of operator-splitting methods, considering that these methods were (and still are) applied at LANL to solve a large variety of challenging, mostly multi-physics, problems. Another event emphasizing the importance of operator-splitting methods was the December 2005 conference, at Rice University in Houston, commemorating “50 Years of Alternating-Direction Methods” and honoring J. Douglas, D. Peaceman and H. Rachford, the inventors of those particular operator-splitting methods bearing their name. Actually, it was striking to observe during this conference that, at the time, most members of the Partial Differential Equations and Optimization communities were ignoring that most alternating-direction methods for initial value-problems are closely related to primal-dual algorithms such as ADMM (Alternating Direction Methods of Multipliers). In order to create a bridge between these two communities, we updated the failed SIAM Review paper and submitted it elsewhere, leading to [73] (clearly, a publication in an SIAM journal would have had more impact, worldwide). Our goal in this chapter is to present a (kind of) updated variant of [73], less CFD (resp., more ADMM) oriented. It will contain in particular applications to Imaging, a topic barely mentioned in reference [73]. The content of this chapter is as follows:

In Section 2, we will discuss the numerical solution of initial value problems by operator-splitting time-discretization schemes such as Peaceman-Rachford’s, Douglas-Rachford’s, Lie’s, Strang’s, Marchuk-Yanenko’s, and by the fractional θ-scheme, a three-stage variation, introduced in [67] and [68], of Peaceman Rachford’s scheme. We will conclude this section by some remarks on the parallelization of operator-splitting schemes.

Section 3 will be dedicated to augmented Lagrangian and ADMM algorithms. We will show in particular that some augmented Lagrangian and ADMM algorithms are nothing but disguised operator-splitting methods (justifying thus the ADMM terminology).

Following [73], we will discuss in Section 4 the operator-splitting based direct numerical simulation of particulate flow, in the particular case of mixtures of incompressible viscous fluids and rigid solid particles.

In Section 5, we will discuss the application of operator-splitting methods to the solution of two problems from Physics, namely the Gross-Pitaevskii equation, a nonlinear Schrödinger equation modeling Bose-Einstein condensates, and the Zakharov system, a model for the propagation of Langmuir waves in ionized plasma.

Next, in Section 6, we will discuss applications of augmented Lagrangian and ADMM algorithms to the solution of problems from Imaging, a highly popular topic nowadays (actually, the renewed interest in ADMM type algorithms that we observe currently can be largely explained by their application to Image Processing; see [156, 170]).

Finally, in Section 7, we will return to various issues that we left behind in the preceding sections of this chapter: these include augmentation parameter selection, an analysis of the asymptotic behavior of the Peaceman-Rachford and Douglas-Rachford schemes, and various comments concerning high order accurate operator-splitting schemes. Also, owing to the fact that one of the success stories of operator-splitting methods has been the numerical solution of the Navier-Stokes equations modeling viscous flow, we will conclude this section (and the chapter) by providing a (non-exhaustive) list of related references.

In addition to all the other chapters of this volume, material related to operator-splitting, augmented Lagrangian and ADMM algorithms can be found in [72] (see also the references therein). More references will be given in the following sections.

2 Operator-Splitting Schemes for the Time Discretization of Initial Value Problems

2.1 Generalities

Let us consider the following autonomous initial value problem:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad & \dfrac{d\phi } {dt} + A(\phi ) = 0\ \mathrm{on}\ (0,T)\ (\mathrm{with}\ 0 < T \leq +\infty ), \phantom{1} \\ \quad &\phi (0) =\phi _{0}.\end{array} \right. }$$
(2.1)

Operator A maps the vector space V into itself and we suppose that ϕ 0 ∈ V. We suppose also that A has a nontrivial decomposition such as

$$\displaystyle{ A =\sum _{ j=1}^{J}A_{ j}, }$$
(2.2)

with J ≥ 2 (by nontrivial we mean that the operators A j are individually simpler than A).

A question which arises naturally is clearly:

Can we take advantage of decomposition (2.2) for the solution of (2.1)?

It has been known for many years (see for example [36]) that the answer to the above question is definitely yes.

Many schemes have been designed to take advantage of the decomposition (2.2) when solving (2.1); several of them will be briefly discussed in the following paragraphs.

2.2 Time-Discretization of (2.1) by Lie’s Scheme

Let △t( > 0) be a time-discretization step (for simplicity, we suppose △t fixed); we denote nt by t n. With ϕ n denoting an approximation of ϕ(t n), Lie’s scheme reads as follows (for its derivation see, e.g., [70] (Chapter 6) and Chapter 1, Section 2, of this book):

$$\displaystyle{ \phi ^{0} =\phi _{ 0}; }$$
(2.3)

then, for n ≥ 0, ϕ n → ϕ n+1 via

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &\dfrac{d\phi _{j}} {dt} + A_{j}(\phi _{j}) = 0\ \mbox{ on}\ (t^{n},t^{n+1}), \\ \quad &\phi _{j}(t^{n}) =\phi ^{n+(j-1)/J};\phi ^{n+j/J} =\phi _{j}(t^{n+1}),\end{array} \right. }$$
(2.4)

for j = 1, , J.

If (2.1) is taking place in a finite dimensional space and if the operators A j are smooth enough, then ∥ ϕ(t n) −ϕ n ∥  = O(△t), function ϕ being the solution of (2.1).

Remark 1.

The above scheme applies also for multivalued operators (such as the subdifferentials of proper lower semi-continuous convex functionals), but in such a case first order accuracy is not guaranteed anymore. A related application will be given in Section 2.7.

Remark 2.

The above scheme is easy to generalize to non-autonomous problems by observing that

$$\displaystyle{\left \{\begin{array}{l} \dfrac{d\phi } {dt} + A(\phi,t) = 0, \\ \phi (0) =\phi _{0}\end{array} \right. \Leftrightarrow \left \{\begin{array}{l} \dfrac{d\phi } {dt} + A(\phi,\theta ) = 0, \\ \dfrac{d\theta } {dt} - 1 = 0, \\ \phi (0) =\phi _{0},\theta (0) = 0. \end{array} \right.}$$

Remark 3.

Scheme (2.3)–(2.4) is semi-constructive in the sense that we still have to solve the initial value sub-problems in (2.4) for each j. Suppose that we discretize these sub-problems using just one step of the backward Euler scheme. The resulting scheme reads as follows:

$$\displaystyle{ \phi ^{0} =\phi _{ 0}; }$$
(2.5)

then, for n ≥ 0, ϕ n → ϕ n+1 via the solution of

$$\displaystyle{ \dfrac{\phi ^{n+j/J} -\phi ^{n+(j-1)/J}} {\bigtriangleup t} + A_{j}(\phi ^{n+j/J}) = 0, }$$
(2.6)

for j = 1, , J.

Scheme (2.5)–(2.6) is known as the Marchuk-Yanenko scheme (see, e.g., refs. [121] and [70] (Chapter 6)) for more details). Several chapters of this volume are making use of the Marchuk-Yanenko scheme.

2.3 Time-Discretization of (2.1) by Strang’s Symmetrized Scheme

In order to improve the accuracy of Lie’s scheme, G. Strang suggested a symmetrized variant of scheme (2.3)–(2.4) (ref. [153]). When applied to non-autonomous problems, in the case where J = 2, we obtain (with t n+1∕2 = (n + 1∕2)△t):

$$\displaystyle{ \phi ^{0} =\phi _{ 0}; }$$
(2.7)

then, for n ≥ 0, \(\phi ^{n} \rightarrow \phi ^{n+1/2} \rightarrow \widehat{\phi }^{n+1/2} \rightarrow \phi ^{n+1}\) via

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &\dfrac{d\phi _{1}} {dt} + A_{1}(\phi _{1},t) = 0\ \mbox{ on}\ (t^{n},t^{n+1/2}), \\ \quad &\phi _{1}(t^{n}) =\phi ^{n};\phi ^{n+1/2} =\phi _{1}(t^{n+1/2}),\end{array} \right. }$$
(2.8)
$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &\dfrac{d\phi _{2}} {dt} + A_{2}(\phi _{2},t^{n+1/2}) = 0\ \mbox{ on}\ (0,\bigtriangleup t), \\ \quad &\phi _{2}(0) =\phi ^{n+1/2};\widehat{\phi }^{n+1/2} =\phi _{2}(\bigtriangleup t),\end{array} \right. }$$
(2.9)
$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &\dfrac{d\phi _{1}} {dt} + A_{1}(\phi _{1},t) = 0\ \mbox{ on}\ (t^{n+1/2},t^{n+1}), \\ \quad &\phi _{1}(t^{n+1/2}) = \widehat{\phi }^{n+1/2};\phi ^{n+1} =\phi _{1}(t^{n+1}).\end{array} \right. }$$
(2.10)

If (2.1) is taking place in a finite dimensional space and if operators A 1 and A 2 are smooth enough, then ∥ ϕ(t n) −ϕ n ∥  = O( | △t | 2), function ϕ being the solution of (2.1).

Remark 4.

In order to preserve the second order accuracy of scheme (2.7)–(2.10) (assuming it takes place) we have to solve the initial value problems in (2.8), (2.9) and (2.10) by schemes which are themselves second order accurate (at least); these schemes are highly dependent of the properties of A 1 and A 2. The sub-problems (2.8), (2.9) and (2.10) are all particular cases of

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad & \dfrac{d\phi } {dt} + B(\phi,t) = 0\ \mbox{ on}\ (t_{0},t_{f}), \\ \quad &\phi (t_{0}) =\phi _{0}.\end{array} \right. }$$
(2.11)

Suppose now that B is a (positively) monotone operator; following [70] (Chapter 6), we advocate using for the numerical integration of (2.11) the second order implicit Runge-Kutta scheme below:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \phi ^{0} =\phi _{0}; \quad \\ \mbox{ for $q = 0,\ldots,Q - 1$, $\phi ^{q} \rightarrow \phi ^{q+\theta } \rightarrow \phi ^{q+1-\theta }\rightarrow \phi ^{q+1}$ via} \quad \\ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &\dfrac{\phi ^{q+\theta } -\phi ^{q}} {\theta \tau } + B(\phi ^{q+\theta },t^{q+\theta }) = 0, \\ \quad &\phi ^{q+1-\theta } = \dfrac{1-\theta } {\theta } \phi ^{q+\theta } + \dfrac{2\theta - 1} {\theta } \phi ^{q}, \\ \quad &\dfrac{\phi ^{q+1} -\phi ^{q+1-\theta }} {\theta \tau } + B(\phi ^{q+1},t^{q+1}) = 0, \end{array} \right.\quad \end{array} \right. }$$
(2.12)

where in (2.12):

  • Q( ≥ 1) is an integer and \(\tau = \dfrac{t_{f} - t_{0}} {Q}\).

  • ϕ q+α is an approximation of ϕ(t q+α), with t q+α = t 0 + (q +α)τ.

  • \(\theta = 1 - \dfrac{1} {\sqrt{2}}\).

It is shown in [70] (Chapter 2) that the implicit Runge-Kutta scheme (2.12) is stiff A-stable and “nearly” third-order accurate. It has been used, in particular, in [70] and [162] for the numerical simulation of incompressible viscous flow.

Remark 5.

The main (if not the unique) drawback of Strang’s symmetrized scheme(2.7)–(2.10) concerns its ability at capturing the steady state solutions of (2.1) (when T = +), assuming that such solutions do exist. Indeed, the splitting error associated with scheme (2.7)–(2.10) prevents using large values of △t when integrating (2.1) from t = 0 to t = +; if the sequence {ϕ n} n ≥ 0 converges to a limit, this limit is not, in general, a steady state solution of (2.1), albeit being close to one for small values of △t (a similar comment applies also to the sequences {ϕ n+1∕2} n ≥ 0 and \(\{\widehat{\phi }^{n+1/2}\}_{n\geq 0}\)). A simple way to-partly-overcome this difficulty is to use variable time discretization steps: for example, in (2.8), (2.9) and (2.10), one can replace △t by τ n (the sequence {τ n } n ≥ 0 verifying τ n  > 0, \(\lim _{n\rightarrow \infty }\tau _{n} = 0\) and \(\sum _{n=0}^{\infty }\tau _{ n} = +\infty \)), and then define t n+1 and t n+1∕2 by t n+1 = t n +τ n \(\forall n \geq 0\), t 0 = 0, and t n+1∕2 = t n +τ n∕2, respectively. A more sophisticated way to fix the asymptotic behavior of scheme (2.7)–(2.10) is to proceed as in the chapter by McNamara and Strang in this book (Chapter 3).

Remark 6.

More comments on scheme (2.7)–(2.10) can be found in, e.g., [70] (Chapter 6), [72] (Chapter 3) and various chapters of this volume, Chapter 3 in particular. Among these comments, the generalization of the above scheme to those situations where J ≥ 3 in (2.2) has been discussed. Conceptually, the case J ≥ 3 is no more complicated than J = 2. Focusing on J = 3, we can return (in a nonunique way) to the case J = 2 by observing that

$$\displaystyle\begin{array}{rcl} A& =& A_{1} + A_{2} + A_{3} = A_{1} + (A_{2} + A_{3}) = (A_{1} + A_{2}) + A_{3} \\ & =& (A_{1} + \frac{1} {2}A_{2}) + (\frac{1} {2}A_{2} + A_{3}). {}\end{array}$$
(2.13)

The first (resp., second and third) arrangement in (2.13) leads to 5 (resp., 7 and 9) initial value sub-problems per time step. Scheme (2.7)–(2.10), combined with the first arrangement in (2.13), has been applied in [81] to the computation of the periodic solution of a nonlinear integro-differential equation from Electrical Engineering.

2.4 Time-Discretization of (2.1) by Peaceman-Rachford’s Alternating Direction Method

Another candidate for the numerical solution of the initial value problem (2.1), or of its non-autonomous variant

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad & \dfrac{d\phi } {dt} + A(\phi,t) = 0\ \mathrm{on}\ (0,T), \\ \quad &\phi (0) =\phi _{0}.\end{array} \right. }$$
(2.14)

is provided, if J = 2 in (2.2), by the Peaceman-Rachford scheme (introduced in [139]). The idea behind the Peaceman-Rachford scheme is quite simple: the notation being like in Sections 2.1, 2.2 and 2.3, one divides the time interval [t n, t n+1] into two sub-intervals of length △t∕2 using the mid-point t n+1∕2. Then assuming that the approximate solution ϕ n is known at t n one computes first ϕ n+1∕2 using over [t n, t n+1∕2] a scheme of the backward Euler type with respect to A 1 and of the forward Euler type with respect to A 2; one proceeds similarly over [t n+1∕2, t n+1], switching the roles of A 1 and A 2. The following scheme, due to Peaceman and Rachford (see [139]), realizes precisely this program when applied to the solution of the initial value problem (2.14):

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \phi ^{0} =\phi _{0}; \quad \\ \mbox{ for $n \geq 0$, $\phi ^{n} \rightarrow \phi ^{n+1/2} \rightarrow \phi ^{n+1}$ via the solution of} \quad \\ \dfrac{\phi ^{n+1/2} -\phi ^{n}} {\bigtriangleup t/2} + A_{1}(\phi ^{n+1/2},t^{n+1/2}) + A_{2}(\phi ^{n},t^{n}) = 0, \quad \\ \dfrac{\phi ^{n+1} -\phi ^{n+1/2}} {\bigtriangleup t/2} + A_{1}(\phi ^{n+1/2},t^{n+1/2}) + A_{2}(\phi ^{n+1},t^{n+1}) = 0.\quad \end{array} \right. }$$
(2.15)

The convergence of the Peaceman-Rachford scheme (2.15) has been proved in [118] and [84] under quite general monotonicity assumptions concerning the operators A 1 and A 2 (see also [64, 65] and [110]); indeed, A 1 and/or A 2 can be nonlinear, unbounded and even multi-valued. In general, scheme (2.15) is first order accurate at best; however, if the operators A 1 and A 2 are linear, time independent, and commute then scheme (2.15) is second order accurate (that is ∥ ϕ nϕ(t n) ∥  = O( | △t | 2)), ϕ being the solution of problem (2.1)). Further properties of scheme (2.15) can be found in, e.g., [121, 70] (Chapter 2) and [72] (Chapter 3), including its stability, and its asymptotic behavior if T = +; concerning this last issue, a sensible advice is to use another scheme to compute steady state solutions, scheme (2.15) not being stiff A-stable.

Remark 7.

Scheme (2.15) belongs to the alternating direction method family. The reason of that terminology is well known: one of the very first applications of scheme (2.15) was the numerical solution of the heat equation

$$\displaystyle{ \dfrac{\partial \phi } {\partial t} - \dfrac{\partial ^{2}\phi } {\partial x^{2}} - \dfrac{\partial ^{2}\phi } {\partial y^{2}} = f, }$$

completed by initial and boundary conditions. After finite difference discretization, the roles of A 1 and A 2 were played by the square matrices approximating the operators \(- \dfrac{\partial ^{2}} {\partial x^{2}}\) and \(- \dfrac{\partial ^{2}} {\partial y^{2}}\), respectively, explaining the terminology.

Remark 8.

We observe that operators A 1 and A 2 play essentially symmetrical roles in scheme (2.15).

Remark 9.

For those fairly common situations where operator A 2 is uni-valued, but operator A 1 is “nasty” (discontinuous and/or multi-valued, etc.), we should use the following equivalent formulation of the Peaceman-Rachford scheme (2.15):

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \phi ^{0} =\phi _{0}; \quad \\ \mbox{ for $n \geq 0$, $\phi ^{n} \rightarrow \phi ^{n+1/2} \rightarrow \phi ^{n+1}$ via the solution of}\quad \\ \dfrac{\phi ^{n+1/2} -\phi ^{n}} {\bigtriangleup t/2} + A_{1}(\phi ^{n+1/2},t^{n+1/2}) + A_{2}(\phi ^{n},t^{n}) = 0, \quad \\ \dfrac{\phi ^{n+1} - 2\phi ^{n+1/2} +\phi ^{n}} {\bigtriangleup t/2} + A_{2}(\phi ^{n+1},t^{n+1}) = A_{2}(\phi ^{n},t^{n}). \quad \end{array} \right. }$$
(2.16)

2.5 Time-Discretization of (2.1) by Douglas-Rachford’s Alternating Direction Method

We assume that J = 2 in (2.2).

The Douglas-Rachford scheme (introduced in [57]) is a variant of the Peaceman-Rachford scheme (2.15); when applied to the numerical solution of the initial value problem (2.14) (the non-autonomous generalization of (2.1)), it takes the following form:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \phi ^{0} =\phi _{0}; \quad \\ \mbox{ for $n \geq 0$, $\phi ^{n} \rightarrow \widehat{\phi }^{n+1} \rightarrow \phi ^{n+1}$ via the solution of}\quad \\ \dfrac{\widehat{\phi }^{n+1} -\phi ^{n}} {\bigtriangleup t} + A_{1}(\widehat{\phi }^{n+1},t^{n+1}) + A_{2}(\phi ^{n},t^{n}) = 0, \quad \\ \dfrac{\phi ^{n+1} -\phi ^{n}} {\bigtriangleup t} + A_{1}(\widehat{\phi }^{n+1},t^{n+1}) + A_{2}(\phi ^{n+1},t^{n+1}) = 0. \quad \end{array} \right. }$$
(2.17)

The Douglas-Rachford scheme (2.17) has clearly a predictor-corrector flavor.

The convergence of the Douglas-Rachford scheme (2.17) has been proved in [118] and [84] under quite general monotonicity assumptions concerning the operators A 1 and A 2 (see also [64, 65] and [110]); indeed, A 1 and/or A 2 can be nonlinear, unbounded, and even multi-valued. In general, scheme (2.17) is first order accurate at best (even if the operators A 1 and A 2 are linear, time independent and commute, assumptions implying second order accuracy for the Peaceman-Rachford scheme). Further properties of scheme (2.17) can be found in, e.g., [121, 70] (Chapter 2) and [72] (Chapter 3), including its stability, and its asymptotic behavior if T = +. Concerning this last issue, a sensible advice is to use another scheme to compute steady state solutions, scheme (2.17) not being stiff A-stable, a property it shares with the Peaceman-Rachford scheme (2.15).

Remark 10.

Unlike the Peaceman-Rachford scheme (2.15), we observe that the roles played by operators A 1 and A 2 are non-symmetrical in scheme (2.17); actually, numerical experiments confirm that fact: for example, for the same △t the speed of convergence to a steady state solution may depend of the choice one makes for A 1 and A 2. As a rule of thumb, we advocate taking for A 2 the operator with the best continuity and monotonicity properties (see, for example, [62] (Chapter 3), [63] (Chapter 3) and [74] (Chapter 3) for more details).

Remark 11.

Unlike scheme (2.15), scheme (2.17) is easy to generalize to operator decompositions involving more than two operators. Consider thus the numerical integration of (2.14) when J ≥ 3 in (2.2). Following J. Douglas in [54] and [55] we generalize scheme (2.17) by

$$\displaystyle{ \phi ^{0} =\phi _{ 0}; }$$
(2.18)

then for n ≥ 0, ϕ n being known, compute ϕ n+1∕J, , ϕ n+jJ, , ϕ n+1 via the solution of

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{\phi ^{n+1/J} -\phi ^{n}} {\bigtriangleup t} + \dfrac{1} {J - 1}A_{1}(\phi ^{n+1/J},t^{n+1}) + \dfrac{J - 2} {J - 1}A_{1}(\phi ^{n},t^{n})\quad \\ \quad \qquad \qquad +\sum _{ i=2}^{J}A_{ i}(\phi ^{n},t^{n}) = 0, \quad \end{array} \right. }$$
(19.1)
$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{\phi ^{n+j/J} -\phi ^{n}} {\bigtriangleup t} +\sum _{ i=1}^{j-1}\left [ \dfrac{1} {J - 1}A_{i}(\phi ^{n+i/J},t^{n+1}) + \dfrac{J - 2} {J - 1}A_{i}(\phi ^{n},t^{n})\right ]\quad \\ \quad \qquad \qquad + \dfrac{1} {J - 1}A_{j}(\phi ^{n+j/J},t^{n+1}) + \dfrac{J - 2} {J - 1}A_{j}(\phi ^{n},t^{n}) \quad \\ \quad \qquad \qquad +\sum _{ i=j+1}^{J}A_{ i}(\phi ^{n},t^{n}) = 0, \quad \end{array} \right. }$$
(19.j)
$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{\phi ^{n+1} -\phi ^{n}} {\bigtriangleup t} +\sum _{ i=1}^{J-1}\left [ \dfrac{1} {J - 1}A_{i}(\phi ^{n+i/J},t^{n+1}) + \dfrac{J - 2} {J - 1}A_{i}(\phi ^{n},t^{n})\right ]\quad \\ \quad \qquad \qquad + \dfrac{1} {J - 1}A_{J}(\phi ^{n+1},t^{n+1}) + \dfrac{J - 2} {J - 1}A_{J}(\phi ^{n},t^{n}) = 0, \quad \end{array} \right. }$$
(19.J)

Above, ϕ n+iJ and ϕ n+jJ denote approximate solutions at steps i and j of the computational process; they do not denote approximations of ϕ(t n+iJ) and ϕ(t n+jJ) (unless i = j = J).

Remark 12.

This is the Douglas-Rachford analog of Remark 9: for those situations where A 1 is a “bad” operator (in the sense of Remark 9), we should use (assuming that A 2 is uni-valued) the following equivalent formulation of the Douglas-Rachford scheme (2.17):

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \phi ^{0} =\phi _{0}; \quad \\ \mbox{ for $n \geq 0$, $\phi ^{n} \rightarrow \widehat{\phi }^{n+1} \rightarrow \phi ^{n+1}$ via the solution of}\quad \\ \dfrac{\widehat{\phi }^{n+1} -\phi ^{n}} {\bigtriangleup t} + A_{1}(\widehat{\phi }^{n+1},t^{n+1}) + A_{2}(\phi ^{n},t^{n}) = 0, \quad \\ \dfrac{\phi ^{n+1} -\widehat{\phi }^{n+1}} {\bigtriangleup t} + A_{2}(\phi ^{n+1},t^{n+1}) = A_{2}(\phi ^{n},t^{n}). \quad \end{array} \right. }$$
(2.20)

Remark 13.

To those wondering how to choose between the Peaceman-Rachford and Douglas-Rachford schemes, we will say that, on the basis of many numerical experiments, it seems that the second scheme is more robust and faster for those situations where one of the operators is non-smooth (multi-valued or singular, for example), particularly if one is interested by capturing steady state solutions. Actually, a better advice could be: consider using the fractional θ-scheme to be discussed in Section 2.6, below. Indeed, we have encountered situations where this θ-scheme outperforms both the Peaceman-Rachford and Douglas-Rachford schemes, for steady state computations in particular; such an example is provided by the anisotropic Eikonal equation, a nonlinear hyperbolic problem to be briefly discussed in Section 2.7. We will return to the Peaceman-Rachford vs Douglas-Rachford issue in Section 7.

2.6 Time-Discretization of (2.1) by a Fractional θ-Scheme

This scheme (introduced in [67, 68] for the solution of the Navier-Stokes equations) is a variant of the Peaceman-Rachford scheme (2.15). Let θ belong to the open interval (0, 1∕2) (in practice, θ ∈ [1∕4, 1∕3]); the fractional θ -scheme, applied to the solution of the initial value problem (2.14) (the non-autonomous generalization of (2.1)), reads as follows if A = A 1 + A 2:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \phi ^{0} =\phi _{0}; \quad \\ \mbox{ for $n \geq 0$, $\phi ^{n} \rightarrow \phi ^{n+\theta } \rightarrow \phi ^{n+1-\theta }\rightarrow \phi ^{n+1}$ via the solution of}\quad \\ \dfrac{\phi ^{n+\theta } -\phi ^{n}} {\theta \bigtriangleup t} + A_{1}(\phi ^{n+\theta },t^{n+\theta }) + A_{2}(\phi ^{n},t^{n}) = 0, \quad \\ \dfrac{\phi ^{n+1-\theta }-\phi ^{n+\theta }} {(1 - 2\theta )\bigtriangleup t} + A_{1}(\phi ^{n+\theta },t^{n+\theta }) + A_{2}(\phi ^{n+1-\theta },t^{n+1-\theta }) = 0, \quad \\ \dfrac{\phi ^{n+1} -\phi ^{n+1-\theta }} {\theta \bigtriangleup t} + A_{1}(\phi ^{n+1},t^{n+1}) + A_{2}(\phi ^{n+1-\theta },t^{n+1-\theta }) = 0. \quad \end{array} \right. }$$
(2.21)

Remark 14.

One should avoid confusion between scheme (2.21) and the following solution method for the initial value problem (2.14) (with 0 ≤ θ ≤ 1)

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \phi ^{0} =\phi _{0}; \quad \\ \mbox{ for $n \geq 0$, $\phi ^{n} \rightarrow \phi ^{n+1}$ via the solution of} \quad \\ \dfrac{\phi ^{n+1} -\phi ^{n}} {\bigtriangleup t} +\theta A(\phi ^{n+1},t^{n+1}) + (1-\theta )A(\phi ^{n},t^{n}) = 0,\quad \end{array} \right. }$$
(2.22)

which is also known as a θ-scheme. We observe that if θ = 1 (resp., θ = 0, θ =1/2) scheme (2.22) reduces to backward Euler’s scheme (resp., forward Euler’s scheme, a Crank-Nicolson’s type scheme). Another “interesting” value is θ = 2/3 (for reasons detailed in, e.g., [70] (Chapter 2) and [72] (Chapter 3)). By the way, it is to avoid confusion between schemes (2.21) and (2.22) that some practitioners (S. Turek, in particular) call the first one a fractional θ-scheme. □ 

The stability and convergence properties of scheme (2.21) have been discussed in [70] (Chapter 2) and [72] (Chapter 3) for very simple finite dimensional situations where A 1 and A 2 are both positive multiples of the same symmetric positive definite matrix. Numerical experiments have shown that the good properties verified by scheme (2.21) for those simple linear situations, in particular its stiff A-stability for θ well chosen, still hold for more complicated problems, such as the numerical simulation of unsteady incompressible viscous flow modeled by the Navier-Stokes equations (as shown in, e.g., [23, 41, 69] and [70]).

Remark 15.

We observe that operators A 1 and A 2 play non-symmetrical roles in scheme (2.21). Since, at each time step, one has to solve two problems (resp., one problem) associated with operator A 1 (resp., A 2) a natural choice is to take for A 1 the operator leading to the sub-problems which are the easiest to solve (that is, whose solution is the less time consuming). Less naive criteria may be used to choose A 1 and A 2, such as the regularity (or lack of regularity) of these operators.

Remark 16.

If one takes A 1 = A and A 2 = 0 in (2.21), the above scheme reduces to the Runge-Kutta scheme (2.12), with A replacing B.

Remark 17.

The fractional θ-scheme (2.21) is a symmetrized scheme. From that point of view, it has some analogies with Strang’s symmetrized scheme (2.7)–(2.10), discussed in Section 2.3.

Remark 18.

This is the fractional θ-scheme analog of Remarks 9 and 12. For those situations where A 1 is a “bad” operator (in the sense of Remark 9), we advocate using the following equivalent formulation of the θ-scheme (2.21):

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \phi ^{0} =\phi _{0}; \quad \\ \mbox{ for $n \geq 0$, $\phi ^{n} \rightarrow \phi ^{n+\theta } \rightarrow \phi ^{n+1-\theta }\rightarrow \phi ^{n+1}$ via the solution of} \quad \\ \dfrac{\phi ^{n+\theta } -\phi ^{n}} {\theta \bigtriangleup t} + A_{1}(\phi ^{n+\theta },t^{n+\theta }) + A_{2}(\phi ^{n},t^{n}) = 0, \quad \\ \dfrac{\theta \phi ^{n+1-\theta }-(1-\theta )\phi ^{n+\theta }+(1-2\theta )\phi ^{n}} {\theta (1 - 2\theta )\bigtriangleup t} +A_{2}(\phi ^{n+1-\theta },t^{n+1-\theta }) = A_{2}(\phi ^{n},t^{n}),\quad \\ \dfrac{\phi ^{n+1} -\phi ^{n+1-\theta }} {\theta \bigtriangleup t} + A_{1}(\phi ^{n+1},t^{n+1}) + A_{2}(\phi ^{n+1-\theta },t^{n+1-\theta }) = 0. \quad \end{array} \right. }$$
(2.23)

2.7 Two Applications: Smallest Eigenvalue Computation and Solution of an Anisotropic Eikonal Equation

2.7.1 Synopsis

It is not an exaggeration to say that applications of operator-splitting methods are everywhere, new ones occurring “almost” every day; indeed, some well-known methods and algorithms are disguised operator-splitting schemes as we will show in Section 2.7.2, concerning the computation of the smallest eigenvalue of a real symmetric matrix. In Section 2.7.3, we will apply the fractional θ-scheme (2.21) to the solution of an Eikonal equation modeling wave propagation in anisotropic media. More applications will be discussed in Sections 4 and 5.

2.7.2 Application to Some Eigenvalue Computation

Suppose that A is a real d × d symmetric matrix. Ordering the eigenvalues of A as follows: λ 1 ≤ λ 2 ≤  ≤ λ d , our goal is to compute λ 1. We have (with obvious notation)

$$\displaystyle{ \lambda _{1} =\min _{\mathbf{v}\in S}\mathbf{v}^{t}\mathbf{A}\mathbf{v},\ \ with\ S =\{ \mathbf{v}\vert \mathbf{v} \in I\!\!R^{d},\|\mathbf{v}\| = 1\}, }$$
(2.24)

the norm in (2.24) being the canonical Euclidean one. The constrained minimization problem in (2.24) is equivalent to

$$\displaystyle{ \min _{\mathbf{v}\in I\!\!R^{d}}\left [\frac{1} {2}\mathbf{v}^{t}\mathbf{A}\mathbf{v} + I_{ S}(\mathbf{v})\right ], }$$
(2.25)

where, in (2.25), the functional I S : I​​R d → I​​R ∪{ +} is defined as follows

$$\displaystyle{I_{S}(\mathbf{v}) = \left \{\begin{array}{rl} 0\ &if\ \mathbf{v} \in S,\\ + \infty &otherwise, \end{array} \right.}$$

implying that I S is the indicator functional of the sphere S. Suppose that u is a solution of (2.25) (that is a minimizer of the functional in (2.25)); we have then

$$\displaystyle{ \mathbf{A}\mathbf{u} + \partial I_{S}(\mathbf{u}) \ni \mathbf{0}, }$$
(2.26)

∂ I S (u) in (2.26) being a (kind of) generalized gradient of functional I S at u (∂ I S is a multivalued operator). Next, we associate with the (necessary) optimality system (2.26) the following initial value problem (flow in the Dynamical System terminology):

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &\dfrac{d\mathbf{u}} {dt} + \mathbf{A}\mathbf{u} + \partial I_{S}(\mathbf{u}) \ni \mathbf{0}\ \mbox{ in}\ (0,+\infty ), \\ \quad &\mathbf{u}(0) = \mathbf{u}_{0}.\end{array} \right. }$$
(2.27)

If one applies the Marchuk-Yanenko scheme (2.5)–(2.6) to the solution of problem (2.27), one obtains (with τ = △t):

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \mathbf{u}^{0} = \mathbf{u}_{0}, \quad \\ \mbox{ for $n \geq 0$, $\mathbf{u}^{n} \rightarrow \mathbf{u}^{n+1/2} \rightarrow \mathbf{u}^{n+1}$ via the solution of}\quad \\ \dfrac{\mathbf{u}^{n+1/2} -\mathbf{u}^{n}} {\tau } + \mathbf{A}\mathbf{u}^{n+1/2} = \mathbf{0}, \quad \\ \dfrac{\mathbf{u}^{n+1} -\mathbf{u}^{n+1/2}} {\tau } + \partial I_{S}(\mathbf{u}^{n+1}) \ni \mathbf{0}. \quad \end{array} \right. }$$
(2.28)

The first finite difference equation in (2.28) implies

$$\displaystyle{ \mathbf{u}^{n+1/2} = (\mathbf{I} +\tau \mathbf{A})^{-1}\mathbf{u}^{n}. }$$
(2.29)

On the other hand, the second finite difference equation in (2.28) can be interpreted as a necessary optimality condition for the following minimization problem

$$\displaystyle{ \min _{\mathbf{v}\in S}\left [\frac{1} {2}\|\mathbf{v}\|^{2} -\mathbf{v}^{t}\mathbf{u}^{n+1/2}\right ]. }$$
(2.30)

Since ∥ v ∥  = 1 over S, the solution of problem (2.30) is given by

$$\displaystyle{ \mathbf{u}^{n+1} = \dfrac{\mathbf{u}^{n+1/2}} {\|\mathbf{u}^{n+1/2}\|}. }$$
(2.31)

It follows from (2.29) and (2.31) that algorithm (2.28) is nothing but the inverse power method with shift, a well-known algorithm from Numerical Linear Algebra. Indeed, if

$$\displaystyle{ 0 <\tau < \dfrac{1} {\max (0_{+},-\lambda _{1})}, }$$

and if the projection of u 0 on the vector space spanned by the eigenvectors of A associated with λ 1 is different from 0, we can easily prove that the sequence {u n} n ≥ 0 converges to an eigenvector of A associated with λ 1 and also that

$$\displaystyle{ \lim _{n\rightarrow +\infty }(\mathbf{u}^{n})^{t}\mathbf{A}\mathbf{u}^{n} =\lambda _{ 1}. }$$

Clearly, numerical analysts have not been waiting for operator-splitting to compute matrix eigenvalues and eigenvectors; on the other hand, operator-splitting has provided efficient algorithms for the solution of complicated problems from Differential Geometry, Mechanics, Physics, Physico-Chemistry, Finance, etc., including some nonlinear eigenvalue problems, as shown in, e.g., [72] (Chapter 7).

2.7.3 Application to the Solution of an Anisotropic Eikonal Equation from Acoustics

The next application of operator-splitting, that we are going to (briefly) consider in this chapter, was brought to our attention recently (December 2014) by our colleagues S. Leung and J. Qian. It concerns the numerical solution of the following nonlinear hyperbolic partial differential equation

$$\displaystyle{ \vert \nabla \tau \vert -\dfrac{\vert 1 -\mathbf{V} \cdot \nabla \tau \vert } {c} = 0\ \ in\ \varOmega, }$$
(2.32)

encountered in Acoustics and known as the anisotropic Eikonal equation. In (2.32), we have (see [40] for more details):

  • Ω ⊂ I​​R d, with d ≥ 2.

  • τ(x) is the time of 1 st arrival of the wave front at x ∈ Ω.

  • c > 0 is the wave propagation speed in the medium filling Ω, assuming that this medium is at rest (the so-called background medium).

  • Assuming that the ambient medium is moving, V is its moving velocity; we assume that V ∈ (L (Ω))d.

Fast-sweeping methods have been developed for the efficient numerical solution of the classical Eikonal equation

$$\displaystyle{ \vert \nabla \tau \vert = \dfrac{1} {c}\ \ in\ \varOmega, }$$
(2.33)

(see, e.g., [104] and [181]); these methods provide automatically viscosity solutions in the sense of Crandall and Lions (see [38] for this notion). Unfortunately, as shown in [40], the fast sweeping methods developed for the solution of (2.33) cannot handle (2.32), unless one modifies them significantly, as done in [40]. Actually, there exists an alternative, simpler to implement, to the method developed in [40]: it relies on the operator-splitting methods discussed in Sections 2.3, 2.4, 2.5 and 2.6, and takes advantage of the fact that the fast-sweeping methods developed for the solution of (2.33) can be easily modified in order to handle equations such as

$$\displaystyle{ \alpha \tau -\beta \nabla ^{2}\tau + \vert \nabla \tau \vert = f }$$
(2.34)

and

$$\displaystyle{ \alpha \tau -\beta \nabla ^{2}\tau -\dfrac{\vert 1 -\mathbf{V} \cdot \nabla \tau \vert } {c} = f, }$$
(2.35)

with α > 0 and β ≥ 0. Therefore, in order to solve problem (2.32), we associate with it the following initial value problem:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} (I -\epsilon \nabla ^{2}) \dfrac{\partial \tau } {\partial t} + \vert \nabla \tau \vert -\dfrac{\vert 1 -\mathbf{V} \cdot \nabla \tau \vert } {c} = 0\ \ in\ \varOmega \times (0,+\infty ),\quad \\ \tau (0) =\tau _{0}, \quad \end{array} \right. }$$
(2.36)

whose steady state solutions are also solutions of (2.32). In (2.36), ε is a non-negative parameter (a regularizing one if ε > 0) and τ(t) denotes the function t → τ(x, t). Actually, additional conditions are required to have solution uniqueness, typical ones being τ specified on a subset of \(\overline{\varOmega }(=\varOmega \cup \partial \varOmega )\), possibly reduced to just one point (a point source for the wave). A typical choice for τ 0 is the corresponding solution of problem (2.33).

The results reported in [75] show that, with θ = 1∕3, the fractional θ-scheme discussed in Section 2.6 outperforms the Strang’s, Peaceman-Rachford’s, and Douglas-Rachford’s schemes when applied to the computation of the steady state solutions of (2.36). The resulting algorithm reads as follows:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \tau ^{0} =\tau _{0}; \quad \\ \mbox{ for $n \geq 0$, $\tau ^{n} \rightarrow \tau ^{n+\theta } \rightarrow \tau ^{n+1-\theta }\rightarrow \tau ^{n+1}$ via the solution of}\quad \\ (I -\epsilon \nabla ^{2})\dfrac{\tau ^{n+\theta } -\tau ^{n}} {\theta \bigtriangleup t} + \vert \nabla \tau ^{n+\theta }\vert -\dfrac{\vert 1 -\mathbf{V} \cdot \nabla \tau ^{n}\vert } {c} = 0, \quad \\ (I -\epsilon \nabla ^{2})\dfrac{\tau ^{n+1-\theta }-\tau ^{n+\theta }} {(1 - 2\theta )\bigtriangleup t} + \vert \nabla \tau ^{n+\theta }\vert -\dfrac{\vert 1 -\mathbf{V} \cdot \nabla \tau ^{n+1-\theta }\vert } {c} = 0, \quad \\ (I -\epsilon \nabla ^{2})\dfrac{\tau ^{n+1} -\tau ^{n+1-\theta }} {\theta \bigtriangleup t} + \vert \nabla \tau ^{n+1}\vert -\dfrac{\vert 1 -\mathbf{V} \cdot \nabla \tau ^{n+1-\theta }\vert } {c} = 0.\quad \end{array} \right. }$$
(2.37)

The three problems in (2.37) being particular cases of (2.34) and (2.35), their finite difference analogues can be solved by fast-sweeping algorithms. Physical considerations suggest that △t has to be of the order of the space discretization step h. Actually, the numerical results reported in [75] show that, unlike the other schemes discussed in Sections 2.2 to 2.5, scheme (2.37), with θ = 1∕3, has very good convergence properties, even for large values of the ratio \(\dfrac{\bigtriangleup t} {h}\) (100, typically). If ε = 0 (resp., h 2), these numerical experiments suggest that the number of iterations (time steps), necessary to achieve convergence to a steady state solution, varies (roughly) like h −1∕2 (resp., h −1∕3), for two- and three-dimensional test problems (see [75] for further results and more details). Clearly, preconditioning does pay here (a well-known fact, in general).

Remark 19.

Some readers may wonder why the authors of [75] gave the role of A 1 (resp., A 2) to the operator τ → | ∇τ | (resp., \(\tau \rightarrow -\dfrac{1} {c}\vert 1 -\mathbf{V} \cdot \nabla \tau \vert \)), and not the other way around. Let us say to these readers that the main reason behind that choice was preliminary numerical experiments showing that, for the same values of α and β, problem (2.34) is cheaper to solve that problem (2.35).

2.8 Time-Discretization of (2.1) by a Parallel Splitting Scheme

The splitting schemes presented so far have a sequential nature, i.e. the sub-problems associated with the decomposed operators are solved in a sequential manner. Actually, it is also possible to solve the sub-problems in parallel, as shown just below, using the following variant of Marchuk-Yanenko’s scheme:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \phi ^{0 } =\phi _{0}; \quad \\ \mbox{ for $n \geq 0$, we obtain $\phi ^{n+1}$ from $\phi ^{n}$ by solving first }\quad \\ \dfrac{\phi ^{n+j/2J} -\phi ^{n}} {J\bigtriangleup t} + A_{j}(\phi ^{n+j/2J},t^{n+1}) = 0,\ \mbox{ for}\ j = 1,\ldots,J, \quad \\ \mbox{ $\phi ^{n+1}$ being then obtained by averaging as follows}\quad \\ \phi ^{n+1} = \dfrac{1} {J}\sum _{j=1}^{J}\phi ^{n+j/2J}. \quad \end{array} \right. }$$
(2.38)

Scheme (2.38) is nothing but Algorithm 5.1 in [119]. Under suitable conditions, it has been proved in the above reference that scheme (2.38) is first order accurate, that is ∥ ϕ nϕ(t n) ∥  = O(△t). A parallelizable algorithm with second order accuracy is presented also in [119]. The main advantage of the above schemes is that the sub-problems can be solved in parallel. Clearly, this parallel splitting idea can be used for computing the steady state solutions of (2.1). As observed in [155], the sub-problems (or at least some of them) can also be solved in parallel if the corresponding operator A j has the right decomposition properties.

3 Augmented Lagrangian Algorithms and Alternating Direction Methods of Multipliers

3.1 Introduction

It is our opinion that a review chapter like this one has to include some material about augmented Lagrangian algorithms, including of course their relationships with alternating direction methods. On the other hand, since augmented Lagrangian algorithms and alternating direction methods of multipliers, and their last known developments, are discussed, with many details, in other chapters of this book, we will not say much about these methods in this section. However, we will give enough information so that the reader may follow Section 6 (dedicated to Image Processing) without spending too much time consulting the other chapters (or other references).

In Section 3.2 we will introduce several augmented Lagrangian algorithms, and show in section 3.3 how these algorithms relate to the alternating direction methods discussed in Sections 2.4 (Peaceman-Rachford’s) and 2.5 (Douglas-Rachford’s).

This section is largely inspired by Chapter 4 of [72].

3.2 Decomposition-Coordination Methods by Augmented Lagrangians

3.2.1 Abstract Problem Formulation. Some Examples

A large number of problems in Mathematics, Physics, Engineering, Economics, Data Processing, Imaging, etc. can be formulated as

$$\displaystyle{ u = \mbox{ arg}\min _{\mathit{v}\in V }[F(B\mathit{v}) + G(\mathit{v})], }$$
(2.39)

where: (i) V and H are Banach spaces. (ii) \(B \in \mathcal{L}(V,H)\). (iii) F: H → I​​R ∪{ +} and G: V → I​​R ∪{ +} are proper, lower semi-continuous, and convex functionals verifying dom(FB) ∩ dom(G) ≠ ∅, implying that problem (2.39) may have solutions.

Example 1.

This first example concerns the following variational problem:

$$\displaystyle{ u = \mbox{ arg}\min _{\mathit{v}\in H_{0}^{1}(\varOmega )}\left [ \dfrac{\mu } {2}\int _{\varOmega }\vert \nabla \mathit{v}\vert ^{2}\,dx +\tau _{ y}\int _{\varOmega }\vert \nabla \mathit{v}\vert \,dx -\varpi \int _{\varOmega }\mathit{v}\,dx\right ], }$$
(2.40)

where: (i) Ω is a bounded domain (that is a bounded open connected subset) of I​​R 2; we denote by Γ the boundary of Ω. (ii) dx = dx 1 dx 2. (iii) μ and τ y are two positive constants. (iv) \(\vert \nabla \mathit{v}\vert ^{2} = \left \vert \dfrac{\partial \mathit{v}} {\partial x_{1}}\right \vert ^{2} + \left \vert \dfrac{\partial \mathit{v}} {\partial x_{2}}\right \vert ^{2}\) (v) The space H 0 1(Ω) (a Sobolev space) is defined by

$$\displaystyle{ H_{0}^{1}(\varOmega ) =\{ \mathit{v}\vert \mathit{v} \in L^{2}(\varOmega ),\partial \mathit{v}/\partial x_{ i} \in L^{2}(\varOmega ),\forall i = 1,2,\mathit{v}\vert _{\varGamma } = 0\}, }$$
(2.41)

the two derivatives in (2.41) being in the sense of distributions (see, e.g., [148, 157] for this notion). Since Ω is bounded, H 0 1(Ω) is a Hilbert space for the inner product {v, w} →  Ω v ⋅ ∇wdx, and the associated norm. Problem (2.40) is a well-known problem from non-Newtonian fluid mechanics; it models the flow of an incompressible visco-plastic fluid (of the Bingham type) in an infinitely long cylinder of cross-section Ω, ϖ being the pressure drop per unit length and u the flow axial velocity. In (2.40), μ denotes the fluid viscosity and τ y its plasticity yield (see, e.g., [59] and [83] for further information on visco-plastic fluid flows; see also the references therein). It follows from, e.g., [66] and [72], that the variational problem (2.40) has a unique solution.

Problem (2.40) is a particular case of (2.39) with V = H 0 1(Ω), H = (L 2(Ω))2, B = ∇, \(F(\mathbf{q}) =\tau _{y}\int _{\varOmega }\vert \mathbf{q}\vert \,dx\), and \(G(\mathit{v}) = \dfrac{\mu } {2}\int _{\varOmega }\vert \nabla \mathit{v}\vert ^{2}\,dx -\varpi \int _{\varOmega }\mathit{v}\,dx\); other decompositions are possible.

Close variants of problem (2.40) are encountered in imaging, as shown in Section 6 (and other chapters of this volume).

Example 2.

It concerns the following variant of problem (2.40):

$$\displaystyle{ u = \mbox{ arg}\min _{\mathit{v}\in K}\left [ \dfrac{\mu } {2}\int _{\varOmega }\vert \nabla \mathit{v}\vert ^{2}\,dx - C\int _{\varOmega }\mathit{v}\,dx\right ], }$$
(2.42)

where Ω is a bounded domain of I​​R 2, μ is a positive constant and

$$\displaystyle{ K =\{ \mathit{v}\vert \mathit{v} \in H_{0}^{1}(\varOmega ),\vert \nabla \mathit{v}\vert \leq 1\ \mbox{ a.e. in}\ \varOmega \}. }$$

It is a classical result (see, e.g., [59]) that (2.42) models, in an appropriate system of mechanical units, the torsion of an infinitely long cylinder of cross-section Ω, made of an elastic-plastic material, C being the torsion angle per unit length and u a stress potential. It follows from, e.g., [66] and [72], that the variational problem (2.42) has a unique solution.

Problem (2.42) is a particular case of problem (2.39) with V = H 0 1(Ω), H = (L 2(Ω))2, B = ∇, \(G(\mathit{v}) = \dfrac{\mu } {2}\int _{\varOmega }\vert \nabla \mathit{v}\vert ^{2}\,dx - C\int _{\varOmega }\mathit{v}\,dx\), and F(q) = I K (q), I K (⋅ ) being the indicator functional of the closed convex nonempty subset K of H defined by

$$\displaystyle{ \mathit{K} =\{ \mathbf{q}\vert \mathbf{q} \in H,\vert \mathbf{q}\vert \leq 1\ \mbox{ a.e. in}\ \varOmega \}. }$$

Other decompositions are possible.

Remark 20.

We recall that, we have, (from the definition of indicator functionals)

$$\displaystyle{ I_{\mathit{K}}(\mathbf{q}) = \left \{\begin{array}{@{}l@{\quad }l@{}} 0\ \mbox{ if}\ \mathbf{q} \in \mathit{K}\,, \quad \\ +\infty \ \mbox{ otherwise},\quad \end{array} \right. }$$

implying, from the properties of K , that I K : H → I​​R ∪{ +} is convex, proper and lower semi-continuous. □ 

Numerical methods for the solution of problem (2.42) can be found in, e.g., [66] and [76].

3.2.2 Primal-Dual Methods for the Solution of Problem (2.39): ADMM Algorithms

In order to solve problem (2.39), we are going to use a strategy introduced in [77] and [78] (to the best of our knowledge). The starting point is the obvious equivalence between (2.39) and the following linearly constrained optimization problem:

$$\displaystyle{ \{u,Bu\} = \mbox{ arg}\min _{\{\mathit{v},q\}\in W}j(\mathit{v},q), }$$
(2.43)

where

$$\displaystyle{ j(\mathit{v},q) = F(q) + G(\mathit{v}), }$$

and

$$\displaystyle{ W =\{\{ \mathit{v},q\}\vert \mathit{v} \in V,q \in H,B\mathit{v} - q = 0\}. }$$

From now on, we will assume that V and H are (real) Hilbert spaces, the H-norm being denoted by | ⋅ | and the associated inner-product by (⋅ , ⋅ ). The next step is quite natural: we associate with the minimization problem (2.43) a Lagrangian functional \(\mathcal{L}\) defined by

$$\displaystyle{ \mathcal{L}(\mathit{v},q;\mu ) = j(\mathit{v},q) + (\mu,B\mathit{v} - q), }$$

and an augmented Lagrangian functional \(\mathcal{L}_{r}\) defined (with r > 0) by

$$\displaystyle{ \mathcal{L}_{r}(\mathit{v},q;\mu ) = \mathcal{L}(\mathit{v},q;\mu ) + \dfrac{r} {2}\vert B\mathit{v} - q\vert ^{2}. }$$
(2.44)

One can easily prove that the functionals \(\mathcal{L}\) and \(\mathcal{L}_{r}\) share the same saddle-points over (V × H) × H, and also that, if {{u, p}, λ} is such a saddle-point, then u is a solution of problem (2.39) and p = Bu. A classical algorithm to compute saddle-points is the so-called Uzawa algorithm, popularized by [3] (a book dedicated to the study of Economics equilibria), and further discussed in, e.g., [76]. Applying a close variant of the Uzawa algorithm to the computation of the saddle-points of \(\mathcal{L}_{r}\) over (V × H) × H, we obtain

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \lambda ^{0}\ \mbox{ is given in}\ H; \quad \\ \mbox{ for $n \geq 0$},\ \lambda ^{n} \rightarrow \{ u^{n},p^{n}\} \rightarrow \lambda ^{n+1}\ \mbox{ via}\quad \\ \{u^{n},p^{n}\} = \mbox{ arg}\min _{\{\mathit{ v},q\}\in V \times H}\mathcal{L}_{r}(\mathit{v},q;\lambda ^{n}), \quad \\ \lambda ^{n+1} =\lambda ^{n} +\rho (Bu^{n} - p^{n}), \quad \end{array} \right. }$$
(2.45)

an algorithm called ALG1 by some practitioners, following a terminology introduced in [78] (an alternative name could have been augmented Lagrangian Uzawa algorithm which summarizes quite well what algorithm (2.45) is all about).

Concerning the convergence of ALG1 it has been proved in, e.g., [62, 63, 66] and [74] (see also [78]), that if:

  1. (i)

    \(\mathcal{L}\) has a saddle-point {{u, p}, λ} over (V × H) × H.

  2. (ii)

    B is an injection and R(B) is closed in H.

  3. (iii)

    \(\lim _{\vert q\vert \rightarrow +\infty }\dfrac{F(q)} {\vert q\vert } = +\infty \).

  4. (iv)

    F = F 0 + F 1 with F 0 and F 1 proper, lower semi-continuous and convex, with F 0 Gateaux-differentiable, and uniformly convex on the bounded sets of H

(the above properties imply that problem (2.39) has a unique solution), then we have, \(\forall \ r > 0\) and if

$$\displaystyle{ 0 <\rho < 2r, }$$

the following convergence result

$$\displaystyle{ \lim _{n\rightarrow +\infty }\{u^{n},p^{n}\} =\{ u,Bu\}\ \mbox{ in}\ V \times H, }$$
(2.46)

where u is the solution of problem (2.39); moreover, the convergence result (2.46) holds \(\forall \ \lambda ^{0} \in H\). The convergence of the multiplier sequence {λ n} n ≥ 0 is no better than weak in general, implying that the criterion used to stop ALG1 has to been chosen carefully. Of course, in finite dimension, the properties of B, F and G implying convergence are less demanding than in infinite dimension; for example, the existence of a solution to problem (2.39) is sufficient to imply the existence of a saddle-point.

The main difficulty with the Uzawa algorithm (2.45) is clearly the solution of the minimization problem it contains. An obvious choice to solve this problem is to use a relaxation method (as advocated in [77, 78]). Suppose that, as advocated in the two above references (which show that, indeed, for the nonlinear elliptic problem discussed there the number of relaxation iterations reduces quickly to two), we limit the number of relaxation iterations to one when solving the minimization problem in (2.45): we obtain then the following primal-dual algorithm (called ALG2 by some practitioners):

$$\displaystyle{ \{u^{-1},\lambda ^{0}\}\ \mbox{ is given in}\ V \times H; }$$
(2.47)

for n ≥ 0, {u n−1, λ n} → p n → u n → λ n+1 via

$$\displaystyle{ p^{n} = \mbox{ arg}\min _{ q\in H}\mathcal{L}_{r}(u^{n-1},q;\lambda ^{n}), }$$
(2.48)
$$\displaystyle{ u^{n} = \mbox{ arg}\min _{\mathit{ v}\in V }\mathcal{L}_{r}(\mathit{v},p^{n};\lambda ^{n}), }$$
(2.49)
$$\displaystyle{ \lambda ^{n+1} =\lambda ^{n} +\rho (Bu^{n} - p^{n}). }$$
(2.50)

Assuming that

$$\displaystyle{ 0 <\rho < \dfrac{1 + \sqrt{5}} {2} r, }$$

with the other assumptions implying the convergence of ALG1 still holding, we have

$$\displaystyle{ \lim _{n\rightarrow +\infty }\{u^{n},p^{n}\} =\{ u,Bu\}\ \mbox{ in}\ V \times H, }$$

where u is the solution of problem (2.39). Convergence proofs can be found in [62, 63, 66] and [74].

A simple variant (called ALG3) of algorithm (2.47)–(2.50) is obtained by updating the multiplier a first time immediately after (2.48); we obtain then

$$\displaystyle{ \{u^{-1},\lambda ^{0}\}\ \mbox{ is given in}\ V \times H, }$$
(2.51)

for n ≥ 0, {u n−1, λ n} → p n → λ n+1∕2 → u n → λ n+1 via

$$\displaystyle{ p^{n} = \mbox{ arg}\min _{ q\in H}\mathcal{L}_{r}(u^{n-1},q;\lambda ^{n}), }$$
(2.52)
$$\displaystyle{ \lambda ^{n+1/2} =\lambda ^{n} +\rho (Bu^{n-1} - p^{n}), }$$
(2.53)
$$\displaystyle{ u^{n} = \mbox{ arg}\min _{\mathit{ v}\in V }\mathcal{L}_{r}(\mathit{v},p^{n};\lambda ^{n+1/2}), }$$
(2.54)
$$\displaystyle{ \lambda ^{n+1} =\lambda ^{n+1/2} +\rho (Bu^{n} - p^{n}). }$$
(2.55)

Most practitioners prefer ALG2 to ALG3, the main reason being that ALG2 is more robust than ALG3, in general.

Remark 21.

If one takes ρ = r in (2.47)–(2.50) and (2.51)–(2.55), the algorithms we obtain belong to the Alternating Direction Methods of Multipliers (ADMM) family (a terminology we will justify in Section 3.3). The convergence of ADMM related algorithms is rather well established in the convex case (see, for example, [18, 61, 95]; see also the references therein and other chapters of this book, the one by M. Burger, A. Sawatzky & G. Steidl in particular). On the other hand, one is still lacking a general theory for the convergence of algorithms such as ALG1, ALG2, and ALG3 when applied to the solution of non-convex variational problems. Nevertheless, the above algorithms have been successfully applied to the solution of non-convex problems as shown, for example, in [42, 72] (Chapter 4), [74], and other chapters of this book, Chapters 7 and 8, in particular.

Remark 22.

An important issue with the above primal-dual algorithms is how to vary r and ρ dynamically in order to improve the speed of convergence of these algorithms. This issue has been addressed in, e.g., [18, 34, 45, 46] (see also the references therein).

Remark 23.

An overlooked ([34] being a notable exception) property of primal-dual algorithms such as ALG1, ALG2 and ALG3 is that they may be constructive still, in those not so uncommon situations where in (2.39) one has dom(FB)∩ dom(G) = ∅, implying that problem (2.39) has no solutions, strictly speaking. On the basis of the numerical results reported in [42] (see also [72] (Chapter 4) and Chapter 8 of this volume), we conjecture that if the parameters ρ and r are properly chosen, the sequence {{u n, p n}} n ≥ 0 converges to a pair {u, p} minimizing the functional

$$\displaystyle{ \{\mathit{v},q\} \rightarrow G(\mathit{v}) + F(q) }$$

over the set

$$\displaystyle{ \{\{\mathit{v},q\}\}\vert \{\mathit{v},q\} \in \mbox{ dom}(G) \times \mbox{ dom}(F),\vert B\mathit{v} - q\vert =\min _{\{w,\varpi \}\in \mbox{ dom}(G)\times \mbox{ dom}(F)}\vert Bw -\varpi \vert \}, }$$

while the sequence {λ n} n ≥ 0 diverges arithmetically (that is, | λ n | → + like n multiplied by a positive constant, that is slowly). If the above convergence/divergence result holds true (which seems to be the case for the non-convex problem discussed in [42]), it implies that the above primal-dual algorithms solve problem (2.39) in a least-squares sense, a most remarkable property indeed, testifying of the robustness of these algorithms. The above results look natural, but the optimization experts we consulted had trouble to give us a precise reference (or a proof).

Remark 24.

We encountered situations (in incompressible finite elasticity in particular; see, e.g., [74] for details) where a safe way to proceed with the above primal-dual algorithms is as follows: Employ ALG1 with a well-balanced (that is neither too small nor too large) stopping criterion for the relaxation algorithm used to solve the minimization problem in (2.45); it has been observed quite often that the number of relaxation iterations necessary to compute {u n, p n} from λ n goes down quickly to one or two (an observation at the origin of ALG2), implying that starting with ALG1, the algorithm switches automatically to ALG2. It is not uncommon that this implementation of ALG1 produces an algorithm faster (CPU-wise) than ALG2 and ALG3, when solving “hard” problems. □ 

Further information on the convergence of Lagrange multiplier based iterative methods can be found in other chapters of this volume, and in, e.g., [45, 60, 62, 63], [66, 74] and [100] (see also the many references therein).

3.3 On the Relationship Between Alternating Direction Methods and ALG2, ALG3

As reported in [71] and [72] (Chapter 4) some previously unknown relationships between alternating direction methods and augmented Lagrangian algorithms were identified in 1975 by T.F. Chan and the first author of this chapter, while investigating the numerical solution of some simple nonlinear elliptic problems by various iterative methods (see [30] for details). Indeed, let us consider the particular case of problem (2.39) where V = H, B = I, and F and G are both differentiable over V; then, assuming that ρ = r, ALG2 (that is algorithm (2.47)–(2.50)) takes the following form:

$$\displaystyle{ \{u^{-1},\lambda ^{0}\}\ \mbox{ is given in}\ V \times H; }$$
(2.56)

for n ≥ 0, {u n−1, λ n} → p n → u n → λ n+1 via

$$\displaystyle{ r(p^{n} - u^{n-1}) + DF(p^{n}) -\lambda ^{n} = 0, }$$
(2.57)
$$\displaystyle{ r(u^{n} - p^{n}) + DG(u^{n}) +\lambda ^{n} = 0, }$$
(2.58)
$$\displaystyle{ \lambda ^{n+1} =\lambda ^{n} + r(u^{n} - p^{n}), }$$
(2.59)

where DF (resp., DG) denotes the differential of F (resp., G). By elimination of λ n and λ n+1 in (2.57)–(2.59), we obtain

$$\displaystyle{ r(p^{n} - u^{n-1}) + DF(p^{n}) + DG(u^{n-1}) = 0, }$$
$$\displaystyle{ r(u^{n} - u^{n-1}) + DF(p^{n}) + DG(u^{n}) = 0, }$$

which imply in turn (after changing n − 1 in n):

$$\displaystyle{ r(p^{n+1} - u^{n}) + DF(p^{n+1}) + DG(u^{n}) = 0, }$$
(2.60)
$$\displaystyle{ r(u^{n+1} - u^{n}) + DF(p^{n+1}) + DG(u^{n+1}) = 0. }$$
(2.61)

Comparing to (2.17) shows that in this particular case, ALG2 is a disguised form of the Douglas-Rachford scheme discussed in Section 2.5, with r = 1∕△t and DF (resp., DG) playing the role of A 1 (resp., A 2). A similar interpretation holds for ALG3: indeed, if we assume again that V = H, B = I and F and G are differentiable, then, if ρ = r, algorithm (2.51)–(2.55) reduces to the Peaceman-Rachford scheme (2.15) discussed in Section 2.4. The above equivalence result can be generalized to situations where F and/or G are not differentiable.

The reasons for which ALG2 and ALG3 are called Alternating Direction Methods of Multipliers (ADMM) by many practitioners should be clear now. For further information and details on these primal-dual equivalences, see the discussion by M. Yan and W. Yin in Chapter 5 of this book.

4 Operator-Splitting Methods for the Direct Numerical Simulation of Particulate Flow

4.1 Generalities. Problem Formulation

It is the (necessarily biased) opinion of the authors of this chapter that the direct numerical simulation of particulate flow has been one of the success stories of operator-splitting methods, justifying thus a dedicated section in this chapter, despite the fact that this story has been told in several publications (see, e.g., [70] (Chapters 8 & 9), [73] and [79], and the references therein). For simplicity, we will discuss only the one-particle case (however, the results of numerical experiments involving more than one particle will be presented).

Let Ω be a bounded, connected, and open region of I​​R d (d = 2 or 3 in applications); the boundary of Ω is denoted by Γ. We suppose that Ω contains:

  1. (i)

    A Newtonian incompressible viscous fluid of density ρ f and viscosity μ f ; ρ f and μ f are both positive constants.

  2. (ii)

    A rigid body B of boundary ∂ B, mass M, center of mass G, and inertia I at the center of mass (see Figure 2.1, for additional details).

    Fig. 2.1
    figure 1

    Visualization of the rigid body and of a part of the flow region

The fluid occupies the region \(\varOmega \setminus \overline{B}\) and we suppose that distance (∂ B(0), Γ) > 0. From now on, x = { x i } i = 1 d will denote the generic point of I​​R d, d x = dx 1 … d x d , while ϕ(t) will denote the function x → ϕ(x, t). Assuming that the only external force is gravity, the fluid flow-rigid body motion coupling is modeled by

$$\displaystyle\begin{array}{rcl} & & \rho _{f}\left (\dfrac{\partial \mathbf{u}} {\partial t} + (\mathbf{u} \cdot \boldsymbol{\nabla })\mathbf{u}\right ) -\mu _{f}\boldsymbol{\nabla }^{2}\mathbf{u} + \boldsymbol{\nabla }p =\rho _{ f}\mathbf{g}\ \ in\ \varOmega \setminus \overline{B(t)},\ \mbox{ a.e.}\ t \in (0,T),{}\end{array}$$
(2.62)
$$\displaystyle\begin{array}{rcl} & & \boldsymbol{\nabla }\cdot \mathbf{u}(t) = 0\ in\ \varOmega \setminus \overline{B(t)},\ \mbox{ a.e.}\ t \in (0,T),{}\end{array}$$
(2.63)
$$\displaystyle\begin{array}{rcl} & & \mathbf{u}(t) = \mathbf{u}_{\varGamma }(t)\ on\ \varGamma,\ \mbox{ a.e.}\ t \in (0,T),\ with\ \int _{\varGamma }\mathbf{u}_{\varGamma }(t) \cdot \mathbf{n}\,d\varGamma = 0,{}\end{array}$$
(2.64)
$$\displaystyle\begin{array}{rcl} & & \mathbf{u}(0) = \mathbf{u}_{0}\ in\ \varOmega \setminus \overline{B(0)}\ with\ \boldsymbol{\nabla }\cdot \mathbf{u}_{0} = 0,{}\end{array}$$
(2.65)

and

$$\displaystyle\begin{array}{rcl} & & \dfrac{dG} {dt} = \mathbf{V},{}\end{array}$$
(2.66)
$$\displaystyle\begin{array}{rcl} & & \mathbf{\mathit{M}}\ \dfrac{d\mathbf{V}} {dt} = \mathbf{\mathit{M}}\mathbf{g} + \mathbf{R}_{H},{}\end{array}$$
(2.67)
$$\displaystyle\begin{array}{rcl} & & \dfrac{d(\mathbf{I}\boldsymbol{\omega })} {dt} = \mathbf{T}_{H},{}\end{array}$$
(2.68)
$$\displaystyle\begin{array}{rcl} & & G(0) = G_{0},\mathbf{V}(0) = \mathbf{V}_{0},\boldsymbol{\omega }(0) = \boldsymbol{\omega }_{0},B(0) = B_{0}.{}\end{array}$$
(2.69)

In relations (2.62)–(2.69):

  • Vector u = { u i } i = 1 d is the fluid flow velocity and p is the pressure.

  • u 0 and u Γ are given vector-valued functions.

  • V is the velocity of the center of mass of body B, while \(\boldsymbol{\omega }\) is the angular velocity.

  • R H and T H denote, respectively, the resultant and the torque of the hydrodynamical forces, namely the forces that the fluid exerts on B; we have, actually,

$$\displaystyle{ \mathbf{R}_{H} =\int _{\partial B}\boldsymbol{\sigma }\mathbf{n}\,d\gamma \ \ and\ \ \mathbf{T}_{H} =\int _{\partial B}\stackrel{\mathop{\longrightarrow}\limits_{}^{\phantom{G\ \mathbf{x}}}}{G\mathbf{x}} \times \boldsymbol{\sigma }\mathbf{n}\,d\gamma. }$$
(2.70)

In (2.70) the stress-tensor \(\boldsymbol{\sigma }\) is defined by \(\boldsymbol{\sigma } = 2\mu _{f}\mathbf{D}(\mathbf{u}) - p\mathbf{I}_{d}\), with \(\mathbf{D}(\mathbf{v}) = \frac{1} {2}(\boldsymbol{\nabla }\mathbf{v} + (\boldsymbol{\nabla }\mathbf{v})^{t})\), while n is a unit normal vector at ∂ B and I d is the identity tensor.

Concerning the compatibility conditions on ∂ B we have: (i) the forces exerted by the fluid on the solid body balance those exerted by the solid body on the fluid, and we shall assume that: (ii) the no-slip boundary condition holds, namely

$$\displaystyle{ \mathbf{u}(\mathbf{x},t) = \mathbf{V}(t) + \boldsymbol{\omega }(t) \times \stackrel{\mathop{\longrightarrow}\limits_{}^{\phantom{G(t)\ \mathbf{x}}}}{G(t)\mathbf{x}},\ \forall \mathbf{x} \in \partial B(t). }$$
(2.71)

Remark 25.

System (2.62)–(2.65) (resp., (2.66)–(2.69)) is of the incompressible Navier-Stokes (resp., Euler-Newton) type. Also, the above model can be generalized to multiple-particles situations and/or non-Newtonian incompressible viscous fluids. □ 

The (local in time) existence of weak solutions for problems such as (2.62)–(2.69) has been proved in [52], assuming that, at t = 0, the particles do not touch Γ and each other (see also [87] and [145]). Concerning the numerical solution of (2.62)–(2.69), (2.71) several approaches are encountered in the literature, among them: (i) The Arbitrary Lagrange-Euler (ALE) methods; these methods, which rely on moving meshes, are discussed in, e.g., [98, 103] and [127]. (ii) The fictitious boundary method discussed in, e.g., [165], and (iii) the non-boundary fitted fictitious domain methods discussed in, e.g., [70, 79] and [140, 141] (and in Section 4.2, hereafter). Among other things, the methods in (ii) and (iii) have in common that the meshes used for the flow computations do not have to match the boundary of the particles.

Remark 26.

Even if theory suggests that collisions may never take place in finite time (if we assume that the particles have smooth shapes and that the flow is still modeled by the Navier-Stokes equations as long as the particles do not touch each other, or Γ), near collisions take place, and after discretization particles may collide. These phenomena can be handled by introducing (as done in, e.g., [70] (Chapter 8) and [79]) well-chosen short range repulsion potentials reminiscent of those encountered in Molecular Dynamics, or by using Kuhn-Tucker multipliers to authorize particle motions with contact but no overlapping (as done in, e.g., [128] and [129]). More information on the numerical treatment of particles in flow can be found in, e.g., [152] (and the references therein), and of course in Google.

4.2 A Fictitious Domain Formulation

Considering the fluid-rigid body mixture as a unique (heterogeneous) medium we are going to derive a fictitious domain based variational formulation to model its motion. The principle of this derivation is pretty simple: it relies on the following steps (see, e.g., [70] and [79] for more details), where in Step a we denote by S: T the Fröbenius inner product of the tensors S and T, that is (with obvious notation) \(\mathbf{S: T} =\sum _{1\leq i,j\leq d}s_{ij}t_{ij}\):

Step a. Start from the following global weak formulation (of the virtual power type):

$$\displaystyle\begin{array}{rcl} \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &\rho _{f}\int _{\varOmega \setminus \overline{B(t)}}\left [\dfrac{\partial \mathbf{u}} {\partial t} + (\mathbf{u} \cdot \boldsymbol{\nabla })\mathbf{u}\right ] \cdot \mathbf{v}\,d\mathbf{x} + 2\mu _{f}\int _{\varOmega \setminus \overline{B(t)}}\mathbf{D}(\mathbf{u}): \mathbf{D}(\mathbf{v})\,d\mathbf{x} \\ \quad &-\int _{\varOmega \setminus \overline{B(t)}}p\boldsymbol{\nabla }\cdot \mathbf{v}\,d\mathbf{x} + \mathbf{\mathit{M}}\dfrac{d\mathbf{V}} {dt} \cdot \mathbf{Y} + \dfrac{d(\mathbf{I}\boldsymbol{\omega })} {dt} \cdot \boldsymbol{\theta } \\ \quad &=\rho _{f}\int _{\varOmega \setminus \overline{B(t)}}\mathbf{g} \cdot \mathbf{v}\,d\mathbf{x} + \mathbf{\mathit{M}}\mathbf{g} \cdot \mathbf{Y}, \\ \quad &\forall \{\mathbf{v},\mathbf{Y},\boldsymbol{\theta }\} \in (H^{1}(\varOmega \setminus \overline{B(t)}))^{d} \times I\!\!R^{d} \times \boldsymbol{\varTheta }\ \ and\ \ verifying\ \ \\ \quad &\mathbf{v} = 0\ on\ \varGamma,\ \mathbf{v}(\mathbf{x}) = \mathbf{Y} + \boldsymbol{\theta } \times \stackrel{\mathop{\longrightarrow}\limits_{}^{\phantom{G(t)\ \mathbf{x}}}}{G(t)\mathbf{x}},\forall \mathbf{x} \in \partial B(t),\ t \in (0,T), \\ \quad &with\ \boldsymbol{\varTheta } = I\!\!R^{3}\ if\ d = 3,\ \boldsymbol{\varTheta } =\{ (0,0,\theta )\ \vert \ \theta \in I\!\!R\}\ if\ d = 2,\end{array} \right.& &{}\end{array}$$
(2.72)
$$\displaystyle\begin{array}{rcl} & & \int _{\varOmega \setminus \overline{B(t)}}q\boldsymbol{\nabla }\cdot \mathbf{u}(t)\,d\mathbf{x} = 0,\forall q \in L^{2}(\varOmega \setminus \overline{B(t)}),\ t \in (0,T),{}\end{array}$$
(2.73)
$$\displaystyle\begin{array}{rcl} & & \mathbf{u}(t) = \mathbf{u}_{\varGamma }(t)\ on\ \varGamma,\ t \in (0,T),{}\end{array}$$
(2.74)
$$\displaystyle\begin{array}{rcl} & & \mathbf{u}(\mathbf{x},t) = \mathbf{V}(t) + \boldsymbol{\omega }(t) \times \stackrel{\mathop{\longrightarrow}\limits_{}^{\phantom{G(t)\ \mathbf{x}}}}{G(t)\mathbf{x}},\forall \mathbf{x} \in \partial B(t),\ t \in (0,T),{}\end{array}$$
(2.75)
$$\displaystyle\begin{array}{rcl} & & \dfrac{dG} {dt} = \mathbf{V},{}\end{array}$$
(2.76)
$$\displaystyle\begin{array}{rcl} & & \mathbf{u}(\mathbf{x},0) = \mathbf{u}_{0}(\mathbf{x}),\forall \mathbf{x} \in \varOmega \setminus \overline{B(0)},{}\end{array}$$
(2.77)
$$\displaystyle\begin{array}{rcl} & & G(0) = G_{0},\ \mathbf{V}(0) = \mathbf{V}_{0},\ \boldsymbol{\omega }(0) = \boldsymbol{\omega }_{0},\ B(0) = B_{0}.{}\end{array}$$
(2.78)

Step b. Fill B with the surrounding fluid and impose a rigid body motion to the fluid inside B.

Step c. Modify the global weak formulation (2.72)–(2.78) accordingly, taking advantage of the fact that if v is a rigid body motion velocity field, then \(\boldsymbol{\nabla }\cdot \mathbf{v} = 0\) and D(v) = 0.

Step d. Use a Lagrange multiplier defined over B to force the rigid body motion inside B.

Assuming that B is made of a homogeneous material of density ρ s , the above program leads to:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &\rho _{f}\int _{\varOmega }\left [\dfrac{\partial \mathbf{u}} {\partial t} + (\mathbf{u} \cdot \boldsymbol{\nabla })\mathbf{u}\right ] \cdot \mathbf{v}\,d\mathbf{x} + 2\mu _{f}\int _{\varOmega }\mathbf{D}(\mathbf{u}): \mathbf{D}(\mathbf{v})\,d\mathbf{x} -\int _{\varOmega }p\boldsymbol{\nabla }\cdot \mathbf{v}\,d\mathbf{x} \\ \quad &+(1 -\rho _{f}/\rho _{s})\left [\mathbf{\mathit{M}}\dfrac{d\mathbf{V}} {dt} \cdot \mathbf{Y} + \dfrac{d(\mathbf{I}\boldsymbol{\omega })} {dt} \cdot \boldsymbol{\theta }\right ]+ < \boldsymbol{\lambda },\mathbf{v} -\mathbf{Y} -\boldsymbol{\theta }\times \stackrel{\mathop{\longrightarrow}\limits_{}^{\phantom{G(t)\ \mathbf{x}}}}{G(t)\mathbf{x}} > _{B(t)} \\ \quad &=\rho _{f}\int _{\varOmega }\mathbf{g} \cdot \mathbf{v}\,d\mathbf{x} + (1 -\rho _{f}/\rho _{s})\mathbf{\mathit{M}}\mathbf{g} \cdot \mathbf{Y},\ \forall \{\mathbf{v},\mathbf{Y},\boldsymbol{\theta }\} \in (H^{1}(\varOmega ))^{d} \times I\!\!R^{d} \times \boldsymbol{\varTheta }, \\ \quad &t \in (0,T),\ \ with\ \boldsymbol{\varTheta } = I\!\!R^{3}\ if\ d = 3,\ \boldsymbol{\varTheta } =\{ (0,0,\theta )\ \vert \ \theta \in I\!\!R\}\ if\ d = 2, \end{array} \right. }$$
(2.79)
$$\displaystyle\begin{array}{rcl} \int _{\varOmega }q\boldsymbol{\nabla }\cdot \mathbf{u}(t)\,d\mathbf{x} = 0,\forall q \in L^{2}(\varOmega ),\ t \in (0,T),& &{}\end{array}$$
(2.80)
$$\displaystyle\begin{array}{rcl} \mathbf{u}(t) = \mathbf{u}_{\varGamma }(t)\ on\ \varGamma,\ t \in (0,T),& &{}\end{array}$$
(2.81)
$$\displaystyle\begin{array}{rcl} \left \{\begin{array}{@{}l@{\quad }l@{}} < \boldsymbol{\mu },\mathbf{u}(\mathbf{x},t) -\mathbf{V}(t) -\boldsymbol{\omega }(t) \times \stackrel{\mathop{\longrightarrow}\limits_{}^{\phantom{G(t)\ \mathbf{x}}}}{G(t)\mathbf{x}} > _{B(t)} = 0,\quad \\ \forall \boldsymbol{\mu } \in \boldsymbol{\varLambda }(t)\ (= (H^{1}(B(t)))^{d}),\ t \in (0,T), \quad \end{array} \right.& &{}\end{array}$$
(2.82)
$$\displaystyle\begin{array}{rcl} \dfrac{dG} {dt} = \mathbf{V},& &{}\end{array}$$
(2.83)
$$\displaystyle\begin{array}{rcl} \left \{\begin{array}{@{}l@{\quad }l@{}} G(0) = G_{0},\ \mathbf{V}(0) = \mathbf{V}_{0},\ \boldsymbol{\omega }(0) = \boldsymbol{\omega }_{0},\ B(0) = B_{0}, \quad \\ \mathbf{u}(\mathbf{x},0) = \mathbf{u}_{0}(\mathbf{x}),\forall \mathbf{x} \in \varOmega \setminus \bar{B}_{0},\ \ \mathbf{u}(x,0) = \mathbf{V}_{0} + \boldsymbol{\omega }_{0} \times \stackrel{\mathop{\longrightarrow}\limits_{}^{\phantom{G\ \mathbf{x}}}}{G_{0}\mathbf{x}},\ \forall \mathbf{x} \in \bar{B}_{0}.\quad \end{array} \right.& &{}\end{array}$$
(2.84)

From a theoretical point of view, a natural choice for < ⋅ , ⋅  >  B(t) is provided by, e.g.,

$$\displaystyle{ < \boldsymbol{\mu },\mathbf{v} > _{B(t)} =\int _{B(t)}[\boldsymbol{\mu } \cdot \mathbf{v} + l^{2}\mathbf{D}(\boldsymbol{\mu }): \mathbf{D}(\mathbf{v})]\,d\mathbf{x}; }$$
(2.85)

in (2.85), l is a characteristic length, the diameter of B, for example. In practice, following [70] (Chapter 8) and [79], one makes things much simpler by approximating \(\boldsymbol{\varLambda }(t)\) by

$$\displaystyle{ \boldsymbol{\varLambda }_{h}(t) =\{ \boldsymbol{\mu }\ \vert \ \boldsymbol{\mu } =\sum _{ j=1}^{N(t)}\boldsymbol{\mu }_{ j}\delta (\mathbf{x} -\mathbf{x}_{j}),\ with\ \boldsymbol{\mu }_{j} \in I\!\!R^{d},\ \forall j = 1,\ldots,N(t)\}, }$$
(2.86)

and the pairing in (2.85) by

$$\displaystyle{ <\mu,\mathbf{v} > _{(B(t),h)} =\sum _{ j=1}^{N(t)}\boldsymbol{\mu }_{ j} \cdot \mathbf{v}(\mathbf{x}_{j}). }$$
(2.87)

In (2.86), (2.87), x → δ(xx j ) is the Dirac measure at x j , and the set {x j } j = 1 N(t) is the union of two subsets, namely: (i) The set of the points of the velocity grid contained in B(t) and whose distance at ∂ B(t) is ≥ ch, h being a space discretization step and c a constant ≈ 1.(ii) A set of control points located on ∂ B(t) and forming a mesh whose step size is of the order of h. It is clear that, using the approach above, one forces the rigid body motion inside the particle by collocation.

A variant of the above fictitious domain approach is discussed in [140] and [141]; after an appropriate elimination, it does not make use of Lagrange multipliers to force the rigid body motion of the particles, but uses instead projections on velocity subspaces where the rigid body motion velocity property is verified over the particles (see [140] and [141] for details).

4.3 Solving Problem (2.79)–(2.84) by Operator-Splitting

We do not consider collisions; after (formal) elimination of p and \(\boldsymbol{\lambda }\), problem (2.79)–(2.84) reduces to an initial value problem of the following form

$$\displaystyle\begin{array}{rcl} \dfrac{d\mathbf{X}} {dt} +\sum _{ j=1}^{J}A_{ j}(\mathbf{X},t) = \mathbf{0}\ on\ (0,T),\ \ \mathbf{X}(0) = \mathbf{X}_{0},& &{}\end{array}$$
(2.88)

where \(\mathbf{X} =\{ \mathbf{u},\mathbf{V},\boldsymbol{\omega },G\}\) (or \(\{\mathbf{u},\mathbf{V},\mathbf{I}\boldsymbol{\omega },G\}\)). A typical situation will be the one where, with J = 4, operator A 1 will be associated with incompressibility, A 2 with advection, A 3 with diffusion, A 4 with the fictitious domain treatment of the rigid body motion; other decompositions are possible as shown in, e.g., [70] (Chapter 8) and [79] (both references include a collision operator). The Lie’s scheme (2.3), (2.4) applies “beautifully” to the solution of problem (2.79)–(2.84). The resulting method is quite modular implying that different space and time approximations can be used to treat the various sub-problems encountered at each time step; the only constraint is that two successive steps have to communicate (by projection in general) to provide the initial condition required by each initial value sub-problem.

4.4 Numerical Experiments

4.4.1 Generalities

The methodology we described (briefly) in the above paragraphs has been validated by numerous experiments (see, in particular, [70] (Chapters 8 & 9), [73, 79], [97, 137] and the related publications reported in http://www.math.uh.edu/~pan/). In this chapter, we will consider two test problems (borrowed from [73] (Section 3.4)): The first test problem involves three particles, while the second one concerns a channel flow with 300 particles. The fictitious domain/operator-splitting approach has made the solution of these problems (almost) routine nowadays. All the flow computations have been done using the Bercovier-Pironneau finite element approximation; namely (see [70] (Chapters 5, 8 and 9) for details), we used a globally continuous piecewise affine approximation of the velocity (resp., the pressure) associated with a triangulation (in 2-D) or tetrahedral partition (in 3-D) \(\mathcal{T}_{h}\) (resp., \(\mathcal{T}_{2h}\)) of Ω, h being a space discretization step. The pressure mesh is thus twice coarser than the velocity one. The calculations have been done using uniform partitions \(\mathcal{T}_{h}\) and \(\mathcal{T}_{2h}\).

4.4.2 First Test Problem: Settling of Three Balls in a Vertical Narrow Tube

Our goal in this subsection is to discuss the interaction of three identical balls settling in a narrow tube of rectangular cross-section, containing an incompressible Newtonian viscous fluid. Theoretically, the tube should be infinitely long, but for practicality we first consider the settling of the balls in a cylinder of length 6 whose cross-section is the rectangle Ω = (0, 1. 5) × (0, 0. 25); this cylinder is moving with the balls in such a way that the center of the lower ball is in the horizontal symmetry plane (a possible, but less satisfying, alternative would be to specify periodicity in the vertical direction). At time t = 0, we suppose that the truncated cylinder coincides with the “box” (0, 1. 5) × (0, 0. 25) × (0, 6), and the centers of the balls are on the vertical axis of the cylinder at the points x 1 = 0. 75, x 2 = 0. 125, x 3 = 1, 1.3 and 1.6. The parameters for this test case are ρ s  = 1. 1, ρ f  = 1, μ f  = 1, the diameter of the balls being d = 0. 2. The mesh size used to compute the velocity field (resp., the pressure) is h v  = h = 1∕96 (resp., h p  = 2h = 1∕48), while we took 1/1000 for the time-discretization step; the initial velocity of the flow is 0, while the three balls are released from rest. The velocity on the cylinder wall is 0. On the time interval [0, 15] the drafting, kissing and tumbling phenomenon (a terminology introduced by D.D. Joseph) has been observed several time before a stable quasi-horizontal configuration takes place, as shown in Figures 2.2, 2.3 and 2.4. The averaged vertical velocity of the balls is 2.4653 on the time interval [13, 15], while the averaged particle Reynolds number is 49.304 on the same time interval, a clear evidence that inertia has to be taken into account.

Fig. 2.2
figure 2

Projections on the x 1 x 3-plane of the trajectories of the mass centers of the three particles

Fig. 2.3
figure 3

Relative positions of the three balls at t = 0, 0.4, 0.6, 0.65, 0.7, 0.8, 0.9, 1, 1.5, 2, 6, 6.25, 6.4, 6.6, 6.7, 8, 9, 10, 12, and 15 (from left to right and from top to bottom)

Fig. 2.4
figure 4

Visualization of the flow and of the particles at t = 1.1, 6.6, and 15.

4.4.3 Motion of 300 Neutrally Buoyant Disks in a Two-Dimensional Horizontal Channel

This second test problem involving 300 particles and a solid volume/fluid volume of the order of 0.38, collisions (or near-collisions) have to be accounted for in the simulations; to do so, we have used the methods discussed in [70] (Chapter 8) and [79]. Another peculiarity of this test problem is that ρ s  = ρ f for all the particles (a neutrally buoyant situation). Indeed, neutrally buoyant models are more delicate to handle than those in the general case since 1 −ρ f ρ s  = 0 in (2.79); however this difficulty can be overcome as shown in [136]. For this test problem, we have: (a) Ω = (0, 42) × (0, 12). (b) Ω contains the mixture of a Newtonian incompressible viscous fluid of density ρ f  = 1 and viscosity μ f  = 1, with 300 identical rigid solid disks of density ρ f  = 1 and diameter 0.9. (c) At t = 0, fluid and particles are at rest, the particle centers being located at the points of a regular lattice. (d) The mixture is put into motion by a uniform pressure drop of 10/9 per unit length (without the particles the corresponding steady flow would have been of the Poiseuille type with 20 as maximal flow speed). (e) The boundary conditions are given by u(x 1, x 2, t) = 0 if 0 ≤ x 1 ≤ 42, x 2 = 0 and 12, and 0 ≤ t ≤ 400 (no-slip boundary condition on the horizontal parts of the boundary), and then u(0, x 2, t) = u(42, x 2, t), 0 < x 2 < 12, 0 ≤ t ≤ 400 (space-periodic in the Ox 1 direction). (f) h v  = h = 1∕10, h p  = 2h = 1∕5, the time-discretization step being 1/1000.

The particle distribution at t = 100, 107.8, 114, 200, and 400 has been visualized on Figures 2.5. These figures show that, initially, we have the sliding motion of horizontal particle layers, then after some critical time a chaotic flow-motion takes place in very few time units, the highest particle concentration being along the channel axis (actually, a careful inspection of the results shows that the transition to chaos takes place just after t = 107.8). The maximal speed at t = 400 is 7.9, implying that the corresponding particle Reynolds number is very close to 7.1. On Figure 2.6 we show the averaged solid fraction as a function of x 2, the averaging space-time set being {{x 1, t} | 0 ≤ x 1 ≤ 42, 380 ≤ t ≤ 400}; the particle aggregation along the channel horizontal symmetry axis appears very clearly from this figure since the solid fraction is close to 0.58 at x 2 = 6 while the global solid fraction is 0.38 (vertical line in the figure). Finally, we have visualized on Figure 2.7 the x 1-averaged horizontal component of the mixture velocity at t = 400, as a function of x 2. The dashed line corresponds to a horizontal velocity distribution of the steady flow of the same fluid, with no particle in the channel, for the same pressure drop; the corresponding velocity profile is (of course) of the Poiseuille type and shows that the mixture behaves like a viscous fluid whose viscosity is (approximately) 2.5 larger than μ f . Actually, a closer inspection (see [136] for details) shows that the mixture behaves like a non-Newtonian incompressible viscous fluid of the power law type, for an exponent s = 1.7093 (s = 2 corresponding to a Newtonian fluid and s = 1 to a perfectly plastic material). Figures 2.5, 2.6, and 2.7 show also that, as well known, some order may be found in chaos.

Fig. 2.5
figure 5

Positions of the 300 particles at t = 100, 107.8, 114, 200, and 400 (from top to bottom).

Fig. 2.6
figure 6

Averaged solid fraction distribution.

Fig. 2.7
figure 7

Horizontal velocity distribution at t = 400.

For more details and further results and comments on pressure driven neutrally buoyant particulate flows in two-dimensional channels (including simulations with much larger numbers of particles, the largest one being 1,200) see [70] (Chapter 9) and [136].

5 Operator-Splitting Methods for the Numerical Solution of Nonlinear Problems from Condensate and Plasma Physics

5.1 Introduction

Operator-splitting methods have been quite successful at solving problems in Computational Physics, beside those from Computational Mechanics (CFD in particular). Among these successful applications let us mention those involving nonlinear Schrödinger equations, as shown, for example, by [9, 10, 44] and [102]. On the basis of some very inspiring articles (see, e.g., [9, 10] and [102]) he wrote on the above topic, the editors asked their colleague Peter Markowich to contribute a related chapter for this book; unfortunately, Professor Markowich being busy elsewhere had to say no. Considering the importance of nonlinear Schrödinger related problems, it was decided to (briefly) discuss in this chapter the solution of some of them by operator-splitting methods (see also Chapter 18 on the propagation of laser pulses along optical fibers). In Section 5.2, we will discuss the operator-splitting solution of the celebrated Gross-Pitaevskii equation for Bose-Einstein condensates, then, in Section 5.3, we will discuss the solution of the Zakharov system modeling the propagation of Langmuir waves in ionized plasma.

5.2 On the Solution of the Gross-Pitaevskii Equation

A Bose-Einstein condensate (BEC) is a state of matter of a dilute gas of bosons cooled to temperatures very close to absolute zero. Under such conditions, a large fraction of the bosons occupies the lowest quantum state, at which point macroscopic quantum phenomena become apparent. The existence of Bose-Einstein condensates was predicted in the mid-1920s by S. N. Bose and A. Einstein. If dilute enough, the time evolution of a BEC is described by the following Gross-Pitaevskii equation (definitely of the nonlinear Schrödinger type and given here in a-dimensional form (following [9])):

$$\displaystyle{ i\epsilon \dfrac{\partial \boldsymbol{\psi }} {\partial t} = -\dfrac{\epsilon ^{2}} {2}\nabla ^{2}\boldsymbol{\psi } + V _{ d}(x)\boldsymbol{\psi } + K_{d}\vert \boldsymbol{\psi }\vert ^{2}\boldsymbol{\psi }\ \ \mbox{ in}\ \varOmega \times (0,T), }$$
(2.89)

where, in (2.89), \(\boldsymbol{\psi }\) is a complex-valued function of x and t, \(i = \sqrt{-1}\), Ω is an open connected subset of I​​R d (with d = 1, 2 or 3), the real-valued function V d denotes an external potential, and the real-valued parameter K d is representative of the particles interactions. Equation (2.89) has to be completed by boundary and initial conditions. Equation (2.89) has motivated a very large literature from both physical and mathematical points of view. Let us mention among many others [1, 9, 125] and [126] (see also the many references therein). To solve equation (2.89) numerically we need to complete it by boundary and initial conditions: from now on, we will assume that

$$\displaystyle{ \boldsymbol{\psi }(x,0) = \boldsymbol{\psi }_{0}(x),\ x \in \varOmega, }$$
(2.90)

and (denoting by Γ the boundary of Ω)

$$\displaystyle{ \boldsymbol{\psi } = 0\ \mbox{ on}\ \varGamma \times (0,T). }$$
(2.91)

The boundary conditions in (2.91) have been chosen for their simplicity, and also to provide an alternative to the periodic boundary conditions considered in [9]. An important (and very easy to prove) property of the solution of the initial boundary value problem (2.89)–(2.91) reads as:

$$\displaystyle{ \dfrac{d} {dt}\int _{\varOmega }\vert \boldsymbol{\psi }(x,t)\vert ^{2}\,dx = 0\ \mbox{ on}\ (0,T], }$$

implying that

$$\displaystyle{ \int _{\varOmega }\vert \boldsymbol{\psi }(x,t)\vert ^{2}\,dx =\int _{\varOmega }\vert \boldsymbol{\psi }_{ 0}(x)\vert ^{2}\,dx\ \mbox{ on}\ [0,T]. }$$
(2.92)

As done before, we denote by \(\boldsymbol{\psi }(t)\) the function \(x \rightarrow \boldsymbol{\psi }(x,t)\). Let △t( > 0) be a time discretization step and denote (n +α)△t by t n+α; applying to problem (2.89)–(2.91) the Strang’s symmetrized scheme (2.7)–(2.10) of Section 2.3, we obtain:

$$\displaystyle\begin{array}{rcl} & & \boldsymbol{\psi }^{0} = \boldsymbol{\psi }_{ 0}; \\ & & \mbox{ for}\ n \geq 0,\boldsymbol{\psi }^{n} \rightarrow \boldsymbol{\psi }^{n+1/2} \rightarrow \widehat{\boldsymbol{\psi }}^{n+1/2} \rightarrow \boldsymbol{\psi }^{n+1}\ \mbox{ as follows}{}\end{array}$$
(2.93)
$$\displaystyle\begin{array}{rcl} & & \left \{\begin{array}{@{}l@{\quad }l@{}} i \dfrac{\partial \boldsymbol{\psi }} {\partial t} + \dfrac{\epsilon } {2}\nabla ^{2}\boldsymbol{\psi } = 0\ \ \mbox{ in}\ \varOmega \times (t^{n},t^{n+1/2}),\quad \\ \boldsymbol{\psi } = 0\ \ \mbox{ on}\ \varGamma \times (t^{n},t^{n+1/2}), \quad \\ \boldsymbol{\psi }(t^{n}) = \boldsymbol{\psi }^{n};\ \boldsymbol{\psi }^{n+1/2} = \boldsymbol{\psi }(t^{n+1/2}), \quad \end{array} \right.{}\end{array}$$
(2.94)
$$\displaystyle\begin{array}{rcl} & & \left \{\begin{array}{@{}l@{\quad }l@{}} i\epsilon \dfrac{\partial \boldsymbol{\psi }} {\partial t} = V _{d}(x)\boldsymbol{\psi } + K_{d}\vert \boldsymbol{\psi }\vert ^{2}\boldsymbol{\psi }\ \ \mbox{ in}\ \varOmega \times (0,\bigtriangleup t),\quad \\ \boldsymbol{\psi } = 0\ \ \mbox{ on}\ \varGamma \times (0,\bigtriangleup t), \quad \\ \boldsymbol{\psi }(0) = \boldsymbol{\psi }^{n+1/2};\ \widehat{\boldsymbol{\psi }}^{n+1/2} = \boldsymbol{\psi }(\bigtriangleup t), \quad \end{array} \right.{}\end{array}$$
(2.95)
$$\displaystyle\begin{array}{rcl} \left \{\begin{array}{@{}l@{\quad }l@{}} i \dfrac{\partial \boldsymbol{\psi }} {\partial t} + \dfrac{\epsilon } {2}\nabla ^{2}\boldsymbol{\psi } = 0\ \ \mbox{ in}\ \varOmega \times (t^{n+1/2},t^{n+1}),\quad \\ \boldsymbol{\psi } = 0\ \ \mbox{ on}\ \varGamma \times (t^{n+1/2},t^{n+1}), \quad \\ \boldsymbol{\psi }(t^{n+1/2}) = \widehat{\boldsymbol{\psi }}^{n+1/2};\ \boldsymbol{\psi }^{n+1} = \boldsymbol{\psi }(t^{n+1}). \quad \end{array} \right.& &{}\end{array}$$
(2.96)

On the solution of (2.95): Let us denote by ψ 1 (resp., ψ 2) the real (resp., imaginary) part of \(\boldsymbol{\psi }\); from (2.95), we have

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \epsilon \dfrac{\partial \psi _{1}} {\partial t} = V _{d}(x)\psi _{2} + K_{d}\vert \boldsymbol{\psi }\vert ^{2}\psi _{2}\ \ \mbox{ in}\ \varOmega \times (0,\bigtriangleup t), \quad \\ \epsilon \dfrac{\partial \psi _{2}} {\partial t} = -V _{d}(x)\psi _{1} - K_{d}\vert \boldsymbol{\psi }\vert ^{2}\psi _{1}\ \ \mbox{ in}\ \varOmega \times (0,\bigtriangleup t),\quad \\ \quad \end{array} \right. }$$
(2.97)

Multiplying by ψ 1 (resp., ψ 2) the 1st (resp., the 2nd) equation in (2.97), we obtain by addition

$$\displaystyle{ \dfrac{\partial } {\partial t}\vert \boldsymbol{\psi }(x,t)\vert ^{2} = 0\ \mbox{ on}(0,\bigtriangleup t),\ \mbox{ a.e.}\ x \in \varOmega, }$$

which implies in turn that

$$\displaystyle{ \vert \boldsymbol{\psi }(x,t)\vert = \vert \boldsymbol{\psi }(x,0)\vert = \vert \boldsymbol{\psi }^{n+1/2}\vert \ \mbox{ on}(0,\bigtriangleup t),\ \mbox{ a.e.}\ x \in \varOmega. }$$
(2.98)

It follows then from (2.95) and (2.98) that

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\epsilon \dfrac{\partial \boldsymbol{\psi }} {\partial t} = V _{d}(x)\boldsymbol{\psi } + K_{d}\vert \boldsymbol{\psi }^{n+1/2}\vert ^{2}\boldsymbol{\psi }\ \ \mbox{ in}\ \varOmega \times (0,\bigtriangleup t),\quad \\ \boldsymbol{\psi } = 0\ \ \mbox{ on}\ \varGamma \times (0,\bigtriangleup t), \quad \\ \boldsymbol{\psi }(0) = \boldsymbol{\psi }^{n+1/2};\ \widehat{\boldsymbol{\psi }}^{n+1/2} = \boldsymbol{\psi }(\bigtriangleup t), \quad \end{array} \right. }$$

which implies for \(\widehat{\boldsymbol{\psi }}^{n+1/2}\) the following closed-form solution

$$\displaystyle{ \widehat{\boldsymbol{\psi }}^{n+1/2} = e^{-i\frac{\bigtriangleup t} {\epsilon } \left (V _{d}+K_{d}\vert \boldsymbol{\psi }^{n+1/2}\vert ^{2}\right ) }\boldsymbol{\psi }^{n+1/2}. }$$
(2.99)

On the solution of (2.94) and (2.96): The initial boundary value problems in (2.94) and (2.96) are particular cases of

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i \dfrac{\partial \boldsymbol{\phi }} {\partial t} + \dfrac{\epsilon } {2}\nabla ^{2}\boldsymbol{\phi } = 0\ \ \mbox{ in}\ \varOmega \times (t_{0},t_{f}),\quad \\ \boldsymbol{\phi } = 0\ \ \mbox{ on}\ \varGamma \times (t_{0},t_{f}), \quad \\ \boldsymbol{\phi }(t_{0}) = \boldsymbol{\phi }_{0}. \quad \end{array} \right. }$$
(2.100)

The above linear Schrödinger problem is a very classical one. Its solution is obviously given by

$$\displaystyle{ \boldsymbol{\phi }(t) = e^{i \frac{\epsilon }{2} (t-t_{0})\nabla ^{2} }\boldsymbol{\phi }_{0},\ \forall t \in [t_{0},t_{f}]. }$$
(2.101)

Suppose that Ω = (0, a) × (0, b) × (0, c) with 0 < a < +, 0 < b < +, and 0 < c < +; since the eigenvalues, and related eigenfunctions, of the negative Laplace operator −∇2, associated with the homogeneous Dirichlet boundary conditions are known explicitly, and given, for p, q and r positive integers, by

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \lambda _{pqr} =\pi ^{2}\left (\dfrac{p^{2}} {a^{2}} + \dfrac{q^{2}} {b^{2}} + \dfrac{r^{2}} {c^{2}} \right ), \quad \\ w_{pqr}(x_{1},x_{2},x_{3}) = 2\sqrt{ \dfrac{2} {abc}}\sin \left (p\pi \dfrac{x_{1}} {a} \right )\sin \left (q\pi \dfrac{x_{2}} {b} \right )\sin \left (r\pi \dfrac{x_{3}} {c} \right )\quad \end{array} \right. }$$
(2.102)

(we have then Ω  | w pqr (x) | 2dx = 1) it follows from (2.101) that

$$\displaystyle{ \boldsymbol{\phi }(x,t) =\sum _{1\leq p,q,r<+\infty } \boldsymbol{\phi } _{pqr}^{0}e^{-i \frac{\epsilon }{2} \lambda _{pqr}(t-t_{0})}w_{ pqr}(x),\ \mbox{ with}\ \boldsymbol{\phi }_{pqr}^{0} =\int _{\varOmega }w_{ pqr}(y)\boldsymbol{\phi }_{0}(y)\,dy. }$$
(2.103)

In practice, one takes 1 ≤ p ≤ P, 1 ≤ q ≤ Q, 1 ≤ r ≤ R, and uses the Fast Fourier Transform (FFT) to compute the coefficients \(\boldsymbol{\phi }_{pqr}^{0}\) and then \(\boldsymbol{\phi }(x,t)\).

For those more general situations where the solutions of the following linear eigenvalue problem

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \{w,\lambda \}\in H_{0}^{1}(\varOmega ) \times I\!\!R,\ \int _{\varOmega }\vert w(x)\vert ^{2}\,dx = 1,\ \lambda > 0,\quad \\ \int _{\varOmega }\nabla w \cdot \nabla \mathit{v}\,dx =\lambda \int _{\varOmega }w\mathit{v}\,dx,\ \forall \mathit{v} \in H_{0}^{1}(\varOmega ), \quad \end{array} \right. }$$
(2.104)

are not known explicitly, one still has several options to solve (2.100), an obvious one being:

Approximate (2.104) by

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \{w,\lambda \}\in V _{h} \times I\!\!R,\ \int _{\varOmega }\vert w(x)\vert ^{2}\,dx = 1,\ \lambda > 0,\quad \\ \int _{\varOmega }\nabla w \cdot \nabla \mathit{v}\,dx =\lambda \int _{\varOmega }w\mathit{v}\,dx,\ \forall \mathit{v} \in V _{h}, \quad \end{array} \right. }$$
(2.105)

where V h is a finite dimensional sub-space of H 0 1(Ω). Then, as in, e.g., [17, 82] use an eigensolver (like the one discussed in [113]) to compute the first Q( ≤ N = dimV h ) eigen-pairs solutions of (2.105), such that (with obvious notation) Ω w p w q dx = 0 \(\forall p,q,\ 1 \leq p,q \leq Q\), pq, and denote by V Q the finite dimensional space span by the basis {w q } q = 1 Q. Next, proceeding as in the continuous case we approximate the solution of problem (2.100) by \(\boldsymbol{\phi }_{Q}\) defined by

$$\displaystyle{ \boldsymbol{\phi }_{Q}(x,t) =\sum _{ q=1}^{Q}\boldsymbol{\phi }_{ q}^{0}e^{-i \frac{\epsilon }{2} \lambda _{q}(t-t_{0})}w_{ q}(x),\ \mbox{ with}\ \boldsymbol{\phi }_{q}^{0} =\int _{\varOmega }w_{ q}(y)\boldsymbol{\phi }_{0}(y)\,dy. }$$
(2.106)

For the space V h in (2.105), we can use these finite element approximations of H 0 1(Ω) discussed for example in [37, 66] (Appendix 1) and [72] (Chapter 1) (see also the references therein).

Another approach, less obvious but still natural, is to observe that if \(\boldsymbol{\phi }\), the unique solution of (2.100) is smooth enough, it is also the unique solution of

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{\partial ^{2}\boldsymbol{\phi }} {\partial t^{2}} + \dfrac{\epsilon ^{2}} {4}\nabla ^{4}\boldsymbol{\phi } = 0\ \ \mbox{ in}\ \varOmega \times (t_{0},t_{f}), \quad \\ \boldsymbol{\phi } = 0\ \mbox{ and}\ \nabla ^{2}\boldsymbol{\phi } = 0\ \ \mbox{ on}\ \varGamma \times (t_{0},t_{f}),\quad \\ \boldsymbol{\phi }(t_{0}) = \boldsymbol{\phi }_{0},\ \dfrac{\partial \boldsymbol{\phi }} {\partial t}(t_{0}) = i \dfrac{\epsilon } {2}\nabla ^{2}\boldsymbol{\phi }_{0}(= \boldsymbol{\phi }_{1}),\quad \end{array} \right. }$$
(2.107)

a well-known model in elasto-dynamics (vibrations of simply supported plates). From Q, a positive integer, we define a time discretization step τ by \(\tau = \dfrac{t_{f} - t_{0}} {Q}\). The initial-boundary value problem (2.107) is clearly equivalent to

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{\partial \boldsymbol{\phi }} {\partial t} = \mathbf{v}\ \ \mbox{ in}\ \varOmega \times (t_{0},t_{f}), \quad \\ \dfrac{\partial \mathbf{v}} {\partial t} + \dfrac{\epsilon ^{2}} {4}\nabla ^{4}\boldsymbol{\phi } = 0\ \ \mbox{ in}\ \varOmega \times (t_{0},t_{f}), \quad \\ \boldsymbol{\phi } = 0\ \mbox{ and}\ \nabla ^{2}\boldsymbol{\phi } = 0\ \ \mbox{ on}\ \varGamma \times (t_{0},t_{f}),\quad \\ \boldsymbol{\phi }(t_{0}) = \boldsymbol{\phi }_{0},\ \mathbf{v}(t_{0}) = i \dfrac{\epsilon } {2}\nabla ^{2}\boldsymbol{\phi }_{0}(= \mathbf{v}_{0}).\quad \end{array} \right. }$$
(2.108)

A time-discretization scheme for (2.107) (via (2.108)), combining good accuracy, stability, and energy conservation properties (see, e.g., [14]) reads as follows (with \(\{\boldsymbol{\phi }^{q},\mathbf{v}^{q}\}\) an approximation of \(\{\boldsymbol{\phi },\mathbf{v}\}\) at t q = t 0 + ):

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \boldsymbol{\phi }^{0} = \boldsymbol{\phi }_{0},\ \mathbf{v}^{0} = \mathbf{v}_{0}; \quad \\ \mbox{ for $q = 0$,}\cdots \,,Q - 1,\ \mbox{ compute}\ \{\boldsymbol{\phi }^{q+1},\mathbf{v}^{q+1}\}\ \mbox{ from}\ \{\boldsymbol{\phi }^{q},\mathbf{v}^{q}\}\ \mbox{ via the solution of} \quad \\ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{\boldsymbol{\phi }^{q+1} -\boldsymbol{\phi }^{q}} {\tau } = \dfrac{1} {2}(\mathbf{v}^{q+1} + \mathbf{v}^{q}), \quad \\ \dfrac{\mathbf{v}^{q+1} -\mathbf{v}^{q}} {\tau } + \dfrac{\epsilon ^{2}} {8}\nabla ^{4}(\boldsymbol{\phi }^{q+1} + \boldsymbol{\phi }^{q}) = 0\ \ \mbox{ in}\ \varOmega,\quad \\ \boldsymbol{\phi }^{q+1} = 0\ \mbox{ and}\ \nabla ^{2}\boldsymbol{\phi }^{q+1} = 0\ \ \mbox{ on}\ \varGamma. \quad \end{array} \right.\quad \end{array} \right. }$$
(2.109)

By elimination of v q+1 it follows from (2.109) that \(\boldsymbol{\phi }^{q+1}\) is solution of

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \boldsymbol{\phi }^{q+1} + \dfrac{(\tau \epsilon )^{2}} {8} \nabla ^{4}\boldsymbol{\phi }^{q+1} = \boldsymbol{\phi }^{q} +\tau \mathbf{v}^{q} -\dfrac{(\tau \epsilon )^{2}} {8} \nabla ^{4}\boldsymbol{\phi }^{q}\ \ \mbox{ in}\ \varOmega,\quad \\ \boldsymbol{\phi }^{q+1} = 0\ \mbox{ and}\ \nabla ^{2}\boldsymbol{\phi }^{q+1} = 0\ \ \mbox{ on}\ \varGamma, \quad \end{array} \right. }$$
(2.110)

a bi-harmonic problem which is well posed in H 0 1(Ω) ∩ H 2(Ω). Next, one obtains easily v q+1 from

$$\displaystyle{ \mathbf{v}^{q+1} = \dfrac{2} {\tau } (\boldsymbol{\phi }^{q+1} -\boldsymbol{\phi }^{q}) -\mathbf{v}^{q}. }$$

For the solution of the bi-harmonic problem (2.110) we advocate those mixed finite element approximations and conjugate gradient algorithms used in various chapters of [72] (see also the references therein).

5.3 On the Solution of Zakharov Systems

In 1972, V.E. Zakharov introduced a mathematical model describing the propagation of Langmuir waves in ionized plasma (ref. [179]). This model reads as follows (after rescaling):

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + \nabla ^{2}u = un, \quad \\ \dfrac{\partial ^{2}n} {\partial t^{2}} -\nabla ^{2}n + \nabla ^{2}(\vert u\vert ^{2}) = 0,\quad \end{array} \right. }$$
(2.111)

where the complex-valued function u is associated with a highly oscillating electric field, while the real-valued function n denotes the fluctuation of the plasma-ion density from its equilibrium state. In this section, following [102], we will apply the symmetrized Strang operator-splitting scheme (previously discussed in Section 2.3 of this chapter) to the following generalization of the above equations:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + \nabla ^{2}u + 2\lambda \vert u\vert ^{2}u + 2un = 0,\quad \\ \dfrac{1} {c^{2}} \dfrac{\partial ^{2}n} {\partial t^{2}} -\nabla ^{2}n +\mu \nabla ^{2}(\vert u\vert ^{2}) = 0, \quad \end{array} \right. }$$
(2.112)

where λ and μ are real numbers and c( > 0) is the wave propagation speed. Following again [102], we will assume, for simplicity, that the physical phenomenon modeled by (2.112) takes place on the bounded interval (0, L), with u, n, ∂ u∂ x and ∂ n∂ x space-periodic, during the time interval [0, T]. Thus, (2.112), completed by initial conditions, reduces to:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + \dfrac{\partial ^{2}u} {\partial x^{2}} + 2\lambda \vert u\vert ^{2}u + 2un = 0\ \ \mbox{ in}\ (0,L) \times (0,T),\quad \\ \dfrac{1} {c^{2}} \dfrac{\partial ^{2}n} {\partial t^{2}} -\dfrac{\partial ^{2}n} {\partial x^{2}} +\mu \dfrac{\partial ^{2}} {\partial x^{2}}(\vert u\vert ^{2}) = 0\ \ \mbox{ in}\ (0,L) \times (0,T),\quad \\ u(0,t) = u(L,t),\ \dfrac{\partial u} {\partial x}(0,t) = \dfrac{\partial u} {\partial x}(L,t)\ \ \mbox{ on}\ (0,T), \quad \\ n(0,t) = n(L,t),\ \dfrac{\partial n} {\partial x}(0,t) = \dfrac{\partial n} {\partial x}(L,t)\ \ \mbox{ on}\ (0,T), \quad \\ u(0) = u_{0},\ n(0) = n_{0},\ \dfrac{\partial n} {\partial t} (0) = n_{1}. \quad \end{array} \right. }$$
(2.113)

As done previously in this chapter, we denote by ϕ(t) the function x → ϕ(x, t).

Remark 27.

Albeit considered by some as too simple from a physical point of view, space-periodic boundary conditions are common in plasma physics. They have been used for example in [131], a most celebrated article dedicated to the mathematical analysis of the behavior of plasma entropy (see also [163] which relates a discussion that C. Villani had with E. Lieb concerning precisely the use of space-periodic boundary conditions in plasma physics). □ 

From the rich structure of the Zakharov’s system (2.113) it is not surprising that a variety of operator-splitting schemes can be applied to its numerical solution, several of these schemes being described in [102] (see also the references therein concerning splitting schemes not described in [102]). A first step to the application of operator-splitting scheme to the time-discretization of problem (2.113) is to introduce the function \(p = \dfrac{\partial n} {\partial t}\) and to rewrite (2.113) as:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + \dfrac{\partial ^{2}u} {\partial x^{2}} + 2\lambda \vert u\vert ^{2}u + 2un = 0\ \ \mbox{ in}\ (0,L) \times (0,T), \quad \\ \dfrac{\partial n} {\partial t} - p = 0\ \ \mbox{ in}\ (0,L) \times (0,T), \quad \\ \dfrac{1} {c^{2}} \dfrac{\partial p} {\partial t} -\dfrac{\partial ^{2}n} {\partial x^{2}} +\mu \dfrac{\partial ^{2}} {\partial x^{2}}(\vert u\vert ^{2}) = 0\ \ \mbox{ in}\ (0,L) \times (0,T), \quad \\ u(0,t) = u(L,t),\ \dfrac{\partial u} {\partial x}(0,t) = \dfrac{\partial u} {\partial x}(L,t)\ \ \mbox{ on}\ (0,T), \quad \\ n(0,t) = n(L,t),\ \dfrac{\partial n} {\partial x}(0,t) = \dfrac{\partial n} {\partial x}(L,t),\ p(0,t) = p(L,t)\ \ \mbox{ on}\ (0,T),\quad \\ u(0) = u_{0},\ n(0) = n_{0},\ p(0) = n_{1}. \quad \end{array} \right. }$$
(2.114)

Applying the Strang’s symmetrized scheme to the time-discretization of problem (2.114), one obtains (among other possible schemes, and with t q+α = (q +α)△t):

$$\displaystyle{ \{u^{0},n^{0},p^{0}\} =\{ u_{ 0},n_{0},n_{1}\}. }$$
(2.115)

For q ≥ 0, \(\{u^{q},n^{q},p^{q}\} \rightarrow \{ u^{q+1/2},n^{q+1/2},p^{q+1/2}\} \rightarrow \{\hat{u}^{q+1/2},\hat{n}^{q+1/2},\hat{p}^{q+1/2}\} \rightarrow \{ u^{q+1},n^{q+1},p^{q+1}\}\) via

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + \dfrac{\partial ^{2}u} {\partial x^{2}} = 0\ \ \mbox{ in}\ (0,L) \times (t^{q},t^{q+1/2}), \quad \\ \dfrac{\partial n} {\partial t} -\dfrac{p} {2} = 0\ \ \mbox{ in}\ (0,L) \times (t^{q},t^{q+1/2}), \quad \\ \dfrac{1} {c^{2}} \dfrac{\partial p} {\partial t} -\dfrac{\partial ^{2}n} {\partial x^{2}} = 0\ \ \mbox{ in}\ (0,L) \times (t^{q},t^{q+1/2}), \quad \\ u(0,t) = u(L,t),\ \dfrac{\partial u} {\partial x}(0,t) = \dfrac{\partial u} {\partial x}(L,t)\ \ \mbox{ on}\ (t^{q},t^{q+1/2}), \quad \\ n(0,t) = n(L,t),\ \dfrac{\partial n} {\partial x}(0,t) = \dfrac{\partial n} {\partial x}(L,t),\ p(0,t) = p(L,t)\ \ \mbox{ on}\ (t^{q},t^{q+1/2}),\quad \\ u(t^{q}) = u^{q},\ n(t^{q}) = n_{q},\ p(t^{q}) = p^{q}; \quad \\ u^{q+1/2} = u(t^{q+1/2}),n^{q+1/2} = n(t^{q+1/2}),p^{q+1/2} = p(t^{q+1/2}), \quad \end{array} \right. }$$
(2.116)
$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + 2\lambda \vert u\vert ^{2}u + 2un = 0\ \ \mbox{ in}\ (0,L) \times (0,\bigtriangleup t), \quad \\ \dfrac{\partial n} {\partial t} -\dfrac{p} {2} = 0\ \ \mbox{ in}\ (0,L) \times (0,\bigtriangleup t), \quad \\ \dfrac{1} {c^{2}} \dfrac{\partial p} {\partial t} +\mu \dfrac{\partial ^{2}} {\partial x^{2}}(\vert u\vert ^{2}) = 0\ \ \mbox{ in}\ (0,L) \times (0,\bigtriangleup t), \quad \\ u(0,t) = u(L,t),\ \dfrac{\partial u} {\partial x}(0,t) = \dfrac{\partial u} {\partial x}(L,t)\ \ \mbox{ on}\ (0,\bigtriangleup t), \quad \\ n(0,t) = n(L,t),\ \dfrac{\partial n} {\partial x}(0,t) = \dfrac{\partial n} {\partial x}(L,t),\ p(0,t) = p(L,t)\ \ \mbox{ on}\ (0,\bigtriangleup t),\quad \\ u(0) = u^{q+1/2},\ n(0) = n^{q+1/2},\ p(0) = p^{q+1/2}; \quad \\ \hat{u}^{q+1/2} = u(\bigtriangleup t),\hat{n}^{q+1/2} = n(\bigtriangleup t),\hat{p}^{q+1/2} = p(\bigtriangleup t), \quad \end{array} \right. }$$
(2.117)
$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + \dfrac{\partial ^{2}u} {\partial x^{2}} = 0\ \ \mbox{ in}\ (0,L) \times (t^{q+1/2},t^{q+1}), \quad \\ \dfrac{\partial n} {\partial t} -\dfrac{p} {2} = 0\ \ \mbox{ in}\ (0,L) \times (t^{q+1/2},t^{q+1}), \quad \\ \dfrac{1} {c^{2}} \dfrac{\partial p} {\partial t} -\dfrac{\partial ^{2}n} {\partial x^{2}} = 0\ \ \mbox{ in}\ (0,L) \times (t^{q+1/2},t^{q+1}), \quad \\ u(0,t) = u(L,t),\ \dfrac{\partial u} {\partial x}(0,t) = \dfrac{\partial u} {\partial x}(L,t)\ \ \mbox{ on}\ (t^{q+1/2},t^{q+1}), \quad \\ n(0,t) = n(L,t),\ \dfrac{\partial n} {\partial x}(0,t) = \dfrac{\partial n} {\partial x}(L,t),\ p(0,t) = p(L,t)\ \ \mbox{ on}\ (t^{q+1/2},t^{q+1}),\quad \\ u(t^{q+1/2}) = \hat{u}^{q+1/2},\ n(t^{q+1/2}) = \hat{n}^{q+1/2},\ p(t^{q+1/2}) = \hat{p}^{q+1/2}; \quad \\ u^{q+1} = u(t^{q+1}),n^{q+1} = n(t^{q+1}),p^{q+1} = p(t^{q+1}). \quad \end{array} \right. }$$
(2.118)

Scheme (2.115)–(2.118) is clearly equivalent to

$$\displaystyle{ \{u^{0},n^{0},p^{0}\} =\{ u_{ 0},n_{0},n_{1}\}. }$$
(2.119)

For q ≥ 0, \(\{u^{q},n^{q},p^{q}\} \rightarrow \{ u^{q+1/2},n^{q+1/2},p^{q+1/2}\} \rightarrow \{\hat{u}^{q+1/2},\hat{n}^{q+1/2},\hat{p}^{q+1/2}\} \rightarrow \{ u^{q+1},n^{q+1},p^{q+1}\}\) via

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + \dfrac{\partial ^{2}u} {\partial x^{2}} = 0\ \ \mbox{ in}\ (0,L) \times (t^{q},t^{q+1/2}), \quad \\ \dfrac{2} {c^{2}} \dfrac{\partial ^{2}n} {\partial t^{2}} -\dfrac{\partial ^{2}n} {\partial x^{2}} = 0\ \ \mbox{ in}\ (0,L) \times (t^{q},t^{q+1/2}), \quad \\ u(0,t) = u(L,t),\ \dfrac{\partial u} {\partial x}(0,t) = \dfrac{\partial u} {\partial x}(L,t)\ \ \mbox{ on}\ (t^{q},t^{q+1/2}), \quad \\ n(0,t) = n(L,t),\ \dfrac{\partial n} {\partial x}(0,t) = \dfrac{\partial n} {\partial x}(L,t)\ \ \mbox{ on}\ (t^{q},t^{q+1/2}), \quad \\ u(t^{q}) = u^{q},\ n(t^{q}) = n_{q},\ \dfrac{\partial n} {\partial t} (t^{q}) = p^{q}/2; \quad \\ u^{q+1/2} = u(t^{q+1/2}),n^{q+1/2} = n(t^{q+1/2}),p^{q+1/2} = 2\dfrac{\partial n} {\partial t} (t^{q+1/2}),\quad \end{array} \right. }$$
(2.120)
$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + 2\lambda \vert u\vert ^{2}u + 2un = 0\ \ \mbox{ in}\ (0,L) \times (0,\bigtriangleup t), \quad \\ \dfrac{2} {c^{2}} \dfrac{\partial ^{2}n} {\partial t^{2}} +\mu \dfrac{\partial ^{2}} {\partial x^{2}}(\vert u\vert ^{2}) = 0\ \ \mbox{ in}\ (0,L) \times (0,\bigtriangleup t), \quad \\ u(0,t) = u(L,t),\ \dfrac{\partial u} {\partial x}(0,t) = \dfrac{\partial u} {\partial x}(L,t)\ \ \mbox{ on}\ (0,\bigtriangleup t), \quad \\ n(0,t) = n(L,t),\ \dfrac{\partial n} {\partial x}(0,t) = \dfrac{\partial n} {\partial x}(L,t)\ \ \mbox{ on}\ (0,\bigtriangleup t), \quad \\ u(0) = u^{q+1/2},\ n(0) = n^{q+1/2},\ \dfrac{\partial n} {\partial t} (0) = p^{q+1/2}/2; \quad \\ \hat{u}^{q+1/2} = u(\bigtriangleup t),\hat{n}^{q+1/2} = n(\bigtriangleup t),\hat{p}^{q+1/2} = 2\dfrac{\partial n} {\partial t} (\bigtriangleup t),\quad \end{array} \right. }$$
(2.121)
$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + \dfrac{\partial ^{2}u} {\partial x^{2}} = 0\ \ \mbox{ in}\ (0,L) \times (t^{q+1/2},t^{q+1}), \quad \\ \dfrac{2} {c^{2}} \dfrac{\partial ^{2}n} {\partial t^{2}} -\dfrac{\partial ^{2}n} {\partial x^{2}} = 0\ \ \mbox{ in}\ (0,L) \times (t^{q+1/2},t^{q+1}), \quad \\ u(0,t) = u(L,t),\ \dfrac{\partial u} {\partial x}(0,t) = \dfrac{\partial u} {\partial x}(L,t)\ \ \mbox{ on}\ (t^{q+1/2},t^{q+1}), \quad \\ n(0,t) = n(L,t),\ \dfrac{\partial n} {\partial x}(0,t) = \dfrac{\partial n} {\partial x}(L,t)\ \ \mbox{ on}\ (t^{q+1/2},t^{q+1}), \quad \\ u(t^{q+1/2}) = \hat{u}^{q+1/2},\ n(t^{q+1/2}) = \hat{n}^{q+1/2},\ \dfrac{\partial n} {\partial t} (t^{q+1/2}) = \hat{p}^{q+1/2}/2;\quad \\ u^{q+1} = u(t^{q+1}),n^{q+1} = n(t^{q+1}),p^{q+1} = 2\dfrac{\partial n} {\partial t} (t^{q+1}). \quad \end{array} \right. }$$
(2.122)

The linear Schrödinger and wave equations in (2.120) and (2.122) are uncoupled, implying that they can be solved by a variety of classical spectral or finite difference methods taking advantage of the space-periodic boundary conditions. On the other hand, the nonlinear system (2.121) can be solved pointwise: Indeed, since u and n are real-valued functions, it follows from the first and fifth equations in (2.121) that

$$\displaystyle{ \vert u(x,t)\vert = \vert u^{q+1/2}(x)\vert,\ \forall t \in [0,\bigtriangleup t],\ x \in [0,L]. }$$
(2.123)

It follows then from (2.121) and (2.123) that the solution n in (2.121) is also a solution of the following linear problem

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{\partial ^{2}n} {\partial t^{2}} = -\dfrac{\mu } {2}c^{2} \dfrac{\partial ^{2}} {\partial x^{2}}(\vert u^{q+1/2}\vert ^{2})\ \ \mbox{ in}\ (0,L) \times (0,\bigtriangleup t),\quad \\ n(0,t) = n(L,t)\ \ \mbox{ on}\ (0,\bigtriangleup t), \quad \\ n(0) = n^{q+1/2},\ \dfrac{\partial n} {\partial t} (0) = p^{q+1/2}/2. \quad \end{array} \right. }$$
(2.124)

The closed form solution of (2.124) is given by

$$\displaystyle{ n(x,t) = n^{q+1/2}(x) + \dfrac{1} {2}p^{q+1/2}(x)t- \dfrac{\mu } {4}c^{2} \dfrac{\partial ^{2}} {\partial x^{2}}(\vert u^{q+1/2}\vert ^{2})t^{2}\ \ \mbox{ on}\ (0,L) \times (0,\bigtriangleup t), }$$
(2.125)

implying, in particular, that

$$\displaystyle{ \hat{n}^{q+1/2} = n^{q+1/2} + \dfrac{\bigtriangleup t} {2} p^{q+1/2} - \dfrac{\mu } {4}(c\bigtriangleup t)^{2} \dfrac{\partial ^{2}} {\partial x^{2}}(\vert u^{q+1/2}\vert ^{2}). }$$

Finally, to obtain the u solution of system (2.121), we observe that (n being known from (2.125)) it is the unique solution of the following non-autonomous linear initial value problem

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\partial u} {\partial t} + 2(\lambda \vert u^{q+1/2}\vert ^{2} + n)u = 0\ \ \mbox{ in}\ (0,L) \times (0,\bigtriangleup t),\quad \\ u(0,t) = u(L,t)\ \ \mbox{ on}\ (0,\bigtriangleup t), \quad \\ u(0) = u^{q+1/2}, \quad \end{array} \right. }$$
(2.126)

a particular case of

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i \dfrac{\partial \phi } {\partial t} + 2(\lambda \vert \boldsymbol{\psi }\vert ^{2} + v)\phi = 0\ \ \mbox{ in}\ (0,L) \times (t_{0},t_{f}),\quad \\ \phi (0,t) =\phi (L,t)\ \ \mbox{ on}\ (t_{0},t_{f}), \quad \\ \phi (t_{0}) =\phi _{0}, \quad \end{array} \right. }$$
(2.127)

\(\boldsymbol{\psi }\) (resp., v) being a given complex (resp., real)-valued function of x (resp., of {x, t}). With M ≥ 1 an integer, let us define τ, a time-discretization step, by \(\tau = \dfrac{t_{f} - t_{0}} {M}\), and t m = t 0 + . To solve (2.127) we advocate the following time-discretization scheme of the Crank-Nicolson type:

$$\displaystyle{ \phi ^{0} =\phi _{ 0}. }$$
(2.128)

For m = 0, ⋯ , M − 1, ϕ m → ϕ m+1 via the solution of

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} i\dfrac{\phi ^{m+1}-\phi ^{m}} {\tau } +\left [\lambda \vert \boldsymbol{\psi }\vert ^{2}+ \dfrac{v(t^{m+1}) + v(t^{m})} {2} \right ] (\phi ^{m+1}+\phi ^{m}) = 0\ \ \mbox{ in}\ (0,L),\quad \\ \phi ^{m+1}(0) =\phi ^{m+1}(L). \quad \end{array} \right. }$$
(2.129)

Problem (2.129), can be solved point-wise (in practice at the grid-points of a finite difference one- dimensional “grid”). Scheme (2.128)–(2.129) is second-order accurate and modulus preserving (that is, verifies | ϕ m+1 |  =  | ϕ m | , \(\forall m = 0,\ldots,M - 1\) ). On [0, L], ϕ m+1(x) is obtained via the solution of a 2 × 2 linear system (for those who prefer to use real arithmetic).

Remark 28.

In [102], one advocates using instead of n the function nμ | u | 2. The numerical results reported in the above publication clearly show that operator-splitting provides efficient methods for the numerical solution of the Zakharov’s system (2.112).

6 Applications of Augmented Lagrangian and ADMM Algorithms to the Solution of Problems from Imaging

6.1 Variational Models for Image Processing

6.1.1 Generalities

Usually, image processing refers to the processing and analysis of digital images. Variational models have become an essential part of image processing, such models relying on the minimization of a well-chosen energy functional, the minimization problem reading typically as

$$\displaystyle{ u = \mbox{ arg min}_{\mathit{v}\in V }\left [E_{fitting}(\mathit{v}) + E_{regularizing}(\mathit{v})\right ]. }$$
(2.130)

As shown above, the energy functional has two parts, namely a fitting part and a regularizing one. In the following we will present various variational image processing models and show that the operator-splitting and ADMM methodology provides efficient methods for the numerical solution of the related minimization problems. We will start our discussion with the well-known Rudin-Osher-Fatemi (ROF) model, and then follow with the presentation of some higher order models. Before going into more details, some remarks are in order, namely:

Remark 29.

Most of the models we are going to consider below are not fully understood yet from a mathematical point of view, two of the main issues being, in (2.130), the choice of the space V and the weak-continuity properties of the energy functional. This will not prevent us to use these continuous models, for the simplicity of their formalism which facilitates the derivation of algorithms whose discrete analogues have provable convergence properties.

Remark 30.

For image processing problems, the computational domain is always a rectangle, the image pixels providing a natural mesh for space discretization. This particularity makes easy, in general, the finite difference discretization of problem (2.130) and the implementation of iterative solution algorithms. The methodology we are going to discuss is not restricted to rectangular domains, however for domains with curved boundaries using finite-difference discretization may become complicated near the boundary; an elegant way to overcome this difficulty is to employ finite element approximations, as done in, e.g., [133].

Remark 31.

A very detailed analysis of ADMM algorithms for the solution of image processing problems can be found in the chapter of this book by M. Burger, A. Sawatzky & G. Steidl (Chapter 10).

6.1.2 Total Variation and the ROF Model

One of the most popular variational models for image processing was proposed by Rudin, Osher, and Fatemi in their seminal work (ROF model) [144]. In [144], a denoised image is obtained by minimizing the following energy functional

$$\displaystyle{ E(\mathit{v}) = \dfrac{1} {2}\int _{\varOmega }\vert f -\mathit{v}\vert ^{2}\,dx +\eta \int _{\varOmega }\vert \nabla \mathit{v}\vert \,dx, }$$
(2.131)

where: dx = dx 1 dx 2, f: Ω → I​​R is a given noisy image defined on Ω, Ω  | ∇v | dx stands for the total variation of the trial function v (see [157] and [169] for a definition of the notion of total variation), and η > 0 is a positive tuning parameter controlling how much noise will be removed. The remarkable feature of the ROF model lies in its effectiveness in preserving object edges while removing noise. In fact, the total variation regularizer has been widely employed to accomplish other image processing tasks such as deblurring, segmentation, and registration.

In order to incorporate more geometrical information into the regularizer, a number of higher order regularization models have been proposed and used for image processing and computer vision problems. The ROF model has several unfavorable features. The main caveat is the stair-case effect, that is, the resulting cleaned image would present blocks even though the desired image may be smooth. Other undesirable properties include corner smearing and loss of image contrast. To remedy these drawbacks, a very rich list of results exists in the literature, see [2, 31, 120, 182, 185]. Despite the effectiveness of these models in removing the staircase effect, it is often a challenging issue to minimize the corresponding functionals. Note that if the functional E contains second-order derivatives of v, the related Euler-Lagrange equation is a fourth-order linear or nonlinear partial differential equation.

6.1.3 Regularization Using TV 2

In [120], Lysaker et al. directly incorporated second order derivative information into the image denoising process, by proposing to minimize the following energy functional

$$\displaystyle{ E(\mathit{v}) = \dfrac{1} {2}\int _{\varOmega }\vert f -\mathit{v}\vert ^{2}\,dx +\eta \int _{\varOmega }\sqrt{(\mathit{v} _{ x_{1}x_{1}})^{2} + 2(\mathit{v}_{x_{ 1}x_{2}})^{2} + (\mathit{v}_{x_{ 2}x_{2}})^{2}}\,dx }$$
(2.132)

This higher order energy functional is much simpler than the Elastica regularizer that we shall introduce later. Numerically, this regularizer shows rather good performance with noise suppression and edge preservation. In the literature, there exists quite a number of related models, see [20, 24, 25, 26, 28, 39, 53, 58, 89, 99, 101, 134, 138, 147, 171, 146, 180]. The well posedness of the variational problem associated with the energy functional in (2.132), and its gradient flow equation, have been studied in [88, 130]. High order models, such as the one associated with the energy in (2.132), have been discussed in, e.g., [15, 24, 32, 149, 176].

6.1.4 Regularization Using the Euler’s Elastica Energy

In order to ‘clean’ a given function f: Ω → I​​R, the Euler’s Elastica model relies on the minimization of the following energy functional

$$\displaystyle{ E(\mathit{v}) = \dfrac{1} {2}\int _{\varOmega }\vert f -\mathit{v}\vert ^{2}\,dx +\int _{\varOmega }\left [a + b\left \vert \nabla \cdot \dfrac{\nabla \mathit{v}} {\vert \nabla \mathit{v}\vert }\right \vert ^{2}\right ]\vert \nabla \mathit{v}\vert \,dx. }$$
(2.133)

In (2.133), a and b are non-negative with a + b > 0. These two constants have to be chosen properly, depending of the application under consideration. The image processing model associated with the above energy functional comes from the Euler’s Elastica energy for curves (see [31, 124] for the derivation of this energy): indeed, for a given curve Γ ⊂ I​​R 2 with curvature κ, the Euler’s Elastica energy is defined (with obvious notation) by Γ (a + 2) ds. For a function v, the curvature of the level curve Γ c : = { x | v(x) = c} is \(\kappa = \nabla \cdot \dfrac{\nabla \mathit{v}} {\vert \nabla \mathit{v}\vert }\) (if ∇v0). Thus, the Euler’s Elastica energy for the level curve Γ c is given by

$$\displaystyle{ l(c) =\int _{\varGamma _{c}}\left [a + b\left \vert \nabla \cdot \dfrac{\nabla \mathit{v}} {\vert \nabla \mathit{v}\vert }\right \vert ^{2}\right ]\,ds. }$$

Summing up (integrating) the Euler’s Elastica energy over all the level curves Γ c , it follows from the co-area formula (see [168]) that the total Euler’s Elastica energy is given by

$$\displaystyle{ \int _{-\infty }^{\infty }l(c)\,dc =\int _{ -\infty }^{\infty }\int _{ \varGamma _{c}}\left [a + b\left \vert \nabla \cdot \dfrac{\nabla \mathit{v}} {\vert \nabla \mathit{v}\vert }\right \vert ^{2}\right ]\,dsdc =\int _{\varOmega }\left [a + b\left \vert \nabla \cdot \dfrac{\nabla \mathit{v}} {\vert \nabla \mathit{v}\vert }\right \vert ^{2}\right ]\vert \nabla \mathit{v}\vert \,dx. }$$

6.1.5 Regularization Using the Image Graph Mean Curvature

In [182], the authors proposed a variational image processing model making use of the mean curvature of the graph of function f, that is of the surface {x, y, z = f(x, y)}, to remove the noise. More specifically, the model considered in [182] employs the L 1 norm of the mean curvature of the above graph as a regularizer, the associated energy functional being defined by

$$\displaystyle{ E(\mathit{v}) = \dfrac{1} {2}\int _{\varOmega }\vert f -\mathit{v}\vert ^{2}\,dx +\eta \int _{\varOmega }\left \vert \nabla \cdot \dfrac{\nabla \mathit{v}} {\sqrt{1 + \vert \nabla \mathit{v} \vert ^{2}}}\right \vert \,dx. }$$
(2.134)

Above, η( > 0) is a tuning parameter and the term \(\dfrac{\nabla \mathit{v}} {\sqrt{1 + \vert \nabla \mathit{v} \vert ^{2}}}\) is the mean curvature of the surface ϕ(x, y, z) = 0 with ϕ(x, y, z) = u(x, y) − z. Clearly, the model tries to fit the given noisy image surface {x, y, z = f(x, y)} with a surface {x, y, z = u(x, y)}, u being a minimizer of the L 1-mean curvature energy functional (2.134). This idea goes back to much earlier publications, [108] for example. The model can sweep noise while keeping object edges, and it also avoids the staircase effect. More importantly, as discussed in [185], the model is also capable of preserving image contrasts as well as object corners.

6.1.6 Interface Problems: Chan-Vese Segmentation Model, Labeling Techniques, Min-Cut, and Continuous Max-Flow

In image processing, computer vision, etc., one encounters operations more complicated than denoising, segmentation being one of them. These applications require mathematical models more complicated (in some sense) than those considered in Sections 6.1.2 to 6.1.5, one of them being the Chan-Vese model introduced in [33]. Actually (as obvious from [33]), the snake and active contour model (ref. [106]) and the Mumford-Shah model (ref. [132]) can be viewed as ancestors of the Chan-Vese model. Using the notation of [33], the Chan-Vese segmentation model relies on the minimization of the following energy functional:

$$\displaystyle\begin{array}{rcl} E_{CV }(\phi,d_{1},d_{2})& =& \lambda _{1}\int _{\varOmega }\vert f - d_{1}\vert ^{2}H(\phi )\,dx +\lambda _{ 2}\int _{\varOmega }\vert f - d_{2}\vert ^{2}[1 - H(\phi )]\,dx \\ & & +\mu \int _{\varOmega }\vert \nabla H(\phi )\vert \,dx +\nu \int _{\varOmega }H(\phi )\,dx, {}\end{array}$$
(2.135)

where in (2.135): (i) ϕ is a level set function whose zero level curves set represents the segmentation boundary. (ii) H(⋅ ) is the Heaviside function. (iii) d 1 and d 2 are two real numbers. (iv) λ 1, λ 2 and μ (resp., ν ) are positive (resp., non-negative) tuning parameters (in many applications, one takes λ 1 = λ 2 = 1). The Euler-Lagrange equation associated with the minimization of the functional in (2.135) has been derived in [33]. In the above reference the associated gradient flow has been time-discretized by an explicit scheme to compute the solution of the above minimization problem (after an appropriate finite difference space discretization). Operator- splitting and ADMM can be used to develop algorithms with much faster convergence properties than the above explicit schemes; we will return on this issue in Section 6.2. Let us denote H(ϕ) by v; there is clearly equivalence between minimizing the functional defined by (2.135) and

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \inf _{\{\mathit{v},d_{1},d_{2}\}\in V \times I\!\!R\times I\!\!R}\ [\lambda _{1}\int _{\varOmega }\vert f - d_{1}\vert ^{2}\mathit{v}\,dx +\lambda _{ 2}\int _{\varOmega }\vert f - d_{2}\vert ^{2}[1 -\mathit{v}]\,dx\quad \\ \qquad \qquad \qquad \qquad +\mu \int _{\varOmega }\vert \nabla \mathit{v}\vert \,dx +\nu \int _{\varOmega }\mathit{v}\,dx], \quad \end{array} \right. }$$
(2.136)

where V = { v | v ∈ L (Ω), v(x) ∈ { 0, 1}, a.e. in Ω, ∇v ∈ L 1(Ω)}. The model associated with (2.136) was proposed in [117] and referred as a binary level set based model. More generally, we can consider the minimization, over the above set V, of energy functionals such as E potts defined by

$$\displaystyle{ E_{potts}(\mathit{v}) =\int _{\varOmega }f_{1}\mathit{v}\,dx +\int _{\varOmega }f_{2}[1 -\mathit{v}]\,dx +\int _{\varOmega }g\vert \nabla \mathit{v}\vert \,dx, }$$
(2.137)

where f 1 and f 2 are given functions indicating the possibility that a point belongs to phase 0 or to phase 1, and where g is a non-negative function, possibly constant; if d 1 and d 2 are fixed in (2.136), the Chan-Vase model becomes a particular case of the model associated with the functional E potts defined by (2.137). It was recently observed (see [173, 175]) that minimizing E potts over the above V is a (kind of) continuous min-cut problem, itself equivalent (by duality) to a max-flow problem. Indeed, let us consider the following continuous max-flow problem

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \sup _{q_{s},q_{f},\mathbf{v}}\int _{\varOmega }q_{s}\,dx\ \ \mbox{ subject to} \quad \\ q_{s} \leq f_{1},q_{t} \leq f_{2},\vert \mathbf{v}\vert \leq g, \quad \\ \nabla \cdot \mathbf{v} = q_{s} - q_{t}\ \mbox{ in}\ \varOmega,\mathbf{v} \cdot \mathbf{n} = 0\ \mbox{ on}\ \varGamma (= \partial \varOmega ),\quad \end{array} \right. }$$
(2.138)

where in (2.138)): (i) v = { v 1, v 2} and \(\vert \mathbf{v}\vert = \sqrt{\mathit{v} _{1 }^{2 } + \mathit{v} _{2 }^{2}}\), v being the flow inside Ω. (ii) n is the unit outward vector normal at Γ. (iii) q s (resp., q t ) represents a flow from a source (resp., to a sink). (iv) f 1 and f 2 are as in (2.137). We can also define | v | by | v | : =  | v 1 | + | v 2 | ; if we do so, the discretized max-flow problem can be solved by traditional graph cut methods. It follows from [175] that a dual of the max flow problem (2.138) reads as:

$$\displaystyle{ \inf _{\mu \in \varLambda }\left [\int _{\varOmega }f_{1}(1-\mu )\,dx +\int _{\varOmega }f_{2}\mu \,dx +\int _{\varOmega }g\vert \nabla \mu \vert \,dx\right ], }$$
(2.139)

where Λ = {μ | μ ∈ L (Ω), 0 ≤ μ(x) ≤ 1, a.e. in Ω} ∩ W 1, 1(Ω). We have recovered thus the functional E potts from (2.137) and shown a link between the Chan-Vese model and the max-flow problem. The dual problem (2.139) is known as a (continuous) min-cut problem. Actually, Chan, Esedoglu and Nikolova have shown in [29] that there is equivalence between (2.139) and minimizing over V = { v | v ∈ L (Ω), v(x) ∈ { 0, 1},  a.e. in Ω, ∇v ∈ L 1(Ω)} the functional E potts defined by (2.137), a most remarkable result indeed since problem (2.139) is a convex variational problem whose discrete variants can be solved by ADMM type algorithms (see [5, 6, 7, 8, 114, 173, 174, 175, 178] for more details and generalizations).

Remark 32.

In (2.136), (2.138) and (2.139), it is on purpose that we used inf (resp., sup) instead of min (resp., max) since we have no guarantee that the minimizing sequences of the functionals under consideration will converge weakly in the space or set where the minimization takes place.

Remark 33.

Suppose that in (2.138) we replace the constraint | v | ≤ g by | v 1 | ≤ g 1 and | v 2 | ≤ g 2, everything else being the same; then, the dual problem of the associated variant of (2.138) reads (with Λ as in (2.139)) as

$$\displaystyle{ \inf _{\mu \in \varLambda }\left [\int _{\varOmega }f_{1}(1-\mu )\,dx +\int _{\varOmega }f_{2}\mu \,dx +\int _{\varOmega }\left (g_{1}\left \vert \dfrac{\partial \mu } {\partial x_{1}}\right \vert + g_{2}\left \vert \dfrac{\partial \mu } {\partial x_{2}}\right \vert \right )\,dx\right ], }$$

clearly a close variant of (2.139). Similarly, if we replace in (2.138) the constraint | v | ≤ g by | v 1 | + | v 2 | ≤ g, we obtain (as expected) the following dual problem

$$\displaystyle{ \inf _{\mu \in \varLambda }\left [\int _{\varOmega }f_{1}(1-\mu )\,dx +\int _{\varOmega }f_{2}\mu \,dx +\int _{\varOmega }g\ \sup \left (\left \vert \dfrac{\partial \mu } {\partial x_{1}}\right \vert,\left \vert \dfrac{\partial \mu } {\partial x_{2}}\right \vert \right )\,dx\right ], }$$

the set Λ being as above.

6.1.7 Segmentation Models with Higher Order Regularization

As could have been expected, first order segmentation models have limitations (discussed in [132]). To give an example let us consider the situation depicted in Figure 2.8(a) where some parts of the four letters have been erased: albeit one can easily recognize the four letters, first order segmentation models such as Chan-Vese’s, might often capture the existing boundary instead of restoring the missing ones, as illustrated in Figure 2.8(b). In inpainting problems (see [31, 124]), missing image information is also recovered, but within given regions assigned in advance. In contrast, one would like to have a segmentation model that can interpolate the missing boundaries automatically without specifying the region of interest. To this end, one may employ the Euler’s Elastica functional as a novel regularization term in the Chan-Vese’s model (2.135), in order to replace the weighted TV term. Doing so we obtain the following energy functional (we assume ν = 0, here):

$$\displaystyle\begin{array}{rcl} E_{CV E}(\phi,d_{1},d_{2})& =& \lambda _{1}\int _{\varOmega }\vert f - d_{1}\vert ^{2}H(\phi )\,dx +\lambda _{ 2}\int _{\varOmega }\vert f - d_{2}\vert ^{2}[1 - H(\phi )]\,dx \\ & & +\int _{\varOmega }\left [a + b\left (\nabla \cdot \dfrac{\nabla \phi } {\vert \nabla \phi \vert }\right )^{2}\right ]\vert \nabla H(\phi )\vert \,dx {}\end{array}$$
(2.140)

where λ 1, λ 2, a and b are positive parameters. If ϕ is the signed distance level set function, it can be proved that the last term in (2.140) is equal to the Euler’s elastica energy of the segmentation curve. This regularization was originally proposed and used in the celebrated paper on segmentation with depth by Nitzberg, Mumford, and Shiota (ref. [135]). Actually, it has also been used in [31] (resp., [183, 184]) for the solution of the in-painting (resp., illusory contour) problem. In [146], linear programming was used to minimize (after space discretization) curvature dependent functionals, the functional defined by (2.140) being one of those considered in this article.

Fig. 2.8
figure 8

Broken letters “UCLA” and its connected segmentation.

Remark 34.

Observe that since (formally at least, but this can be justified using a well-chosen regularization of the Heaviside function, such as \(\xi \rightarrow \frac{1} {2}\left [1 + \frac{\xi } {\sqrt{\epsilon ^{2 } +\xi ^{2}}} \right ]\)) \(\dfrac{\nabla \phi } {\vert \nabla \phi \vert } = \dfrac{\nabla H(\phi )} {\vert \nabla H(\phi )\vert }\), only the sign H(ϕ) of the function ϕ is needed when solving the segmentation problem via the functional in (2.140). This property suggests, as done in [117], to use a binary level set representation via the introduction of the function v = H(ϕ). Such a change of function was also used in [29] for finding the global minimizer associated with the Chan-Vese’s model. More general binary level set representations with global minimization techniques have been developed (see, e.g., [7, 173, 174, 175, 177]) using the relationships existing between graph cuts, binary labeling and continuous max flow problems. Since \(\nabla \cdot \dfrac{\nabla \phi } {\vert \nabla \phi \vert } = \nabla \cdot \dfrac{\nabla H(\phi )} {\vert \nabla H(\phi )\vert }\), one can rewrite the functional in (2.140) as

$$\displaystyle\begin{array}{rcl} E_{(}\mathit{v},d_{1},d_{2})& =& \lambda _{1}\int _{\varOmega }\vert f - d_{1}\vert ^{2}\mathit{v}\,dx +\lambda _{ 2}\int _{\varOmega }\vert f - d_{2}\vert ^{2}[1 -\mathit{v}]\,dx \\ & & +\int _{\varOmega }\left [a + b\left (\nabla \cdot \dfrac{\nabla \mathit{v}} {\vert \nabla \mathit{v}\vert }\right )^{2}\right ]\vert \nabla \mathit{v}\vert \,dx {}\end{array}$$
(2.141)

with the values taken by v being either 0 or 1. Strictly speaking the mean curvature of the graph makes sense for “smooth” functions only; to fix this issue, one relaxes the above binary restriction by replacing it by 0 ≤ v ≤ 1, a less constraining condition indeed.

6.2 Fast Numerical Algorithms for Variational Image Processing Models Based on Operator- Splitting and Augmented Lagrangian Methods (ALM)

In this section, we will present operator-splitting and ALM based fast numerical algorithms, for the numerical treatment of variational image processing models.

6.2.1 Parallel Splitting Schemes for the ROF Model

The first model that we are going to consider is the ROF model discussed in Section 6.1.2. The formal Euler-Lagrange equation associated with the minimization of the (strictly convex) functional in (2.131) reads as

$$\displaystyle{ -\eta \nabla \cdot \dfrac{\nabla u} {\vert \nabla u\vert } + u = f\ \mbox{ in}\ \varOmega,\ \dfrac{\nabla u} {\vert \nabla u\vert }\cdot \mathbf{n} = 0\ \mbox{ on}\ \partial \varOmega, }$$
(2.142)

with n the outward unit vector normal at ∂ Ω. In order to solve the nonlinear non-smooth elliptic equation (2.142) we associate with it an initial value problem and look for steady-state solutions. We consider thus

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{\partial u} {\partial t} -\eta \nabla \cdot \dfrac{\nabla u} {\vert \nabla u\vert } + u = f\ \mbox{ in}\ \varOmega \times (0,+\infty ),\quad \\ \dfrac{\nabla u} {\vert \nabla u\vert }\cdot \mathbf{n} = 0\ \mbox{ on}\ \partial \varOmega \times (0,+\infty ), \quad \\ u(0) = u_{0}, \quad \end{array} \right. }$$
(2.143)

an obvious choice for u 0 in (2.143) being u 0 = f. Actually to overcome the difficulty associated with the non-smoothness of the elliptic operator in (2.142) and (2.143), we consider the following regularized variant of (2.143):

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{\partial u} {\partial t} -\eta \nabla \cdot \dfrac{\nabla u} {\sqrt{\vert \nabla u\vert ^{2 } +\epsilon ^{2}}} + u = f\ \mbox{ in}\ \varOmega \times (0,+\infty ),\quad \\ \dfrac{\partial u} {\partial n} = 0\ \mbox{ on}\ \partial \varOmega \times (0,+\infty ), \quad \\ u(0) = u_{0}, \quad \end{array} \right. }$$
(2.144)

with ε a small positive number. The simplest time-stepping scheme we can think about to capture the steady state solution of (2.144) is clearly the forward-Euler scheme. Let △t( > 0) be a time-discretization step; applied to the solution of (2.144) the forward Euler scheme produces the following algorithm:

$$\displaystyle{ u^{0} = u_{ 0}. }$$
(2.145)

For n ≥ 0, u n → u n+1 via

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{u^{n+1} - u^{n}} {\bigtriangleup t} -\eta \nabla \cdot \dfrac{\nabla u^{n}} {\sqrt{\vert \nabla u^{n } \vert ^{2 } +\epsilon ^{2}}} + u^{n} = f\ \mbox{ in}\ \varOmega,\quad \\ \dfrac{\partial u^{n+1}} {\partial n} = 0\ \mbox{ on}\ \partial \varOmega. \quad \end{array} \right. }$$
(2.146)

In practice, scheme (2.145)–(2.146) is applied to a discrete variant of (2.144) obtained by finite difference or finite element space discretization. Scheme (2.145)–(2.146) being explicit and the condition number of the operator in (2.146) rather large, its conditional stability requires small time steps leading to a slow convergence to a steady state solution. Suppose that Ω is the rectangle (0, a) × (0, b); in order to improve the speed of convergence to a steady state solution, we are going to apply to the solution of (2.144) the parallelizable operator-splitting scheme discussed in Section 2.8, taking advantage of the following decomposition of the operator in (2.144)

$$\displaystyle{ -\eta \nabla \cdot \dfrac{\nabla u} {\sqrt{\vert \nabla u\vert ^{2 } +\epsilon ^{2}}} + u - f = A_{1}(u) + A_{2}(u), }$$
(2.147)

with

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} A_{1}(u) = -\eta \dfrac{\partial } {\partial x_{1}}\left ( \dfrac{ \dfrac{\partial u} {\partial x_{1}}} {\sqrt{\vert \nabla u\vert ^{2 } +\epsilon ^{2}}}\right ) + \dfrac{1} {2}(u - f),\quad \\ A_{2}(u) = -\eta \dfrac{\partial } {\partial x_{2}}\left ( \dfrac{ \dfrac{\partial u} {\partial x_{2}}} {\sqrt{\vert \nabla u\vert ^{2 } +\epsilon ^{2}}}\right ) + \dfrac{1} {2}(u - f).\quad \end{array} \right. }$$
(2.148)

Combining the scheme we mentioned just above with a semi-explicit time discretization of the nonlinear terms we obtain

$$\displaystyle{ u^{0} = u_{ 0}. }$$
(2.149)

For n ≥ 0, u n → { u n+1∕4, u n+2∕4} → u n+1 via

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &\dfrac{u^{n+1/4} - u^{n}} {2\bigtriangleup t} -\eta \dfrac{\partial } {\partial x_{1}}\left ( \dfrac{\dfrac{\partial u^{n+1/4}} {\partial x_{1}} } {\sqrt{\vert \nabla u^{n } \vert ^{2 } +\epsilon ^{2}}}\right ) + \dfrac{u^{n+1/4}} {2} = \dfrac{f} {2} \ \mbox{ in}\ \varOmega, \\ \quad &\dfrac{\partial u^{n+1/4}} {\partial x_{1}} (0,x_{2}) = \dfrac{\partial u^{n+1/4}} {\partial x_{1}} (a,x_{2}) = 0\ \forall x_{2} \in (0,b), \end{array} \right. }$$
(150.1)
$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &\dfrac{u^{n+2/4} - u^{n}} {2\bigtriangleup t} -\eta \dfrac{\partial } {\partial x_{2}}\left ( \dfrac{\dfrac{\partial u^{n+2/4}} {\partial x_{2}} } {\sqrt{\vert \nabla u^{n } \vert ^{2 } +\epsilon ^{2}}}\right ) + \dfrac{u^{n+2/4}} {2} = \dfrac{f} {2} \ \mbox{ in}\ \varOmega, \\ \quad &\dfrac{\partial u^{n+2/4}} {\partial x_{2}} (x_{1},0) = \dfrac{\partial u^{n+2/4}} {\partial x_{2}} (x_{1},b) = 0\ \forall x_{1} \in (0,a), \end{array} \right. }$$
(150.2)
$$\displaystyle{ u^{n+1} = \dfrac{1} {2}(u^{n+1/4} + u^{n+2/4}). }$$
(2.151)

Scheme (2.149)–(2.151) can accommodate large time steps implying a fast convergence to steady state solutions. It preserves also the symmetry of the images. Moreover since in most applications Ω is a rectangle with the image pixels uniformly distributed on it, it makes sense to use a finite difference discretization on a uniform Cartesian grid to approximate (150.1) and (150.2). For Dirichlet or Neumann boundary conditions, the finite difference discretization of (150.1) and (150.2) will produce two families of uncoupled tri-diagonal linear systems easily solvable (the good parallelization properties of the above scheme are quite obvious). The above operator-splitting scheme can be generalized to the numerical treatment of other variational models (such as Chan-Vese’s, and to models involving derivatives of order higher than one, as shown in, e.g., [89]). A closely related scheme is discussed in [167].

6.2.2 A Split-Bregman Method and Related ADMM Algorithm for the ROF Model

In ref. [86], T. Goldstein and S. Osher proposed and tested a fast converging iterative method for the ROF model: this algorithm, of the split-Bregman type, is certainly one of the fastest numerical methods for the ROF model. It was quickly realized (see [156, 170, 172]) that the Bregman algorithm discussed in [86] is equivalent to an ADMM one. Here, we will explain the ideas in an informal way using the continuous model whose formalism is much simpler. As stated in Remark 29, to make our discussion more rigorous mathematically, the functional spaces for which the continuous model makes sense have to be specified (here, they are of the bounded variation type). This difficulty is one of the reasons explaining why some authors (as in [170]) consider discrete models, directly.

Let us denote ∇u by p; then, it is easy to see that (from (2.131)) the ROF model is equivalent to the following linearly constrained minimization problem:

$$\displaystyle{ \{u,\mathbf{p}\} = \mbox{ arg min}_{\mathop{\nabla \mathit{v}-\mathbf{q}=0}\limits^{\{\mathit{v},\mathbf{q}\}}}\left [\eta \int _{\varOmega }\vert \mathbf{q}\vert \,dx + \dfrac{1} {2}\int _{\varOmega }\vert \mathit{v} - f\vert ^{2}\,dx\right ]. }$$
(2.152)

Clearly, problem (2.152) belongs to the family of variational problems discussed in Section 3.2, the associated augmented Lagrangian being defined (with r > 0) by (see, e.g., [72] (Chapter 4)):

$$\displaystyle\begin{array}{rcl} \mathcal{L}_{rof}(\mathit{v},\mathbf{q};\boldsymbol{\mu })& =& \eta \int _{\varOmega }\vert \mathbf{q}\vert \,dx + \dfrac{1} {2}\int _{\varOmega }\vert \mathit{v} - f\vert ^{2}\,dx \\ & & +\dfrac{r} {2}\int _{\varOmega }\vert \nabla \mathit{v} -\mathbf{q}\vert ^{2}\,dx +\int _{\varOmega }\boldsymbol{\mu } \cdot (\nabla \mathit{v} -\mathbf{q})\,dx.{}\end{array}$$
(2.153)

Above, u: Ω → I​​R denotes the restored image we are looking for, p = ∇u, \(\boldsymbol{\mu }\) is a Lagrange multiplier. Due to the strict convexity of the second term, the discrete analogues of the minimization problem (2.152) have a unique solution. Applying algorithm ALG2 of Section 3.2.2 to the solution of (2.152) we obtain the following

Algorithm 6.1: An augmented Lagrangian method for the ROF model

0. Initialization: \(\boldsymbol{\lambda }^{0} = \mathbf{0}\), u 0 = f.

For k = 0, 1, ⋯ , until convergence:

1. Compute p k+1 from

$$\displaystyle{ \mathbf{p}^{k+1} = \mbox{ arg min}_{\mathbf{ q}\in (L^{2}(\varOmega ))^{2}}\mathcal{L}_{rof}(u^{k},\mathbf{q};\boldsymbol{\lambda }^{k}). }$$
(2.154)

2. Compute u k+1 from

$$\displaystyle{ u^{k+1} = \mbox{ arg min}_{\mathit{ v}\in H^{1}(\varOmega )}\mathcal{L}_{rof}(\mathit{v},\mathbf{p}^{k+1};\boldsymbol{\lambda }^{k}). }$$
(2.155)

3. Update \(\boldsymbol{\lambda }^{k}\) by

$$\displaystyle{ \boldsymbol{\lambda }^{k+1} = \boldsymbol{\lambda }^{k} + r(\nabla u^{k+1} -\mathbf{p}^{k+1}). }$$
(2.156)

It was observed in [156, 170] that this augmented Lagrangian algorithm is equivalent to the split-Bregman algorithm discussed in [86]. This equivalence is also explained in [172] for compressive sensing models. The minimization sub-problems (2.154) have closed form solutions which can be computed point-wise; solving them is thus quite easy. The minimization sub-problems (2.155) (in fact their discrete analogues) reduce to discrete well-posed linear Neumann problems; the associated matrix being symmetric, positive definite and sparse, these discrete elliptic problems can be solved by a large variety of direct and iterative methods (among them: sparse Cholesky, multi-level, Gauss-Seidel, conjugate gradient, FFT, etc.; see [170, 172] for more details). The convergence of algorithm (2.154)–(2.156) is discussed in [170].

Remark 35.

As described above, Algorithm 6.1 is largely formal since it operates in the space W = [H 1(Ω) × (L 2(Ω))2] × (L 2(Ω))2, although the solution u of problem (2.131) may not have enough regularity to belong to H 1(Ω). However, Algorithm 6.1 makes sense for the discrete analogues of problem (2.131) and space W obtained by finite difference or finite element approximation; for finite element approximations in particular, the formalisms of Algorithm 6.1 and of its discrete counterparts are nearly identical. The above observation applies to most of the ADMM algorithms described below (see Remark 36, for example).

Remark 36.

As shown in, e.g., [109] (for image denoising applications), Algorithm 6.1 is easy to modify in order to handle those situations where the functional Ω  | ∇v | dx is replaced by \(\frac{1} {s}\int _{\varOmega }\vert \nabla \mathbf{v}\vert ^{s}\,dx\) with 0 < s < 1, or by other non-convex functionals of | ∇v | ; once discretized, these modifications of Algorithm 6.1 perform very well as shown in [109].

Remark 37.

It is easy to extend algorithm (2.154)–(2.156) to the solution of the min-cut problem (2.139), since the additional constraint encountered in this last problem, namely 0 ≤ μ(x) ≤ 1, a.e. in Ω, is (relatively) easy to treat; actually, this extension has been done in [22] (see also [4, 27], and Section 6.2.3, below, for a number of related new approaches). As shown in [170] (page 320), and [142, 143], it is also easy to extend algorithm (2.154)–(2.156) to those situations where one uses vector-TV regularization in order to process vector- valued data.

6.2.3 An Augmented Lagrangian Method for the Continuous Min-Cut and Max-Flow Problems

The continuous max-flow problems (2.138) and (2.139) are dual to each other in the sense that if the function λ is solution of (2.139), it is a Lagrange multiplier for the flow conservation equation in (2.138). We can solve both problems simultaneously using a primal-dual method à la ALG2 relying on the following augmented Lagrangian functional

$$\displaystyle{ \mathcal{L}_{c}(q_{s},q_{t},\mathbf{v};\mu ) = -\int _{\varOmega }q_{s}\,dx -\int _{\varOmega }\mu (\nabla \cdot \mathbf{v} - q_{s} + q_{t})\,dx + \dfrac{r} {2}\int _{\varOmega }(\nabla \cdot \mathbf{v} - q_{s} + q_{t})^{2}\,dx, }$$
(2.157)

where in (2.157): r > 0, and q s , q t and v verify, a.e. in Ω, q s  ≤ f 1, q t  ≤ f 2, | v | ≤ g; here \(\vert \mathbf{v}\vert = \sqrt{\mathit{v} _{1 }^{2 } + \mathit{v} _{2 }^{2}}\), \(\forall \mathbf{v} =\{ \mathit{v}_{1},\mathit{v}_{2}\}\). Applying ALG2 to the computation of the saddle-points of \(\mathcal{L}_{c}\) over the set (Q 1 × Q 2 × K) × L 2(Ω), where Q 1 = { q | q ∈ L 2(Ω), q ≤ f 1}, Q 2 = { q | q ∈ L 2(Ω), q ≤ f 2}, and K = { v | v ∈ (L 2(Ω))2, ∇⋅ v ∈ L 2(Ω), v ⋅ n = 0 on Γ,   | v | ≤ g}, we obtain

Algorithm 6.2: An augmented Lagrangian method for the continuous max-flow problem

0. Initialization: λ 0 = 0, p s 0 = f 1, p t 0 = f 2.

For k = 0, 1, ⋯ , until convergence:

1. Compute u k+1 from

$$\displaystyle{ \mathbf{u}^{k+1} = \mbox{ arg min}_{\mathbf{ v}\in K}\mathcal{L}_{c}(p_{s}^{k},p_{ t}^{k},\mathbf{v};\lambda ^{k}). }$$
(2.158)

2. Compute {p s k+1, p t k+1} from

$$\displaystyle{ \{p_{s}^{k+1},p_{ t}^{k+1}\} = \mbox{ arg min}_{\{ q_{s},q_{t}\}\in Q_{1}\times Q_{2}}\mathcal{L}_{c}(q_{s},q_{t},\mathbf{u}^{k+1};\lambda ^{k}). }$$
(2.159)

3. Update λ k by

$$\displaystyle{ \lambda ^{k+1} =\lambda ^{k} - r(\nabla \cdot \mathbf{u}^{k+1} - p_{ s}^{k+1} + p_{ t}^{k+1}). }$$
(2.160)

We observe that (2.159) has a closed form solution (and that p s k+1 and p t k+1 can be computed point-wise independently of each other). The sub-problem (2.158) is a simple variant of the dual of the ROF problem (that is, the unconstrained minimization of the functional in (2.131)). We just need to solve this problem approximately; indeed, in our implementations we just used few steps of a descent algorithm, followed by a projection on the convex set {v | v ∈ (L 2(Ω))2, | v | ≤ g} (see [173, 175] for more details on the solution of these sub-problems). The discrete variant of algorithm (2.158)–(2.160) that we implemented (via a finite difference discretization) proved being very robust with respect to initialization and to the value of the augmentation parameter r; it is also very efficient computationally.

Remark 38.

As written, algorithm (2.158)–(2.160) is applicable only to the solution of two-phase flow problems. There are several ways to generalize this algorithm to models involving more than two phases, as shown in, e.g., [5, 6, 7, 8, 173, 177]. Also, we would like to emphasize the fact that the discrete analogue of algorithm (2.158)–(2.160) we implemented has good convergence properties no matter which of the following two norms we used for the flow constraint in (2.138) (see Remark 33 for the dual formulation associated with (2.162)):

$$\displaystyle{ \vert \mathbf{v}\vert _{2} = \sqrt{\mathit{v} _{1 }^{2 } + \mathit{v} _{2 }^{2}} }$$
(2.161)

or

$$\displaystyle{ \vert \mathbf{v}\vert _{1} = \vert \mathit{v}_{1}\vert + \vert \mathit{v}_{2}\vert. }$$
(2.162)

If one uses the meshes classically used in digital imaging, traditional graph cut methods (like those discussed in [19]) can be used to solve the discrete min-cut and max-flow problems if one uses the norm defined by (2.162) to bound v. On the other hand, the above-mentioned graph cut methods cannot handle the norm defined by (2.161). It is also known that the solutions of the discrete min-cut and max-flow problems suffer from the matrication error if the norm in (2.162) is used. Compared to graph cut methods, ADMM algorithms such as (2.158)–(2.160) can handle both norms without particular difficulty. Moreover, these augmented Lagrangian algorithms are easy to parallelize and to implement on GPUs; also, they use much less memory than traditional graph cut methods; this enables using these algorithms for high dimensional and large size images or data.

6.2.4 A Split-Bregman Method and Related ADMM Algorithm for a Second Order Total Variation Model

Here, we will discuss the application of ALG2 (that is ADMM) to the solution of those image processing problems associated with the functional defined by (2.132) (also known as the TV 2 model). The presentation follows [42, 73, 170], where the main ideas are: (i) transfer the burden of nonlinearity from the Hessian

$$\displaystyle{ \mathbf{D}^{2}u\left (= \left (\begin{array}{cc} \partial ^{2}u/\partial x_{ 1}^{2} & \partial ^{2}u/\partial x_{ 1}\partial x_{2} \\ \partial ^{2}u/\partial x_{1}\partial x_{2} & \partial ^{2}u/\partial x_{2}^{2} \end{array} \right )\right ) }$$

to an additional unknown p, via the relation

$$\displaystyle{ \mathbf{p} = \mathbf{D}^{2}u, }$$
(2.163)

and (ii) use a well-chosen augmented Lagrangian functional, associated with the linear relation (2.163). A similar idea has been (successfully) used in [42] for the augmented Lagrangian solution of the Dirichlet problem for the Monge-Ampère equation det D 2 u = f (see also Chapter 8 of this book).

Back to the TV 2 model (2.132), let us recall that the related minimization problem reads as

$$\displaystyle{ u = \mbox{ arg min}_{\mathit{v}\in V }\left [\dfrac{1} {2}\int _{\varOmega }\vert \mathit{v} - f\vert ^{2}\,dx +\eta \int _{\varOmega }\vert \mathbf{D}^{2}\mathit{v}\vert \,dx\right ], }$$
(2.164)

with V = { v | v ∈ L 2(Ω), D 2 v ∈ (L 1(Ω))2×2} and \(\vert \mathbf{M}\vert = \sqrt{\sum _{1\leq i,j\leq 2 } m_{ij }^{2}}\) denoting the Fröbenius norm of matrix M. Proceeding as in Section 3.2.2, we observe the equivalence between (2.164) and

$$\displaystyle{ \{u,\mathbf{D}^{2}u\} = \mbox{ arg min}_{\{\mathit{ v},\mathbf{q}\}\in W}\left [\dfrac{1} {2}\int _{\varOmega }\vert \mathit{v} - f\vert ^{2}\,dx +\eta \int _{\varOmega }\vert \mathbf{q}\vert \,dx\right ], }$$
(2.165)

where

$$\displaystyle{ W =\{\{ \mathit{v},\mathbf{q}\}\vert \mathit{v} \in V,\mathbf{q} \in (L^{1}(\varOmega ))^{d\times d},\mathbf{D}^{2}\mathit{v} -\mathbf{q} = \mathbf{0}\}, }$$

an observation leading us to introduce the following augmented Lagrangian functional

$$\displaystyle\begin{array}{rcl} \mathcal{L}_{TV 2}(\mathit{v},\mathbf{q};\boldsymbol{\mu })& =& \dfrac{1} {2}\int _{\varOmega }\vert \mathit{v} - f\vert ^{2}\,dx +\eta \int _{\varOmega }\vert \mathbf{q}\vert \,dx \\ & & +\dfrac{r} {2}\int _{\varOmega }\vert \mathbf{D}^{2}\mathit{v} -\mathbf{q}\vert ^{2}\,dx +\int _{\varOmega }\boldsymbol{\mu }: (\mathbf{D}^{2}\mathit{v} -\mathbf{q})\,dx,{}\end{array}$$
(2.166)

where, in (2.166), r > 0, and (with obvious notation) S: T =  1 ≤ i, j ≤ 2 s ij t ij . Applying the methods discussed in Section 3.2.2 to the solution of the minimization problem (2.164) we obtain the following

Algorithm 6.3: An augmented Lagrangian method for the TV 2 model

0. Initialization: \(\boldsymbol{\lambda }^{0} = \mathbf{0}\), u 0 = f.

For k = 0, 1, ⋯ , until convergence:

1. Compute p k+1 from

$$\displaystyle{ \mathbf{p}^{k+1} = \mbox{ arg min}_{\mathbf{ q}\in (L^{2}(\varOmega ))^{2\times 2}}\mathcal{L}_{TV 2}(u^{k},\mathbf{q};\boldsymbol{\lambda }^{k}). }$$
(2.167)

2. Compute u k+1 from

$$\displaystyle{ u^{k+1} = \mbox{ arg min}_{\mathit{ v}\in H^{2}(\varOmega )}\mathcal{L}_{TV 2}(\mathit{v},\mathbf{p}^{k+1};\lambda ^{k}). }$$
(2.168)

3. Update \(\boldsymbol{\lambda }^{k}\) by

$$\displaystyle{ \boldsymbol{\lambda }^{k+1} = \boldsymbol{\lambda }^{k} + r(\mathbf{D}^{2}u^{k+1} -\mathbf{p}^{k+1}). }$$
(2.169)

As with Algorithm 6.1 (that is (2.154)–(2.156)), the sub-problems (2.167) have closed-form solutions which can be computed point-wise. On the other hand, the sub-problems (2.168) reduce to linear bi-harmonic problems for the elliptic operator I + r4; if properly discretized on a uniform grid (typically by finite differences), the discrete analogues of these bi-harmonic problems can be solved by FFT or by iterative methods (see [170] (page 324) for details).

Remark 39.

Obviously, Remark 35 applies also to Algorithm 6.3, with H 2(Ω) playing here the role of H 1(Ω) there.

6.2.5 An Augmented Lagrangian Method for the Euler’s Elastica Model

The energy functional defined by (2.133), namely

$$\displaystyle{ E(\mathit{v}) = \dfrac{1} {2}\int _{\varOmega }\vert f -\mathit{v}\vert ^{2}\,dx +\int _{\varOmega }\left [a + b\left \vert \nabla \cdot \dfrac{\nabla \mathit{v}} {\vert \nabla \mathit{v}\vert }\right \vert ^{2}\right ]\vert \nabla \mathit{v}\vert \,dx, }$$

makes no sense on the subset of Ω where ∇v vanishes. Following an approach very common in visco-plasticity (see, e.g., [66, 83]) one make things more rigorous by defining (following [154]) the energy functional by

$$\displaystyle{ E(\mathit{v},\mathbf{m}) = \dfrac{1} {2}\int _{\varOmega }\vert f -\mathit{v}\vert ^{2}\,dx +\int _{\varOmega }\left [a + b\left \vert \nabla \cdot \mathbf{m}\right \vert ^{2}\right ]\vert \nabla \mathit{v}\vert \,dx }$$
(2.170)

the functions v and m in (2.170) verifying

$$\displaystyle{ \vert \nabla \mathit{v}\vert = \mathbf{m} \cdot \nabla \mathit{v},\ \vert \mathbf{m}\vert \leq 1. }$$
(2.171)

The related minimization problem reads as

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \{u,\mathbf{n}\} = \mbox{ arg min}_{\{\mathit{v},\mathbf{m}\}}E(\mathit{v},\mathbf{m}), \quad \\ \mbox{ with $\{\mathit{v},\mathbf{m}\}$ verifying (<InternalRef RefID="Equ174">2.171</InternalRef>).}\quad \end{array} \right. }$$
(2.172)

Introducing the vector-valued function p verifying p = ∇u, we clearly have equivalence between (2.172) and

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \{u,\mathbf{p},\mathbf{n}\} = \mbox{ arg min}_{\{\mathit{v},\mathbf{q},\mathbf{m}\}}\left [\dfrac{1} {2}\int _{\varOmega }\vert f -\mathit{v}\vert ^{2}\,dx +\int _{\varOmega }\left [a + b\left \vert \nabla \cdot \mathbf{m}\right \vert ^{2}\right ]\vert \mathbf{q}\vert \,dx\right ],\quad \\ \mbox{ with $\{\mathit{v},\mathbf{q},\mathbf{m}\}$ verifying $\mathbf{q} = \nabla \mathit{v}$, $\vert \mathbf{q}\vert = \mathbf{m} \cdot \mathbf{q}$, $\vert \mathbf{m}\vert \leq 1$}. \quad \end{array} \right.\quad }$$
(2.173)

Following [154], we associate with the minimization problem (2.173) the following augmented Lagrangian functional

$$\displaystyle\begin{array}{rcl} \mathcal{L}_{elas}\{\mathit{v},\mathbf{q},\mathbf{m};\boldsymbol{\mu }_{1},\mu _{2})& =& \dfrac{1} {2}\int _{\varOmega }\vert \mathit{v} - f\vert ^{2}\,dx +\int _{\varOmega }\left [a + b\left \vert \nabla \cdot \mathbf{m}\right \vert ^{2}\right ]\vert \mathbf{q}\vert \,dx \\ & & +\dfrac{r_{1}} {2} \int _{\varOmega }\vert \nabla \mathit{v} -\mathbf{q}\vert ^{2}\,dx + r_{ 2}\int _{\varOmega }(\vert \mathbf{q}\vert -\mathbf{q} \cdot \mathbf{m})\,dx \\ & & +\int _{\varOmega }\boldsymbol{\mu }_{1} \cdot (\nabla \mathit{v} -\mathbf{q})\,dx +\int _{\varOmega }\mu _{2}(\vert \mathbf{q}\vert -\mathbf{q} \cdot \mathbf{m})\,dx,{}\end{array}$$
(2.174)

with r 1 and r 2 both positive. Suppose that in (2.174) the vector-valued function m belongs to M, the closed convex set of (L 2(Ω))2 defined by

$$\displaystyle{ \mathbf{M} =\{ \mathbf{m}\vert \mathbf{m} \in (L^{2}(\varOmega ))^{2},\vert \mathbf{m}(x)\vert \leq 1,\mbox{ a.e. in}\ \varOmega \}; }$$

we have then | q | −q ⋅ m ≥ 0, implying (since | q | −q ⋅ m =  | | q | −q ⋅ m | )) that the variant of ALG2 described just below will force the condition | q | −q ⋅ m = 0 in the sense of L 1(Ω). This variant of ALG2 reads as follows when applied to the solution of problem (2.172) (below, H(Ω; div) = { v | v ∈ (L 2(Ω))2, ∇⋅ v ∈ L 2(Ω)}):

Algorithm 6.4: An augmented Lagrangian method for the Euler’s Elastica model

0. Initialization: \(\boldsymbol{\lambda }_{1}^{0} = \mathbf{0}\), λ 2 0 = 0, u 0 = f, n 0 = 0.

For k = 0, 1, ⋯ , until convergence:

1. Compute p k+1 from

$$\displaystyle{ \mathbf{p}^{k+1} = \mbox{ arg min}_{\mathbf{ q}\in (L^{2}(\varOmega ))^{2}}\mathcal{L}_{elas}(u^{k},\mathbf{q},\mathbf{n}^{k};\boldsymbol{\lambda }_{ 1}^{k},\lambda _{ 2}^{k}). }$$
(2.175)

2. Compute n k+1 from

$$\displaystyle{ \mathbf{n}^{k+1} = \mbox{ arg min}_{ \mathbf{m}\in H(\varOmega;\mbox{ div})\cap \mathbf{M}}\mathcal{L}_{elas}(u^{k},\mathbf{p}^{k+1},\mathbf{m};\boldsymbol{\lambda }_{ 1}^{k},\lambda _{ 2}^{k}). }$$
(2.176)

3. Compute u k+1 from

$$\displaystyle{ u^{k+1} = \mbox{ arg min}_{\mathit{ v}\in H^{1}(\varOmega )}\mathcal{L}_{elas}(\mathit{v},\mathbf{p}^{k+1},\mathbf{n}^{k+1};\boldsymbol{\lambda }_{ 1}^{k},\lambda _{ 2}^{k}). }$$
(2.177)

4. Update \(\{\boldsymbol{\lambda }_{1}^{k},\lambda _{2}^{k}\}\) by

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \boldsymbol{\lambda }_{1}^{k+1} = \boldsymbol{\lambda }_{1}^{k} + r_{1}(\nabla u^{k+1} -\mathbf{p}^{k+1}), \quad \\ \lambda _{2}^{k+1} =\lambda _{ 2}^{k} + r_{2}(\vert \mathbf{p}^{k+1}\vert -\mathbf{p}^{k+1} \cdot \mathbf{n}^{k+1}).\quad \end{array} \right. }$$
(2.178)

Below, we will give some details and comments about the solution of the sub-problems encountered when applying algorithm (2.175)–(2.178); implementation issues will be also addressed. Further information is provided in [154].

  • The minimization sub-problem (2.175) has a unique closed-form solution which can be computed point-wise.

  • The minimization sub-problem (2.176) is equivalent to the following elliptic variational inequality

    $$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \mathbf{n}^{k+1} \in H(\varOmega;\mbox{ div}) \cap \mathbf{M}, \quad \\ b\int _{\varOmega }\vert \mathbf{p}^{k+1}\vert \ \nabla \cdot \mathbf{n}^{k+1}\ \nabla \cdot (\mathbf{m} -\mathbf{n}^{k+1})\,dx \geq \int _{\varOmega }(r_{ 2} +\lambda _{ 2}^{k})\mathbf{p}^{k+1} \cdot (\mathbf{m} -\mathbf{n}^{k+1})\,dx,\quad \\ \forall \mathbf{m} \in H(\varOmega;\mbox{ div}) \cap \mathbf{M}. \quad \end{array} \right. }$$
    (2.179)

    We observe that the bilinear functional in the left-hand side of (2.179) is symmetric and positive semi-definite (indeed, Ω  | p k+1 | (∇⋅ m)2dx = 0 if \(\mathbf{m} = \nabla \times \boldsymbol{z}\). However, the boundedness of M implies that the variational problem (2.176), (2.179) has solutions. For the solution of the discrete analogues of the above problem we advocate using few iterations of those relaxation methods with projection discussed in, e.g., [66, 76] (other methods are possible as shown in [154]).

  • The minimization sub-problem (2.177) has a unique solution characterized by

    $$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} u^{k+1} \in H^{1}(\varOmega ), \quad \\ \int _{\varOmega }u^{k+1}\mathit{v}\,dx + r_{ 1}\int _{\varOmega }\nabla u^{k+1} \cdot \nabla \mathit{v}\,dx =\int _{\varOmega }f\mathit{v}\,dx +\int _{\varOmega }(r_{ 1}\mathbf{p}^{k+1} -\boldsymbol{\lambda }_{ 1}^{k}) \cdot \nabla \mathit{v}\,dx,\quad \\ \forall \mathit{v} \in H^{1}(\varOmega ). \quad \end{array} \right. }$$
    (2.180)

    Actually, (2.180) is nothing but a variational formulation of the following Neumann problem

    $$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} u^{k+1} - r_{1}\nabla ^{2}u^{k+1} = f -\nabla \cdot (r_{1}\mathbf{p}^{k+1} -\boldsymbol{\lambda }_{1}^{k})\ \ \mbox{ in}\ \varOmega,\quad \\ r_{1}\dfrac{\partial u^{k+1}} {\partial \nu } = (r_{1}\mathbf{p}^{k+1} -\boldsymbol{\lambda }_{1}^{k}) \cdot \boldsymbol{\nu }\ \ \mbox{ on}\ \partial \varOmega, \quad \end{array} \right. }$$
    (2.181)

    where, in (2.181), \(\boldsymbol{\nu }\) denotes the outward unit vector normal at the boundary ∂ Ω of Ω. The numerical solution of linear elliptic problems such as (2.181) is routine nowadays; after an appropriate space discretization it can be achieved by a large variety of direct and iterative methods (sparse Cholesky, FFT, relaxation, multilevel, etc.).

  • Since the energy functional associated with the Euler’s Elastica is non-convex (see (2.170)) the augmentation parameters r 1 and r 2 have to be chosen large enough to guarantee the convergence of algorithm (2.175)–(2.179). Actually, the tuning of r 1 and r 2 is a delicate issue in itself and we can expect (as shown for example in [133], for a problem involving three augmentation parameters) the optimal values of these parameters to be of different orders of magnitude with respect to the space discretization h.

  • Another solution method for the Euler’s Elastica is discussed in [21]. It relies on tractable convex relaxation in higher dimension.

Remark 40.

In [154], an alternative method for the solution of the Euler’s Elastica problem (2.172) is also considered. It relies on the equivalence between (2.172) and

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \{u,\mathbf{p},\mathbf{n}^{1},\mathbf{n}^{2}\} =\mathop{ \mathrm{{\ast}}}\nolimits arg\ min_{\{\mathit{ v},\mathbf{q},\mathbf{m}_{1},\mathbf{m}_{2}\}}\left [\dfrac{1} {2}\int _{\varOmega }\vert f -\mathit{v}\vert ^{2}\,dx +\int _{\varOmega }\left [a + b\left \vert \nabla \cdot \mathbf{m}_{ 1}\right \vert ^{2}\right ]\vert \mathbf{q}\vert \,dx\right ], \quad \\ \mbox{ with $\{\mathit{v},\mathbf{q},\mathbf{m}_{1},\mathbf{m}_{2}\}$ verifying $\mathbf{q} = \nabla \mathit{v}$, $\mathbf{m}_{1} = \mathbf{m}_{2}$, $\vert \mathbf{q}\vert = \mathbf{m}_{2} \cdot \mathbf{q}$, $\vert \mathbf{m}_{2}\vert \leq 1$}.\quad \end{array} \right. }$$
(2.182)

An augmented Lagrangian associated with (2.182) is clearly the one defined by

$$\displaystyle\begin{array}{rcl} & & \mathcal{L}_{elas}\{\mathit{v},\mathbf{q},\mathbf{m}_{1},\mathbf{m}_{2};\boldsymbol{\mu }_{1},\mu _{2},\boldsymbol{\mu }_{3}) = \dfrac{1} {2}\int _{\varOmega }\vert \mathit{v} - f\vert ^{2}\,dx +\int _{\varOmega }\left [a + b\left \vert \nabla \cdot \mathbf{m}_{ 1}\right \vert ^{2}\right ]\vert \mathbf{q}\vert \,dx \\ & & +\dfrac{r_{1}} {2} \int _{\varOmega }\vert \nabla \mathit{v} -\mathbf{q}\vert ^{2}\,dx + r_{ 2}\int _{\varOmega }(\vert \mathbf{q}\vert -\mathbf{q} \cdot \mathbf{m}_{2})\,dx + r_{3}\int _{\varOmega }\vert \mathbf{m}_{1} -\mathbf{m}_{2}\vert ^{2}\,dx \\ & & +\int _{\varOmega }\boldsymbol{\mu }_{1} \cdot (\nabla \mathit{v} -\mathbf{q})\,dx +\int _{\varOmega }\mu _{2}(\vert \mathbf{q}\vert -\mathbf{q} \cdot \mathbf{m}_{2})\,dx +\int _{\varOmega }\boldsymbol{\mu }_{3} \cdot (\mathbf{m}_{1} -\mathbf{m}_{2})\,dx, {}\end{array}$$
(2.183)

with r 1, r 2 and r 3 all positive. From (2.183), one can easily derive a variant of algorithm (2.175)–(2.178) for the solution of the minimization problem (2.172); such an algorithm is discussed in [154]. Actually the above reference discusses also the solution by a similar methodology of the variant of problem (2.172) obtained by replacing the fidelity term \(\dfrac{1} {2}\int _{\varOmega }\vert f -\mathit{v}\vert ^{2}\,dx\) by \(\dfrac{1} {s}\int _{\varOmega }\vert f -\mathit{v}\vert ^{s}\,dx\) with s ∈ [1, +). Typically, one takes s = 1 (resp., s = 2) for salt-and-pepper noise (resp., Gaussian noise). Further details and generalizations are given in [154].

Remark 41.

As shown in [186], the methodology we employed to solve the minimization problem (2.172) can be easily modified in order to handle the Chan-Vese Elastica model.

6.2.6 An Augmented Lagrangian Method for the L 1-Mean Curvature Model

In this section, we follow closely the presentation used in [185]. The rational of the L 1-mean curvature model has been given in Section 6.1.5, leading one to consider the following minimization problem

$$\displaystyle{ u = \mbox{ arg min}_{\mathit{v}\in V }\left [\dfrac{1} {2}\int _{\varOmega }\vert \mathit{v} - f\vert ^{2}\,dx +\eta \int _{\varOmega }\left \vert \nabla \cdot \dfrac{\nabla \mathit{v}} {\sqrt{1 + \vert \nabla \mathit{v} \vert ^{2}}}\right \vert \,dx\right ], }$$
(2.184)

where ∇ = { ∂ x i } i = 1 2. In (2.184), the choice of V is a delicate theoretical issue; indeed the safest way to proceed would be to take V = H 2(Ω) in (2.184), and to replace min by inf (a (kind) of justification for this approach can be found in [133]). Let us observe (as in [185], where a slightly different notation is used) that

$$\displaystyle{ \nabla \cdot \dfrac{\nabla \mathit{v}} {\sqrt{1 + \vert \nabla \mathit{v} \vert ^{2}}} = \nabla _{3} \cdot \dfrac{\{\nabla \mathit{v},-1\}} {\vert \{\nabla \mathit{v},-1\}\vert }, }$$
(2.185)

where, in (2.185), ∇3 = { ∂ x 1, ∂ x 2, 0},and where {∇v, −1} denotes the 3-dimensional vector-valued function {∂ v∂ x 1, ∂ v∂ x 2, −1}. In order to simplify (in some sense) the nonlinear structure of the minimization problem (2.184), we associate new unknown functions with its solution u, namely p, n and ψ verifying

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \mathbf{p} =\{ \nabla u,-1\}, \quad \\ \mathbf{n} = \dfrac{\mathbf{p}} {\vert \mathbf{p}\vert },\ \mbox{ or equivalently here }\ \vert \mathbf{p}\vert -\mathbf{p} \cdot \mathbf{n} = 0,\ \vert \mathbf{n}\vert \leq 1,\quad \\ \psi = \nabla _{3} \cdot \mathbf{n}. \quad \end{array} \right. }$$
(2.186)

From (2.185) and (2.186), there is clearly equivalence between (2.184) and

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \{u,\psi,\mathbf{p},\mathbf{n}\} =\mathop{ \mathrm{{\ast}}}\nolimits arg\ min_{\{\mathit{v},\varphi,\mathbf{q},\mathbf{m}\}}\left [\dfrac{1} {2}\int _{\varOmega }\vert \mathit{v} - f\vert ^{2}\,dx +\eta \int _{\varOmega }\vert \varphi \vert \,dx\right ], \quad \\ \mbox{ with $\{\mathit{v},\varphi,\mathbf{q},\mathbf{m}\}$ verifying}\ \mathbf{q} =\{ \nabla \mathit{v},-1\},\vert \mathbf{q}\vert -\mathbf{q} \cdot \mathbf{m} = 0,\vert \mathbf{m}\vert \leq 1,\nabla _{3} \cdot \mathbf{m} =\varphi.\quad \end{array} \right. }$$
(2.187)

In order to solve the minimization problem (2.184), taking advantage of its equivalence with (2.187), we introduce the following augmented Lagrangian functional

$$\displaystyle\begin{array}{rcl} & & \mathcal{L}_{MC}(\mathit{v},\varphi,\mathbf{q},\mathbf{z},\mathbf{m};\mu _{1},\boldsymbol{\mu }_{2},\mu _{3},\boldsymbol{\mu }_{4}) = \dfrac{1} {2}\int _{\varOmega }\vert \mathit{v} - f\vert ^{2}\,dx +\eta \int _{\varOmega }\vert \varphi \vert \,dx \\ & & \phantom{\mathcal{L}_{MC}} + \dfrac{r_{1}} {2} \int _{\varOmega }(\vert \mathbf{q}\vert -\mathbf{q} \cdot \mathbf{z})\,dx +\int _{\varOmega }\mu _{1}(\vert \mathbf{q}\vert -\mathbf{q} \cdot \mathbf{z})\,dx \\ & & \phantom{\mathcal{L}_{MC}} + \dfrac{r_{2}} {2} \int _{\varOmega }\vert \{\nabla \mathit{v},-1\} -\mathbf{q}\vert ^{2}\,dx +\int _{\varOmega }\boldsymbol{\mu }_{ 2} \cdot (\{\nabla \mathit{v},-1\} -\mathbf{q})\,dx \\ & & \phantom{\mathcal{L}_{MC}} + \dfrac{r_{3}} {2} \int _{\varOmega }\left \vert \varphi -\left (\dfrac{\partial m_{1}} {\partial x_{1}} + \dfrac{\partial m_{2}} {\partial x_{2}} \right )\right \vert ^{2}\,dx +\int _{\varOmega }\mu _{ 3}\left (\varphi -\left (\dfrac{\partial m_{1}} {\partial x_{1}} + \dfrac{\partial m_{2}} {\partial x_{2}} \right )\right )\,dx, \\ & & \phantom{\mathcal{L}_{MC}} + \dfrac{r_{4}} {2} \int _{\varOmega }\vert \mathbf{z} -\mathbf{m}\vert ^{2}\,dx +\int _{\varOmega }\boldsymbol{\mu }_{ 4} \cdot (\mathbf{z} -\mathbf{m})\,dx. {}\end{array}$$
(2.188)

The additional vector-valued function z has been introduced in order to decouple ∇3 ⋅ m from the nonlinear relations verified by m in (2.187). Following [185], and taking (2.187) and (2.188) into account, we advocate the following algorithm for the solution of problem (2.184):

Algorithm 6.5: An augmented Lagrangian method for the L 1 -mean curvature model

0. Initialization: λ 1 0 = 0, \(\boldsymbol{\lambda }_{2}^{0} = \mathbf{0}\),λ 3 0 = 0, \(\boldsymbol{\lambda }_{4}^{0} = \mathbf{0}\), u 0 = f, p 0 = { ∇u 0, −1}, \(\mathbf{n}^{0} = \mathbf{y}^{0} = \dfrac{\mathbf{p}^{0}} {\vert \mathbf{p}^{0}\vert }\), ψ 0 = ∇3 ⋅ n 0.

For k = 0, 1, ⋯ , until convergence:

1. Compute u k+1 from

$$\displaystyle{ u^{k+1} = \mbox{ arg min}_{\mathit{ v}\in H^{1}(\varOmega )}\mathcal{L}_{MC}(\mathit{v},\psi ^{k},\mathbf{p}^{k},\mathbf{y}^{k},\mathbf{n}^{k};\lambda _{ 1}^{k},\boldsymbol{\lambda }_{ 2}^{k},\lambda _{ 3}^{k},\boldsymbol{\lambda }_{ 4}^{k}). }$$
(2.189)

2. Compute ψ k+1 from

$$\displaystyle{ \psi ^{k+1} = \mbox{ arg min}_{\varphi \in L^{2}(\varOmega )}\mathcal{L}_{MC}(u^{k+1},\varphi,\mathbf{p}^{k},\mathbf{y}^{k},\mathbf{n}^{k};\lambda _{ 1}^{k},\boldsymbol{\lambda }_{ 2}^{k},\lambda _{ 3}^{k},\boldsymbol{\lambda }_{ 4}^{k}). }$$
(2.190)

3. Compute p k+1 from

$$\displaystyle{ \mathbf{p}^{k+1} = \mbox{ arg min}_{\mathbf{ q}\in (L^{2}(\varOmega ))^{3}}\mathcal{L}_{MC}(u^{k+1},\psi ^{k+1},\mathbf{q},\mathbf{y}^{k},\mathbf{n}^{k};\lambda _{ 1}^{k},\boldsymbol{\lambda }_{ 2}^{k},\lambda _{ 3}^{k},\boldsymbol{\lambda }_{ 4}^{k}). }$$
(2.191)

4. Compute y k+1 from

$$\displaystyle{ \mathbf{y}^{k+1} = \mbox{ arg min}_{\mathbf{ z}\in \mathbf{Z}}\mathcal{L}_{MC}(u^{k+1},\psi ^{k+1},\mathbf{p}^{k+1},\mathbf{z},\mathbf{n}^{k};\lambda _{ 1}^{k},\boldsymbol{\lambda }_{ 2}^{k},\lambda _{ 3}^{k},\boldsymbol{\lambda }_{ 4}^{k}). }$$
(2.192)

5. Compute n k+1 from

$$\displaystyle{ \mathbf{n}^{k+1} = \mbox{ arg min}_{\mathbf{ m}\in \mathbf{M}}\mathcal{L}_{MC}(u^{k+1},\psi ^{k+1},\mathbf{p}^{k+1},\mathbf{y}^{k+1},\mathbf{m};\lambda _{ 1}^{k},\boldsymbol{\lambda }_{ 2}^{k},\lambda _{ 3}^{k},\boldsymbol{\lambda }_{ 4}^{k}). }$$
(2.193)

6. Update \(\{\lambda _{1}^{k},\boldsymbol{\lambda }_{2}^{k},\lambda _{3}^{k},\boldsymbol{\lambda }_{4}^{k}\}\) by

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \lambda _{1}^{k+1} =\lambda _{ 1}^{k} + r_{1}(\vert \mathbf{p}^{k+1}\vert -\mathbf{p}^{k+1} \cdot \mathbf{y}^{k+1}), \quad \\ \boldsymbol{\lambda }_{2}^{k+1} = \boldsymbol{\lambda }_{2}^{k} + r_{2}(\{\nabla u^{k+1},-1\} -\mathbf{p}^{k+1}), \quad \\ \lambda _{3}^{k+1} =\lambda _{ 3}^{k} + r_{3}\left (\psi ^{k+1} -\left (\dfrac{\partial n_{1}^{k+1}} {\partial x_{1}} + \dfrac{\partial n_{2}^{k+1}} {\partial x_{2}} \right )\right ),\quad \\ \boldsymbol{\lambda }_{4}^{k+1} = \boldsymbol{\lambda }_{4}^{k} + r_{4}(\mathbf{y}^{k+1} -\mathbf{n}^{k+1}). \quad \end{array} \right. }$$
(2.194)

In (2.189)–(2.194), the sets Z and M are defined by

$$\displaystyle{ \mathbf{Z} =\{ \mathbf{z}\vert \mathbf{z} \in (L^{2}(\varOmega ))^{3},\vert \mathbf{z}(x)\vert \leq 1,\ \mbox{ a.e. in}\ \varOmega \}, }$$

and

$$\displaystyle{ \mathbf{M} =\{ \mathbf{m}\vert \mathbf{m} \in (L^{2}(\varOmega ))^{3}, \dfrac{\partial m_{1}} {\partial x_{1}} + \dfrac{\partial m_{2}} {\partial x_{2}} \in L^{2}(\varOmega )\}, }$$

respectively.

We observe that the minimization sub-problems (2.190), (2.191), and (2.192) have closed form solutions which can be computed point-wise. On the other hand, the Euler-Lagrange equations of the sub-problems (2.189) and (2.193) are well-posed linear elliptic equations with constant coefficients; fast solvers exist for the solution of the discrete analogues of these elliptic problems (see [185] for details and the results of numerical experiments validating the above algorithm). An important issue is the tuning of the augmentation parameters r 1, r 2, r 3, and r 4; the comments we did in Section 6.2.5, concerning the adjustment of r 1 and r 2 in algorithm (2.176)–(2.178), still apply here.

Remark 42.

Another augmented Lagrangian based solution method for the L 1-mean curvature problem (2.184) is discussed and numerically tested in ref. [133]. The related ADMM algorithm involves only three Lagrange multipliers and three augmentation parameters. Moreover, the various vector-valued functions encountered in the approach discussed in [133] map Ω into I​​R 2 (instead of I​​R 3, as it is the case for algorithm (2.189)–(2.194)).

7 Further Comments and Complements

There is much more to say about operator-splitting and ADMM algorithms; fortunately, many of these issues and topics we left behind, or said very little about, are developed in the other chapters of this book. There are however some issues we would like to-briefly-comment to conclude this chapter, namely:

  1. (i)

    The convergence of operator-splitting methods and ADMM algorithms, when applied to the solution of problems involving non-monotone operators and/or non-convex functionals.

  2. (ii)

    The choice of the augmentation parameters and their dynamical adjustment when applying ADMM algorithms.

  3. (iii)

    The derivation of operator-splitting schemes of high (or higher) orders of accuracy.

  4. (iv)

    Trying to understand why the Douglas-Rachford scheme is more robust than the Peaceman-Rachford one, using simple model problems to clarify this issue.

  5. (v)

    Very few problems have generated as many operator-splitting based solution methods than the Navier-Stokes equations modeling viscous fluid flows. From this fact, providing the reader with a significant number of related references is a must in a book like this one. These references will conclude this chapter.

Concerning the first issue, to the best of our knowledge, there is no general theory concerning the convergence of operator-splitting methods and ADMM algorithms when the problem under consideration involves at least one non-monotone operator and/or a non-convex functional. Actually, one can find in the literature convergence results for some problems lacking monotonicity and/or convexity, but, most often, the proofs of these results are very specific of the problem under consideration, and therefore are not easy to generalize to other situations. However, some recent results obtained by R. Luke [96, 115] and W. Yin [166], and collaborators, suggest that a fairly general theory is not out of reach. However, we think that there always will be situations where one not will be able to prove the convergence of operator-splitting methods and ADMM algorithms. This is not surprising since these methods and algorithms have been quite successful at solving problems for which the existence of solutions has not been proved.

The second issue concerning the choice and the dynamical adaptation of the augmentation parameters is another complicated one, particularly for those non-convex and non-monotone situations involving more than one of such parameters. Indeed, numerical experiments have shown that the optimal values of these parameters may have several orders of magnitude (as shown in, e.g., [80] and [133]), and, from the possible existence of multiple solutions, that bifurcations can take place depending also of the values of these parameters (and of the algorithm initialization). However, for particular problems, heuristics have been found, significantly improving the speed of convergence of these ADMM algorithms (see, e.g., [46]).

In order to address the high (or higher) orders of accuracy issue (our third issue) we return to Section 2.3 of this chapter (the one dedicated to the Strang symmetrized operator-splitting scheme), and consider the following initial value problem

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \dfrac{dX} {dt} + (A + B)X = 0\ \mbox{ on}\ (0,T),\quad \\ X(0) = X_{0}, \quad \end{array} \right. }$$
(2.195)

where A and B are linear operators independent of t. When applied to the solution of the initial value problem (2.195), the Strang symmetrized scheme (2.7)–(2.10) can be written in the following more compact form

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &X^{0} = X_{ 0}, \\ \quad &X^{n+1} = e^{-A\bigtriangleup t/2}e^{-B\bigtriangleup t}e^{-A\bigtriangleup t/2}X^{n},\ \forall n \geq 0.\end{array} \right. }$$
(2.196)

The relation

$$\displaystyle{ e^{-(A+B)\bigtriangleup t} - e^{-A\bigtriangleup t/2}e^{-B\bigtriangleup t}e^{-A\bigtriangleup t/2} = O(\bigtriangleup t^{3}), }$$

shows that scheme (2.196) is second order accurate (and exact if AB = BA). For those situations requiring an order of accuracy higher than two, several options do exist, the best known being:

  1. (a)

    The 4th order Strang-Richardson scheme discussed in [49, 50, 48] where it is applied (among other problems) to the numerical solution of real-valued or vector-valued reaction-diffusion equations such as

    $$\displaystyle{ \dfrac{\partial \mathbf{u}} {\partial t} -\mathbf{M}\nabla ^{2}\mathbf{u} + \mathbf{F}(\mathbf{u}) = \mathbf{0}, }$$

    where u(x, t) ∈ I​​R d, ∇2 denotes the Laplace operator, M is a d × d symmetric positive definite matrix, and F is a smooth mapping from I​​R d into I​​R d.

  2. (b)

    The exponential operator-splitting schemes. Actually, the Lie and Strang splitting schemes belong to this family of time discretization methods, whose origin (concerning schemes of order higher than two) is not easy to track back, early significant publications being [150, 151] (see also the references therein and those in [161], and in Google Scholar). Arbitrary high accuracy can be obtained with these methods, the price to pay being their reduced stability (compared to the Strang scheme, for example).

The best way to introduce the Strang-Richardson scheme is to start, one more time, from the simple initial value problem (2.195). Applied to the solution of (2.195), the Strang-Richardson scheme reads as

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &X^{0} = X_{0}, \\ \quad &X^{n+1} = \dfrac{1} {3}\left [4e^{-A\bigtriangleup t/4}e^{-B\bigtriangleup t/2}e^{-A\bigtriangleup t/2}e^{-B\bigtriangleup t/2}e^{-A\bigtriangleup t/4}\right. \\ \quad &\phantom{X^{n+1} = \dfrac{1} {3}}\left.-e^{-A\bigtriangleup t/2}e^{-B\bigtriangleup t}e^{-A\bigtriangleup t/2}\right ]X^{n},\ \forall n \geq 0.\end{array} \right. }$$
(2.197)

A more practical equivalent formulation of the symmetrized scheme (2.197) can be found in the Chapter 6 of [70]; it avoids the use of matrix exponentials and can be generalized easily to nonlinear problems (it requires the solution of eight sub-initial value problems per time step). Scheme (2.197) is fourth order accurate but not as stable as the original Strang scheme (scheme (2.196)). Also, its application to decompositions involving more than two operators becomes a bit complicated to say the least (higher order methods of the same type are discussed in [85]).

In a similar fashion, we consider again the initial value problem (2.195) to introduce exponential splitting methods. Applied to the solution of (2.195) the typical exponential operator-splitting scheme reads as follows:

$$\displaystyle{ \left \{\begin{array}{@{}l@{\quad }l@{}} \quad &X^{0} = X_{ 0}, \\ \quad &X^{n+1} = \left (\prod \limits _{j=1}^{J}e^{-b_{j}B\bigtriangleup t}e^{-a_{j}A\bigtriangleup t}\right )X^{n},\ \forall n \geq 0,\end{array} \right. }$$
(2.198)

where a j , b j  ∈ I​​R, for 1 ≤ j ≤ J. The Strang symmetrized scheme (2.196) is a particular case of (2.198) (corresponding to J = 2, b 1 = 0, a 1 = 1∕2, b 2 = 1, a 2 = 1∕2). By an appropriate choice of J, and of the coefficients a j and b j , scheme (2.198) can be made of order higher than two (as shown in, e.g., [16]), the price to pay being that some of the coefficients a j , b j are negative making the scheme inappropriate to those situations where some of the operators are dissipative. On the other hand, these higher order schemes produce spectacular results when applied to reversible systems, like those associated with some linear and nonlinear Schrödinger operators, as shown in, e.g.,[51, 161]. Their generalization to those (fairly common) situations involving more than two operators is rather complicated, although theoretically doable.

Concerning the fourth issue, the Peaceman-Rachford and Douglas-Rachford schemes have been briefly discussed in Sections 2.4 and 2.5, respectively. In order to have a better idea of their accuracy and stability properties, we will consider the particular situation where, in problem (2.14), ϕ 0 ∈ I​​R d, T = +, and where A 1 (resp., A 2) is given by A 1 = α A (resp., A 2 = β A), A being a real symmetric positive definite d × d matrix, and α, β verifying 0 ≤ α, β ≤ 1 and α +β = 1. The exact solution of the associated problem (2.14) reads as

$$\displaystyle{ \phi (t) = e^{-At}\phi _{ 0},\ \forall t \geq 0, }$$

which implies (by projection on an orthonormal basis of eigenvectors of matrix A, and with obvious notation)

$$\displaystyle{ \phi _{i}(t) = e^{-\lambda _{i}t}\phi _{ 0i},\ \forall t \geq 0,\ \forall i = 1,\ldots,d, }$$
(2.199)

where 0 < λ 1 ≤  ≤ λ i  ≤  ≤ λ d , the λ i ’s being the eigenvalues of matrix A. Applying the Peaceman-Rachford scheme (2.15) to the particular problem (2.14) defined above, we obtain the following discrete analogue of (2.199):

$$\displaystyle{ \phi _{i}^{n} = \left (R_{ 1}(\lambda _{i}\bigtriangleup t)\right )^{n}\phi _{ 0i},\ \forall n \geq 0,\ \forall i = 1,\ldots,d, }$$
(2.200)

R 1 being the rational function defined by

$$\displaystyle{ R_{1}(\xi ) = \dfrac{\left (1 - \dfrac{\alpha } {2}\xi \right )\left (1 - \dfrac{\beta } {2}\xi \right )} {\left (1 + \dfrac{\alpha } {2}\xi \right )\left (1 + \dfrac{\beta } {2}\xi \right )}. }$$
(2.201)

Since | R 1(ξ) |  < 1, \(\forall \xi > 0\), the Peaceman-Rachford scheme (2.15) is unconditionally stable in the particular case considered here. However, the property \(\lim \limits _{\xi \rightarrow +\infty }R_{1}(\xi ) = 1\) shows that the above scheme is not stiff A-stable, making it not a first choice scheme to capture steady state solutions or to simulate fast transient phenomena. Actually, the stability drawback we just mentioned is not specific to the particular case we are considering, but seems to hold in general for scheme (2.15). Incidentally, the relation

$$\displaystyle{ R_{1}(\xi ) - e^{-\xi } = O(\xi ^{3})\ \mbox{ in the neighborhood of }\ \xi = 0 }$$

implies that in the particular case under consideration (where A 1 and A 2 commute) scheme (2.15) is second order accurate. Applying now the Douglas-Rachford scheme (2.17) to the same particular case of problem (2.14), we obtain

$$\displaystyle{ \phi ^{n+1} = (I +\alpha \bigtriangleup tA)^{-1}(I +\beta \bigtriangleup tA)^{-1}(I +\alpha \beta (\bigtriangleup t)^{2}A^{2})\phi ^{n},\ \forall n \geq 0, }$$

which implies

$$\displaystyle{ \phi ^{n} = (I +\alpha \bigtriangleup tA)^{-n}(I +\beta \bigtriangleup tA)^{-n}(I +\alpha \beta (\bigtriangleup t)^{2}A^{2})^{n}\phi _{ 0},\ \forall n \geq 0. }$$
(2.202)

By projection of (2.202) on an orthonormal basis of I​​R d consisting of eigenvectors of A, we obtain the following variant of (2.200):

$$\displaystyle{ \phi _{i}^{n} = \left (R_{ 2}(\lambda _{i}\bigtriangleup t)\right )^{n}\phi _{ 0i},\ \forall n \geq 0,\ \forall i = 1,\ldots,d, }$$
(2.203)

R 2 being the rational function defined by

$$\displaystyle{ R_{2}(\xi ) = \dfrac{1 +\alpha \beta \xi ^{2}} {(1+\alpha \xi )(1+\beta \xi )}. }$$
(2.204)

Since 0 < R 2(ξ) < 1, \(\forall \xi > 0\), the Douglas-Rachford scheme (2.17) is unconditionally stable in the particular case considered here. However, the property \(\lim \limits _{\xi \rightarrow +\infty }R_{2}(\xi ) = 1\) shows that the above scheme is not stiff A-stable, making it not a first choice scheme to capture steady state solutions or to simulate fast transient phenomena. Actually, the stability drawback we just mentioned is not specific to the particular case we are considering, but seems to hold in general for scheme (2.17). Concerning the accuracy of scheme (2.17), we observe that in the neighborhood of ξ = 0, we have

$$\displaystyle{ R_{2}(\xi ) = 1 -\xi +\xi ^{2} + O(\xi ^{3}), }$$

which implies, by comparison with \(e^{-\xi } = 1 -\xi +\dfrac{\xi ^{2}} {2} + O(\xi ^{3})\), that scheme (2.17) is no better than first order accurate in the particular case we are considering. Since this particular case is the most favorable one can think about, one expects the Douglas-Rachford scheme (2.17) to be generically first order accurate, a prediction supported by the results of various numerical experiments. It is worth mentioning that in order to improve the accuracy of the Douglas-Rachford scheme (2.17), J. Douglas & S. Kim introduced in the late 90s–early 2000s [56], the following variant of the above scheme

$$\displaystyle{ \phi ^{0} =\phi _{ 0}. }$$
(2.205)

For n ≥ 0, \(\phi ^{n} \rightarrow \hat{\phi }^{n+1} \rightarrow \phi ^{n+1}\) as follows:

Solve

$$\displaystyle{ \dfrac{\hat{\phi }^{n+1} -\phi ^{n}} {\bigtriangleup t} + A_{1}\left (\dfrac{\hat{\phi }^{n+1} +\phi ^{n}} {2},t^{n+1/2}\right ) + A_{ 2}(\phi ^{n},t^{n}) = 0, }$$
(2.206)

and

$$\displaystyle{ \dfrac{\phi ^{n+1} -\phi ^{n}} {\bigtriangleup t} + A_{1}\left (\dfrac{\hat{\phi }^{n+1} +\phi ^{n}} {2},t^{n+1/2}\right ) + A_{ 2}\left (\dfrac{\phi ^{n+1} +\phi ^{n}} {2},t^{n+1/2}\right ) = 0. }$$
(2.207)

The Douglas-Kim scheme (2.205)–(2.207) is clearly inspired from the Crank-Nicolson scheme. Scheme (2.205)–(2.207) is second order accurate if the operators A 1 and A 2 are sufficiently smooth, the price to pay for this accuracy enhancement being a reduction of stability and robustness compared to the original Douglas-Rachford scheme (2.17).

At those wondering how to choose between Peaceman-Rachford and Douglas-Rachford schemes we will say that on the basis of many numerical experiments, it seems that the second scheme is more robust and faster for those situations where one of the operators is non-smooth (multivalued or singular, for example), particularly if one is interested at capturing steady state solutions. Actually, this behavior is consistent with the fact that the rational function R 1 associated with the Peaceman-Rachford scheme (the one defined by (2.201)) may change sign when ξ varies on (0, +), unlike the rational function R 2 defined by (2.204) (the one associated with the Douglas- Rachford scheme) which stays positive on the above interval. These sign changes suggest a more oscillatory behavior for the associated scheme if fast transients take place, or if one tries to capture steady state solutions starting far away from these solutions.

As a final comment on ADI methods we have to mention that one of their main contributors (if not the main one), beyond their founders (J. Douglas, H. Rachford, and D. Peaceman), is definitely E. Wachpress: His wonderful book The ADI Model Problem [164] is an invaluable source of information and references on the Peaceman-Rachford and Douglas-Rachford methods, from the theoretical and practical points of view.

As a conclusion, let us observe that the Navier-Stokes equations modeling the flow of viscous fluids have been mentioned quite a few times in this chapter (Section 4 in particular), and in other chapters of this book. There is no doubt that very few partial differential equation problems have motivated such a large number of operator-splitting based solution methods. Focusing on those publications with which we have some familiarity, let us mention: [11, 12, 13, 23, 35, 43, 47, 70, 72, 73, 90, 91, 92, 93, 94, 105, 107, 111, 112, 116, 122, 123, 158, 159, 160] (see also the references therein, Google Scholar, and Chapters 21, 22 and 23 of this book).