An Efficient Primal-Dual Method for the Obstacle Problem

Zosso, Dominique; Osting, Braxton; Xia, Mandy(Mengqi); Osher, Stanley J.

doi:10.1007/s10915-017-0420-0

An Efficient Primal-Dual Method for the Obstacle Problem

Published: 23 March 2017

Volume 73, pages 416–437, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Scientific Computing Aims and scope Submit manuscript

An Efficient Primal-Dual Method for the Obstacle Problem

Download PDF

Dominique Zosso ORCID: orcid.org/0000-0001-5685-4273¹,
Braxton Osting²,
Mandy(Mengqi) Xia³ &
…
Stanley J. Osher³

923 Accesses
10 Citations
Explore all metrics

Abstract

We solve the non-linearized and linearized obstacle problems efficiently using a primal-dual hybrid gradients method involving projection and/or $L^1$ penalty. Since this method requires no matrix inversions or explicit identification of the contact set, we find that this method, on a variety of test problems, achieves the precision of previous methods with a speed up of 1–2 orders of magnitude. The derivation of this method is disciplined, relying on a saddle point formulation of the convex problem, and can be adapted to a wide range of other constrained convex optimization problems.

First-order least-squares method for the obstacle problem

Article 22 October 2019

An interior penalty approach to a large-scale discretized obstacle problem with nonlinear constraints

Article 03 January 2020

A power penalty approach to a discretized obstacle problem with nonlinear constraints

Article 12 November 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The obstacle problem is a classic motivating example in the mathematical study of variational inequalities and free boundary problems with broad applications, e.g., in fluid filtration in porous media, constrained heating, elasto-plasticity, optimal control, or financial mathematics [2, 7, 10, 22]. The classical formulation is motivated by the equilibrium position of a membrane (described by the graph of $u:\varOmega \subset {\mathbb {R}}^n\rightarrow {\mathbb {R}}$), with fixed values f imposed on the boundary $\partial \varOmega $ and constrained to lie (almost everywhere) above an obstacle function, $\varphi :\varOmega \rightarrow {\mathbb {R}}$:

$$\begin{aligned} \min _{u\in H^1(\varOmega )} \ \int _\varOmega \sqrt{1+|\nabla u|^2} \qquad \text {s.t.} \quad u \ge \varphi \text { a.e. in } \varOmega \quad \text {and} \quad u=f \text { on } \partial \varOmega . \end{aligned}$$

(1.1)

For small deflections, the above integral (1.1) is typically linearized into the Dirichlet integral:

$$\begin{aligned} \min _{{u\in H^1(\varOmega )}} \ \int _\varOmega |\nabla u|^2 \qquad \text {s.t.} \quad u \ge \varphi \text { a.e. in } \varOmega \quad \text {and} \quad u=f \text { on } \partial \varOmega . \end{aligned}$$

(1.2)

Within the Sobolev space $H^1(\varOmega )$, the admissible functions for either obstacle problem, (1.1) or (1.2), define the closed convex set $U = U(\varOmega ,\phi ,f)$:

$$\begin{aligned} U=\lbrace u\in H^1(\varOmega ),\; u \ge \varphi \text { a.e. in } \varOmega ,\; u\vert _{\partial \varOmega } = f\rbrace \end{aligned}$$

(1.3)

Standard arguments from variational analysis can be used to show that a unique minimizer of (1.2) exists for sufficiently regular data $(\varOmega ,\phi ,f)$; see, for example, [2, 10, 22]. The non-coercivity of the nonlinear problem (1.1) can be handled using the methods in [7].

In practice, numerical methods are required to solve the obstacle problem. Lions and Mercier [16] illustrated a splitting algorithm on the obstacle problem. Finite element methods were used to solve the obstacle problem as a free boundary problem [5, 11, 15]. Iteration methods based on penalty and projected gradients are known approaches [6, 23]. In [12, 13], the obstacle constraint is incorporated into the objective using a Lagrange multiplier. Multigrid and multilevel methods have been employed to speed up computation: The obstacle problems described by variational inequalities can be solved using the iterative scheme proposed in [14], where at each step the system is solved by a multigrid algorithm; [28] discretizes the problem into a continuous piecewise linear finite element space and turn the problem into a quadratic programming problem with inequality constraints, which is then solved with multilevel projection (MP).

In [17], the authors propose a modified level set method to solve the obstacle problem and turn the problem into a nonsmooth minimization problem. Two methods are presented to solve the minimization problem: a proximal bundle method is employed for solving a general nonsmooth minimization problem; a gradient method is proposed to solve the regularized problems.

Most recently, in [26], the authors proposed an efficient numerical scheme for solving the (linearized) obstacle problem (and generalizations) based on a reformulation of the obstacle in terms of an $L^1$-like penalty on the variational problem, in lieu of the obstacle constraint:

$$\begin{aligned} \min _{u\in H^1(\varOmega )} \ \int _\varOmega |\nabla u|^2+\mu (\varphi -u)_+\qquad \text {s.t.}\quad u\vert _{\partial \varOmega } = f, \end{aligned}$$

(1.4)

where $(\cdot )_+=\max (\cdot , 0)$. It was shown that for sufficiently large but finite $\mu $, the minimizer of the unconstrained problem (1.4) is also the minimizer of the original, constrained problem (1.2).

The functionals and constraints in (1.1), (1.2), and (1.4) are all closed and convex, which makes the problems accessible to convex optimization techniques. There is a huge corpus of algorithms that apply to such convex optimization problems. Recently, primal-dual splitting algorithms have gained particular attention, most notably so in the context of TV and $L^1$-type problems in imaging; see, e.g., [8, 9, 29, 30]. Zosso and Bustin [31] have applied the theory to the particular case of the Beltrami functional [24], used as an interesting regularizer in imaging problems that interpolates between the classical $H^1$ and TV regularizers, and which is a generalization of the surface area functional. Zosso and Osting [32] use the method for efficient minimization of a minimal surface-based graph partitioning problem.

In this paper, we develop a primal-dual hybrid gradient method to efficiently solve constrained and $L^1$-penalty formulations of the linear and non-linear obstacle problems, (1.1) and (1.2). We comment that the original constrained problem is algorithmically related to the $L^1$-based problem for finite $\mu $, and strictly identical for the particular choice $\mu =\infty $. Our results achieve state-of-the-art precision in much shorter time; the speed up is 1–2 orders of magnitude with respect to the method in [26], and even larger compared to older methods [16, 17, 27]. This is demonstrated for a variety of one- and two-dimensional problems, including the extension to elasto-plastic problems with double obstacles and potential term.

It is important to note that our fast algorithm is derived from a primal-dual reformulation of the original convex optimization problems in a very disciplined way. The primal-dual hybrid gradients method employed here enjoys straightforward convergence guarantees. Moreover, in the particular case of the linear obstacle problem, the dual variable update is particularly simple, and the dual variable can be eliminated altogether, resulting in a single primal variable update scheme that greatly improves over standard explicit gradient descent in multiple ways. Namely, the resulting scheme is fast because previous iterates keep being taken into account, which has highly interesting interpretations in terms of “momentum method”, and can also be reformulated as a damped wave equation. We illustrate this nature on a very simple yet striking toy example. More general advantages of our method are the avoidance of matrix/operator inversion (all update steps are explicit), and the fact that the contact/coincidence set (the free boundary) is handled implicitly, with no explicit tracking being necessary.

1.1 Outline

The remainder of the paper is organized as follows: Sect. 2 gives a formulation of the discretized obstacle problem and its different flavors. Section 3 describes the proposed primal-dual algorithms. In Sect. 4, we compare the resulting primal-dual algorithms to some more classical ones. Section 5 presents the results of the primal-dual algorithm on a variety of numerical experiments. We conclude in Sect. 6 with a discussion.

2 Discretized Obstacle Problem

In this section, we introduce a discretized obstacle problem and discuss some of its properties. To this end, we introduce the following notation.

Let X denote the set of Cartesian grid points of the discretization of the D-dimensional domain $\varOmega \subset {\mathbb {R}}^D$.

2.1 Inner Products and Norms

For functions $f,g:X\rightarrow {\mathbb {R}}$, we define the $L^2$ inner product

$$\begin{aligned} \langle f, g\rangle _{L^2(X)} := \sum _{x\in X} f(x)g(x) \end{aligned}$$

(2.1)

and the induced $L^2$-norm

$$\begin{aligned} \Vert f\Vert _{L^2(X)} := \langle f,f\rangle _{L^2(X)}^{\frac{1}{2}} = \sqrt{\sum _{x\in X}f(x)^2} \end{aligned}$$

(2.2)

Similarly, for vectors $v,w:X\rightarrow {\mathbb {R}}^D$, we define the $L^2$ inner product

$$\begin{aligned} \langle v,w\rangle _{L^2(X)^D} := \sum _{x\in X} \sum _{d=1}^D v_d(x)\cdot w_d(x) \end{aligned}$$

(2.3)

and obtain the induced $L^2$-norm

$$\begin{aligned} \Vert v\Vert _{L^2(X)^D} := \langle v,v \rangle _{L^2(X)^D}^{\frac{1}{2}} = \sqrt{\sum _{x\in X} \sum _{d=1}^D v_d(x)^2}, \end{aligned}$$

(2.4)

where $v_d(x)$ denotes the d-th component of the vector at location x. For a vector $v:X\rightarrow {\mathbb {R}}^D$, the magnitude $|v|:X \rightarrow {\mathbb {R}}$ is defined as follows:

$$\begin{aligned} |v|(x) := \sqrt{\sum _{d=1}^D v_d(x)^2},\qquad x\in X. \end{aligned}$$

(2.5)

The magnitude operator, $|\cdot |:L^2(X)^D\rightarrow L^2(X)$, thus maps vector-valued functions to scalar-valued functions. It is easy to see that $\Vert v\Vert _{L^2(X)^D} = \Vert |v| \Vert _{L^2(X)}$. Finally, for the obstacle penalty, we need the single sided $L^1$-penalty of a function $f:X\rightarrow {\mathbb {R}}$:

$$\begin{aligned} \Vert f\Vert _+ := \sum _{x\in X} \max (f(x),0). \end{aligned}$$

(2.6)

A corresponding negative-sided penalty can be obtained directly as

$$\begin{aligned} \Vert f\Vert _- := \Vert -f\Vert _+. \end{aligned}$$

(2.7)

It is easy to verify that $\Vert f\Vert _+, \Vert f\Vert _-\ge 0$, and $\Vert f\Vert _+ + \Vert f\Vert _- = \Vert f\Vert _1$.

2.2 Finite Difference Operators

For functions $f:X\rightarrow {\mathbb {R}}$, we denote by $\nabla _X:L^2(X)\rightarrow L^2(X)^D$ a suitable discretization of the continuous gradient by finite differences, subject to desired boundary constraints. Using the discrete gradient, for vectors $v:X\rightarrow {\mathbb {R}}^D$ we define the discretized divergence $\mathrm{div}_X :L^2(X)^D\rightarrow L^2(X)$, such that the two operators satisfy the usual negative adjoint property:

$$\begin{aligned} \langle \nabla _X f, v\rangle _{L^2(X)^D} = -\langle f,\mathrm{div}_X v \rangle _{L^2(X)},\qquad f:X\rightarrow {\mathbb {R}},\ v:X\rightarrow {\mathbb {R}}^D. \end{aligned}$$

(2.8)

For functions $f:X\rightarrow {\mathbb {R}}$, the composition of divergence and gradient yields the discrete Laplacian operator $\varDelta _X:L^2(X)\rightarrow L^2(X)$:

$$\begin{aligned} \varDelta _X f := (\mathrm{div}_X \circ \nabla _X) f. \end{aligned}$$

(2.9)

For example, we consider the standard forward differences stencil for the gradient; then the adjoint relation leads to backward differences in the divergence and the resulting discretized Laplacian corresponds to the commonly used $2D+1$ point centered differences stencil.

2.3 Surface Area and Dirichlet Energy

Based on the preceding definitions, we define the following two functionals: Let $S:L^2(X)^D\rightarrow {\mathbb {R}}$ be defined as

$$\begin{aligned} S[s] := \sum _{x\in X} \sqrt{1+|s|(x)^2}. \end{aligned}$$

(2.10)

For a Monge surface, discretized by the graph of $u:X\rightarrow {\mathbb {R}}$, the discrete surface area is then measured by $S[\nabla _Xu]$. Similarly, let $D:L^2(X)^D\rightarrow {\mathbb {R}}$ be defined as

$$\begin{aligned} D[s] := \frac{1}{2} \Vert s\Vert _{L^2(X)^D}^2. \end{aligned}$$

(2.11)

We then call $D[\nabla _Xu]$ the discrete Dirichlet energy of u.

For slowly varying functions u, i.e., such that $\forall x\in X:|\nabla _X u|(x) \ll 1$, the Dirichlet energy is an approximation to the surface area,

$$\begin{aligned} S[\nabla _Xu] \approx |X|+D[\nabla _Xu], \end{aligned}$$

where |X| denotes the number of sampling points in X, i.e., the area of the discretized domain $\varOmega $. Note that both the surface area and the Dirichlet energy functional are proper, closed, and convex.

2.4 Convex Conjugates

We recall the definition of the convex conjugate (a.k.a. Legendre–Fenchel transform) $f^*$ of a function f:

$$\begin{aligned} f^*(p) = \sup _{s} \ \ \langle s,p\rangle - f(s). \end{aligned}$$

(2.12)

The biconjugate, $f^{**}:=(f^*)^*$, is the largest closed convex function with $f^{**}\le f$. As a result, $f\equiv f^{**}$ iff f is closed convex (Fenchel–Moreau–Rockafellar Theorem) [21, Theorem 5].

Using Definitions 2.4 and 2.11, we can determine the respective convex conjugate functionals, $D^*,S^*:L^2(X)^D\rightarrow {\mathbb {R}}$ of the Dirichlet energy and surface area as follows:

$$\begin{aligned} D^*[p]&:= \sup _{s\in L^2(X)^D} \ \ \langle s,p\rangle _{L^2(X)^D} - D[s] = \frac{1}{2} \Vert p\Vert _{L^2(X)^D}^2, \end{aligned}$$

(2.13a)

and

$$\begin{aligned} S^*[p]&:= \sup _{s\in L^2(X)^D} \ \ \langle s,p\rangle _{L^2(X)^D} - S[s] \\ \nonumber&= {\left\{ \begin{array}{ll}-\sum \limits _{x\in X}\sqrt{1-|p|(x)^2} &{} \text {if}\ \ \forall x\in X:|p|(x) \le 1,\\ \infty &{} \text {otherwise.}\end{array}\right. } \end{aligned}$$

(2.13b)

Thanks to the Fenchel–Moreau theorem we know that the biconjugate functionals, $S^{**}$ and $D^{{**}}$, are identical to the original functionals, thus

$$\begin{aligned} S[\nabla _X u]&= {\mathop {\mathop {\sup }\limits _{p\in L^2(X)^D }}\limits _{|p|(x)\le 1}} \ \ \langle \nabla _X u,p\rangle _{L^2(X)^D} - S^*[p] \end{aligned}$$

(2.14a)

$$\begin{aligned} D[\nabla _X u]&= \sup _{p\in L^2(X)^D} \ \ \langle \nabla _X u,p\rangle _{L^2(X)^D} - D^*[p]. \end{aligned}$$

(2.14b)

2.5 Discretized Obstacle Problem

Having defined the discretized surface area and Dirichlet energy functionals, we next define the discretized optimization problem.

Given a discretization X and appropriate finite difference operators for gradient, divergence, and Laplacian, we declare a subset $\partial X\subset X$ to be boundary points. Let $\varphi :X\rightarrow {\mathbb {R}}$ be an obstacle, and $f:\partial X\rightarrow {\mathbb {R}}$ the prescribed boundary values, compatible with the obstacle, i.e., $\forall x\in \partial X:\varphi (x) \le f(x)$. Consider further the set $U_f$ of square integrable functions who satisfy the boundary constraints,

$$\begin{aligned} U_f:=\{u\in L^2(X)\mid u(x) = f(x) \ \forall x\in \partial X \}, \end{aligned}$$

(2.15)

and the subset $U_{f,\varphi }$ thereof additionally satisfying the obstacle constraint,

$$\begin{aligned} U_{f,\varphi }:=\{u\in L^2(X)\mid u(x) = f(x) \ \forall x\in \partial X \text { and } u(x) \ge \varphi (x) \ \forall x\in X\}. \end{aligned}$$

(2.16)

Both these sets can be shown to be closed convex. We are interested in solving the following four (primal) problems:

2.5.1 Constrained Obstacle Problem

$$\begin{aligned} \min _{u\in U_{f,\varphi }}\ \ S[\nabla _Xu] \end{aligned}$$

(P1)

2.5.2 Constrained Linearized Obstacle Problem

$$\begin{aligned} \min _{u\in U_{f,\varphi }}\ \ D[\nabla _Xu] \end{aligned}$$

(P2)

2.5.3 $L^1$-Penalty Obstacle Problem

$$\begin{aligned} \min _{u\in U_f}\ \ S[\nabla _Xu] + \mu \Vert \varphi -u\Vert _+ \end{aligned}$$

(P3)

2.5.4 Linearized $L^1$-Penalty Obstacle Problem

$$\begin{aligned} \min _{u\in U_f}\ \ D[\nabla _Xu] + \mu \Vert \varphi -u\Vert _+ \end{aligned}$$

(P4)

Due to the convexity of functionals and admissible sets it follows that if the admissible sets are non-empty, there exists a solution. The following Lemma shows that the solution to (P2) and (P4) is unique.

Lemma 1

Provided the graph X is connected, the boundary points $\partial X \ne \emptyset $, and the admissible sets are non-empty, the solutions to (P2) and (P4) are unique.

Proof

We prove the result for (P4); the result for (P2) is similar.

Let $u,v \in U_f$ be optimal solutions and $\alpha \in [0,1]$. Denote the objective by

$$\begin{aligned} J(u) = \langle u, -\varDelta _X u \rangle + \mu \Vert \phi - u \Vert _+. \end{aligned}$$

By the convexity of J, any convex combination of the u and v must have the same value, so

$$\begin{aligned} J(\alpha u + (1-\alpha ) v) = \alpha J(u) + (1-\alpha ) J(v). \end{aligned}$$

Writing out the definition of J and rearranging, this is equivalent to

$$\begin{aligned}&\ \alpha (1-\alpha ) \langle -\varDelta _X (u-v) , u-v \rangle \\&\quad = \ \mu \left( \Vert \phi - \alpha u - (1-\alpha ) v \Vert _+ - \alpha \Vert \phi - u \Vert _+ - (1-\alpha ) \Vert \phi - v \Vert _+ \right) \\&\quad \le \ 0. \end{aligned}$$

Here we have used the convexity of $\Vert \cdot \Vert _+$. But, $\langle -\varDelta _X \cdot , \cdot \rangle $ is a positive semi-definite form, and since X is assumed to be connected, has only the constant function in the kernel. This implies that $u-v = \alpha 1$ where $\alpha \in {\mathbb {R}}$ and $ 1 :V \rightarrow {\mathbb {R}}$ is the constant function. Since, $u,v \in U_f$, i.e., satisfy the same boundary conditions, we conclude that $u=v$, as desired.

In [26, Theorem 3.1] it is shown that for any $\mu \ge - \varDelta \phi $, the minimizers of the constrained and $L^1$-penalty linearized obstacle problems, the continuum version of (P2) and (P4), are the same. Below, in Theorem 1, we give the analogous result for the discrete problems, (P2) and (P4). The proof is largely the same, except for one small but interesting point. In the continuum, one has that for any $f \in H^1$, $\langle \nabla f_+ , \nabla f_- \rangle = 0$, where $f_+ = \max (f,0)$, $f_- = \min (f,0)$, and $f = f_+ + f_-$. In our discrete setting we have the following result.

Lemma 2

For any $f \in L^2(X)$, $\langle \nabla _X f_+ , \nabla _X f_- \rangle _{L^2(X)^D} \ge 0$.

Proof

Denoting the adjacent vertices of $x\in X$ by N(x), we compute

$$\begin{aligned} \langle \nabla _X f_+ , \nabla _X f_- \rangle _{L^2(X)^D}&= - \langle \varDelta _X f_+ , f_- \rangle _{L^2(X)} = - \sum _{x \in X} (\varDelta _X f_+)(x) f_-(x) \\&= - \sum _{x \in X} \Bigr ( \sum _{y \in N(x) } f_+(y) - f_+(x) \Bigr ) f_-(x) \\&= \sum _{x \in X} |f_-(x) | \sum _{y \in N(x)} | f_+(y) | \\&\ge 0. \end{aligned}$$

Theorem 1

Let u and $u_\mu $ denote the minimizers of (P2) and (P4) respectively. For any $\mu \ge \max \{ - \varDelta _X \phi (x) :x \in X\}$, we have that $u_ \mu = u$.

Proof

Consider $w := u_\mu + (\phi - u_\mu )_+$. Abbreviate $\langle \cdot , \cdot \rangle = \langle \cdot , \cdot \rangle _{L^2(X)^D}$ and $\Vert \cdot \Vert = \Vert \cdot \Vert _{L^2(X)^D}$. Evaluating the objective of (P4) at the admissible $w \in U_{f,\phi }$, we obtain

$$\begin{aligned} D[\nabla _Xw] + \mu \Vert \phi - w \Vert _+&= D[\nabla _Xw] = D[\nabla _Xu_\mu ] + \langle \nabla _X u_\mu , \nabla _X (\phi -u_\mu )_+ \rangle \\&\quad + \frac{1}{2} \langle \nabla _X (\phi -u_\mu )_+, \nabla _X (\phi -u_\mu )_+ \rangle \\&= D[\nabla _Xu_\mu ] + \langle - \varDelta _X \phi , (\phi -u_\mu )_+ \rangle _{L^2(X)} \\&\quad - \frac{1}{2} \Vert \nabla _X (\phi - u_\mu )_+ \Vert ^2 - \langle \nabla _X(\phi -u_\mu )_+, \nabla _X (\phi -u_\mu )_- \rangle \end{aligned}$$

Now assuming $\mu \ge \max \{ - \varDelta _X \phi (x) :x \in X\}$ and using Lemma 2, we obtain

$$\begin{aligned} D[\nabla _Xw] + \mu \Vert \phi - w \Vert _+ \le D[\nabla _Xu_\mu ] + \mu \Vert \phi - u_\mu \Vert _+ . \end{aligned}$$

Since $u_\mu $ is the unique minimizer of $D[\nabla _X\cdot ] + \mu \Vert \phi - \cdot \Vert _+ $ over $U_f$, we conclude that $w = u_\mu $ which implies that $u_\mu \ge \phi $, i.e., $u_\mu \in U_{f,\phi }$. Additionally, we have that

$$\begin{aligned} D[\nabla _Xu_\mu ] = D[\nabla _Xu_\mu ] + \mu \Vert \phi - u_\mu \Vert _+ \le D[\nabla _Xu] + \mu \Vert \phi - u \Vert _+ = D[\nabla _Xu]. \end{aligned}$$

Since u is the unique minimizer of $D[\nabla _X\cdot ]$ over $U_{f,\phi }$ and $u_\mu $ is admissible, we conclude that $u = u_\mu $, as desired.

2.6 Saddle Point Formulation

Thanks to the identity of a closed convex functional with its biconjugate, we can substitute the functionals $S[\nabla _Xu]$ and $D[\nabla _Xu]$ by the definition of their respective biconjugates (2.14a) and (2.14b). We thus obtain the four primal-dual saddle point problems, respectively equivalent to the primal problems (P1)–(P4):

2.6.1 Constrained Obstacle Problem

$$\begin{aligned} \min _{u\in U_{f,\varphi }}\ \ \ {\mathop {\mathop {\max }\limits _{p\in L^2(X)^D}}\limits _{|p|(x)\le 1}} \ \langle \nabla _X u,p\rangle _{L^2(X)^D} - S^*[p] \end{aligned}$$

(PD1)

2.6.2 Constrained Linearized Obstacle Problem

$$\begin{aligned} \min _{u\in U_{f,\varphi }} \ \ \max _{p\in L^2(X)^D} \ \ \langle \nabla _X u,p\rangle _{L^2(X)^D} - D^*[p] \end{aligned}$$

(PD2)

2.6.3 $L^1$-Penalty Obstacle Problem

$$\begin{aligned} \min _{u\in U_f} \ \ {\mathop {\mathop {\max }\limits _{p\in L^2(X)^D}}\limits _{|p|(x)\le 1}} \ \ \langle \nabla _X u,p\rangle _{L^2(X)^D} - S^*[p] + \mu \Vert \varphi -u\Vert _+ \end{aligned}$$

(PD3)

2.6.4 Linearized $L^1$-Penalty Obstacle Problem

$$\begin{aligned} \min _{u\in U_f} \ \ \max _{p\in L^2(X)^D} \ \ \langle \nabla _X u,p\rangle _{L^2(X)^D} - D^*[p] + \mu \Vert \varphi -u\Vert _+ \end{aligned}$$

(PD4)

For discussing algorithms for the preceding problems (PD1)–(PD4), it is convenient to summarize these saddle point problems in the following form:

$$\begin{aligned} \min _{u\in U} \ \ \max _{p\in W^*} \ \ \langle \nabla _X u,p\rangle _{L^2(X)^D} - F^*[p] + G[u], \end{aligned}$$

(2.17)

where we identify the admissible sets, U and $W^*$, and the functionals, $F^*$ and G, as follows:

$$\begin{aligned} U&:={\left\{ \begin{array}{ll}U_{f,\varphi }&{} \text {for}\ \ (\hbox {PD1}),(\hbox {PD2})\\ U_{f}&{} \text {for}\ \ (\hbox {PD3}),(\hbox {PD4})\end{array}\right. } \end{aligned}$$

(2.18a)

$$\begin{aligned} W^*&:= {\left\{ \begin{array}{ll}L^2(X)^D \cap \{p\mid |p|(x)\le 1\}&{} \text {for}\ \ (\hbox {PD1}),(\hbox {PD3})\\ L^2(X)^D &{} \text {for}\ \ (\hbox {PD2}),(\hbox {PD4})\end{array}\right. } \end{aligned}$$

(2.18b)

$$\begin{aligned} F^*[p]&:= {\left\{ \begin{array}{ll}S^*[p] &{} \text {for}\ \ (\hbox {PD1}),(\hbox {PD3})\\ D^*[p] &{} \text {for}\ \ (\hbox {PD2}),(\hbox {PD4})\end{array}\right. } \end{aligned}$$

(2.18c)

$$\begin{aligned} G[u]&:={\left\{ \begin{array}{ll}0&{} \text {for}\ \ (\hbox {PD1}),(\hbox {PD2})\\ \mu \Vert \varphi -u\Vert _+&{} \text {for}\ \ (\hbox {PD3}),(\hbox {PD4}).\end{array}\right. } \end{aligned}$$

(2.18d)

3 Algorithms

3.1 Primal-Dual Hybrid Gradients Scheme

To solve the general saddle point problem (2.17) efficiently, we propose an adaptation of the primal-dual hybrid gradients algorithm [8, 29, 30, 32]. For (2.17), the structure of the PDHG algorithm is as follows:

$$\begin{aligned} p^{n+1}&= {{\mathrm{arg\,min}}}_{p\in W^*} \ \ -\langle \nabla _X \bar{u}^n,p\rangle _{L^2(X)^D} + F^*[p] + \frac{1}{2r_1}\Vert p-p^{n}\Vert _{L^2(X)^D}^2 \end{aligned}$$

(3.1a)

$$\begin{aligned} u^{n+1}&= {{\mathrm{arg\,min}}}_{u\in U} \ \ -\langle u,\mathrm{div}_Xp^{n+1}\rangle _{L^2(X)} + G[u] + \frac{1}{2r_2}\Vert u-u^{n}\Vert _{L^2(X)}^2 \end{aligned}$$

(3.1b)

$$\begin{aligned} \bar{u}^{n+1}&=2u^{n+1} -u^{n}. \end{aligned}$$

(3.1c)

The first two steps are the proximal update of the dual and primal variable, respectively. The third, extra-gradient step is an overrelaxation of the primal update in order to overcome the stepsize shortening typical of first order methods; it is a prediction of the primal variable update used in the dual variable update. Note the appearance of $\bar{u}$ instead of u in the dual variable update 3.1a.

In general, O(1 / n) (where n is the number of iterations) convergence has been shown for fixed $r_1,r_2$ satisfying

$$\begin{aligned} r_1r_2\Vert \nabla _X\Vert ^2\le 1, \end{aligned}$$

(3.2)

where $\Vert \nabla _X\Vert $ is the operator norm/induced norm of the discretized gradient operator; equivalently, $\Vert \nabla _X\Vert ^2=\Vert \varDelta _X\Vert $ is the induced norm of the composition $\varDelta _X = \mathrm{div}_X\circ \nabla _X$.

In the following subsections, we show how the respective minimization sub-problems (3.1a) and (3.1b) can be solved efficiently for each specific configuration in (2.18) corresponding to (PD1)–(PD4).

3.2 Dual Variable Update

We first address the update of the dual variable $p$. This step is unaffected by the choice of constrained or $L^1$-penalty formulation, but differs between the non-linear, $S[\nabla _Xu]$, and linearized, $D[\nabla _Xu]$, case.

3.2.1 Linear Cases, (PD2) and (PD4)

In the linearized cases (PD2) and (PD4), where $F^*=D^*$, the minimization problem of the dual variable update (3.1a) is given by

$$\begin{aligned} p^{n+1} = {{\mathrm{arg\,min}}}_{p\in L^2(X)^D} \ \ -\langle \nabla _X\bar{u}^n,p\rangle _{L^2(X)^D} + \frac{1}{2} \Vert p\Vert _{L^2(X)^D}^2 + \frac{1}{2r_1}\Vert p-p^{n}\Vert _{L^2(X)^D}^2. \end{aligned}$$

(3.3)

The minimizer of this quadratic function is immediately found as

$$\begin{aligned} p^{n+1} = \frac{p^n+r_1\nabla _X\bar{u}^n}{1+r_1}. \end{aligned}$$

(3.4)

3.2.2 Non-Linear Cases (PD1) and (PD3)

In the nonlinear cases (PD1) and (PD3), where $F^*=S^*$, by rationalizing the numerator of the dual, the minimization problem of the dual variable update (3.1a) can be rewritten as

$$\begin{aligned} p^{n+1} = {\mathop {\mathop {{{\mathrm{arg\,min}}}}\limits _{p\in L^2(X)^D}}\limits _{|p|(x)\le 1}} -\langle \nabla _X\bar{u}^n,p\rangle _{L^2(X)^D} +\sum _{x\in X}\frac{|p|(x)^2-1}{\sqrt{1-|p|(x)^2}} + \frac{1}{2r_1}\Vert p-p^{n}\Vert _{L^2(X)^D}^2. \end{aligned}$$

(3.5)

We propose to solve this in an iteratively reweighted least squares (IRLS) approach as follows. We observe that the objective in (3.5) without the square-root in the denominator is quadratic in $p$. We fix the current estimate of $p$ in the square-root to obtain a weighted least squares problem, this weight is then updated, and the process is repeated until convergence:

$$\begin{aligned} \psi ^{0}&= p^{n} \end{aligned}$$

(3.6a)

$$\begin{aligned} \psi ^{t+1}&={\mathop {\mathop {{{\mathrm{arg\,min}}}}\limits _{\psi \in L^2(X)^D }}\limits _{|\psi |(x)\le 1}} \ \ -\langle \nabla _X\bar{u}^n,\psi \rangle _{L^2(X)^D} + \sum _{x\in X}\frac{|\psi |(x)^2-1}{\sqrt{1-|\psi ^t|(x)^2}} + \frac{1}{2r_1}\Vert \psi -p^{n}\Vert _{L^2(X)^D}^2 \end{aligned}$$

(3.6b)

$$\begin{aligned} p^{n+1}&= \lim _{t\rightarrow \infty }\psi ^t \end{aligned}$$

(3.6c)

Our focus is on the solution of the fixed-point minimization step (3.6b). Indeed, it can be shown (see [32, Lemma 4.3]) that for small enough $r_1$, the solution to (3.6b) is given by

$$\begin{aligned} \psi ^{t+1}(x) = \frac{\varepsilon ^t(x)}{\varepsilon ^t(x)+2r_1}\left[ p^n(x)+r_1(\nabla _X\bar{u}^n)(x)\right] ,\quad x\in X, \end{aligned}$$

(3.7)

where $\varepsilon ^t(x):=\sqrt{1-|\psi ^t|(x)^2}$. Further, for suitably chosen $r_1$, the iterative scheme (3.6) converges to the desired fixed point (see [32, Lemma 4.4]).

3.3 Primal Variable Update

We now consider the update problem (3.1b) of the primal variable. Here, the choice of non-linear versus linearized cost functional is irrelevant; instead, we distinguish the original constraint formulations (PD1) and (PD2) from the relaxed $L^1$-penalty formulations (PD3) and (PD4).

3.3.1 Constrained Formulation (PD1) and (PD2)

The constrained primal variable iteration (3.1b) reads

$$\begin{aligned} u^{n+1} = {{\mathrm{arg\,min}}}_{u\in U_{f,\varphi }}\ \ -\langle u,\mathrm{div}_Xp^{n+1}\rangle _{L^2(X)} +\frac{1}{2r_2}\Vert u-u^n\Vert _{L^2(X)}^2. \end{aligned}$$

(3.8)

Based on the projection theorem and variational inequalities [6], and in the tradition of the Arrow–Hurwitz–Uzawa algorithm [1] and the first recent proposition of the PDHG algorithm [29], this update is approximated by seeking a minimizer in the relaxed set $L^2(X)$ and projecting onto the admissible set $U_{f,\varphi }$. This approximation to the iteration can be written

$$\begin{aligned} u^{n+1}&= P_{U_{f,\varphi }}\left[ {{\mathrm{arg\,min}}}_{u\in L^2(X)}\ \ -\langle u,\mathrm{div}_Xp^{n+1}\rangle _{L^2(X)} +\frac{1}{2r_2}\Vert u-u^n\Vert _{L^2(X)}^2\right] \\ \nonumber&= P_{U_{f,\varphi }}[u^\star ], \end{aligned}$$

(3.9)

with

$$\begin{aligned} u^\star := u^n + r_2\mathrm{div}_Xp^{n+1} \end{aligned}$$

(3.10)

and where the projection $P_{U_{f,\varphi }}:L^2(X) \rightarrow {U_{f,\varphi }}$ is simply defined as

$$\begin{aligned} P_{U_{f,\varphi }}[u^\star ] := {{\mathrm{arg\,min}}}_{u\in U_{f,\varphi }}\Vert u-u^\star \Vert _{L^2(X)}. \end{aligned}$$

(3.11)

Therefore, more explicitly for $x\in X$, the update of the primal variable in the constrained setting (PD1), (PD2), is:

$$\begin{aligned} u^{n+1}(x) := {\left\{ \begin{array}{ll} f(x) &{} \text {if} \ \ x\in \partial X,\\ \varphi (x) &{} \text {if}\ \ x\notin \partial X \text { and } u^\star (x) < \varphi (x),\\ u^\star (x) &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

(3.12)

3.3.2 $L^1$-Penalty Formulations (PD3) and (PD4)

The $L^1$-penalty method relaxes the admissible set to $U_f$, i.e., $L^2(X)$ subject to only the boundary values f on $\partial X$, and introduces the $L^1$-like penalty term $\Vert \varphi -u\Vert _+$, instead. The resulting instance of (3.1b) can be rewritten as

$$\begin{aligned} u^{n+1} = {{\mathrm{arg\,min}}}_{u\in U_{f}}\ \ \mu \Vert \varphi -u\Vert _++ \frac{1}{2r_2}\Vert u-(u^n+r_2\mathrm{div}_Xp^{n+1})\Vert _{L^2(X)}^2. \end{aligned}$$

(3.13)

The minimizer can be easily found by making a change of variables, $z=\varphi -u$, leading to

$$\begin{aligned} z^\star&= {{\mathrm{arg\,min}}}_{z}\ \ r_2\mu \Vert z\Vert _++ \frac{1}{2}\Vert \varphi -(u^n+r_2\mathrm{div}_Xp^{n+1})-z\Vert _{L^2(X)}^2\nonumber \\&= {{\mathrm{shrink}}}_+(\varphi -(u^n+r_2\mathrm{div}_Xp^{n+1}), r_2\mu ), \end{aligned}$$

(3.14)

where ${{\mathrm{shrink}}}_+$ denotes single-sided soft-thresholding (shrinkage) as follows:

$$\begin{aligned} {{\mathrm{shrink}}}_+(s,\tau ) := {\left\{ \begin{array}{ll} s&{}\text {if} \ \ s<0,\\ 0 &{} \text {if} \ \ 0\le s \le \tau ,\\ s-\tau &{} \text {if} \ \ s>\tau .\end{array}\right. } \end{aligned}$$

(3.15)

Let again $u^\star =u^n+r_2\mathrm{div}_X p^{n+1}$. We re-substitute $u^{n+1}=\varphi -z^\star $ and respect the boundary conditions through projection $P_{U_f}$,

$$\begin{aligned} u^{n+1} = P_{U_f}[\varphi - {{\mathrm{shrink}}}_+(\varphi -u^\star , r_2\mu ) ]. \end{aligned}$$

(3.16)

This leads to the following explicit primal update for the $L^1$-penalty based obstacle problems, (PD3), (PD4):

$$\begin{aligned} u^{n+1}(x) = {\left\{ \begin{array}{ll} f(x) &{} \text {if} \ \ x\in \partial X,\\ u^\star (x)+r_2\mu &{} \text {if}\ \ x\notin \partial X \text { and } u^\star (x) < \varphi (x)-r_2\mu ,\\ \varphi (x) &{} \text {if}\ \ x\notin \partial X \text { and } \varphi (x)-r_2\mu \le u^\star (x) \le \varphi (x),\\ u^\star (x) &{} \text {otherwise}.\end{array}\right. } \end{aligned}$$

(3.17)

Remark 1

Recall that in Theorem 1, we proved that the solutions to the constrained and $L^1$-penalized obstacle problems, (PD2) and (PD4), are equivalent for $\mu \ge \max \{-\varDelta _X \varphi (x) :x \in X\}$; see also [26]. In light of this, it is interesting to compare primal iterations (3.12) and (3.17) for the constrained and $L^1$-penalized obstacle problems. These two algorithms only disagree for points $x \notin \partial X$ with $u^\star (x) < \varphi (x)-r_2\mu $, i.e., when the tentative membrane update $u^\star $ violates the obstacle constraint by an amount in excess of $r_2\mu $. This suggests that not only do the solutions of these two problems agree for sufficiently large $\mu $, but so do all of the iterates in the proposed primal-dual methods.

3.4 Elimination of the Dual Variable in the Linear Case, (PD2) and (PD4)

In the updates of the primal variable, the dual variable only appears through its divergence, $\mathrm{div}_X p^{n+1}$; see (3.10) being used in (3.12) and (3.17). However, in the linear case, (PD2) and (PD4), by applying the divergence to (3.4), it is easy to see that

$$\begin{aligned} \mathrm{div}_Xp^{n+1} = \frac{\mathrm{div}_Xp^n+r_1\varDelta _X\bar{u}^n}{1+r_1}. \end{aligned}$$

(3.18)

Assuming $p^{0}(x)=0$ for all $x\in X$, this recursion relation can be solved, giving

$$\begin{aligned} \mathrm{div}_Xp^{n+1} = \sum _{\eta =0}^n \frac{r_1\varDelta _X\bar{u}^{n-\eta }}{(1+r_1)^{\eta +1}}. \end{aligned}$$

(3.19)

This allows rewriting the primal variable updates without the dual variable, effectively eliminating the dual variable from the scheme altogether. After elimination, the intermediate primal update (3.10) is now

$$\begin{aligned} u^\star = u^n + r_1r_2\sum _{\eta =0}^n \frac{\varDelta _X\bar{u}^{n-\eta }}{(1+r_1)^{\eta +1}}, \end{aligned}$$

(3.20)

which is then projected/shrunk according to (3.12) or (3.17) as before.

3.5 Complete Algorithms

The complete algorithms for the four different flavors of the obstacle problem (PD1)–(PD4) are summarized in Algorithms 1–2. These algorithms have extremely simple structure and require only low-complexity operations. In particular, no matrix inversions are required. These algorithms converge in O(1 / n) (where n is the number of iterations) for $r_1,r_2$ satisfying

$$\begin{aligned} r_1r_2\Vert \varDelta _X\Vert \le 1. \end{aligned}$$

Remark 2

Although Algorithms 1 and 2 appear to be specifically engineered to serve a particular purpose, it is important to note that they have been derived in a disciplined manner: the re-formulation of the original, primal obstacle problem as a saddle-point problem and by implementing a primal-dual hybrid gradients algorithm with known convergence guarantees.

4 Interpretation/Comparisons for Proposed Algorithm

In this section we compare the proposed primal-dual algorithm (summarized in Algorithms 1 and 2) to two time-stepping schemes arising in classical problems.

4.1 Comparison to the Forward Euler Scheme for the Discrete Heat Equation

For a comparison, recall that the forward Euler time-stepping scheme for the discrete heat equation, can be writtten

$$\begin{aligned} u^\star&= u^n + r_1r_2 \varDelta _Xu^n \end{aligned}$$

(4.1a)

$$\begin{aligned}&= u^0 + r_1r_2 \sum _{\eta =0}^n \varDelta _Xu^\eta , \end{aligned}$$

(4.1b)

where $r_1r_2$ is the time step. Thus, after elimination of the dual variable, the primal-dual method for the linear problem seems to perform something akin to an alternating sequence of explicit heat diffusion steps (4.1), projection/thresholding (3.12) or (3.17), and an extra gradient step (3.1c). Unfortunately, the explicit heat diffusion (4.1) is well-known to be severely limited by its stability criterion and therefore dreadfully slow.

Indeed, the first term, $\eta =0$, of the summation in (3.20) is an explicit heat update step, subject to the usual constraint on the time step, here $r_1r_2\le \rho (\varDelta _X)^{-1}$. However, thanks to the sum, previous iterates are taken into account (discounted by a compounding factor of $\frac{1}{1+r_1}$), the impact of which is essential for the efficiency and convergence of the scheme. Consider the following two cases:

Away from the solution, subsequent heat updates point towards the same direction and the sum lets the history of updates interfere constructively, effectively greatly increasing the time step of the explicit heat update. Assuming $\varDelta _X\bar{u}^{n-\eta }\approx \varDelta _Xu^n$ is constant for recent iterations (small $\eta $), and noting that $\sum _{\eta =0}^n \frac{1}{(1+r_1)^{\eta +1}} = \frac{1}{r_1}$, we can approximate (3.20) by

$$\begin{aligned} u^\star \approx u^n + r_2 \varDelta _X u^n. \end{aligned}$$

The proposed scheme is thus heat diffusion with a $1/r_1$-fold larger effective time-step, see (4.1). This is more pronounced for $r_1 \ll 1$, reducing the discounting on previous time steps (i.e., increasing the memory horizon of the summation).

Close to the solution, the membrane tends to oscillate about the optimum and subsequent heat steps have opposing signs. The summation leads to destructive interference of subsequent updates, dampening the oscillations and effectively stabilizing the explicit heat update. In this case, larger values of $r_1$ reduce the memory horizon and allow the system to transition from constructive interference to a stabilizing, destructive interference, thereby reducing overshooting.

4.2 Comparison to a Time-Stepping Scheme for the Damped Wave Equation

Another interesting analysis of the resulting update scheme, in the linear case and away from obstacle and boundary value projections, emerges when we rewrite it as a three-point scheme in time.

Lemma 3

The proposed update iterations of the primal variable (ignoring projections) satisfy the following three-point equality:

$$\begin{aligned} \left( u^{n+1}-2u^n+u^{n-1}\right) + r_1\left( u^{n+1}-u^n\right) = r_1r_2\left( 2\varDelta _X u^{n}-\varDelta _X u^{n-1}\right) . \end{aligned}$$

(4.2)

Remark 3

The left hand terms in (4.2) are immediately identified as finite difference approximations of $u_{tt}^n\approx \left( u^{n+1}-2u^n+u^{n-1}\right) /h^2$ and $u_t^n\approx \left( u^{n+1}-u^n\right) /h$, and one clearly recognizes elements of a damped wave equation:

$$\begin{aligned} u_{tt}^n + \frac{r_1}{h}u_t^n = \frac{r_1r_2}{h^2}\left( 2\varDelta _X u^{n}-\varDelta _X u^{n-1}\right) \end{aligned}$$

(4.3)

Proof

By substituting the extragradient (3.1c) into (3.18), we get:

$$\begin{aligned} (1+r_1)\mathrm{div}_Xp^{n+1} = \mathrm{div}_Xp^n + r_1\left( 2\varDelta _X u^n - \varDelta _X u^{n-1}\right) . \end{aligned}$$

From (3.10) and (3.12)/(3.17), but ignoring shrinkage and projections, we get

$$\begin{aligned} \mathrm{div}_Xp^{n+1}=\frac{u^{n+1}-u^n}{r_2}. \end{aligned}$$

Put together, this yields the primal update

$$\begin{aligned} u^{n+1} = u^n + \frac{u^n-u^{n-1}}{1+r_1} + \frac{r_1r_2}{1+r_1}\left( 2\varDelta _X u^n - \varDelta _X u^{n-1}\right) . \end{aligned}$$

(4.4)

from which (4.2) follows by simple term rearrangements.

The three-point primal update scheme (4.4) highlights how the proposed scheme performs explicit heat diffusion (based on the forward-projected Laplacian, in the extragradient step), on top of which the most recent update is added. This scheme is related to the “momentum method”—because it mimicks the motion of a system u with a certain mass (and therefore inertia) [19]. This type of gradient descent acceleration method is classically used in gradient-descent back-propagation training of neural networks [20], and has been associated with a specific type of conjugate gradients [4]. Also, the idea of maintaining momentum by mixing in an extrapolation of most recent updates is the essence of Nesterov’s gradient acceleration scheme [18] and the core of the FISTA algorithm [3]. The idea is also to be compared to gradient averaging e.g., as in stochastic gradient descent methods.

The evolution of a simple toy problem under both, standard Euler explicit heat updates (4.1) and the proposed primal-dual scheme (3.20) is illustrated in Fig. 1. It is easy to see how the summation in (3.20) causes the peak to decay at a faster pace after the first iterate. As a result, the initial localized state splits into two components that propagate away from the center and decay, illustrating the damped wave nature of the iterations. The wave is reflected at the fixed boundaries, inverting the sign, in contrast to the explicit scheme which remains non-negative (according to the maximum principle). Evidently, the proposed scheme converges to the (trivial) flat solution $u=0$ much faster than the explicit heat update.

5 Numerical Experiments

In all following problems, we assume X to be a Cartesian sampling of either an interval in ${\mathbb {R}}$, or of a rectangular subset of ${\mathbb {R}}^2$. The differential operators are chosen as forward differences for the gradient, and, as a result, backward differences for the divergence and central differences for the Laplacian.

We implemented the algorithms in MATLAB®,^{Footnote 1} and ran our experiments on standard computing equipment (2012 laptop with 2.80 GHz Intel® Core^TM i7, with 4 GB RAM).

5.1 One-Dimensional Obstacles

We consider two obstacles on [0, 1], previously considered in [23, 26]:

$$\begin{aligned} \varphi _1(x) := {\left\{ \begin{array}{ll} 100x^2 &{} \text {for} \ \ 0\le x\le 0.25\\ 100x(1-x) &{} \text {for} \ \ 0.25\le x\le 0.5\\ \varphi _1(1-x)&{}\text {for}\ \ 0.5\le x \le 1, \end{array}\right. } \end{aligned}$$

(5.1)

and

$$\begin{aligned} \varphi _2(x) := {\left\{ \begin{array}{ll} 10\sin (2\pi x) &{} \text {for} \ \ 0\le x\le 0.25\\ 5\cos (\pi (4x-1))+5 &{} \text {for} \ \ 0.25\le x\le 0.5\\ \varphi _1(1-x)&{}\text {for}\ \ 0.5\le x \le 1. \end{array}\right. } \end{aligned}$$

(5.2)

A third obstacle (from [26]) is given by

$$\begin{aligned} \varphi _3(x) := 10\sin ^2(\pi (x+1)^2),\qquad x\in [0,1]. \end{aligned}$$

(5.3)

All three problems and the obtained solutions are illustrated in Fig. 2. In the 1D case, there is of course no difference between the solutions of the linear and the non-linear versions: the solutions are identical to the obstacle on the contact set, and straight lines away from it.

5.2 Two-Dimensional Obstacles

We now look at example problems in 2D (from [26]). First, consider the square domain, $\varphi _4:\varOmega \rightarrow {\mathbb {R}}$, with $\varOmega =[0,1]^2$, and the obstacle

$$\begin{aligned} \varphi _4(x,y) := {\left\{ \begin{array}{ll} 5 &{} \text {for}\ \ |x-0.6|+|y-0.6|< 0.04\\ 4.5 &{} \text {for}\ \ (x-0.6)^2+(y-0.25)^2< 0.001\\ 4.5&{} \text {for}\ \ y=0.57 \ \ \text {and} \ \ 0.075< x < 0.13\\ 0&{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

(5.4)

We solve the linear (PD2) and non-linear (PD1) obstacle problem (with 0 boundary values imposed) at different resolutions, as illustrated in Fig. 3. Parameter values are as follows: $\mu =\infty $, $r_1=0.01$, $r_2=12.5$, $\epsilon =10^{-5}$.

In Table 1 we can compare to numerical experiments of previous algorithm implementations, in particular the $L^1$-penalty method [26], and the Lions-Mercier splitting scheme [16], equally reported in [26]. These comparisons hold only in a loose sense, because the implementation and computing infrastructure is undisclosed for [26]. Nevertheless, our method seems to outperform by 1–2 orders of magnitude.

Table 1 Running times for 2D obstacle problem $\varphi _4$, at different resolution

Full size table

Next, we consider the square domain $\varOmega =[-2,2]^2$, and the radially-symmetric obstacle

$$\begin{aligned} \varphi _5(r) := {\left\{ \begin{array}{ll} \sqrt{1-r^2} &{} \text {for}\ \ r\le 1,\\ -1 &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$

(5.5)

This obstacle has been used in [17, 26], and can be traced back to at least [25], albeit with a typo in its definition. Assuming $f=0$ on a circular boundary of radius $R=2$,^{Footnote 2} the linear obstacle problem admits the following radially-symmetric analytical solution:

$$\begin{aligned} u_5^*(r) = {\left\{ \begin{array}{ll}\sqrt{1-r^2} &{} \text {for}\ \ r\le r^*,\\ -(r^*)^2\ln (r/R)/\sqrt{1-(r^*)^2} &{}\text {otherwise}, \end{array}\right. } \end{aligned}$$

(5.6)

where $r^*=0.6979651482\ldots $ satisfies $(r^*)^2(1-\ln (r^*/R))=1$.

Using the values of this analytical solution as prescribed boundary values on the boundaries of the square domain, we solve the linear obstacle problem. We set the parameters as follows: $\mu = 0.1$, $r_1=0.008$, $r_2=15.625$, $\epsilon =10^{-7}$. Results are shown in Fig. 4. The pointwise error is of the order of $10^{-5}$ and concentrated at the circular free boundary, mainly as a result of discretization. This type of error is comparable to the results reported in [17] and [26].

For this obstacle problem, we can compare to numerical experiments of previous algorithm implementations, see Table 2. In particular, we compare the runtime and number of iterations with the levelset gradient-method [17]. Even assuming that the 2004 computing times of [17] may be discounted by Moore’s law (by half each 18 months, about a factor of 160–200 overall), we note that our method clearly outperforms by 1–2 orders of magnitude.

5.3 Generalization to a Double Obstacle with Forcing

A more general form of the obstacle problem deals with two obstacles, bounding the membrane from below $(\varphi )$ and above $(\psi )$, and the inclusion of a force term v acting vertically on the membrane:

$$\begin{aligned} \min _u\int _\varOmega |\nabla u|^2 - u v \qquad \text {s.t.}\quad \psi \ge u\ge \varphi ,\quad u\vert _{\partial \varOmega } = f. \end{aligned}$$

(5.7)

It is cumbersome but straightforward to extend the proposed algorithms to this case: the double obstacle results in a second shrinkage term (albeit with opposite signs), and the force v becomes an extra term in the primal variable update, $u^\star = u^n + r_2\mathrm{div}_Xp^{n+1} + v$. The exact details are left as an exercise to the reader.

Table 2 Running times for spherical 2D problem $\varphi _5$, at different resolution

Full size table

As an illustration of the performance of this slightly extended algorithm, we consider an elastic-plastic torsion problem described in [27]. Let $\varOmega = [0,1]^2$ and X a Cartesian sampling thereof. The obstacles are constructed as

$$\begin{aligned} \varphi _6(x,y)&:= -\text {dist}(x,\partial \varOmega ),\ \ \text {and} \end{aligned}$$

(5.8a)

$$\begin{aligned} \psi _6(x,y)&:= 0.2. \end{aligned}$$

(5.8b)

The boundary values are $u=0$ on $\partial \varOmega $. The force $v:\varOmega \rightarrow {\mathbb {R}}$ is defined as

$$\begin{aligned} v(x,y) := {\left\{ \begin{array}{ll} 300,&{}\text {if}\ \ (x,y)\in S := \{ (x,y):|x-y|\le 0.1 \ \wedge \ x\le 0.3\},\\ -70e^yg(x), &{} \text {if}\ \ (x,y) \notin S\ \wedge \ x\le 1-y,\\ 15e^yg(x), &{} \text {if}\ \ (x,y) \notin S\ \wedge \ x> 1-y,\\ \end{array}\right. } \end{aligned}$$

(5.9)

where we further have

$$\begin{aligned} g(x) := \left\{ \begin{array}{lll} 6x,&{}\text {if}\ \ 0\le x\le 1/6,\\ 2(1-3x),&{}\text {if}\ \ 1/6< x\le 1/3,\\ 6(x-1/3),&{}\text {if}\ \ 1/3< x\le 1/2,\\ 2(1-3(x-1/3)),&{}\text {if}\ \ 1/2< x\le 2/3,\\ 6(x-2/3),&{}\text {if}\ \ 2/3< x\le 5/6,\\ 2(1-3(x-2/3)),&{}\text {if}\ \ 5/6< x\le 1.\\ \end{array}\right. \end{aligned}$$

(5.10)

The algorithm on a coarse grid $64\times 64$ (as in [27]), with $\mu =0.1$, $r_1=0.008$, $r_2=15.625$, and for a tolerance $\epsilon =10^{-6}$, converges in 1104 iterations, requiring only 0.45 s. The algorithm presented in [27] is reported to perform in 916.8–1275.1 s, which is more than three orders of magnitude slower. The computed minimum-energy membrane and the associated contact set (coincidence set) are shown in Fig. 5, and compare very well to the results in [27].

6 Conclusions and Outlook

In this paper we have proposed a primal-dual hybrid gradients based method for the efficient solution of the obstacle problem. We consider both the non-linear and linearized versions of the discrete obstacle problem, as well as both the original, constrained formulation and the $L^1$-penalty relaxation. All these problems are convex minimization problems, and we start by reformulating them as primal-dual problems, based on the Legendre–Fenchel transform (convex conjugate) of the surface area and the Dirichlet energy, respectively. The resulting saddle-point problem is solved by the primal-dual hybrid gradients method, which consists of three iterative steps: the dual and primal variable proximal updates, and an extra-gradient step (overrelaxation) of the primal variable. The proximal updates can be solved efficiently, and in the linear case even particularly so. Indeed, the linear case allows eliminating the dual variable altogether, resulting in primal-variable update and extragradient scheme, only. We demonstrate on various 1D and 2D problems that the resulting scheme outperforms current state-of-the-art methods by 1–2 orders of magnitude.

In addition to being efficient, the proposed algorithm also benefits from a highly interesting physical interpretation, as discussed in Sects. 3.4 and 4.2. Firstly, the elimination of the dual variable results in a single variable scheme that is strongly reminiscent of gradient descent, except for the fact that previous iterates remain involved. As a result, there is build-up of a certain “momentum” that accelerates the updates beyond the limits of the usual CFL step-size criteria, see Sect. 3.4. Indeed, away from the solution the previous iterates interfere constructively resulting in a greatly increased apparent time step; whereas closer to the solution the membrane has a tendency to oscillate, and the iterates cancel out leading to increased stability. Similarly, the scheme can be brought into a form that is highly reminiscent of a damped wave equation (Sect. 4.2). This nature is also particularly illustrated by the toy-example considered in Fig. 1, where the momentum splits an initial central state into two separate parts, who then propagate away laterally and eventually dissipate.

A general advantage of the proposed primal-dual based method for the obstacle problem is the absence of matrix/operator inversions, since the proximal updates only contain explicit operations involving adjoints, at most. Also, in analogy to [23, 26] but in contrast to e.g., [12, 13], the proposed method does not need to track the contact set explicitly; indeed, the method handles the interface between coincidence set and harmonic membrane as a free boundary, intrinsically. Finally, it is to note that multigrid and multilevel methods that have been employed to speed up computation in different approaches such as [28] may be applicable to the proposed method as well, expected to result in a further speed up of the computations.

We remark that the properties of the proposed algorithm are not specifically engineered, but emerge from a very disciplined derivation from convex optimization, see Remark 2. More applications of this method, including elliptic PDE problems, will be reported in [33].

Notes

Code available at http://www.math.montana.edu/dzosso/code.
None of these sources explicitly specifies the boundary values other than claiming $f=0$ on “some” (undisclosed) location, clearly different from the square border $\partial \varOmega $.

References

Arrow, K.J., Hurwicz, L., Uzawa, H.: Studies in Linear and Non-Linear Programming. Cambridge Univ Press, Cambridge (1958)
MATH Google Scholar
Attouch, H., Buttazzo, G., Michaille, G. (eds.): Variational Analysis in Sobolev and BV Spaces: Applications to PDEs and Optimization, 2nd edn. SIAM, Philadelphia, PA (2014)
MATH Google Scholar
Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18(11), 2419–2434 (2009). doi:10.1109/TIP.2009.2028250
Article MathSciNet Google Scholar
Bhaya, A., Kaszkurewicz, E.: Steepest descent with momentum for quadratic functions is a version of the conjugate gradient method. Neural Netw. Off. J. Int. Neural Netw. Soc. 17(1), 65–71 (2004). doi:10.1016/S0893-6080(03)00170-9
Article MATH Google Scholar
Braess, D., Carstensen, C., Hoppe, R.H.W.: Convergence analysis of a conforming adaptive finite element method for an obstacle problem. Numer. Math. 107(3), 455–471 (2007). doi:10.1007/s00211-007-0098-6
Article MathSciNet MATH Google Scholar
Brezis, H., Sibony, M.: Méthodes d’approximation et d’itération pour les opérateurs monotones. Arch. Ration. Mech. Anal. 28(1), 59–82 (1968). doi:10.1007/BF00281564
Article MATH Google Scholar
Caffarelli, L.A.: The obstacle problem revisited. J. Fourier Anal. Appl. 4(4–5), 383–402 (1998). doi:10.1007/BF02498216
Article MathSciNet MATH Google Scholar
Chambolle, A., Pock, T.: A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vision 40(1), 120–145 (2011). doi:10.1007/s10851-010-0251-1
Article MathSciNet MATH Google Scholar
Esser, E., Zhang, X., Chan, T.F.: A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3(4), 1015–1046 (2010). doi:10.1137/09076934X
Article MathSciNet MATH Google Scholar
Friedmann, A.: Variational Principles and Free-Boundary Problems. Wiley, New York (1982)
Google Scholar
Glowinski, R.: Numerical methods for nonlinear variational problems. Springer Science & Business Media, Berlin (2013)
Google Scholar
Hintermüller, M., Ito, K., Kunisch, K.: The primal-dual active set strategy as a semismooth newton method. SIAM J. Optim. 13(3), 865–888 (2002). doi:10.1137/S1052623401383558
Article MathSciNet MATH Google Scholar
Hintermüller, M., Kovtunenko, V.A., Kunisch, K.: Obstacle problems with cohesion: a hemivariational inequality approach and its efficient numerical solution. SIAM J. Optim. 21(2), 491–516 (2011). doi:10.1137/10078299
Article MathSciNet MATH Google Scholar
Hoppe, R.H.W.: Multigrid algorithms for variational inequalities. SIAM J. Numer. Anal. 24(5), 1046–1065 (1987). doi:10.1137/0724069
Article MathSciNet MATH Google Scholar
Johnson, C.: Adaptive finite element methods for the obstacle problem. Math. Model Method Appl. Sci. 02(04), 483–487 (1992). doi:10.1142/S0218202592000284
Article MathSciNet MATH Google Scholar
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)
Article MathSciNet MATH Google Scholar
Majava, K., Tai, X.C.: A level set method for solving free boundary problems associated with obstacles. Int. J. Numer. Anal. Model. 1(2), 157–171 (2004)
MathSciNet MATH Google Scholar
Nesterov, Y.: A method of solving a convex programming problem with convergence rate $O(1/k^2)$. Sov. Math. Dokl. 27(2), 372–376 (1983)
MATH Google Scholar
Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964). doi:10.1016/0041-5553(64)90137-5
Article Google Scholar
Qian, N.: On the momentum term in gradient descent learning algorithms. Neural Netw. 12(1), 145–151 (1999). doi:10.1016/S0893-6080(98)00116-6
Article Google Scholar
Rockafellar, R.T.: Conjugate duality and optimization. Soc. Ind. Appl. Math. (1974). doi:10.1137/1.9781611970524
MATH Google Scholar
Rodrigues, J.F.: Obstacle Problems in Mathematical Physics. Elsevier Science Publishers, Amsterdam (1987)
MATH Google Scholar
Scholz, R.: Numerical solution of the obstacle problem by the penalty method. Computing 32(4), 297–306 (1984). doi:10.1007/BF02243774
Article MathSciNet MATH Google Scholar
Sochen, N., Kimmel, R., Malladi, R.: A general framework for low level vision. IEEE Trans. Image Process. 7(3), 310–318 (1998)
Article MathSciNet MATH Google Scholar
Tai, X.C.: Rate of convergence for some constraint decomposition methods for nonlinear variational inequalities. Numer. Math. 93(4), 755–786 (2003). doi:10.1007/s002110200404
Article MathSciNet MATH Google Scholar
Tran, G., Schaeffer, H., Feldman, W.M., Osher, S.J.: An $L^1$ penalty method for general obstacle problems. SIAM J. Appl. Math. 75(4), 1424–1444 (2015). doi:10.1137/140963303
Article MathSciNet MATH Google Scholar
Wang, F., Cheng, X.L.: An algorithm for solving the double obstacle problems. Appl. Math. Comput. 201(1–2), 221–228 (2008). doi:10.1016/j.amc.2007.12.015
MathSciNet MATH Google Scholar
Zhang, Y.: Multilevel projection algorithm for solving obstacle problems. Comput. Math. Appl. 41(12), 1505–1513 (2001). doi:10.1016/S0898-1221(01)00115-8
Article MathSciNet MATH Google Scholar
Zhu, M., Chan, T.: An efficient primal-dual hybrid gradient algorithm for total variation image restoration. Tech. Rep. UCLA CAM Rep. 08–34 (2008). http://www.math.ucla.edu/applied/cam/ ftp://ftp.math.ucla.edu/pub/camreport/cam08-34.pdf
Zhu, M., Wright, S.J., Chan, T.F.: Duality-based algorithms for total-variation-regularized image restoration. Comput. Optim. Appl. 47(3), 377–400 (2010). doi:10.1007/s10589-008-9225-2
Article MathSciNet MATH Google Scholar
Zosso, D., Bustin, A.: A primal-dual projected gradient algorithm for efficient Beltrami regularization. Tech. Rep. UCLA CAM Rep. 14–52 (2014). http://www.math.ucla.edu/applied/cam/ ftp://ftp.math.ucla.edu/pub/camreport/cam14-52.pdf
Zosso, D., Osting, B.: A minimal surface criterion for graph partitioning. Inverse Probl. Imaging 10(4), 1149–1180 (2016). doi:10.3934/ipi.2016036
Article MathSciNet MATH Google Scholar
Zosso, D., Osting, B., Osher, S.J.: Elliptic PDE and primal-dual optimization methods. In Preparation (2015)

Download references

Acknowledgements

The authors thank W. Feldman, I. Kim, and G. Tran for helpful discussions.

Author information

Authors and Affiliations

Department of Mathematical Sciences, Montana State University, Bozeman, MT, 59717-2400, USA
Dominique Zosso
Department of Mathematics, University of Utah, Salt Lake City, UT, 84112, USA
Braxton Osting
Department of Mathematics, University of California, Los Angeles, Los Angeles, CA, 90095-1555, USA
Mandy(Mengqi) Xia & Stanley J. Osher

Authors

Dominique Zosso
View author publications
You can also search for this author in PubMed Google Scholar
Braxton Osting
View author publications
You can also search for this author in PubMed Google Scholar
Mandy(Mengqi) Xia
View author publications
You can also search for this author in PubMed Google Scholar
Stanley J. Osher
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dominique Zosso.

Additional information

DZ and BO gratefully acknowledge support from NSF DMS-1461138. DZ was also supported by UC Lab Fees Research Grant 12-LR-236660 and the W. M. Keck Foundation. SJO is supported by NSF DMS-1118971, UC Lab Fees Grant, and ONR Grant N00014-14-0444.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zosso, D., Osting, B., Xia, M. et al. An Efficient Primal-Dual Method for the Obstacle Problem. J Sci Comput 73, 416–437 (2017). https://doi.org/10.1007/s10915-017-0420-0

Download citation

Received: 01 November 2015
Revised: 06 November 2016
Accepted: 16 March 2017
Published: 23 March 2017
Issue Date: October 2017
DOI: https://doi.org/10.1007/s10915-017-0420-0

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An Efficient Primal-Dual Method for the Obstacle Problem

Abstract

Similar content being viewed by others

First-order least-squares method for the obstacle problem

An interior penalty approach to a large-scale discretized obstacle problem with nonlinear constraints

A power penalty approach to a discretized obstacle problem with nonlinear constraints

1 Introduction

1.1 Outline

2 Discretized Obstacle Problem

2.1 Inner Products and Norms

2.2 Finite Difference Operators

2.3 Surface Area and Dirichlet Energy

2.4 Convex Conjugates

2.5 Discretized Obstacle Problem

2.5.1 Constrained Obstacle Problem

2.5.2 Constrained Linearized Obstacle Problem

2.5.3 \(L^1\)-Penalty Obstacle Problem

2.5.4 Linearized \(L^1\)-Penalty Obstacle Problem

Lemma 1

Proof

Lemma 2

Proof

Theorem 1

Proof

2.6 Saddle Point Formulation

2.6.1 Constrained Obstacle Problem

2.6.2 Constrained Linearized Obstacle Problem

2.6.3 \(L^1\)-Penalty Obstacle Problem

2.6.4 Linearized \(L^1\)-Penalty Obstacle Problem

3 Algorithms

3.1 Primal-Dual Hybrid Gradients Scheme

3.2 Dual Variable Update

3.2.1 Linear Cases, (PD2) and (PD4)

3.2.2 Non-Linear Cases (PD1) and (PD3)

3.3 Primal Variable Update

3.3.1 Constrained Formulation (PD1) and (PD2)

3.3.2 \(L^1\)-Penalty Formulations (PD3) and (PD4)

Remark 1

3.4 Elimination of the Dual Variable in the Linear Case, (PD2) and (PD4)

3.5 Complete Algorithms

Remark 2

4 Interpretation/Comparisons for Proposed Algorithm

4.1 Comparison to the Forward Euler Scheme for the Discrete Heat Equation

4.2 Comparison to a Time-Stepping Scheme for the Damped Wave Equation

Lemma 3

Remark 3

Proof

5 Numerical Experiments

5.1 One-Dimensional Obstacles

5.2 Two-Dimensional Obstacles

5.3 Generalization to a Double Obstacle with Forcing

6 Conclusions and Outlook

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation