1 Model Problem

Let Ω be a convex bounded polygonal/polyhedral domain in \(\mathbb {R}^2/\mathbb {R}^3\), y d ∈ L 2( Ω), β be a positive constant, ψ ∈ H 3( Ω) ∩ W 2, ( Ω) and ψ > 0 on  Ω. The model problem [1] is to find

(1)

where \((y,u)\in H^1_0(\Omega )\times L_2(\Omega )\) belongs to \(\mathbb {K}\) if and only if

$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} \int_\Omega\nabla y\cdot\nabla z\,dx&= \int_\Omega uz\,dx&\qquad &\forall\,z\in H^1_0(\Omega), {} \end{array}\end{aligned} $$
(2)
$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} y&\leq\psi &\qquad &\text{a.e. on }\Omega. {} \end{array}\end{aligned} $$
(3)

Throughout this paper we will follow the standard notation for operators, function spaces and norms that can be found for example in [2, 3].

In this model problem y (resp., u) is the state (resp., control) variable, y d is the desired state and β is a regularization parameter. Similar linear-quadratic optimization problems also appear as subproblems when general PDE constrained optimization problems are solved by sequential quadratic programming (cf. [4, 5]).

In view of the convexity of Ω, the constraint (2) implies y ∈ H 2( Ω) (cf. [6,7,8]). Therefore we can reformulate (1)–(3) as follows:

(4)

where

$$\displaystyle \begin{aligned} K=\{y\in H^2(\Omega)\cap H^1_0(\Omega): y\leq\psi \;\text{on }\Omega\}. \end{aligned} $$
(5)

Note that K is nonempty because ψ > 0 on  Ω. It follows from the classical theory of calculus of variations [9] that (4)–(5) has a unique solution \(\bar y\in K\) characterized by the fourth order variational inequality

$$\displaystyle \begin{aligned} a(\bar y,y-\bar y)\geq \int_\Omega y_d(y-\bar y)dx \qquad \forall\,y\in K, \end{aligned} $$
(6)

where

$$\displaystyle \begin{aligned} a(y,z)=\beta\int_\Omega (\Delta y)(\Delta z)dx+\int_\Omega yz\,dx. \end{aligned} $$
(7)

Furthermore, by the Riesz-Schwartz Theorem for nonnegative linear functionals [10, 11], we can rewrite (6) as

$$\displaystyle \begin{aligned} a(\bar y,z)=\int_\Omega y_dz\,dx+\int_\Omega z\,d\mu \qquad \forall\,z\in H^2(\Omega)\cap H^1_0(\Omega), \end{aligned} $$
(8)

where

$$\displaystyle \begin{aligned} \mu\text{ is a nonpositive finite Borel measure} \end{aligned} $$
(9)

that satisfies the complementarity condition

$$\displaystyle \begin{aligned} \int_\Omega (\bar y-\psi)d\mu=0. \end{aligned} $$
(10)

Note that (10) is equivalent to the statement that

$$\displaystyle \begin{aligned} \mu\text{ is supported on }\mathcal{A}, \end{aligned} $$
(11)

where the active set \(\mathcal {A}=\{x\in \Omega :\,\bar y(x)=\psi (x)\}\) satisfies

$$\displaystyle \begin{aligned} \mathcal{A}\subset\subset \Omega \end{aligned} $$
(12)

because ψ > 0 on  Ω and \(\bar y=0\) on  Ω.

According to the elliptic regularity theory in [6,7,8, 12, 13], we have

$$\displaystyle \begin{aligned} \bar y\in H^3_{loc}(\Omega)\cap W^{2,\infty}_{loc}(\Omega)\cap H^{2+\alpha}(\Omega), \end{aligned} $$
(13)

where α ∈ (0, 1] is determined by the geometry of Ω. It then follows from (8), (11)–(13) and integration by parts that

$$\displaystyle \begin{aligned} \mu\in H^{-1}(\Omega). \end{aligned} $$
(14)

Details for (13) and (14) can be found in [14].

Remark 1

Note that (cf. [6, 15])

$$\displaystyle \begin{aligned} \int_\Omega (\Delta y)(\Delta z)dx=\int_\Omega D^2y:D^2 z\,dx\qquad \forall\,y,z\in H^2(\Omega)\cap H^1_0(\Omega), \end{aligned}$$

where D 2y : D 2z denotes the Frobenius inner product between the Hessian matrices of y and z. Therefore we can rewrite the bilinear form a(⋅, ⋅) in (7) as

$$\displaystyle \begin{aligned} a(y,z)= \beta\int_\Omega D^2y:D^2z\,dx+\int_\Omega yz\,dx. \end{aligned} $$
(15)

2 Finite Element Methods

In the absence of the state constraint (3), we have \(K=H^2(\Omega )\cap H^1_0(\Omega )\) and (6) becomes the boundary value problem

$$\displaystyle \begin{aligned} a(\bar y,z)=\int_\Omega y_dz\,dx\qquad \forall\,z\in H^2(\Omega)\cap H^1_0(\Omega). \end{aligned} $$
(16)

Since (16) is essentially a bending problem for simply supported plates, it can be solved by many finite element methods such as (1) conforming methods, (2) classical nonconforming methods, (3) discontinuous Galerkin methods, and (4) mixed methods. For the sake of brevity, below we will consider these methods for \(\Omega \subset \mathbb {R}^2\). But all the results can be extended to three dimensions.

Let V h be a finite element space associated with a triangulation \(\mathcal {T}_h\) of Ω. The approximate solution \(\bar y_h\in V_h\) is determined by

$$\displaystyle \begin{aligned} a_h(\bar y_h,z)=\int_\Omega y_dz\,dx\qquad \forall\,z\in V_h, \end{aligned} $$
(17)

where the choice of the bilinear form a h(⋅, ⋅) depends on the type of finite element method being used.

2.1 Conforming Methods

In this case \(V_h\subset H^2(\Omega )\cap H^1_0(\Omega )\) is a C 1 finite element space and we can take a h(⋅, ⋅) to be a(⋅, ⋅). This class of methods includes the Bogner-Fox-Schmit element [16], the Argyris elements [17], the macro elements [18,19,20], and generalized finite elements [21,22,23].

2.2 Classical Nonconforming Methods

In this case V h ⊂ L 2( Ω) consists of finite element functions that are weakly continuous up to first order derivatives across element boundaries, and the bilinear form a h(⋅, ⋅) is given by

$$\displaystyle \begin{aligned} a_h(y,z)=\beta\sum_{T\in\mathcal{T}_h}\int_\Omega D^2y:D^2z\,dx +\int_\Omega yz\,dx. \end{aligned} $$
(18)

Here we are using the piecewise version of (15), which provides better local control of the nonconforming energy norm \(\|\cdot \|{ }_{a_h}=\sqrt {a_h(\cdot ,\cdot )}\).

This class of methods includes the Adini element [24], the Zienkiewicz element [25], the Morley element [26], the Fraeijs de Veubeke element [27], and the incomplete biquadratic element [28].

2.3 Discontinuous Galerkin Methods

In this case V h consists of functions that are totally discontinuous or only discontinuous in the normal derivatives across element boundaries, and stabilization terms are included in the bilinear form a h(⋅, ⋅). The simplest choice is a Lagrange finite element space \(V_h\subset H^1_0(\Omega )\), resulting in the C 0 interior penalty methods [29,30,31], where the bilinear form a h(⋅, ⋅) is given by

(19)

Here \(\mathcal {E}_h^i\) is the set of the interior edges of \(\mathcal {T}_h\), (resp., ) is the average (resp., jump) of the second (resp., first) normal derivative of y across the edge e, |e| is the length of the edge e, and σ is a (sufficiently large) penalty parameter.

Other discontinuous Galerkin methods for fourth order problems can be found in [32,33,34].

2.4 Mixed Methods

In this case \(V_h\subset H^1_0(\Omega )\) is a Lagrange finite element space. The approximate solution \(\bar y_h\) is determined by

$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} \int_\Omega \bar y_hz\,dx+\beta\int_\Omega \nabla \bar u_h\cdot\nabla z\,dx& =\int_\Omega y_d z\,dx &\qquad &\forall\,z\in V_h,{} \end{array}\end{aligned} $$
(20)
$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} \int_\Omega\nabla \bar y_h\cdot\nabla v\,dx-\int_\Omega \bar u_h v\,dx&=0 &\qquad &\forall\,v\in V_h.{} \end{array}\end{aligned} $$
(21)

By eliminating \(\bar u_h\) from (20)–(21), we can recast \(\bar y_h\) as the solution of (17) where

$$\displaystyle \begin{aligned} a_h(y,z)=\beta\int_\Omega (\Delta_h y)(\Delta_h z)\,dx+\int_\Omega yz\,dx, \end{aligned} $$
(22)

and the discrete Laplace operator Δh : V hV h is defined by

$$\displaystyle \begin{aligned} \int_\Omega (\Delta_h y)z\,dx=-\int_\Omega\nabla y\cdot\nabla z\,dx \qquad \forall\,y,z\in V_h.\end{aligned} $$
(23)

2.5 Finite Element Methods for the Optimal Control Problem

With the finite element methods for (16) in hand, we can now simply discretize the variational inequality (6) as follows: Find \(\bar y_h\in V_h\) such that

$$\displaystyle \begin{aligned} a_h(\bar y_h,y-\bar y_h)\geq \int_\Omega y_d(y-\bar y_h)dx\qquad \forall\,y\in K_h, \end{aligned} $$
(24)

where

$$\displaystyle \begin{aligned} K_h=\{y\in V_h:\,I_h y\leq I_h \psi\;\text{on }\Omega\}, \end{aligned} $$
(25)

and I h is the nodal interpolation operator for the conforming P 1 finite element space associated with \(\mathcal {T}_h\). In other words, the constraint (3) is only imposed at the vertices of \(\mathcal {T}_h\).

Remark 2

Conforming, nonconforming, C 0 interior penalty and mixed methods for (6) were investigated in [14, 35,36,37,38,39,40,41].

3 Convergence Analysis

For simplicity, we will only provide details for the case of conforming finite element methods and briefly describe the extensions to other methods at the end of the section.

For conforming finite element methods, we have a h(⋅, ⋅) = a(⋅, ⋅) and the energy norm \(\|\cdot \|{ }_a=\sqrt {a(\cdot ,\cdot )}\) satisfies, by a Poincaré-Friedrichs inequality [42],

$$\displaystyle \begin{aligned} \|v\|{}_a\approx \|v\|{}_{H^2(\Omega)} \qquad \forall\,v\in H^2(\Omega). \end{aligned} $$
(26)

Our goal is to show that

$$\displaystyle \begin{aligned} \|\bar y-\bar y_h\|{}_a\leq Ch^\alpha, \end{aligned} $$
(27)

where α is the index of elliptic regularity that appears in (13).

We assume (cf. [43]) that there exists an operator \(\Pi _h:H^2(\Omega )\cap H^1_0(\Omega )\longrightarrow V_h\) such that

$$\displaystyle \begin{aligned} \Pi_h\zeta=\zeta \quad \text{at the vertices of }\mathcal{T}_h \end{aligned} $$
(28)

and

$$\displaystyle \begin{aligned} \|\zeta-\Pi_h\zeta\|{}_{L_2(\Omega)}+h|\zeta-\Pi_h\zeta|{}_{H^1(\Omega)} +h^2|\zeta-\Pi_h\zeta|{}_{H^2(\Omega)}\leq Ch^{2+\alpha}|\zeta|{}_{H^{2+\alpha}(\Omega)} \end{aligned} $$
(29)

for all \(\zeta \in H^{2+\alpha }(\Omega )\cap H^1_0(\Omega )\), where \(h=\max _{T\in \mathcal {T}_h}\text{diam}\,T\) is the mesh size of the triangulation \(\mathcal {T}_h\). Here and below we use C to denote a generic positive constant independent of h.

In particular (5), (25) and (28) imply

$$\displaystyle \begin{aligned} \Pi_h\text{ maps }K\text{ into }K_h. \end{aligned} $$
(30)

Therefore K h is nonempty and the discrete problem defined by (24)–(25) has a unique solution.

We will also use the following standard properties of the interpolation operator I h (cf. [2, 3]):

$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} \|\zeta- I_h\zeta\|{}_{L_\infty(T)}&\leq Ch_T^2|\zeta|{}_{W^{2,\infty}(T)}&\qquad & \forall\,\zeta\in W^{2,\infty}(T),\, T\in \mathcal{T}_h, {} \end{array}\end{aligned} $$
(31)
$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} |\zeta-I_h\zeta|{}_{H^1(T)}&\leq Ch_T|\zeta|{}_{H^2(T)}&\qquad &\forall\,\zeta\in H^2(T),\, T\in \mathcal{T}_h, {} \end{array}\end{aligned} $$
(32)

where h T is the diameter of T.

We begin with the estimate

$$\displaystyle \begin{aligned} \|\bar y-\bar y_h\|{}_a^2&=a(\bar y-\bar y_h,\bar y-\bar y_h) \\ &=a(\bar y-\bar y_h,\bar y-\Pi_h\bar y)+a(\bar y,\Pi_h\bar y-\bar y_h)- a(\bar y_h,\Pi_h\bar y-\bar y_h)\\ &\leq C_1\|\bar y-\bar y_h\|{}_ah^\alpha+\Big[a(\bar y,\Pi_h\bar y-\bar y_h) -\int_\Omega y_d(\Pi_h\bar y-\bar y_h)dx\Big] \end{aligned} $$
(33)

that follows from (13), (24), (26), (29), (30) and the Cauchy-Schwarz inequality.

Remark 3

Note that an estimate analogous to (33) also appears in the error analysis for the boundary value problem (16). Indeed the second term on the right-hand side of (33) vanishes in the case of (16) and we would have arrived at the desired estimate \(\|\bar y-\bar y_h\|{ }_a\leq Ch^\alpha \).

The idea now is to show that

$$\displaystyle \begin{aligned} a(\bar y,\Pi_h\bar y-\bar y_h) -\int_\Omega y_d(\Pi_h\bar y-\bar y_h)dx \leq C_2\big[h^{2\alpha}+h^\alpha\|\bar y-\bar y_h\|{}_a\big], \end{aligned} $$
(34)

which together with (33) implies

$$\displaystyle \begin{aligned} \|\bar y-\bar y_h\|{}_a^2\leq C_3h^\alpha\|\bar y-\bar y_h\|{}_a+C_2 h^{2\alpha}. \end{aligned} $$
(35)

The estimate (27) then follows from (35) and the inequality

$$\displaystyle \begin{aligned} ab\leq \frac{\epsilon}{2}a^2+\frac{1}{2\epsilon}b^2 \end{aligned}$$

that holds for any positive 𝜖.

Let us turn to the derivation of (34). Since \(K_h\subset V_h\subset H^2(\Omega )\cap H^1_0(\Omega )\), we have, according to (8),

(36)

and, in view of (9), (10) and (25),

$$\displaystyle \begin{aligned} \int_\Omega (\bar y-\psi)d\mu=0 \quad \text{and} \quad \int_\Omega (I_h\psi-I_h\bar y_h)d\mu\leq 0. \end{aligned} $$
(37)

We can estimate the other three integrals on the right-hand side of (36) as follows:

$$\displaystyle \begin{aligned} \int_\Omega (\Pi_h\bar y-\bar y)d\mu\leq \|\mu\|{}_{H^{-1}(\Omega)} \|\Pi_h\bar y-\bar y\|{}_{H^1(\Omega)}\leq Ch^{1+\alpha} \end{aligned} $$
(38)

by (13), (14) and (29);

$$\displaystyle \begin{aligned} \int_\Omega (\psi-I_h\psi)d\mu\leq |\mu(\Omega)|\|\psi-I_h\psi\|{}_{L_\infty(\Omega)} \leq Ch^2 \end{aligned} $$
(39)

by (9) and (31);

(40)

by (11)–(13), (26), (31) and (32).

The estimate (34) follows from (36)–(40) and the fact that α ≤ 1.

The estimate (27) can be extended to the other finite element methods in Sect. 2 provided ∥⋅∥a is replaced by \(\|\cdot \|{ }_{a_h}=\sqrt {a_h(\cdot ,\cdot )}\).

For classical nonconforming finite element methods and discontinuous Galerkin methods, the key ingredient for the convergence analysis, in addition to an operator \(\Pi _h:H^2(\Omega )\cap H^1_0(\Omega )\longrightarrow V_h\) that satisfies (28) and (29), is the existence of an enriching operator \(E_h:\longrightarrow H^2(\Omega )\cap H^1_0(\Omega )\) with the following properties:

$$\displaystyle \begin{aligned} &(E_hv)(p)=v(p) \quad \text{for all vertices }p\text{ of }\mathcal{T}_h,{} \end{aligned} $$
(41)
(42)
$$\displaystyle \begin{aligned} &\|\zeta-E_h\Pi_h\zeta\|{}_{H^1(\Omega)}\leq Ch^{1+\alpha}\|\zeta\|{}_{H^{2+\alpha}(\Omega)} \qquad \forall\,\zeta\in H^{2+\alpha}(\Omega)\cap H^1_0(\Omega),{} \end{aligned} $$
(43)
$$\displaystyle \begin{aligned} &|a_h(\Pi_h\zeta,v)-a(\zeta,E_hv)|\leq Ch^\alpha\|\zeta\|{}_{H^{2+\alpha}(\Omega)}\|v\|{}_h {} \end{aligned} $$
(44)

for all \(\zeta \in H^{2+\alpha }(\Omega )\cap H^1_0(\Omega )\) and v ∈ V h.

Property (41) is related to the fact that the discrete constraints are imposed at the vertices of \(\mathcal {T}_h\); property (42) indicates that in some sense ∥v − E hvh measures the distance between V h and \(H^2(\Omega )\cap H^1_0(\Omega )\); property (43) means that E h Πh behaves like a quasi-local interpolation operator; property (44) states that E h is essentially the adjoint of Πh with respect to the continuous and discrete bilinear forms. The idea is to use (42) and (44) to reduce the error estimate to the continuous level, and then the error analysis can proceed as in the case of conforming finite element method by using (41) and (43). Details can be found in [44].

Remark 4

The operator E h maps V h to a conforming finite element space and its construction is based on averaging. The history of using such enriching operators to handle nonconforming finite element methods is discussed in [45].

In the case of the mixed method where \(V_h\subset H^1_0(\Omega )\) is a Lagrange finite element space, the operator \(E_h:V_h\longrightarrow H^2(\Omega )\cap H^1_0(\Omega )\) is defined by

$$\displaystyle \begin{aligned} \int_\Omega \nabla E_hv\cdot\nabla w\,dx=\int_\Omega \nabla v\cdot\nabla w\,dx \qquad \forall v\in V_h,\,w\in H^1_0(\Omega). \end{aligned} $$
(45)

The properties (42)–(44) remain valid provided Πh is replaced by the Ritz projection operator \(R_h:H^1_0(\Omega )\longrightarrow V_h\) defined by

$$\displaystyle \begin{aligned} \int_\Omega\nabla R_h\zeta\cdot\nabla v\,dx=\int_\Omega\nabla\zeta\cdot\nabla v\,dx \qquad \forall\,v\in V_h. \end{aligned} $$
(46)

In fact (45) and (46) imply ζ − E hR hζ = 0 and property (43) becomes trivial. However the properties (28) and (41) no longer hold, which necessitates the use of the more sophisticated interior error estimates (cf. [46]) in the convergence analysis. Details can be found in [14].

Remark 5

Since the elliptic regularity index α in (13) is determined by the singularity of the Laplace equation near the boundary of Ω, various finite element techniques [47, 48] can be employed to improve the estimate (27) to

$$\displaystyle \begin{aligned} \|\bar y-\bar y_h\|{}_{a_h}\leq Ch. \end{aligned} $$
(47)

One can also compute an approximation \(\bar u_h\) for the optimal control \(\bar u\) from the approximate optimal state \(\bar y_h\) through post-processing processes [49].

Remark 6

The discrete problems generated by the finite element methods in Sect. 2, which only involve simple box constraints, can be solved efficiently by a primal-dual active set algorithm [50,51,52].

4 Concluding Remarks

In this paper finite element methods for elliptic distributed optimal control problems with pointwise state constraints are treated from the perspective of finite element methods for the boundary value problem of simply supported plates.

The discussion in Sect. 2 shows that one can solve elliptic distributed optimal control problems with pointwise state constraints by a straightforward adaptation of many finite element methods for simply supported plates. The convergence analysis in Sect. 3 demonstrates that the gap between the finite element analysis for boundary value problems and the finite element analysis for elliptic optimal control problems is in fact quite narrow. Thus the vast arsenal of finite element techniques developed for elliptic boundary value problems over several decades can be applied to elliptic optimal control problems with only minor modifications.

Note that in the traditional approach to elliptic optimal control problems, the optimal control \(\bar u\) is treated as the primary unknown and the resulting finite element methods in [35, 39] are equivalent to the method defined by (24), where the bilinear form is given by (22). Therefore the approach based on the reformulation (4)–(5) expands the scope of finite element methods for elliptic optimal control problems from a special class of methods (i.e., mixed methods) to all classes of methods. In addition to the finite element mentioned in Sect. 2, one can also consider recently developed finite element methods for fourth order problems on polytopal meshes [53,54,55,56,57,58,59,60].

The new approach has been extended to problems with the Neumann boundary condition [61, 62] and to problems with pointwise constraints on both control and state [63]. It has also been extended to problems on nonconvex domains [14, 62, 64].

Below are some open problems related to the finite element methods presented in Sect. 2.

  1. 1.

    It follows from the error estimates (27) and (47) that

    $$\displaystyle \begin{aligned} \|\bar y-\bar y_h\|{}_{H^1(\Omega)}+\|\bar y-\bar y_h\|{}_{L_\infty(\Omega)} \leq Ch^\gamma, \end{aligned} $$
    (48)

    where γ = α (without special treatment) or 1 (with special treatments). For conforming or mixed finite element methods, the estimate (48) is a direct consequence of the fact that the energy norm is equivalent to the H 2( Ω) norm and that we have the Sobolev inequality

    $$\displaystyle \begin{aligned} \|\zeta\|{}_{L_\infty(\Omega)}\leq C\|\zeta\|{}_{H^2(\Omega)}. \end{aligned}$$

    For classical nonconforming and discontinuous Galerkin methods, the estimate (48) follows from the Poincaré-Friedrichs inequality and Sobolev inequality for piecewise H 2 functions in [65, 66].

    Comparing to \(\|\cdot \|{ }_{H^2(\Omega )}\), the norms \(\|\cdot \|{ }_{H^1(\Omega )}\) and \(\|\cdot \|{ }_{L_\infty (\Omega )}\) are lower order norms and, based on experience with finite element methods for the boundary value problem (16), the convergence in \(\|\cdot \|{ }_{H^1(\Omega )}\) and \(\|\cdot \|{ }_{L_\infty (\Omega )}\) should be of higher order, and this is observed in numerical experiments. But the theoretical justifications for the observed higher order convergence is missing. In the case of the boundary value problem (16), one can show higher order convergence for lower order norms through a duality argument. However duality arguments do not work for variational inequalities even in one dimension [67]. New ideas are needed.

  2. 2.

    An interesting phenomenon concerning fourth order variational inequalities is that a posteriori error estimators originally designed for fourth order boundary value problems can be directly applied to fourth order variational inequalities [61, 68]. This is different from the second order case where a posteriori error estimators for boundary value problems are not directly applicable to variational inequalities. This difference is essentially due to the fact that Dirac point measures belong to H −2( Ω) but not H −1( Ω).

    Optimal convergence of these adaptive finite element methods have been observed in numerical experiments. However the proofs of convergence and optimality are missing.

  3. 3.

    Fast solvers for fourth order variational inequalities is an almost completely open area. Some recent work on additive Schwarz preconditioners for the subsystems that appear in the primal-dual active set algorithm can be found in [69, 70]. Much remains to be done.