Keywords

1 Preliminaries on the Continuous Model and Problem Formulation

Consider the 1D wave equation with non-homogeneous boundary conditions:

$$ \begin{cases}y_{tt}(x,t)-y_{xx}(x,t)=0,\quad x\in(0,1),\ t>0,\\ y(0,t)=0,\qquad y(1,t)=v(t),\quad t>0,\\ y(x,0)=y^0(x),\qquad y_t(x,0)=y^1(x),\quad x\in(0,1). \end{cases} $$
(1)

System (1) is said to be exactly controllable in time T≥2 if, for all (y 0,y 1)∈L 2×H −1(0,1), there exists a control function vL 2(0,T) such that the solution of (1) can be driven to rest at time T, i.e. y(x,T)=y t (x,T)=0.

We also introduce the adjoint 1D wave equation with homogeneous boundary conditions:

$$ \begin{cases}u_{tt}(x,t)-u_{xx}(x,t)=0,\quad x\in(0,1),\ t>0,\\ u(0,t)=u(1,t)=0,\quad t>0,\\ u(x,T)=u^0(x),\qquad u_t(x,T)=u^1(x),\quad x\in(0,1). \end{cases} $$
(2)

This system is well known to be well posed in the energy space \(\mathcal{V}:=H_{0}^{1}\times L^{2}(0,1)\) and the energy below is conserved in time:

$$\mathcal{E}\bigl(u^0,u^1\bigr)=\frac{1}{2}\bigl( \bigl\|u(\cdot,t)\bigr\|_{H_0^1}^2+\bigl\| u_t(\cdot,t) \bigr\|_{L^2}^2\bigr)=\frac{1}{2}\bigl(\bigl\|u^0 \bigr\|_{H_0^1}^2+\bigl\|u^1\bigr\|_{L^2}^2 \bigr). $$

The Hilbert Uniqueness Method (HUM) introduced in [8] allows showing that the property above of exact controllability for (1) is equivalent to the boundary observability property of (2). The observability property ensures that the following observability inequality holds for all solutions of (2), provided T≥2:

$$ \mathcal{E}\bigl(u^0,u^1\bigr)\leq C(T)\int _0^T\bigl|u_x(1,t)\bigr|^2\,dt. $$
(3)

The best constant C(T) in (3) is the so-called observability constant. The observability time T has to be larger than the characteristic one, T :=2, needed by any initial data (u 0,u 1) supported in a very narrow neighborhood of x=1 to travel along the characteristic rays parallel to x(t)=xt, touch the boundary x=0 and bounce back to the boundary x=1 along the characteristics parallel to x(t)=x+t.

The HUM control v, the one of minimal L 2(0,T)-norm, for which the solution of (1) fulfills y(x,T)=y t (x,T)=0, has the explicit form

$$ v(t)=\tilde{v}(t):=\tilde{u}_x(1,t), $$
(4)

where \(\tilde{u}(x,t)\) is the solution of (2) corresponding to the minimum \((\tilde{u}^{0},\tilde{u}^{1})\in\mathcal {V}\) of the quadratic functional

$$ \mathcal{J}\bigl(u^0,u^1\bigr)=\frac{1}{2}\int _0^T\bigl|u_x(1,t)\bigr|^2\,dt- \bigl\langle\bigl(y^1,-y^0\bigr),\bigl(u( \cdot,0),u_t(\cdot,0)\bigr)\bigr\rangle_{\mathcal {V}',\mathcal{V}}. $$
(5)

Here, \(\mathcal{V}'=H^{-1}\times L^{2}(0,1)\) and \(\langle \cdot,\cdot\rangle_{\mathcal{V}',\mathcal{V}}\) is the duality product between \(\mathcal{V}\) and \(\mathcal{V}'\).

In this paper, in order to analyze the efficiency of the various models under consideration, we shall run the simulations on a specific example. We consider the particular case of the characteristic control time T=2 and of initial data (y 0,y 1) in (1) given by y 1≡0 and the Heaviside function H as initial position:

$$ y^0(x)=H(x):= \begin{cases}1,&x\in[0,1/2),\\-1,&x\in[1/2,1]. \end{cases} $$
(6)

The initial position, having discontinuities, involves significant high frequency components that will be the source of instabilities for the numerical methods under consideration. This example allows us to highlight the high-frequency pathologies of the numerical approximations of the controlled wave problem (1) and the effects of the filtering techniques we propose. In this particular case, the HUM control can be explicitly computed by Fourier expansions, using the periodicity with time period τ=2 of the solutions (cf. Sect. 3.3 in [3]), and it is given by (see Fig. 1):

$$ \tilde{v}(t)=\tilde{v}_H(t)= \begin{cases}-1/2,&t\in(0,1/2]\cup(1,3/2],\\ 1/2,&t\in(1/2,1]\cup(3/2,2). \end{cases} $$
(7)
Fig. 1
figure 1

The initial position H(x) (left) versus the HUM control \(\tilde{v}_{H}\) (middle) versus the solution y of the control problem (1) (right) (red=1, orange=1/2, green=0, cyan=−1/2, and blue=−1)

The discrete approach to the numerical approximation of this kind of control problems has been intensively studied during the last years, starting from some simple models on uniform meshes like finite differences or linear finite element methods in [7] and, more recently, more complex schemes like the discontinuous Galerkin ones in [10]. The problem consists in analyzing whether the controls of a numerical approximation scheme of (1), obtained in a similar manner, i.e., by minimizing a suitable discrete version of (5), converge to the control v of the wave equation (1) as the mesh size parameter tends to zero. In all these cases, the convergence of the approximation scheme in the classical sense of the numerical analysis does not suffice to guarantee that the sequence of discrete controls converges to the continuous ones, as one could expect. This is due to the fact that there are classes of initial data for the discrete adjoint problem generating high-frequency wave packets propagating at a very low group velocity and that, consequently, cannot be observed from the boundary of the domain during a finite time uniformly with the mesh size parameter. This leads to the divergence of the discrete observability constant as the mesh size tends to zero.

Similar high-frequency pathological phenomena have also been observed for numerical approximation schemes of other models, like the linear Schrödinger equation (cf. [5]), in which one is also interested in the uniformity of the so-called dispersive estimates, which play an important role in the study of the well-posedness of some nonlinear models.

The rest of the paper is organized as follows. In Sect. 2, we summarize some well-known results on the boundary controllability of the classical finite element space semi-discretizations, especially the linear and the quadratic ones, emphasizing the high frequency pathologies and their remedies based on the bi-grid algorithm. In Sect. 3, we present in detail the implementation of the conjugate gradient algorithm giving the numerical HUM controls, together with its two-grid adaptation, and we show some numerical results to illustrate the validity of the theoretical ones. In Sect. 4, we will summarize the conclusions of our paper and some related open problems.

Before starting, let us give some basic notation. All vectors we deal with will be considered as being column vectors and will be denoted by bold capital letters. We will use capital letters for the components of the vectors and for matrices and calligraphic capital letters for the discrete spaces. We denote: by h—the mesh size and it will be the first superscript; by p—the degree of the numerical approximation and it will be the first subscript; by the superscript ∗—the transposition of a matrix; and by the overline symbol—the complex conjugation.

2 Preliminaries on Numerical Controls Using P 1 and P 2 Finite Element Approximations

Let us now introduce the quadratic P 2 finite element approximation method and recall the main existent results, taken essentially from [11]. We consider N∈ℕ, h=1/(N+1), and 0=x 0<x j <x N+1=1 to be the nodes of a uniform grid of the interval [0,1], with x j =jh, 0≤jN+1, constituted by the subintervals I j =(x j ,x j+1), with 0≤jN. On this grid, we also define the midpoints x j+1/2=(j+1/2)h, with 0≤jN. Let us introduce the space \(\mathcal{P}_{p}(a,b)\) of polynomials of order p on the interval (a,b) and the space of piecewise quadratic and continuous functions \(\mathcal{U}^{h}_{2}:=\{u\in H_{0}^{1}(0,1)\mbox{ s.t. }u|_{I_{j}}\in\mathcal {P}_{2}(I_{j}),\ 0\leq j\leq N\}\). The space \(\mathcal{U}^{h}_{2}\) can be written as

$$\mathcal{U}^h_2=\mbox{span}\bigl\{\phi_{2,j}^h,1 \leq j\leq N\bigr\}\oplus\mbox{span}\bigl\{\phi_{2,j+1/2}^h,0 \leq j\leq N\bigr\}, $$

where the two classes of basis functions are represented in Fig. 2 and are explicitly given by

(8)

with Q(x,a,b)=(xa)(xb) and [f]+—the positive part of f.

Fig. 2
figure 2

The basis functions: \(\phi_{2,j}^{h}\) (left), \(\phi _{2,j+1/2}^{h}\) (middle), and \(\phi_{1,j}^{h}\) (right)

We will compare the results obtained when numerically approximating the controls on this basis with the ones obtained by the linear P 1 finite element approximation. In order to do this, in the same uniform grid of size h defined by the nodal points x j , 0≤jN+1, we introduce the space of piecewise linear and continuous functions \(\mathcal{U}^{h}_{1}:=\{u\in H_{0}^{1}(0,1)\mbox{ s.t. }u|_{I_{j}}\in\mathcal {P}_{1}(I_{j}),\ 0\leq j\leq N\}\), which can be written as \(\mathcal{U}^{h}_{1}=\mbox{span}\{\phi_{1,j}^{h},1\leq j\leq N\}\), where \(\phi_{1,j}^{h}(x)=[1-(x-x_{j})/h]^{+}\). The linear/quadratic approximation of the adjoint problem (2) is

$$ \begin{cases}\mbox{Find }u^h_p(\cdot,t)\in\mathcal{U}^h_p\mbox{ s.t. }\frac{d^2}{dt^2}( u^h_p(\cdot,t), \varphi)_{L^2} +(u^h_p(\cdot,t),\varphi)_{H_0^1}=0,\quad\forall\varphi\in \mathcal {U}^h_p,\\ u^h_p(x,T)=u^{h,0}_p(x),\qquad u^h_{p,t}(x,T)=u^{h,1}_p(x),\quad x\in(0,1). \end{cases} $$
(9)

The solution \(u^{h}_{p}(\cdot,t)\in\mathcal{U}^{h}_{p}\) admits the decomposition

$$u^h_p(x,t)=\sum _{j=1}^{pN+p-1}U_{p,j/p}(t)\phi_{p,j/p}^h(x). $$

Consequently, the function \(u^{h}_{p}(\cdot,t)\) can be identified with the vector of its coefficients, \(\mathbf{U}^{h}_{p}(t)=(U_{p,j/p}(t))_{1\leq j\leq pN+p-1}\). Thus, problem (9) can be written as a system of second-order linear ordinary differential equations (ODEs):

$$ M^h_p\mathbf{U}^h_{p,tt}(t)+S^h_p \mathbf{U}^h_p(t)=0,\quad \mathbf{U}^h_p(T)= \mathbf{U}^{h,0}_p,\quad \mathbf{U}^h_{p,t}(T)= \mathbf{U}^{h,1}_p,\ p=1,2, $$
(10)

where \(M^{h}_{1}\) and \(S^{h}_{1}\) are the following N×N tri-diagonal mass and stiffness matrices

and \(M^{h}_{2}\) and \(S^{h}_{2}\) are the following (2N+1)×(2N+1) pentha-diagonal mass and stiffness matrices

and

For p=1,2 (corresponding to the linear/quadratic approximation), let us introduce some notations for the discrete analogues of \(H_{0}^{1}(0,1)\), L 2(0,1), and H −1(0,1),

The elements of \(\mathcal{H}^{h,1}_{p}\) verify the additional requirement F p,0=F p,N+1=0. The inner products defining the discrete spaces \(\mathcal{H}^{h,i}_{p}\), i=1,0,−1, are given by

$$ \bigl(\mathbf{E}^h_p,\mathbf{F}^h_p\bigr)_{\mathcal {H}^{h,i}_p}:=\bigl(\bigl(M^h_p\bigl(S^h_p\bigr)^{-1}\bigr)^{1-i}S^h_p\mathbf {E}^h_p,\mathbf {F}^h_p\bigr)_{p,e},\quad i=1,0,-1, $$
(11)

and the norms are given by \(\|\mathbf{F}^{h}_{p}\|_{\mathcal {H}^{h,i}_{p}}^{2}:=(\mathbf{F}^{h}_{p},\mathbf{F}^{h}_{p})_{\mathcal {H}^{h,i}_{p}}\), for all i=1,0,−1. Here, (⋅,⋅) p,e is the inner product in the Euclidean space ℂpN+p−1, defined by

$$\bigl(\mathbf{F}^h_p,\mathbf{G}^h_p \bigr)_{p,e}:= \sum_{k=1}^{p N+p-1}F_{p,k/p} \overline{G}_{p,k/p}. $$

Set \(\mathcal{V}^{h}_{p}:=\mathcal{H}^{h,1}_{p}\times\mathcal{H}^{h,0}_{p}\) and its dual \(\mathcal{V}^{h,'}_{p}:=\mathcal{H}^{h,-1}_{p}\times \mathcal{H}^{h,0}_{p}\), the duality product \(\langle\cdot,\cdot\rangle_{\mathcal {V}_{p}^{h,'},\mathcal{V}_{p}^{h}}\) between \(\mathcal{V}_{p}^{h,'}\) and \(\mathcal{V}_{p}^{h}\) being defined as

$$\bigl\langle\bigl(\mathbf{F}^h_{p,1},\mathbf{G}^h_{p,1} \bigr),\bigl(\mathbf{F}^h_{p,2},\mathbf{G}^h_{p,2} \bigr)\bigr\rangle_{\mathcal{V}_p^{h,'},\mathcal {V}_p^h}:= \bigl(\mathbf{F}^h_{p,1}, \mathbf{F}^h_{p,2}\bigr)_{\mathcal {H}^{h,0}_p}+\bigl( \mathbf{G}^h_{p,1},\mathbf{G}^h_{p,2} \bigr)_{\mathcal{H}^{h,0}_p}. $$

Problem (10) is well posed in \(\mathcal{V}^{h}_{p}\). The total energy of its solutions defined below is conserved in time:

(12)

In [7] and [11], the following discrete version of the observability inequality (3) for the linear (p=1) and for the quadratic (p=2) approximation was analyzed:

$$ \mathcal{E}^h_p\bigl(\mathbf{U}^{h,0}_p, \mathbf{U}^{h,1}_p\bigr)\leq C^h_p(T) \int _0^T\bigl\|B^h_p \mathbf{U}^h_p(t)\bigr\|^2_{p,e}\,dt, $$
(13)

where \(B^{h}_{p}\) is a (pN+p−1)×(pN+p−1) observability matrix operator. Within this paper we focus on the particular case of boundary observation operators \(B_{p}^{h}\), in the sense that they approximate the normal derivative u x (x,t) of the solution of the continuous adjoint problem (2) at x=1 as h→0. One of the simplest examples of such boundary matrix operators \(B_{h}^{p}\) that will be used throughout this paper is:

$$ B_{p,ij}:= \begin{cases}-\frac{1}{h},&(i,j)=(p N+p-1,p N),\\ 0,&\mbox{otherwise.} \end{cases} $$
(14)

The only non-trivial component of \(B^{h}_{p}\mathbf{U}^{h}_{p}(t)\) is the last one which equals to u h,x (x N+(p−1)/p ,t) and is a first-order approximation of u x (1,t), where u is a solution of (2).

As shown in [7] for p=1 and in [11] for p=2, the observability inequality (13) does not hold uniformly as h→0, meaning that the observability constant \(C^{h}_{p}(T)\) in (13) blows up whatever T>0 is. This is due to the existence of solutions propagating very slowly concentrated on zones of the spectrum where the spectral gap or the group velocity tends to zero as h→0. To be more precise, for η∈[0,π], let us introduce the Fourier symbols

Define \(\lambda_{1}(\eta):=\sqrt{\varLambda_{1}(\eta)}\) and \(\lambda ^{\alpha}_{2}(\eta):=\sqrt{\varLambda^{\alpha}_{2}(\eta)}\), α∈{a,o}. Set \(\varLambda^{k}_{1}:=\varLambda_{1}(k\pi h)\) and \(\varLambda _{2}^{\alpha ,k}:=\varLambda_{2}^{\alpha}(k\pi h)\), \(\alpha\in\{\mathrm{a,o}\}\), and consider the following spectral problem:

$$ S^h_p\boldsymbol{\varphi}^h_p= \varLambda^h_pM^h_p\boldsymbol{\varphi}^h_p. $$
(15)

We take L 2-normalized eigenvectors, i.e., \(\|\boldsymbol{\varphi }^{h}_{p}\| _{\mathcal{H}^{h,0}_{p}}=1\). The eigenvalues are explicitly given by

$$\varLambda_1^{h,k}=\varLambda_1^k/h^2, \qquad \varLambda_2^{h,\alpha,k}=\varLambda_2^{\alpha,k}/h^2, $$

with \(\alpha\in\{\mathrm{a,o}\} \) and 1≤kN. The superscripts \(\mathrm{a,o}\) entering in the notation of the P 2-eigenvalues stand for acoustic/optic, respectively, to distinguish these two main branches of the spectrum. In the quadratic case, p=2, additionally to the 2N modes \(\varLambda_{2}^{h,\alpha,k}\), with 1≤kN and \(\alpha\in\{\mathrm{a,o}\}\), there is also the so-called resonant mode, given by \(\varLambda_{2}^{h,\mathrm{r}}=10/h^{2}\). In Fig. 3, we represent \(\lambda^{h}_{p}:=\sqrt {\varLambda^{h}_{p}}\) for different values of p.

Fig. 3
figure 3

The square roots of the eigenvalues, \(\lambda^{h}_{p}\): the continuous (black), acoustic (red), optic (blue), resonant (green) modes for p=2 and \(\lambda_{1}^{h}\) (magenta)

The solutions of (10) admit the following Fourier representation:

$$\mathbf{U}^h_p(t)=\sum _{\pm}\sum _{(\varLambda _p^h,\boldsymbol{\varphi}^h_p)} \widehat{u}_{p,\pm}\exp\bigl(\pm it\lambda_p^h \bigr)\boldsymbol{\varphi}^h_p, $$

where the second sum is taken over the all possible eigensolutions \((\varLambda_{p}^{h},\boldsymbol{\varphi}^{h}_{p})\) in (15). Here, \(\widehat{u}_{p,\pm}=(\widehat{u}^{0}_{p}\pm\widehat {u}^{1}_{p}/i\lambda^{h}_{p})/2\) and \(\widehat{u}^{i}_{p}\) are the Fourier coefficients of the initial data \(\mathbf{U}^{h,i}_{p}\) defined by \(\widehat{u}^{i}_{p}:=(\mathbf{U}^{h,i}_{p},\mathbf{\varphi }_{p}^{h})_{\mathcal{H}^{h,0}_{p}}\).

Firstly, let us remark that as kh→1, \(\varLambda^{\mathrm {a},k}_{2}\to 10\), \(\varLambda^{k}_{1},\varLambda^{\mathrm{o},k}_{2}\to12\) and as kh→0, \(\varLambda^{\mathrm{o},k}_{2}\to60\). On the other hand, as we can see in Fig. 3, \(\lambda_{1}^{h,k}\) and \(\lambda _{2}^{h,\mathrm {a},k}\) are strictly increasing in k, while \(\lambda_{2}^{h,\mathrm {o},k}\) is strictly decreasing. The group velocities, which are the first-order derivatives of the Fourier symbols \(\lambda^{\cdot }_{p}\) and \(\varLambda^{\cdot}_{p}\), verify

(16)

For all \(\alpha\in\{\mathrm{a,o}\}\) and all 1≤kN, the following spectral identities hold:

$$ \bigl\|\boldsymbol{\varphi}^{h,k}_1\bigr\|_{\mathcal {H}^{h,1}_1}^2= \frac{6\|B^h_1\boldsymbol{\varphi}^{h,k}_1\|^2_{1,e}}{12-\varLambda^k_1} \quad\mbox{and}\quad\bigl\|\boldsymbol{\varphi}^{h,\alpha,k}_2 \bigr\|_{\mathcal{H}^{h,1}_2}^2=\frac {\|B^h_2\boldsymbol{\varphi}^{h,\alpha,k}_2\|^2_{2,e}}{W(\varLambda ^{\alpha,k}_2)}, $$
(17)

where

$$W(\varLambda)=\frac{24(\varLambda-10)^2(\varLambda-12)(\varLambda -60)}{(-19\varLambda^2-120\varLambda+3600)(\varLambda^2+16\varLambda+240)}. $$

Thus, for a finite observability time T, by taking solutions of (10) of the form

$$\mathbf{U}^h_1(t)=\exp\bigl(i(T-t)\lambda_1^{h,N} \bigr)\boldsymbol{\varphi}^{h,N}_1 $$

or

$$\mathbf{U}^h_2(t)=\exp\bigl(i(T-t)\lambda_2^{h,\alpha,k} \bigr)\boldsymbol{\varphi}^{h,\alpha,k}_2, $$

with (α,k)∈{(a,N),(o,N),(o,1)}, we obtain that the observability constant \(C_{p}^{h}(T)\) blows up at least polynomially as h→0. In fact, by adapting the analysis in [10] based on the Stationary Phase Lemma, we can obtain a polynomial blow-up rate at any order. In [13], by arguments based on fine estimates on the family of bi-orthogonals that are expected to be adaptable to the approximations used in this paper, an exponential blow-up rate was proved for the finite difference semi-discretization scheme.

In Fig. 4(a), (e), we represent the solution of the continuous (abbreviated by c) adjoint system (2) with

$$u^0(x)=\exp\bigl(-\gamma(x-1/2)^2/2\bigr)\exp(i x \xi_0)\quad\mbox{and}\quad u^1(x)=-u^0_x(x), \quad\gamma=h^{-0.9}, $$

for which the solution propagates at velocity one (the maximum amplitude for both initial time t=0 and final one t=2 is at x=1/2 after two reflections on the boundary) along the generalized ray

$$x(t)= \begin{cases}2k+1+1/2-t,&t\in(2k+1/2,2k+1+1/2),\\ -(2k+1+1/2)+t,&t\in(2k+1+1/2,2k+2+1/2), \end{cases} \quad k\in\mathbb{Z}. $$

Also no dispersive effect holds since the corresponding group acceleration is identically zero (see Fig. 5, the black curves), despite of the value of the wave number η 0. The presence of the dispersion effects due to the group acceleration is responsible for modifications on size of the support of the solutions as time evolves (cf. [12]), whereas their absence leads to the conservation of the support size.

Fig. 4
figure 4

Propagation along the rays of geometric optics of a Gaussian wave packet concentrated around the wave number ξ 0=η 0/h for h=1/1000

Fig. 5
figure 5

Group velocities (left) versus group accelerations (right): continuous (black), p=1 (blue), p=2 acoustic (red), p=2 optic (dotted red)

In Fig. 4(b), (f), we represent the corresponding solution of the numerical adjoint problem (10) for p=1; for both values of the wave number η 0, the solution propagates at a smaller group velocity than the continuous one since both η 0=9π/10 and η 0=39π/40 belong to the region where η λ 1<1; the dispersive effects are visible for both wave numbers, since the group acceleration \(\partial_{\eta }^{2}\lambda_{1}\) is non-trivial; however, they are more accentuated for η 0=39π/40 than for η 0=9π/10 since \(|\partial_{\eta }^{2}\lambda_{1}(39\pi/40)|>|\partial_{\eta}^{2}\lambda_{1}(9\pi/10)|\), as we can see in Fig. 5, the blue curves.

In Fig. 4(c), (g), we represent the projection on the acoustic mode of the solution to the adjoint problem (10) for p=2. For η 0=9π/10, the velocity of propagation is larger than one (\(\partial_{\eta}\lambda_{2}^{\mathrm {a}}(9\pi/10)>1\)) (at the final time t=2, the maximum amplitude is located at a space position x>1/2, after two reflections on the border); almost no dispersive effect can be observed, since \(\partial _{\eta}^{2}\lambda_{2}^{\mathrm{a}}(\eta)\sim0\), for all η∈(0,9π/10). On the other hand, for η 0=39π/40, the projection on the acoustic branch propagates at velocity \(\partial_{\eta}\lambda _{2}^{\mathrm{a}}(39\pi/10)<1\), so that it reflects only once on the boundary, but more rapidly than the corresponding wave packet for p=1 (which even does not reflect on the boundary) since \(\partial_{\eta }\lambda_{2}^{\mathrm{a}}(39\pi/40)>\partial_{\eta}\lambda_{1}(39\pi /40)\). At the same time, the dispersive effects are much more accentuated for the projection on the acoustic branch than for p=1 since \(|\partial_{\eta}^{2}\lambda_{2}^{\mathrm{a}}(39\pi /40)|>|\partial _{\eta}^{2}\lambda_{1}(39\pi/40)|\), as we can see in Fig. 5, the red curves.

In Fig. 4(d), (h), we represent the projection on the optic mode of the solution to the adjoint problem (10), which propagates in the opposite direction than the physical solution, due to the fact that \(\partial_{\eta}\lambda _{2}^{\mathrm{o}}(\eta)<0\), for all η∈(0,π), while in the continuous case the group velocity is strictly positive (≡1). For η 0=9π/10, the velocity of propagation is larger than the one for the corresponding acoustic projection (i.e., \(|\partial_{\eta }\lambda_{2}^{\mathrm{o}}(9\pi/10)|>\partial_{\eta}\lambda _{2}^{\mathrm {a}}(9\pi/10)\)), reflected in the fact that the maximum amplitude at t=2 is located next to x=0; almost no dispersive effects occur. For η 0=39π/40, the optic projection propagates almost at the same velocity as the acoustic one and almost with the same dispersive effects, the only visible change being the reverse direction (see Fig. 5, the dotted red lines).

Several filtering techniques have been designed to face these high frequency pathologies, all based on taking subclasses of initial data that filter them: the Fourier truncation method (cf. [7]), which simply eliminates all the Fourier components propagating non-uniformly, and the bi-grid algorithm (cf. [4]), rigorously studied in [6, 9] and [14] in the context of the finite differences semi-discretization of the 1D and 2D wave equation and of the Schrödinger equation (cf. [5]), which consists in taking initial data with slow oscillations obtained by linear interpolation from data given on a coarser grid. The interested reader is referred to the survey articles [3] and [15] for a more or less complete presentation of the development of this topic and the state of the art.

Let us describe how the bi-grid filtering acts for the linear and quadratic finite element approximations under consideration. To be more precise, for an odd N, let us define the set of data on the fine grid obtained by linear interpolation from data on a twice coarser grid,

and the set of linear data whose nodal components are given by a bi-grid algorithm,

The following result has been proved for the adjoint problem (10) for p=1 in [9] or [14] and for p=2 in [11]:

Theorem 1

For all T≥2, the observability inequality (13) holds uniformly as h→0 within the class of initial data \((\mathbf{U}^{h,0}_{p},\mathbf{U}^{h,1}_{p})\in(\mathcal {B}^{h}_{p}\times\mathcal{B}^{h}_{p})\cap\mathcal{V}^{h}_{p}\) in the adjoint problem (10).

One of the possible proofs of this result is based on a dyadic decomposition argument like in [6]. For the case p=1, it reduces to showing that the total energy of solutions corresponding to initial data in \((\mathcal{B}^{h}_{1}\times \mathcal{B}^{h}_{1})\cap\mathcal{V}^{h}_{1}\) can be uniformly bounded from above by the energy of their projection on the first half of the spectrum. The second step is to use the uniform observability inequality (13) in the class \(\mathcal{T}^{h}_{1}\times\mathcal{T}^{h}_{1}\) consisting of discrete functions for which the second half of Fourier modes have been truncated; this result can be obtained by the multiplier technique (cf. [7]) or by Ingham-type inequalities (cf. [15]). For the quadratic case p=2, the projection on the first half of the acoustic mode has to be implemented to reduce the proof of Theorem 1 to the observability inequality (13) on the class \(\mathcal{T}^{h}_{2}\times\mathcal{T}^{h}_{2}\) of functions for which the second half (the high frequency one) of the acoustic diagram and the whole optic diagram have been truncated. The fact that for p=2 the bi-grid algorithm in Theorem 1 essentially truncates 3/4 of the spectrum versus only 1/2 for p=1 can be intuitively seen in the fact that \(\mathcal{B}^{h}_{2}\) involves two requirements on its elements versus only one requirement for \(\mathcal {B}^{h}_{1}\). The observability time for these two bi-grid algorithms coincides with the continuous optimal one T =2, since the group velocities η λ 1 and \(\partial_{\eta }\lambda^{\mathrm{a}}_{2}\) are increasing functions on [0,π/2] and then η λ 1(η)≥ η λ 1(0)=1 for all η∈[0,π/2] and similarly for \(\partial _{\eta}\lambda^{\mathrm{a}}_{2}\). Thus, the minimal velocity of propagation involving solutions with data in the class \(\mathcal {T}^{h}_{p}\times\mathcal{T}^{h}_{p}\) for both p=1 and p=2 is equal to one.

In practice, one has to employ fully discrete schemes. In this respect, it is important to note that, using the results of Ervedoza–Zheng–Zuazua in [2] allowing to transfer observability results for time-continuous conservative semigroups on time-discrete conservative schemes, we see that our observability results in Theorem 1 are also valid for any conservative fully discrete finite element approximation, like, for example, the implicit midpoint time-discretization scheme

$$\mathbf{U}^{h,k+1}_p-2\mathbf{U}^{h,k}_p+ \mathbf{U}^{h,k-1}_p+(\delta t)^2 \bigl(M^h\bigr)^{-1}S^h\bigl( \mathbf{U}^{h,k+1}_p+\mathbf{U}^{h,k-1}_p \bigr)/2=0, $$

where δt is the time step and \(\mathbf{U}^{h,k}_{p}\thickapprox \mathbf{U}^{h}_{p}(k\delta t)\). Note, however, that the results in [2] do not yield the optimal observability time, a subject that needs further investigation.

Once the observability problem is well understood, we are in conditions to address the discrete control problem. For a particular solution \(\tilde{\mathbf{U}}^{h}_{p}(t)\) of the adjoint problem (10), let us consider the following non-homogeneous discrete problem

$$ \begin{aligned}[c] &M^h_{p}\mathbf{Y}^h_{p,tt}(t)+S^h_p \mathbf{Y}^h_p(t)=-\bigl(B^h_p \bigr)^*B^h_p\tilde{\mathbf{U}}^h_p(t), \qquad\mathbf{Y}^h_p(0)=\mathbf{Y}^{h,0}_p,\\ & \mathbf{Y}^h_{p,t}(0)=\mathbf{Y}^{h,1}_p. \end{aligned} $$
(18)

Multiplying system (18) by any solution \(\mathbf {U}^{h}_{p}(t)\) of the adjoint problem (10), integrating in time and imposing that at t=T the solution is at rest, i.e.,

$$ \bigl\langle\bigl(\mathbf{Y}^h_{p,t}(T),- \mathbf{Y}^h_p(T)\bigr), \bigl(\mathbf{U}^{h,0}_p, \mathbf{U}^{h,1}_p\bigr)\bigr\rangle_{\mathcal {V}_p^{h,'},\mathcal{V}_p^h}=0,\quad \forall\bigl(\mathbf{U}^{h,0}_p,\mathbf{U}^{h,1}_p \bigr)\in\mathcal{V}_p^h, $$
(19)

we obtain the identity,

$$ \int _0^T\bigl(B_p^h \tilde{\mathbf{U}}^h_p(t),B_p^h \mathbf{U}^h_p(t)\bigr)_{p,e}\,dt= \bigl\langle \bigl(\mathbf{Y}^{h,1}_p,-\mathbf{Y}^{h,0}_p \bigr),\bigl(\mathbf{U}^h_p(0),\mathbf{U}^h_{p,t}(0) \bigr)\bigr\rangle_{\mathcal {V}_p^{h,'},\mathcal{V}_p^h}, $$
(20)

for all \((\mathbf{U}^{h,0}_{p},\mathbf{U}^{h,1}_{p})\in\mathcal {V}_{p}^{h}\). This is the Euler–Lagrange equation corresponding to the quadratic functional, the discrete analogue of \(\mathcal{J}\) in (5):

\(\mathbf{U}^{h}_{p}(t)\) being the solution of the adjoint problem (10) with initial data \((\mathbf{U}^{h,0}_{p},\mathbf{U}^{h,1}_{p})\) and \((\mathbf{Y}^{h,1}_{p},\mathbf{Y}^{h,0}_{p})\in\mathcal{V}_{p}^{h,'}\) the initial data to be controlled in (18). Actually, (18) and (20) are completely equivalent so that, in practice, it is sufficient to prove the existence of a critical point for \(\mathcal{J}^{h}_{p}\) to deduce the existence of a control for (18). The uniform observability inequality (13) within the class of initial data \(\mathcal{B}_{p}^{h}\times\mathcal {B}_{p}^{h}\) guarantees the uniform coercivity of \(\mathcal{J}_{p}^{h}\) and the convergence of the last component \(\tilde{v}^{h}_{p}\) of \(B_{p}^{h} \tilde{\mathbf{U}}^{h}_{p}(t)\), the discrete control, to the optimal control \(\tilde{v}\) for the continuous wave equation given by (4) when the initial data \((\mathbf{Y}^{h,0}_{p},\mathbf {Y}^{h,1}_{p})\) in (18) approximates well the initial data (y 0,y 1) in the continuous problem (1). Here \(\tilde{\mathbf{U}}^{h}_{p}(t)\) is the solution of the discrete adjoint system (10) corresponding to the minimizer \((\tilde{\mathbf{U}}^{h,0}_{p},\tilde{\mathbf{U}}^{h,1}_{p})\in \mathcal{B}_{p}^{h}\times\mathcal{B}_{p}^{h}\) of \(\mathcal{J}_{p}^{h}\).

3 Implementation of the Bi-grid Algorithm and Numerical Results

The aim of this section is to numerically illustrate the three high frequency pathologies for the quadratic approximation of the control problem (18) and the way in which the bi-grid filtering leads to the convergence of the solution of (18) to the continuous one. We will also compare the numerical results obtained for the quadratic case p=2 and for the linear one p=1.

In order to simplify the presentation, we will take as discrete initial data \((\mathbf{Y}^{h,0}_{p},\mathbf{Y}^{h,1}_{p})\) in (18) \(\mathbf{Y}^{h,1}_{p}=0\), \(\mathbf{Y}^{h,0}_{p}\) being an approximation of the Heaviside function H(x) in (6). Firstly, let us define the vectors \(\tilde{\mathbf {H}}^{h}_{p}=(\tilde{H}_{p,j/p})_{1\leq j\leq pN+p-1}\), where \(\tilde{H}_{p,j/p}=(H,\phi^{h}_{p,j/p})_{L^{2}}\), for all 1≤jpN+p−1. The numerical approximation of H(x) we consider is

$$ \mathbf{Y}^{h,0}_p=\mathbf{H}^h_p:= \bigl(M_p^h\bigr)^{-1}\tilde{\mathbf{H}}^h_p. $$
(21)

For all \(\alpha\in\{ \mathrm{a,o} \}\) and all \(\beta\in\{ \mathrm{lo,hi} \}\), (lo/hi standing for low/high-frequency), we also define the projections of \(\mathbf{H}^{h}_{p}\) on some parts of the spectrum as follows:

$$\mathbf{H}^h_{1,\beta}=\sum _{k_{\beta }^-}^{k_{\beta}^+}\bigl(\mathbf{H}^h_1, \boldsymbol{\varphi}^{h,k}_1\bigr)_{\mathcal{H}^{h,0}_1}\mathbf{ \varphi}^{h,k}_1\quad\mbox{and}\quad \mathbf{H}^{h,\alpha }_{2,\beta}= \sum _{k_{\beta }^-}^{k_{\beta}^+}\bigl( \mathbf{H}^h_2,\boldsymbol{\varphi}^{h,\alpha ,k}_2 \bigr)_{\mathcal{H}^{h,0}_2}\boldsymbol{\varphi}^{h,\alpha,k}_2, $$

where \((k^{-}_{\beta},k^{+}_{\beta})=(1,(N-1)/2)\) if β=lo and \((k^{-}_{\beta},k^{+}_{\beta})=((N+1)/2,N)\) if β=hi. More precisely, \(\mathbf{H}^{h}_{1,\mathrm{lo}}\) is the projection of \(\mathbf{H}^{h}_{1}\) on the first half of the spectrum and \(\mathbf{H}^{h,\mathrm{a}}_{2,\mathrm{lo}}\) that of \(\mathbf {H}^{h}_{2}\) on the first half of the acoustic diagram (see Fig. 6).

Fig. 6
figure 6

The discrete Heaviside functions \(\mathbf{H}^{h}_{p}\) and their projections

Since the datum H(x) in (6) is irregular due to the presence of the jump, it involves high-frequency eigenfunctions. This also happens with its numerical approximations \(\mathbf{H}^{h}_{p}\), as it can be easily observed in Fig. 7. These high-frequency components will lead to the divergence of the corresponding numerical controls.

Fig. 7
figure 7

The Fourier coefficients of \(\mathbf{H}^{h}_{p}\) for p=1 (left), p=2 (center, blue = acoustic, red = optic), p=2—the optic branch (right)

In order to find the minimum of the discrete functional \(\mathcal {J}_{p}^{h}\), we will apply the Conjugate Gradient (CG) algorithm (see [1, 4]) to iteratively solve the Euler–Lagrange equation (20). Let us briefly recall it when no-filtering technique is applied.

Firstly, fix the initial data to be controlled \((\mathbf {Y}^{h,0}_{p,0},\mathbf{Y}^{h,1}_{p,0})\), a tolerance ϵ (=0.001 in our particular case) and a maximum number of iterations n max (=200), aimed to be a stopping criterium. In order to better follow the CG algorithm, we divide it into several steps as follows:

Step 1. :

We initialize the algorithm solving the adjoint problem (10) with arbitrary data \((\mathbf {U}^{h,0}_{p},\mathbf{U}^{h,1}_{p})=(\mathbf{U}^{h,0}_{p,0},\mathbf {U}^{h,1}_{p,0})\in\mathcal{V}_{p}^{h}\), for example, the trivial one. This step yields the solution \(\mathbf{U}^{h}_{p,0}(t)\).

Step 2. :

Compute the first gradient \((\mathbf {G}^{h,0}_{p,0},\mathbf{G}^{h,1}_{p,0}):=\nabla\mathcal {J}_{p}^{h}(\mathbf{U}^{h,0}_{p,0},\mathbf{U}^{h,1}_{p,0})\) by solving the non-homogeneous problem (18) with initial data \((\mathbf{Y}^{h,0}_{p,0},\mathbf{Y}^{h,1}_{p,0})\) and \(\tilde{\mathbf{U}}^{h}_{p}(t)=\mathbf{U}^{h}_{p,0}(t)\). This produces the solution \(\mathbf{Y}^{h}_{p,0}(t)\). Then

$$\mathbf{G}^{h,0}_{p,0}=-\bigl(S_p^h \bigr)^{-1}M_p^h\mathbf{Y}^h_{p,0,t}(T) \quad\mbox{and}\quad\mathbf{G}^{h,1}_{p,0}=\mathbf{Y}^h_{p,0}(T). $$
Step 3. :

If \(\|\mathbf{G}^{h,0}_{p,0}\|_{\mathcal {H}^{h,1}_{p}}^{2}+\|\mathbf{G}^{h,1}_{p,0}\|_{\mathcal{H}^{h,0}_{p}}^{2}\geq \epsilon^{2}\), compute the first descent direction

$$\bigl(\mathbf{D}^{h,0}_{p,0},\mathbf{D}^{h,1}_{p,0} \bigr)=-\bigl(\mathbf{G}^{h,0}_{p,0},\mathbf{G}^{h,1}_{p,0} \bigr). $$
Step 4. :

Given \((\mathbf{U}^{h,0}_{p,n},\mathbf {U}^{h,1}_{p,n}),\ (\mathbf{G}^{h,0}_{p,n},\mathbf {G}^{h,1}_{p,n})\mbox{ and }(\mathbf{D}^{h,0}_{p,n},\mathbf {D}^{h,1}_{p,n})\) in \(\mathcal{V}_{p}^{h}\), we compute these quantities at the next iteration n+1 as follows:

Step 4.a. :

Solve (10) for \((\mathbf {U}^{h,0}_{p},\mathbf{U}^{h,1}_{p})= (\mathbf{D}^{h,0}_{p,n},\mathbf{D}^{h,1}_{p,n})\). Denote the solution by \(\mathbf{D}^{h}_{p,n}(t)\).

Step 4.b. :

Solve (18) with trivial initial data and \(\tilde{\mathbf{U}}^{h}_{p}(t)=\mathbf{D}^{h}_{p,n}(t)\) and denote the solution by \(\mathbf{Y}^{h}_{p,n+1}(t)\). Take

$$\mathbf{Z}^{h,0}_{p,n}=-\bigl(S_p^h \bigr)^{-1}M_p^h\mathbf{Y}^h_{p,n+1,t}(T) \quad\mbox{and}\quad\mathbf{Z}^{h,1}_{p,n}=\mathbf{Y}^h_{p,n+1}(T). $$
Step 4.c. :

Set

$$\rho_{p,n}:=-\frac{\|\mathbf{G}^{h,0}_{p,n}\|_{\mathcal {H}^{h,1}_p}^2+\|\mathbf{G}^{h,1}_{p,n}\|_{\mathcal{H}^{h,0}_p}^2}{ (\mathbf{Z}^{h,0}_{p,n},\mathbf{D}^{h,0}_{p,n})_{\mathcal {H}^{h,1}_p}+(\mathbf{Z}^{h,1}_{p,n},\mathbf {D}^{h,1}_{p,n})_{\mathcal{H}^{h,0}_p}}. $$
Step 4.d. :

Compute the next iteration

$$\bigl(\mathbf{U}^{h,0}_{p,n+1},\mathbf{U}^{h,1}_{p,n+1} \bigr):=\bigl(\mathbf{U}^{h,0}_{p,n},\mathbf{U}^{h,1}_{p,n} \bigr)+\rho_{p,n}\bigl(\mathbf{D}^{h,0}_{p,n}, \mathbf{D}^{h,1}_{p,n}\bigr). $$
Step 4.e. :

Compute the next gradient

$$\bigl(\mathbf{G}^{h,0}_{p,n+1},\mathbf{G}^{h,1}_{p,n+1} \bigr):=\nabla\mathcal{J}_p^h\bigl(\mathbf{U}^{h,0}_{p,n+1}, \mathbf{U}^{h,1}_{p,n+1}\bigr) $$

by

$$\bigl(\mathbf{G}^{h,0}_{p,n+1},\mathbf{G}^{h,1}_{p,n+1} \bigr):=\bigl(\mathbf{G}^{h,0}_{p,n},\mathbf{G}^{h,1}_{p,n} \bigr)+ \rho_{p,n}\bigl(\mathbf{Z}^{h,0}_{p,n}, \mathbf{Z}^{h,1}_{p,n}\bigr). $$
Step 4.f. :

Compute the next descent direction

The algorithm ends up when for some n<n max we obtain

$$\bigl\|\mathbf{G}^{h,0}_{p,n}\bigr\|_{\mathcal{H}^{h,1}_p}^2+ \bigl\| \mathbf{G}^{h,1}_{p,n}\bigr\|_{\mathcal{H}^{h,0}_p}^2< \epsilon^2 $$

or when nn max. When the second stopping criterium holds, we understand that the CG algorithm does not converge (due to the fact that \(\mathcal{J}_{p}^{h}\) looses coercivity). For both stopping criteria, we take the minimizer of \(\mathcal{J}_{p}^{h}\) to be \((\tilde{\mathbf {U}}^{h,0}_{p}, \tilde{\mathbf{U}}^{h,1}_{p}):=(\mathbf {U}^{h,0}_{p,n},\mathbf{U}^{h,1}_{p,n})\), where n is the last iteration number before stopping.

Let us now describe the changes we have to do in the CG algorithm to implement the bi-grid filtering we propose in Theorem 1. The linear case p=1 has been implemented in [3]. For this reason, we restrict ourselves to the quadratic case p=2. However, whenever we have to implement any filtering technique, the only steps we have to modify are Steps 2 and 4.b above. In order to simplify the presentation, we describe only the modifications to be done on Step 2, the ones on Step 4.b being similar. Firstly, set

$$\mathbf{F}^{h,0}_2=-M_2^h \mathbf{Y}^h_{2,0,t}(T)\quad\mbox{and}\quad\mathbf {F}^{h,1}_2=M_2^h \mathbf{Y}^h_{2,0}(T), $$

and observe that, for all test function \((\mathbf{U}^{h,0}_{2},\mathbf {U}^{h,1}_{2})\in\mathcal{V}_{2}^{h}\), the Gateaux derivative of \(\mathcal {J}_{2}^{h}\) at \((\mathbf{U}^{h,0}_{2,0},\mathbf{U}^{h,1}_{2,0})\) has the following expressions:

Let us observe that the linear functions with nodal components given by a bi-grid algorithm in \(\mathcal{B}_{2}^{h}\) are in fact linear functions on a grid of size 2h. We consider that both the test functions \((\mathbf{U}^{h,0}_{2},\mathbf{U}^{h,1}_{2})\) and the gradient \((\mathbf {G}^{h,0}_{2,0},\mathbf{G}^{h,1}_{2,0})\) belong to \(\mathcal {B}_{2}^{h}\times\mathcal{B}_{2}^{h}\). Consider the restriction operator Π that associates to any quadratic function of coefficients \(\mathbf{E}^{h}_{2}=(E_{2,j/2})_{1\leq j\leq2N+1}\) the linear function on the mesh of size 2h of coefficients \((\varPi \mathbf {E}^{h}_{2})_{j}=E_{2,2j}\), for all 1≤j≤(N−1)/2. When both \((\mathbf{U}^{h,0}_{2},\mathbf{U}^{h,1}_{2})\) and \((\mathbf {G}^{h,0}_{2,0},\mathbf{G}^{h,1}_{2,0})\) belong to \(\mathcal {B}_{2}^{h}\times\mathcal{B}_{2}^{h}\), then

Consider another restriction operator Γ defined as

Then

$$\bigl(\mathbf{F}^{h,0}_2,\mathbf{U}^{h,0}_2 \bigr)_{2,e}+\bigl(\mathbf{F}^{h,1}_2, \mathbf{U}^{h,1}_2\bigr)_{2,e}= \bigl(\varGamma \mathbf{F}^{h,0}_2,\varPi\mathbf{U}^{h,0}_2 \bigr)_{1,e}+\bigl(\varGamma\mathbf{F}^{h,1}_2,\varPi \mathbf{U}^{h,1}_2\bigr)_{1,e} $$

and the two components of the gradient are explicitly given by

$$\mathbf{G}^{h,0}_{2,0}=\varPi^{-1} \bigl(S^{2h}_1\bigr)^{-1}\varGamma\mathbf {F}^{h,0}_2\quad \mbox{and}\quad \mathbf{G}^{h,1}_{2,0}= \varPi^{-1}\bigl(M^{2h}_1\bigr)^{-1}\varGamma \mathbf{F}^{h,1}_2, $$

where Π −1 is the inverse of the restriction operator Π defined as the linear interpolation on a grid of size h/2 of a function defined on a grid of size 2h. Therefore, our filtering mechanism in Theorem 1 for p=2 acts in fact like a classical bi-grid algorithm of mesh ratio 1/4. This is very similar to the bi-grid algorithm designed in [5] to ensure discrete dispersive estimates for the finite difference semi-discretization of the Schrödinger equation uniformly in the mesh size parameter h. In that case, the bi-grid algorithm has to face the two singularities of the Fourier symbol p(η)=4sin2(η/2) defined on η∈[0,π]: the vanishing group velocity at η=π, yielding the non-uniform gain of 1/2-derivative, and the vanishing group acceleration at η=π/2, related to the non-uniform \(L^{p}_{x}\)\(L^{q}_{t}\)-integrability (see Fig. 8, right). In our case, by ordering in an increasing way the eigenvalues on the two dispersion curves and constructing \(\lambda_{2}(\eta)=\lambda_{2}^{a}(\eta )\), for η∈[0,π] and \(\lambda_{2}(\eta)=\lambda_{2}^{o}(2\pi -\eta)\), for η∈[π,2π], we formally obtain a discrete wave equation on the grid h/2 whose dispersion relation λ 2(η), η∈[0,2π], has vanishing group velocity at η=π ± and at η=2π (see Fig. 8, left). In order to remedy the pathologies associated to both singular points π ± and 2π, a bi-grid of mesh ratio 1/4 should suffice, despite of the discontinuity of λ 2 at η=π.

Fig. 8
figure 8

Dispersion relations for the P 2-approximation of the wave equation (left): continuous (black), \(\lambda_{2}^{a}\) (red), \(\lambda _{2}^{o}\) (green), λ 2 (blue), versus Fourier symbols of the finite difference semi-discretization of the Schrödinger equation (right): continuous (black) and discrete (blue). At the marked points, the symbols have vanishing group velocity (circles) or vanishing group acceleration (squares)

Remark 1

In practice, one has to reduce the semi-discrete problem to be solved, \(M^{h}\mathbf {U}^{h}_{tt}(t)+S^{h}\mathbf{U}^{h}(t)=\mathbf{F}^{h}(t)\), to a fully discrete system with time-step of size δt and to take U h,kU h(kδt). Set μ:=δt/h to be the Courant number. When using an explicit time scheme, for example, the leap-frog one,

$$\mathbf{U}^{h,k+1}-2\mathbf{U}^{h,k}+\mathbf{U}^{h,k-1}+ \mu^2h^2\bigl(M^h\bigr)^{-1}S^h \mathbf{U}^{h,k}=(\delta t)^2 \bigl(M^h \bigr)^{-1}\mathbf{F}^h(k\delta t), $$

a careful von Neumann analysis shows that the Courant–Friedrichs–Lewy (CFL) condition for μ is \(\mu\leq\min_{\varLambda}\sqrt{4/\varLambda}\), where the minimum is taken over all the eigenvalues Λ of the matrix h 2(M h)−1 S h. When dealing with (10) or (18) for the linear approximation p=1, this analysis gives \(\mu\leq1/\sqrt{3}\). For the case p=2, we obtain \(\mu\leq 1/\sqrt{15}\) if we work with solutions involving both modes or \(\mu \leq\sqrt{2/5}\) if the numerical solution involves only the acoustic mode. We observe that, globally, the quadratic scheme requires smaller Courant numbers than the linear one, whereas the resolution of the homogeneous problem (10) with data concentrated only on the acoustic mode admits larger μ’s than in the linear case.

We end up this section by discussing the numerical results in Figs. 9, 10, 11, 12. For the P 1-approximation, we take h=1/200 and for the P 2-one, h=1/100, in order to have the same number of degrees of freedom in both approximations.

  • Without restricting the space where the functional \(\mathcal {J}^{h}_{p}\) is minimized, the numerical controls are highly oscillatory and diverge (see Figs. 9 and 10(a)–(b)). This is due to the fact that the initial data \(\mathbf {H}^{h}_{p}\) involves the critical modes on the high-frequency regime of the dispersion relations for which the numerical controls diverge. These pathological effects can be seen separately by controlling the corresponding projections of the data \(\mathbf{H}^{h}_{p}\) on the high frequency modes (see Fig. 9(f) for p=1 and Fig. 10(f), (h), (j) for p=2). As long as the initial data \(\mathbf{H}^{h}_{p}\) is projected on the first half of the acoustic mode for p=2 or on the first half of the spectrum for p=1, the CG algorithm and the numerical controls converge (see Figs. 9(d) and 10(d)). The numerical controls obtained for these projections \(\mathbf {H}_{1,\mathrm{lo}}^{h}\) and \(\mathbf{H}_{2,\mathrm{lo}}^{h,\mathrm {a}}\) as initial positions in the control problem (18) without filtering are approximately the same as the ones obtained by the bi-grid filtering mechanism taking as initial position the whole \(\mathbf{H}_{p}^{h}\) (see also Fig. 11(b) and 12(b)). This is due to the fact that the controls obtained by the bi-grid algorithm damp out the high-frequency effects and for this reason they act mainly on the eigenmodes involved in \(\mathbf{H}_{1,\mathrm{lo}}^{h}\) or \(\mathbf{H}_{2,\mathrm {lo}}^{h,\mathrm{a}}\).

    Fig. 9
    figure 9

    Solutions of the control problem (18) versus their numerical controls for p=1 arising by minimizing \(\mathcal{J}^{h}_{1}\) over the whole space \(\mathcal {V}^{h}_{1}\)

    Fig. 10
    figure 10

    Solutions of the control problem (18) versus their numerical controls for p=2 arising by minimizing \(\mathcal{J}^{h}_{2}\) over the whole space \(\mathcal{V}^{h}_{2}\)

    Fig. 11
    figure 11

    Solutions of the control problem (18) versus their numerical controls for p=1 arising by minimizing \(\mathcal{J}^{h}_{1}\) over the restricted space \((\mathcal {B}^{h}_{1}\times\mathcal{B}^{h}_{1})\cap\mathcal{V}^{h}_{1}\)

    Fig. 12
    figure 12

    Solutions of the control problem (18) versus their numerical controls for p=2 arising by minimizing \(\mathcal{J}^{h}_{2}\) over the restricted space \((\mathcal {B}^{h}_{2}\times\mathcal{B}^{h}_{2})\cap\mathcal{V}^{h}_{2}\)

  • Without filtering, the high-frequency modes produce instabilities in the form of oscillations of larger and larger amplitude which accumulate as time evolves in the solutions of the control problem (18) (see Figs. 9(a), (e) and 10(a), (e), (g), (i)). These high frequency effects are larger in the P 2 case than in the P 1 one, due to the presence of the optic mode whose largest eigenvalues are much above the largest ones for the linear approximation (60/h 2 versus 12/(h/2)2=48/h 2). The solutions of the adjoint problem (10) corresponding to the minimizer \((\tilde{\mathbf {U}}^{h,0}_{p},\tilde{\mathbf{U}}^{h,1}_{p})\) of \(\mathcal{J}^{h}_{p}\) over \(\mathcal{V}^{h}_{p}\) are typically highly oscillatory wave packets whose energy is concentrated far from the boundary x=1 at any time t∈[0,2] (see Fig. 13, left).

    Fig. 13
    figure 13

    Typical solution of the adjoint problem (10) corresponding to the minimizer \((\tilde{\mathbf {U}}^{h,0}_{p},\tilde{\mathbf{U}}^{h,1}_{p})\) of \(\mathcal{J}^{h}_{p}\) over \(\mathcal{V}^{h}_{p}\) (left) or over \((\mathcal{B}^{h}_{p}\times\mathcal {B}^{h}_{p})\times\mathcal{V}^{h}_{p}\) (right)

  • When the space over which the functional \(\mathcal{J}^{h}_{p}\) is minimized is restricted to the bi-grid class \((\mathcal{B}^{h}_{p}\times \mathcal{B}^{h}_{p})\cap\mathcal{V}^{h}_{p}\), the high-frequency modes diminish in time for both the linear and the quadratic approximation as it can be observed in Figs. 11(e) and 12(e), (g), (i). For the case p=2, the optic modes are more dissipated than the acoustic ones. However, by comparing Figs. 11(a)–(b) and 12(a)–(b), we observe that the numerical controls and the solutions of the discrete control problem (18) under filtering are much more accurate in the quadratic case than in the linear one. As we made it precise before, for p=2, the bi-grid filtering acts mainly like a Fourier truncation of the whole optic mode \(\lambda_{2}^{\mathrm{o}}\) and of the second half of the acoustic one \(\lambda_{2}^{\mathrm{a}}\), whereas for p=1, it behaves like a Fourier truncation of the second half of the dispersion diagram λ 1. But the low frequencies of the acoustic mode approximate much better the continuous dispersion relation λ(η)=η, η∈ℝ, than the dispersion diagram of the linear approximation. Indeed, as η∼0, λ 1(η)∼η+η 3/24+η 5/1920, whereas \(\lambda_{2}^{\mathrm {a}}(\eta)\sim\eta+\eta^{5}/1440\). According to the results in [3], this improves the convergence rate h 2/3 of the numerical controls towards the continuous ones corresponding to the case p=1 for initial data (y 0,y 1) in the continuous control problem (1) belonging to the more regular space \(H_{0}^{1}\times L^{2}(0,1)\), so that a convergence order h 4/5 is obtained for p=2 under the same regularity assumptions. In Fig. 14, we represent the errors in the numerical controls obtained by the bi-grid filtering at the logarithmic scale for both approximations p=1 and p=2 for the initial position \(\mathbf {H}_{p}^{h}\) approximating the Heaviside function H. The continuous initial data (y 0,y 1)=(H,0)∈H 1/2−ϵ×H −1/2−ϵ(0,1), for any ϵ>0, which is less than the regularity imposed in [3]. Consequently, by interpolation, a natural sharp bound for the convergence orders of the numerical controls should be h 1/3 for p=1 versus h 2/5 for p=2. This is confirmed by our numerical results.

    Fig. 14
    figure 14

    The error \(\|\tilde{v}^{h}_{p}-\tilde{v}\|_{L^{2}(0,T)}\) for p=1 (blue) and p=2 (red) versus the number of degrees of freedom N at the logarithmic scale. In dotted blue/red, we represent N −1/3 and N −2/5 also at the logarithmic scale. Here N takes values from 99 to 999 with increments of 100

All the numerical simulations in this paper are realized under the Matlab environment. The corresponding numerical codes can be found following the link

www.bcamath.org/projects/MTM2008-03541/sim/images/.

4 Conclusions and Open Problems

In this paper, we have discussed and illustrated numerically the high frequency pathological effects of the P 2 approximation of the 1D wave equation, previously analyzed rigorously in [11] in what concerns the boundary observation and control problems. We have also illustrated the efficiency of the bi-grid filtering algorithm in recovering the convergence of the numerical controls and compared the results obtained by this quadratic finite element method with those one recovers by means of the P 1-approximation. Our conclusion is that, after applying the bi-grid filtering, the quadratic approximation leads to more accurate controls than the linear one.

This filtering technique can be easily generalized to higher order finite element approximation methods of waves (p≥3) on uniform meshes, a higher and higher accuracy of the numerical controls being expected. However, the high-frequency effects of the numerical approximations on irregular meshes is a completely open problem.