Introduction

The null controllability problem for various kinds of linear and semilinear parabolic equations has been an intensively studied subject in the recent decades and a nice survey can be found in the monograph [2]. Here, we propose to discuss the null controllability problem for the parabolic equation with hysteresis of the form

$$\begin{aligned} u_t(x,t)-\Delta u(x,t)+\mathcal {F}[u](x,t) = v(x,t), \quad x \in \Omega \subset \mathbb {R}^n, \ t\in (0,T) \end{aligned}$$
(1)

with a hysteresis operator \(\mathcal {F}\), a right-hand side \(v\) called the control, and initial and boundary conditions specified below. Existence, uniqueness, and regularity results for Eq. (1) with a given right-hand side \(v\), can be found in the monograph [7]. The null controllability problem for Eq. (1) consists in proving that for an arbitrary initial condition and arbitrary final time T, it is possible to choose the control \(v\) in a suitable class of functions of x and t in such a way that the solution satisfies \(u(\cdot ,T)=0\), a.e., in \(\Omega \).

First results about the null controllability of Eq. (1) were obtained by F. Bagagiolo in [1]: His technique relies on a linearization followed by a fixed-point procedure, and we briefly comment on it in Sect. 2. We will see that hysteresis operators arising from phase transition modeling cannot be linearized. To establish the null controllability of the system, new techniques based on M. Brokate’s previous works [3, 4] on optimal control of ODEs with hysteresis need to be developed, and this will be done in Sect. 3.

1 The Physical Problem

Consider a bounded connected Lipschitzian domain \(\Omega \subset \mathbb {R}^3\), fix an arbitrary \(T>0\) and define \(Q=\Omega \times (0,T)\), \(\Gamma =\partial \Omega \times (0,T)\). The unknown functions of the space variable \(x\in \Omega \) and time \(t\in [0,T]\) are \(s(x,t)\in [-1,1]\) for the phase parameter (\(s=-1\) solid, \(s=1\) liquid, \(s\in (-1,1)\) mixture), and \(\theta (x,t)>0\) for the absolute temperature.

The system we consider is the following:

$$\begin{aligned} {\left\{ \begin{array}{ll} c\theta _t+Ls_t-\kappa \Delta \theta =h&{}\text {in }Q,\\ \rho s_t+\partial I(s)\ni L(\theta -\theta _c)&{}\text {in }Q,\\ \text {initial and boundary conditions}, \end{array}\right. } \end{aligned}$$
(2)

where \(I\) is the indicator function of the interval \([-1,1]\), \(\partial I\) is its subdifferential, \(h = h(x,t)\) is the heat source density, and \(c\) specific heat, \(L\) latent heat, \(\kappa \) heat conductivity, \(\rho \) phase relaxation parameter and \(\theta _c\) critical temperature are given positive constants. In the literature this is known as the relaxed Stefan problem (see, e. g., A. Visintin’s monograph [8]), and it models the phase transition in solid–liquid systems: the first equation is the energy balance, whereas the second one describes the phase dynamics. In particular:

  1. (i)

    the smaller \(\rho \) is, the faster the transition takes place. When \(\rho =0\) we get the classical Stefan problem, in which the phase transition is assumed to be instantaneous;

  2. (ii)

    when \(\theta >\theta _c\) then \(s_t \ge 0\), which means that the substance is melting; when \(\theta <\theta _c\) then \(s_t \le 0\), which means that the substance is freezing.

We now show that system (2) can be transformed into the form (1). Indeed, we define a new unknown \(u\) by the formula

$$\begin{aligned} u_t=\frac{L}{\rho }(\theta -\theta _c). \end{aligned}$$

Then the phase dynamics equation in (2) is of the form \(s_t+\partial I(s)\ni u_t\). This is nothing but the definition of the stop operator with threshold 1, \(s=\mathfrak {s}[u]\); see Fig. 1.

Fig. 1
figure 1

Hysteresis loop of the stop operator

The first equation in (2) thus reads

$$\begin{aligned} \frac{c\rho }{L}u_{tt}+L\mathfrak {s}[u]_t-\frac{\kappa \rho }{L}\Delta u_t=h. \end{aligned}$$

Integrating the above equation in time leads, up to constants, to an equation of the form (1), more specifically,

$$\begin{aligned} \frac{c\rho }{L}u_t+L\mathfrak {s}[u]-\frac{\kappa \rho }{L}\Delta u=v \end{aligned}$$
(3)

with \(\mathcal {F}[u] = \mathfrak {s}[u]\), and with \(v\) containing the time integral of \(h\) and additional terms coming from the initial conditions.

2 Null Controllability by Linearization

The system considered by F. Bagagiolo in [1] is the following:

$$\begin{aligned} {\left\{ \begin{array}{ll} u_t(x,t)-\Delta u(x,t)+\mathcal {F}[u](x,t)=m(x)v(x,t)&{}\text {in }Q,\\ u(x,t)=0&{}\text {on }\Gamma ,\\ u(x,0)=u_0(x)&{}\text {in }\Omega , \end{array}\right. } \end{aligned}$$
(4)

where \(m\) is the characteristic function of a set \(\omega \subset \subset \Omega \) and \(\mathcal {F}:L^2(\Omega ;C^0([0,T])) \longrightarrow L^2(\Omega ;C^0([0,T]))\) is a hysteresis operator satisfying the following condition: there exist two constants \(L>0\) and \(m\in \mathbb {R}\) such that, for all \(z\in L^2(\Omega ;C^0([0,T]))\), for all \(t\in [0,T]\) and for a. e. \(x\in \Omega \)

$$\begin{aligned} |\mathcal {F}[z](x,t)|\le L|z(x,t)|, \end{aligned}$$
(5)
$$\begin{aligned} \text {if }z(x,t)=0\text { then }\lim _{\tau \rightarrow t,\, z(x,\tau )\ne 0} \frac{\mathcal {F}[z](x,\tau )}{z(x,\tau )} = m\text { uniformly in }[0,T]. \end{aligned}$$
(6)

Similarly as above, \(v:Q:=\Omega \times (0,T)\rightarrow \mathbb {R}\) is the control function which, being multiplied by \(m\), acts only on a compact subregion of the original domain. The null controllability of system (4) strongly relies on the following result from V. Barbu’s paper [2].

Theorem 1

(Null controllability in the linear case) Let \(\Omega \subset \mathbb {R}^n\) be an open and bounded domain with boundary of class \(C^2\), let \(\omega \subset \Omega \) be a compactly embedded subset, and let \(a\in L^{\infty }(Q)\) be given. Then for every initial datum \(u_0\in L^2(\Omega )\) there is a control function \(v\in L^2(Q)\) such that the (unique) corresponding solution \(u^v\in C^0([0,T];L^2(\Omega ))\cap L^2(0,T;W^{1,2}_0(\Omega ))\) of

$$\begin{aligned} {\left\{ \begin{array}{ll} u_t(x,t)-\Delta u(x,t)+a(x,t)u(x,t)=m(x)v(x,t)&{}\text {in }Q,\\ u(x,t)=0&{}\text {on }\Gamma ,\\ u(x,0)=u_0(x)&{}\text {in }\Omega , \end{array}\right. } \end{aligned}$$
(7)

satisfies \(u^v(x,T)=0\) a. e. \(x\in \Omega \). Moreover, the control \(v\) can be taken in such a way that

$$\begin{aligned} \Vert v\Vert _{L^2(Q)}\le C\Vert u_0\Vert _{L^2(\Omega )}, \end{aligned}$$

where the constant \(C\) only depends on \(\Vert a\Vert _{L^{\infty }(Q)}\).

Note that V. Barbu’s proof of this result relies on the Pontryagin’s Maximum Principle, and the Carleman estimates.

In particular, Pontryagin’s Maximum Principle requires the study of the dual system associated with (7), whereas Carleman estimates allow us to bound the \(L^2\)-norm of the dual variable in terms of the \(L^2\)-norm of the control on the subregion \(\omega \times (0,T)\).

F. Bagagiolo’s condition (5)–(6) implies, in particular, that all hysteresis branches pass through the origin. System (4) thus can be reduced to the form (7) with a factor \(a(x,t)\) depending on the unknown function \(u\). The null controllability result is then obtained by a fixed-point argument.

In the case that the operator \(\mathcal {F}\) is the stop operator given by Eq. (3), the assumption (5) is not satisfied. It is well known (cf. [7]) that typical hysteresis branches of the stop operator do not pass through the origin as required by condition (5), see Fig. 1.

3 A New Approach: Null Controllability by Penalization

The exact values of the physical constants \(c,\rho ,L,\kappa \) are irrelevant for our analysis. We can therefore represent Eq. (3) as a system of the form

$$\begin{aligned} {\left\{ \begin{array}{ll} u_t-\Delta u+s=v&{}\text {in }Q,\\ s_t + \partial I(s) \ni u_t&{}\text {in }Q,\\ \nabla u \cdot n =0&{}\text {on }\Gamma ,\\ u(x,0)=u_0(x)&{}\text {in }\Omega ,\\ s(x,0)=s_0(x)&{}\text {in }\Omega . \end{array}\right. } \end{aligned}$$
(8)

The homogeneous Neumann boundary condition for \(u\) has the physical meaning of a thermally insulated body in the original setting (2). The main result for the system (8) reads as follows.

Theorem 2

Let \(u_0 \in W^{1,2}(\Omega )\cap L^\infty (\Omega )\) and \(s_0 \in L^\infty (\Omega )\) be given, \(|s_0(x)|\le 1\), a.e. Then the system (8) is null controllable, that is, there exists \(v\in L^2(Q)\) such that the corresponding solution \(u^v\in W^{1,2}((0,T);L^2(\Omega ))\cap L^\infty (0,T;W^{1,2}(\Omega ))\) of (8) satisfies \(u^v(\cdot ,T)=0\), a.e., in \(\Omega \).

Note that controls with support restricted to a subdomain \(\omega \subset \Omega \) as in Theorem 1 are not admissible in Theorem 2. This is related to the problem whether Carleman estimates are compatible with the penalty approximation. This question will be given appropriate attention in future work.

Proof

The argument consists in penalizing the subdifferential \(\partial I\) and replacing the differential inclusion with an ODE. In particular, we choose the penalty function

$$\begin{aligned} \Psi (s) = \left\{ \begin{array}{ll} \phi (s-1) &{} \text { for }\ s>1,\\ 0 &{} \text { for }\ s\in [-1,1],\\ \phi (-s-1) &{} \text { for }\ s<-1, \end{array} \right. \end{aligned}$$
(9)

with a convex \(C^2\)-function \(\phi :[0,\infty ) \rightarrow [0,\infty )\) with quadratic growth, for example

$$ \phi (r) = \left\{ \begin{array}{ll} \frac{1}{6}r^3 &{} \text { for }\ r\in [0,1],\\ \frac{1}{2}r^2 - \frac{1}{2} r + \frac{1}{6} &{} \text { for }\ r>1. \end{array} \right. $$

Choosing a small parameter \(\gamma >0\), we replace (8) with a system of one PDE and one ODE for unknown functions \((u,s) = (u^\gamma , s^\gamma )\)

$$\begin{aligned} {\left\{ \begin{array}{ll} u_t-\Delta u+s=v&{}\text {in }Q,\\ s_t+\frac{1}{\gamma }\Psi '(s)=u_t&{}\text {in }Q, \end{array}\right. } \end{aligned}$$
(10)

with the same initial and boundary conditions, and with the intention to let \(\gamma \) tend to \(0\). We choose another small parameter \(\varepsilon > 0\) independent of \(\gamma \) and define the cost functional

$$\begin{aligned} J(u,s,v)=\frac{1}{2}\iint _Q v^2d xd t +\frac{1}{2\varepsilon }\int _\Omega u^2(x,T)d x, \end{aligned}$$

where the two summands represent the cost to implement the control and to reach the desired null final state. Then, for each \(\gamma >0\) we solve the following optimal control problem:

$$\begin{aligned} \text {minimize }J(u,s,v)\text { subject to } (10). \end{aligned}$$
(11)

It is not difficult to see (see, e. g., Tröltzsch [6]) that for each \(\varepsilon >0\) problem (11) has a unique solution \((u^\gamma _{\varepsilon },s^\gamma _{\varepsilon },v^\gamma _{\varepsilon })\). It is found as a critical point of the Lagrangian

$$\begin{aligned} L(u,s,v)=J(u,s,v)+\langle p,G_1(u,s,v)\rangle +\langle q,G_2(u,s,v)\rangle \end{aligned}$$

where \(p, q\) are Lagrange multipliers, the brackets denote the canonical scalar product in \(L^2(\Omega )\), and the constraints are

$$\begin{aligned} G_1(u,s,v)=u_t-\Delta u+s-v,G_2(u,s,v)=s_t+\frac{1}{\gamma }\Psi '(s)-u_t. \end{aligned}$$

The first-order necessary optimality condition for \((u,s,v,p,q)= (u^\gamma _{\varepsilon },s^\gamma _{\varepsilon },v^\gamma _{\varepsilon },p^\gamma _{\varepsilon },q^\gamma _{\varepsilon })\) reads

$$\begin{aligned} v=p \,\, \text {a. e. in }Q, \end{aligned}$$
(12)

and \(p\in W^{1,2}(0,T;L^2(\Omega ))\cap L^\infty (0,T;W^{1,2}(\Omega ))\), \(q\in W^{1,2}(0,T;L^2(\Omega ))\) are the solutions to the backward dual problem

$$\begin{aligned} {\left\{ \begin{array}{ll} p_t +\Delta p - q_t=0&{}\text {in }Q,\\ q_t-\frac{1}{\gamma }\Psi ''(s)q - p = 0&{}\text {in }Q,\\ \nabla p \cdot n =0&{}\text {on }\Gamma ,\\ p(x,T)= -\frac{1}{\varepsilon }u(x,T)&{}\text {in }\Omega ,\\ q(x,T)=0&{}\text {in }\Omega . \end{array}\right. } \end{aligned}$$
(13)

3.1 Estimates

In order to pass to the limits \(\varepsilon \rightarrow 0, \gamma \rightarrow 0\), we first derive a series of estimates for \((u,s,v,p,q)=(u^\gamma _{\varepsilon }, s^\gamma _{\varepsilon }, v^\gamma _{\varepsilon },p^\gamma _{\varepsilon },q^\gamma _{\varepsilon })\) satisfying the system (10), (12), and (13). In what follows, we denote by \(C\) any constant independent of \(\gamma \) and \(\varepsilon \).

We first multiply the second equation of (13) by \(-\mathrm {sign}(q)\) and integrate from an arbitrary \(t \in [0,T)\) to \(T\) to obtain

$$\begin{aligned} |q(x,t)| + \int _t^T\frac{1}{\gamma }\Psi ''(s(x,\tau )) |q(x,\tau )|d \tau \le \int _t^T|p(x,\tau )| d \tau \quad \text { for a. e. }\ x \in \Omega . \end{aligned}$$
(14)

In the next step, we combine the first and the second equation of (13) to get

$$ p_t + \Delta p -\frac{1}{\gamma }\Psi ''(s)q - p = 0, $$

and multiply the resulting equation by an approximation \( S_n(p)\) of \(-\mathrm {sign}(p)\), say, \(S_n(p) = -\mathrm {sign}(p)\) for \(|p| \ge 1/n\), \(S_n(p) = - np\) for \(|p| < 1/n\). Integrating over \(\Omega \) and letting \(n\) tend to infinity we obtain

$$\begin{aligned} -\frac{d }{d t}\int _\Omega |p(x,t)|d x + \int _\Omega |p(x,t)|d x \le \int _\Omega \frac{1}{\gamma }\Psi ''(s(x,t)) |q(x,t)|d x \ \text { for a. e. } t \in (0,T). \end{aligned}$$
(15)

Integrating (15) consecutively \(\int _0^\tau d t\) and then \(\int _0^T d \tau \) and using the estimate (14) gives a bound for \(p(x,0)\), namely

$$\begin{aligned} \int _\Omega |p(x,0)|d x \le C \int _0^T\int _\Omega |p(x,t)|d xd t. \end{aligned}$$
(16)

Finally, we multiply the first equation in (10) by \(p\), the second equation in (10) by \(q\), the first equation in (13) by \(u\), the second equation in (13) by \(s\), integrate in space and time, and sum up (note that \(p=v\) by virtue of (12)):

$$\begin{aligned} \begin{aligned}&\int _0^T\int _\Omega p^2d xd t+\frac{1}{\varepsilon }\int _\Omega u^2(x,T)d x = -\int _\Omega u_0(x)p(x,0)d x+\int _\Omega u_0(x)q(x,0)d x\\&\ -\int _\Omega s_0(x)q(x,0)d x+\frac{1}{\gamma }\int _0^T\int _\Omega q\left( \Psi '(s)- s\Psi ''(s)\right) d xd t. \end{aligned} \end{aligned}$$
(17)

The choice (9) of \(\Psi \) guarantees that

$$ |\Psi '(s)- s\Psi ''(s)| \le \frac{3}{2} \Psi ''(s). $$

Hence, by virtue of (14)–(16), we infer from (17) the estimate

$$\begin{aligned} \int _0^T\int _\Omega p^2(x,t)d xd t+\frac{1}{\varepsilon }\int _\Omega u^2(x,T)d x \le C \int _0^T\int _\Omega |p(x,t)|d xd t \end{aligned}$$
(18)

with a constant \(C\) depending on the \(L^\infty \)-norm of \(u_0\), which, together with Hölder’s inequality, implies in turn that

$$\begin{aligned} \int _0^T\int _\Omega (v^\gamma _\varepsilon )^2(x,t)d xd t +\frac{1}{\varepsilon }\int _\Omega (u^\gamma _\varepsilon )^2(x,T)d x \le C \end{aligned}$$
(19)

with a constant \(C\) independent of \(\gamma \) and \(\varepsilon \).

3.2 Passage to the Limit

As a consequence of (19), we see by a standard result on parabolic PDEs that the solutions \(u^\gamma _\varepsilon , s^\gamma _\varepsilon \) of (10) are for each fixed \(\gamma >0\) uniformly bounded in \(W^{1,2}(0,T; L^2(\Omega ))\cap L^\infty (0,T; W^{1,2}(\Omega ))\). Keeping thus \(\gamma \) fixed for the moment, letting \(\varepsilon \rightarrow 0\) and using the compact embedding of \(W^{1,2}(0,T; L^2(\Omega ))\cap L^\infty (0,T; W^{1,2}(\Omega ))\) into \(L^2(\Omega ;C([0,T]))\), we conclude that along a subsequence for each fixed \(\gamma \) we have

$$\begin{aligned}&v^\gamma _\varepsilon \rightharpoonup v^\gamma _*, \ s^\gamma _\varepsilon \rightarrow s^\gamma _*, \ (s^\gamma _\varepsilon )_t\rightharpoonup (s^\gamma _*)_t \text { in }L^2(Q), \Vert u^\gamma _\varepsilon (x,T)\Vert ^2_{L^2(\Omega )}\rightarrow 0,\\&u^\gamma _\varepsilon \rightarrow u^\gamma _*\text { in }L^2(\Omega ;C([0,T]))\text { and }u^\gamma _*(x,T)=0\text { a. e.} \end{aligned}$$

The convergence \(\gamma \rightarrow 0\) is more delicate. By (19), the controls contain a weakly convergent subsequence

$$\begin{aligned} v^\gamma _*\rightharpoonup v_*\text { in }L^2(Q). \end{aligned}$$

The same parabolic PDE argument as above yields

$$\begin{aligned} (u^\gamma _*)_t\rightharpoonup (u_*)_t\text { in }L^2(Q),u^\gamma _*\rightarrow u_*\text { in }L^2(\Omega ;C([0,T]))\text { with } u_*(x,T)=0. \end{aligned}$$
(20)

It remains to prove that the solutions \(s^\gamma _*\) to the equation

$$ (s^\gamma _*)_t+\frac{1}{\gamma }\Psi '(s^\gamma _*)=(u^\gamma _*)_t $$

converge weakly to \(\mathfrak {s}[u_*]\). To this end, we denote by \(y^\gamma \) the solution of the ODE

$$\begin{aligned} y^\gamma _t+\frac{1}{\gamma }\Psi '(y^\gamma )=(u_*)_t, \quad y^\gamma (x,0) = s_0(x). \end{aligned}$$
(21)

By [5, Theorem 1.12], we have for a. e. \((x,t) \in Q\)

$$\begin{aligned} |s^\gamma _*(x,t) - y^\gamma (x,t)| \le 2\max _{\tau \in [0,t]} |u^\gamma _*(x,\tau ) - u_*(x,\tau )|. \end{aligned}$$
(22)

Multiplying (21) by \(y^\gamma _t\) and integrating over \(Q\) we see that the \(L^2(Q)\)-norm of \(y^\gamma _t\) is bounded independently of \(\gamma \). Hence, up to a subsequence,

$$\begin{aligned} y^\gamma _t\rightharpoonup y_t, \ y^\gamma \rightharpoonup y, \ \frac{1}{\gamma }\Psi '(y^\gamma )\rightharpoonup w \text { in }L^2(Q), \end{aligned}$$
(23)

and it suffices to prove that \(y = \mathfrak {s}[u_*]\). To this end note that \(y\) and \(w\) satisfy the equation

$$\begin{aligned} y_t+ w =(u_*)_t, \quad y(x,0) = s_0(x). \end{aligned}$$
(24)

Furthermore, for every function \(z \in L^\infty (Q)\) we have

$$ \iint _Q y^\gamma z \,d xd t \le \iint _Q |y^\gamma |\, |z| \,d xd t \le \iint _Q (|y^\gamma |-1)^+\, |z| \,d xd t + \iint _Q |z|\,d xd t, $$

hence, choosing \(z\) such that \(\iint _Q |z|d xd t \le 1\), by (23) we have \(\iint _Q y z \,d xd t \le 1\) which in turn implies that \(|y(x,t)| \le 1\) a. e.

We now multiply (21) by \(y^\gamma \) and (24) by \(y\) and integrate over \(Q\). By virtue of the weak convergence we have \(\int _\Omega y^2(x,T)\,d x \le \liminf _{\gamma \rightarrow 0} \int _\Omega (y^\gamma )^2(x,T)\,d x\), hence

$$ \liminf _{\gamma \rightarrow 0} \frac{1}{\gamma } \iint _Q \Psi '(y^\gamma )y^\gamma \,d xd t \le \iint _Q w y\,d xd t. $$

Since \(\Psi '\) is monotone and vanishes in \([-1,1]\), it follows that

$$ \iint _Q \Psi '(y^\gamma )(y^\gamma - \rho )\,d xd t \ge 0 $$

for every measurable function \(\rho \) such that \(|\rho (x,t)| \le 1\) a. e. Hence, for every \(\rho \) we have

$$ \iint _Q w (y - \rho )\,d xd t \ge 0, $$

which implies that \(y = \mathfrak {s}[u_*]\), and the proof is complete.