1 Introduction

The notion of ensemble control was proposed by Brockett [8], and further in [9, 10], while considering the problem of a trade-off between the complexity of implementing a control strategy and the performance of the control system. For the former, Brockett discusses the concept of minimum attention control that results in costs of the control that involve a time derivative of the control function. For the latter, he emphasizes the advantage of considering an ensemble of trajectories, which stem from a distribution of initial conditions, rather than individual trajectories. By these two consideration, Brockett concludes that the natural setting for investigating both aspects of the resulting control problem is by means of the Liouville (or continuity) equation that governs the evolution of the ensemble of trajectories.

The Liouville equation is a hyperbolic-type PDE, which arises in diverse areas of sciences as biology, finances, mechanics, and physics; see e.g. [13, 15, 16, 19, 22]. It is often used to model the evolution of density functions representing the probability density of multiple trials of a single evolving ordinary differential equation (ODE in brief) system, or the physical (e.g. particle) density of multiple non-interacting systems. In both cases, the function of the dynamics of the ODE model appears as the drift coefficient of the Liouville equation. Therefore the problem of controlling a trajectory of a finite-dimensional dynamical system is lifted to the problem of controlling a continuum of dynamical systems with the same control strategy. Specifically, this setting results in the problem of determining a single closed- or open-loop controller, which applies to a particular system over an infinite number of repeated trials, or to steer a family of finite-dimensional dynamical systems. As discussed by Brockett, this approach represents a new control framework that is able to address a number of issues as uncertainty in initial conditions and the trade-off mentioned above.

1.1 The Liouville equation and a control mechanism

Given some time \(T>0\), consider a smooth vector field a(tx) over \({{\mathbb {R}}}^d\), where \((t,x)\in [0,T]\times {{\mathbb {R}}}^d\). We refer to a as the drift function. It is well-known that, if a scalar function \(\rho \), defined on \([0,T]\times {{\mathbb {R}}}^d\), satisfies the Liouville equation

$$\begin{aligned} \partial _t\rho (t,x)\, +\, \mathrm{div}\,\bigl ( a(t,x) \, \rho (t,x)\bigr )\, =\,0, \end{aligned}$$
(1.1)

with some (say) smooth initial datum \(\rho _{|t=0}\,=\,\rho _0\), then we can represent \(\rho \) by the formula

$$\begin{aligned} \rho (t,x)\,=\,\frac{1}{\mathrm{det}\,J\big (t,\psi _t^{-1}(x)\big )}\,\rho _0\bigl (\psi _t^{-1}(x)\bigr ), \end{aligned}$$

where \(\psi _t(x)\,=\,\psi (t,x)\) denotes the flow map associated to a, \(J(t,x)\,=\,\nabla _x\psi _t(x)\) is its Jacobian matrix, and \(\psi _t^{-1}(x)\) means the inverse with respect to the space variable, at t fixed. By definition of flow map, \(\psi \) verifies the following system of ODEs:

$$\begin{aligned} \partial _t\psi (t,x)\,=\,a\bigl (t,\psi (t,x)\bigr ), \qquad \psi (0,x)=x. \end{aligned}$$
(1.2)

In view of physical considerations, it is often natural to assume an initial condition \(\rho _0\) verifying \(\rho _0\ge 0\), together with the normalization \(\int _{{{\mathbb {R}}}^d} \rho _0(x)dx=1\). By Eq. (1.1), if a is smooth enough and growing not too fast at infinity (see e.g. [18]), it is standard to deduce that, for all times \(t\ge 0\), there holds that

$$\begin{aligned} \rho (t,x)\,\ge \,0\qquad \text{ and } \qquad \int _{{{\mathbb {R}}}^d} \rho (t,x)\,dx\,=\,\int _{{{\mathbb {R}}}^d} \rho _0(x)dx\,=\,1. \end{aligned}$$

Nonetheless, we remark that most of our results do not require the latter two assumptions on \(\rho _0\).

Next, let us discuss the control mechanism we will consider throughout this paper. The focus of ensemble control is the development of a control strategy for the differential model (1.2) augmented with a control mechanism, as follows:

$$\begin{aligned} {\dot{x}} = a(t,x;u), \end{aligned}$$
(1.3)

where u denotes the control function. We refer to [9, 10] for a discussion on the choice of u as a function of time only, which corresponds to a so-called open-loop control, or as a function of time and of the state variable, which may represent a feedback law. In this paper, while considering the controlled Liouville model in a general setting that accommodates both choices, we focus our attention on open-loop optimal control problems: this point of view is motivated by the fact that the most used control mechanisms for (1.3) are the linear and bilinear ones, as follows:

$$\begin{aligned} a(t,x;u)\,=\,a_0(t,x)\,+\,u_1(t)\,+\, x \circ u_2(t), \end{aligned}$$
(1.4)

where \(a_0\) is a given smooth vector field and \(u\,=\,(u_1,u_2)\) is the control, which, for the scope of the present discussion, we assume to be smooth. The control \(u_1\) represents a linear control mechanism and \(u_2\) multiplying the state variable x represents the bilinear control term. Both functions \(u_1\) and \(u_2\) are defined on the time interval [0, T] with values in \({{\mathbb {R}}}^d\). The symbol \(\circ :{{\mathbb {R}}}^d \times {{\mathbb {R}}}^d \rightarrow {{\mathbb {R}}}^d\) denotes the Hadamard product of two vectors, i.e. the multiplication component by component.

Notice that, corresponding to the controlled evolution model (1.3), we have the following controlled Liouville equation:

$$\begin{aligned} \partial _t\rho (t,x)\, +\, \mathrm{div}\,\bigl ( a(t,x;u) \, \rho (t,x)\bigr )\, =\,0. \end{aligned}$$
(1.5)

The Liouville equation offers a convenient framework to accommodate any control mechanism and any possible initial distribution including multi-modal ones. Indeed with (1.4) and \(a_0=0\), in the simplest case of a Gaussian unimodal distribution, the Liouville dynamics can be completely described by the first- and second-moment equations, where the control \(u_1\) appears as the main driving force of the mean value of the density, and \(u_2\) determines the evolution of the variance of the density. We refer to [10] for more details on this interpretation. We remark that, also related to this interpretation but in terms of differential inclusions, is the work in [12], that deals with a time-optimal control problem.

As a final comment, we point out that, for the characterization of the solution to our Liouville optimal control problems, we shall deal with (1.5) and with an adjoint Liouville problem, namely a transport problem, given by

$$\begin{aligned} \partial _tq(t,x)\,+\,a\bigl (t,x;u\bigr )\cdot \nabla q(t,x) \,=\,g(t,x), \qquad \hbox { with }\quad q_{|t=0}\, =\,q_0, \end{aligned}$$
(1.6)

where g and \(q_0\) depend on the optimization data.

1.2 Formulation of ensemble optimal control problems

In order to discuss Brockett’s formulation of ensemble control, consider the following ODE optimal control problem:

$$\begin{aligned}&\min j(x,u)\,:=\, \int _0^T \Big ( \theta \big (x(t)\big )\, +\, \kappa \big (u(t)\big ) \Big ) \,dt\, +\, \varphi \big (x(T)\big ) \end{aligned}$$
(1.7)
$$\begin{aligned}&\text{ s.t. } \qquad {\dot{x}}(t)\,= \,a\big (t,x(t);u(t)\big ), \qquad x(0)\,=\,x_0, \end{aligned}$$
(1.8)

where “s.t.” stands for “subject to”. Here, \(\theta \), \(\kappa \) and \(\varphi \) are usually taken to be continuous convex functions of their arguments; we will better specify their properties later on.

The optimal control function u is sought in the following set of admissible controls:

$$\begin{aligned} U_{ad}\,: =\, \left\{ u \,\in \, \mathbb {L}^\infty _T(\mathbb {R}^d)\;\bigl |\quad u^a\, \le \, u(t)\, \le \, u^b \qquad \hbox { for a.e. }\; t\,\in \,[0,T]\right\} . \end{aligned}$$
(1.9)

In particular, in the case of (1.4), we have two box constraints \(u^a\,=\,(u_1^a,u_2^a)\) and \(u^b\,=\, (u_1^b,u_2^b)\), where \(u_j^a < u_j^b\), \(j=1,2\), are given vectors in \({{\mathbb {R}}}^{d}\). Clearly, the optimal control function u that solves (1.7)–(1.8) with \(u \in U_{ad}\) depends on the initial condition \(x_0\), which is fixed, and it represents a control strategy that is determined once and for all times for the given \(x_0\) and the given optimization setting. Therefore no uncertainty on the initial condition is taken into account in the formulation (1.7)–(1.8), hence, from this point of view, the resulting control is not robust. On the other hand, a closed loop control, say, \(u=u(t,x)\), would appropriately control the system based on the actual state of the system, but, as pointed out in [9], the cost of implementing such a control mechanism is often prohibitive and may be not justified by real applications.

With the purpose to strike a balance between the desired performance of the system and the cost of implementing an effective control, the ensemble control strategy considers instead a density of initial conditions, and therefore ensemble of trajectories. In this way, it aims at achieving robustness, while choosing control costs which promote controls allowing for easier implementation. Thus, one is led to the formulation of the following ensemble optimal control problem:

$$\begin{aligned}&\min _{u \in U_{ad}} J(\rho ,u)\,:=\,\int _0^T \int _{{{\mathbb {R}}}^d} \theta (x) \, \rho (x,t) \, dx\, dt\, +\, \int _{{{\mathbb {R}}}^d} \varphi (x) \, \rho (x,T) \, dx\, +\, \int _0^T \kappa \big (u(t)\big ) \, dt \end{aligned}$$
(1.10)
$$\begin{aligned}&\hbox {s.t.} \qquad \partial _t\rho \, +\, \mathop {\mathrm{div}}\bigl ( a(t,x;u) \, \rho \bigr )\, =\,0, \qquad \rho _{|t=0}\,=\, \rho _0. \end{aligned}$$
(1.11)

This problem is defined on the space-time cylinder \(\mathbb {R}^d \times [0,T]\), for some \(T >0\) fixed. In this formulation, the initial density \(\rho _0\) represents the probability distribution of the initial condition \(x_0\) in (1.7)–(1.8), and thus it models the known uncertainty on the initial data.

Next, we discuss some specific choices of the optimization components in (1.7)–(1.8), and correspondingly in (1.10)–(1.11).

For example, if \(x=0\) is a critical point for (1.8), which requires \(a(t,0;u)=0\), then the choice \(\theta (x)=x^2\) appears standard for stabilization purposes. Usually, in this context, the so-called \(L^2\) cost of the control is considered, which corresponds to the choice \(k(u)=\gamma \, u^2\), where \(\gamma >0\) is the weight of the cost of the control. On the other hand, if the purpose of the control in (1.7)–(1.8) is to track a desired and even non-attainable trajectory \(x_d \in L^2(0,T;{{\mathbb {R}}}^d)\), and to come close to a given final configuration \(x_T \in {{\mathbb {R}}}^d\) at the final time (possibly with \(x_d(T) \ne x_T\)), then a natural choice appears to be \(\theta \big (x(t)\big )=\alpha \big (x(t) - x_d(t)\big )^2\) and \(\varphi \big (x(T)\big )=\beta \big (x(T)-x_T\big )^2\), with appropriately chosen weights \(\alpha , \,\beta >0\). Notice however that the role of those functions is to define an attracting potential (i.e. a well centred at a minimum point, such that the minus gradient of the potential is directed towards this minimum): hence, other choices are possible.

As discussed in [8,9,10], the choice of the cost function \(\kappa \) should be such that the effort of implementing the control strategy is as small as possible. In this sense, the cost of implementing a slowly varying control function, and (we add) a control that does not act for all times, should be smaller than that corresponding to a control having large variations. From this perspective, a constant input that controls the system is the cheapest choice, and the next possible choice is a control that slowly changes in time. This requirement leads naturally to a cost of the form

$$\begin{aligned} \nu \, \int _0^T \left( \frac{d u}{d t}(t) \right) ^2 \, dt, \end{aligned}$$

where \(\nu \ge 0\). In fact, as \(\nu \) is taken larger, the resulting optimal control will have smaller values of its time derivative, that is, a slowly varying control, which is called “minimum attention control” in [8].

More recently, there has been a surge of interest in \(L^1\)-costs, originating from signal reconstruction and magnetic resonance imaging [11]. This cost is given by

$$\begin{aligned} \delta \, \int _0^T \left| u(t) \right| \, dt, \end{aligned}$$

where \(\delta \ge 0\). The effect of this cost is that it promotes sparsity of the control function, in the sense that, as \(\delta >0\) is increased, the u resulting from the minimisation procedure will be zero on open intervals in \(\,]0,T[\,\), and these intervals become larger and eventually cover all of \(\,]0,T[\,\) as \(\delta \rightarrow +\infty \). In the present paper, we introduce the \(L^1\)-cost in the context of ensemble control and call the resulting sparse control a “minimum action control”.

All together, we specify the term \(\int _0^T \kappa \big (u(t)\big )\,dt\) in (1.7) and in (1.10) as follows:

$$\begin{aligned} \kappa \big (u(t)\big )\, :=\, \frac{\gamma }{2} \, \big ( u(t) \big )^2\, +\, \delta \, \left| u(t) \right| \, +\, \frac{\nu }{2} \, \left( \frac{d u}{d t} (t)\right) ^2, \end{aligned}$$
(1.12)

where \(\gamma + \delta + \nu > 0\) and the factor 1 / 2 is chosen for convenience of later calculations. Notice that different choices of the value of the positive coefficients \(\gamma , \, \delta , \, \nu \) will result in different features of the resulting optimal control function.

1.3 Goals of the paper and overview of the main results

The purpose of this paper is to give a solid theoretical basis to Brockett’s ensemble control based on Liouville models. For this, we present a rigorous investigation of a class of Liouville optimal control problems with unbounded coefficients and cost functionals that are formulated in terms of the density and of different control costs. To the best of our knowledge, no similar investigation is available in the literature yet.

The first step of our analysis, carried out in Sect. 2 consists in investigating the well-posedness of continuity and transport equations with unbounded drift function, which presents the structure (1.4). We do not strive for minimal regularity hypotheses on the drift vector field a, and frame our work within a setting that can be considered classical. We refer to e.g. Chapter 3 of [4] for the case of bounded drifts, and to the cornerstone paper [18] for the case of unbounded drifts having at most linear growth at infinity (see also [1,2,3, 17] and references therein for recent advances).

However, in order to give full rigorous justification to Brockett’s formulation, including the presence of quadratic potentials in the cost functional, we need to extend (in Sect. 2.2) the classical well-posedness theory to a class of weighted Sobolev spaces \(H^m_k\), see Definition 2.1 below. Roughly speaking, a tempered distribution \(\rho \in H^m\) belongs to \(H^m_k\) if \(\rho \) and all its derivatives up to order m belong to the measurable space \(\big (L^2({{\mathbb {R}}}^d),(1+|x|)^k\,dx\big )\). Such an extension, which is natural in our context, seems to be new in the literature. We point out also that existence, uniqueness and regularity properties are derived in this context by standard arguments: the key of the analysis reduces to show suitable a priori estimates on the solutions in weighted norms.

In passing, we mention that the well-posedness theory in weighted spaces can be adapted with no special problems to \(L^p\)-based spaces, for any \(1\le p<+\infty \). In addition, we believe that the \(H^m_k\) theory should generalise also to general hyperbolic systems which are symmetrizable in the sense of Friedrichs (see e.g. [6, 21]). However, extensions of the present work (well-posedness and investigation of optimal control problems) to both the previous directions go beyond the scopes of our paper, and we leave them for further studies.

After establishing well-posedness of the Liouville equation in a suitable framework, we pass to investigating the related optimal control problem. The main novelties of this part are the following ones: the adoption of Brockett’s problem setting and of a non-smooth functional framework; the fact that we deal with optimal control problems governed by a hyperbolic PDE; the control mechanism, which acts in the coefficients of the principal part of the partial differential operator.

We follow a standard scheme. First of all, in Sect. 3 we define the Liouville control-to-state map G, namely the map which associates to any control state u the unique solution \(\rho \,=\,G(u)\) to the corresponding Liouville equation, and study its main properties. A fundamental issue in this part is to show Fréchet differentiability (in a suitable topology) of G: our method to prove that property (see Sect. 3.2) relies on performing stability estimates on the Liouville equation. Now, dealing with the growth in space of our drift function at \(+\infty \) requires the use of weighted norms; moreover, due to the hyperbolicity of transport and continuity equations, a loss of regularity occurs, which requires to consider both higher smoothness and higher integrability on the initial data (namely, both \(m\ge 2\) and \(k\ge 2\)).

Then, in Sect. 4, we complete the investigation of the ensemble optimal control problem in the case of attracting potentials which are moreover in \(L^2\); the adaptations needed to treat the case of quadratic potentials are mentioned in Sect. 4.4. The first step consists in establishing (see Theorem 4.1) the existence of optimal controls. Then, we characterise these optimal controls as solutions of a first-order optimality system, which can be interpreted in terms of the Fréchet differential of the reduced functional \(\widehat{J}(u)\,:=\,J\big (u,G(u)\big )\). Remark that the differentiability properties of J (and \(\widehat{J}\)) change radically depending on the choice of the optimization weights. For instance, if \(\gamma >0\) and \(\delta =\nu =0\), then the optimization space is \(L^2(0,T)\) and we have Fréchet differentiability of the cost functional. This is the “standard” case. If instead \(\delta > 0\), then we have a semi-smooth optimal control problem and we have to resort to the use of sub-differentials. Finally, if \(\nu >0\), then \(H^1(0,T)\) is the appropriate control space, and the optimality condition accounts for this fact. If all weights are positive and with control constraints, we have an optimal control problem whose structure (to the best of our knowledge) has never been investigated in PDE optimization. For this general case, we prove (see Theorem 4.2) existence of Lagrange multipliers, and derive the optimality system.

In Sect. 4.3, we address the uniqueness of optimal ensemble controls, in the special case \(\gamma >0\) and \(\delta =\nu =0\). More precisely, in Theorem 4.3 we show uniqueness of optimal controls for the control-constrained problem, provided a smallness condition is satisfied; such a condition requires the time T and the size of the data \(\rho \), g, \(\theta \) and \(\varphi \) to be small enough, or the coefficient \(\gamma \) to be sufficiently large. This part of the analysis exploits in a fundamental way the optimality system previously derived, and the characterisation of optimal controls as solutions to it.

1.3.1 Notation

In this section, we present our notation that we use throughout the paper.

Given a domain \(\Omega \subset {{\mathbb {R}}}^d\), the symbol \(C_c^\infty (\Omega )\) denotes the space of infinitely often differentiable functions with compact support in \(\Omega \). Given \(k\in {{\mathbb {N}}}\), we denote by \(C^k(\Omega )\) the space of all k-times continuously differentiable functions defined on \(\Omega \), and by \(C_b^k(\Omega )\) the subspace of \(C^k(\Omega )\) formed by functions which are uniformly bounded together with all their derivatives up to the order k. We equip \(C_b^k(\Omega )\) with the \(W^{k,\infty }\)-norm as follows: \(\left\| v \right\| _{C^{k}_b}\, :=\, \sum _{|\alpha | \le k} \left\| D^\alpha v \right\| _{L^\infty }\).

For \(\alpha \in \,]0,1]\,\), we denote with \(C^{0,\alpha }(\Omega )\) the classical Hölder space (Lipschitz space if \(\alpha =1\)), endowed with the norm \(\left\| \Phi \right\| _{C^{0,\alpha }}\,:=\,\sup _{x \in \Omega } |\Phi (x)|+\sup \big (|\Phi (x)-\Phi (y)|\,/\,|x-y|^\alpha \big )\), where the \(\sup \) is taken over all \(x\ne y\in \Omega \) such that \(|x-y|<1\)-. In particular, \(C^{0,1}(\Omega )\,\equiv \,W^{1,\infty }(\Omega )\).

For \(k \in \mathbb {N}\) and \(1 \le p \le +\infty \), we denote with \(W^{k,p}(\Omega )\) the usual Sobolev space of \(L^p\) functions with all the derivatives up to the order k in \(L^p\); we also set \(H^k(\Omega ):=W^{k,2}(\Omega )\). For \(1\le p < +\infty \), let \(W^{-k,p}(\Omega )\) denote the dual space of \(W^{k,p}(\Omega )\). For any \(p\in [1,+\infty ]\), the space \(L^p_{loc}(\Omega )\) is the set formed by all functions which belong to \(L^p(\Omega _0)\), for any compact subset \(\Omega _0\) of \(\Omega \).

Furthermore, we make use of the so-called Bochner spaces. Given two Banach spaces X and Y and a fixed time \(T>0\), we define \(X_T(Y)\, := \,X\bigl ([0,T];Y\bigr )\), with \(\left\| u \right\| _{X_T(Y)}\,:=\,\int _0^T \left\| u(t) \right\| _Y\, dt\).

Given a Banach space X and a sequence \(\bigl (\Phi _n\bigr )_n\), we use the notation \(\bigl (\Phi _n\bigr )_n \prec X\) meaning that \(\Phi _n \in X\) for all \(n \in \mathbb {N}\) and that this sequence is uniformly bounded in X: there exists some constant \(M>0\) such that \(\left\| \Phi _n \right\| _X \le M~\forall n \in \mathbb {N}\).

Given two Banach spaces X and Y, the space \(X\cap Y\), endowed with the norm \(\Vert \cdot \Vert _{X\cap Y}\,:=\,\Vert \cdot \Vert _{X}\,+\,\Vert \cdot \Vert _{Y}\), is still a Banach space.

For every \(p\in [1,+\infty ]\), we use the notation \(\mathbb {L}^p_T(\mathbb {R}^d)\,:=\, L^p_T(\mathbb {R}^d) \times L^p_T(\mathbb {R}^d)\). Analogously, \(\mathbb {H}^1_T(\mathbb {R}^d)\,:=\, H^1_T(\mathbb {R}^d) \times H^1_T(\mathbb {R}^d)\). In addition, given two vectors u and v in \({{\mathbb {R}}}^d\), we write \(u\le v\) if the inequality is satisfied component by component by the two vectors: namely, \(u^i\le v^i\) for all \(1\le i\le d\).

Given two operators A and B, we use the standard symbol [AB] to denote their commutator: \([A,B]\,:=\,AB-BA\).

2 Theory of Liouville and transport equations with unbounded drifts

In this section, we present results concerning the well-posedness theory of Liouville and transport equations in the class of Sobolev spaces. In view of formula (1.4), we are especially interested in the case when the drift function a may be unbouded, but has at most a linear growth at infinity.

In Sect. 2.1, we review the well-posedness theory in classical \(H^m\) spaces, for \(m\in {{\mathbb {N}}}\) (for simplicity); we do not present all the proofs here, and refer to e.g. [1, 17, 18] for the details and more general results. Afterwards in Sect. 2.2, motivated by the study of our optimal control problem, we extend these results to weighted Sobolev spaces.

2.1 Classical theory of Liouville and transport equations

We start our discussion by considering the Liouville equation. Notice that our statements can be repeated in a very similar way (with just slight modifications) also for the adjoint Liouville problem, namely the transport equation: we treat this case in Paragraph 2.1.2, without giving details.

2.1.1 Liouville equations in classical Sobolev spaces

Consider the following Liouville initial-value problem

$$\begin{aligned} \left\{ \begin{array}{ll} \partial _t\rho \, + \mathop {\mathrm{div}}\bigl ( a(t,x) \, \rho \bigr )\, = \, g(t,x) \qquad \qquad &{} \text{ in } \qquad [0,T]\times {{\mathbb {R}}}^d \\ \rho _{|t=0}\,=\, \rho _0 \qquad \qquad &{} \text{ on } \qquad {{\mathbb {R}}}^d. \end{array} \right. \end{aligned}$$
(2.1)

Whenever attempting at solving Eq. (2.1), we have in mind its weak formulation. Namely, for all \(\phi \, \in \, C_c^\infty \bigl (\mathbb {R}^d \times [0, T[\,\bigr )\), we want to verify the following equality:

$$\begin{aligned} -\int _0^T\int _{\mathbb {R}^d}\rho \,\partial _t \phi \,dx\,dt\,-\,\int _0^T \int _{\mathbb {R}^d}\rho \, a\cdot \nabla \phi \,dx\,dt\,=\, \int _0^T \int _{\mathbb {R}^d}g\,\phi \,dx\,dt\, +\, \int _{\mathbb {R}^d} \rho _0\,\phi (0)\,dx. \end{aligned}$$
(2.2)

The theory for this equation is classical, at least in the case of a bounded drift function a. The following well-posedness result is adapted to our needs from Theorem 3.19 in [4].

Theorem 2.1

Let us fix \(T>0\) and \(m \in \mathbb {N}\), and let \(a \in L^1\bigl ([0,T];C_b^{m+1}(\mathbb {R}^d)\bigr )\), \(\rho _0 \in H^m(\mathbb {R}^d)\) and \(g \in L^1\bigl ([0,T];H^m(\mathbb {R}^d)\bigr )\).

Then there exists a unique weak solution \(\rho \) to (2.1), with \(\rho \,\in \,C\bigl ([0,T];H^{m}(\mathbb {R}^d)\bigr )\). Moreover, there exists a “universal” constant \(C>0\), independent of \(\rho _0\), a, g, \(\rho \) and T, such that the following estimate holds true for any \(t\in [0,T]\):

$$\begin{aligned} \left\| \rho (t) \right\| _{H^m}\, \le \,C \left( \left\| \rho _0 \right\| _{H^m}\, +\, \int _0^t \left\| g(\tau ) \right\| _{H^m}\, d\tau \right) \;\exp \left( C\,\int _0^t\left\| \nabla a(\tau ) \right\| _{C^{m}_b}\,d\tau \right) . \end{aligned}$$

Remark 2.1

In the case \(m=0\), one can replace \(\left\| \nabla a \right\| _{C^{0}_b}\) with \(\Vert \mathrm{div}\,a\Vert _{L^\infty }\) inside the integral in the exponential term.

Motivated by the study of our optimal control problem, see Sect. 1.1 and especially Definition (1.4), we are rather interested in the case when a may be unbounded, with at most a linear growth at infinity. More precisely, given \(m\in {{\mathbb {N}}}\), we assume

$$\begin{aligned} {\left\{ \begin{array}{ll} g \,\in \,L^1\bigl ([0,T];H^m(\mathbb {R}^d)\bigr ) \quad \text{ and } \quad \rho _0 \,\in \, H^m(\mathbb {R}^d) \\ a\, \in \, L^1\bigl ([0,T];C^{m+1}({{\mathbb {R}}}^d)\bigr ),\qquad \text{ with } \quad \nabla a\,\in \,L^1\bigl ([0,T];C_b^{m}({{\mathbb {R}}}^d)\bigr ). \end{array}\right. } \end{aligned}$$
(2.3)

Remark 2.2

Notice that hypotheses (2.3) imply, in particular, that \(a(t,\cdot )\) has at most linear growth in space at infinity: for almost every \((t,x)\,\in \,[0,T]\times {{\mathbb {R}}}^d\), one has

$$\begin{aligned} \left| a(t,x)\right| \,\le \,C\,c(t)\,(1+|x|),\qquad \qquad \hbox { for }\qquad c\,=\,\left\| \nabla a\right\| _{L^\infty }\,\in \,L^1\bigl ([0,T]\bigr ). \end{aligned}$$

The condition of at most linear growth at infinity can be proved to be somehow sharp for well-posedness, see e.g. [17, 18] and the references therein.

The main result of this section is the following statement, proved by DiPerna and Lions in [18] (see also [17]). However, we give here a self-contained presentation of its proof.

Theorem 2.2

Let \(T>0\) and \(m\in {{\mathbb {N}}}\) fixed, and let a, \(\rho _0\) and g satisfy hypotheses (2.3).

Then there exists a unique solution \(\rho \,\in \,C\bigl ([0,T];H^{m}(\mathbb {R}^d)\bigr )\) to problem (2.1). Moreover, there exists a “universal” constant \(C>0\), independent of \(\rho _0\), a, g, \(\rho \) and T, such that the following estimate holds true for any \(t\in [0,T]\):

$$\begin{aligned} \left\| \rho (t) \right\| _{H^m}\, \le \,C \left( \left\| \rho _0 \right\| _{H^m}\, +\, \int _0^t \left\| g(\tau ) \right\| _{H^m}\, d\tau \right) \;\exp \left( C\,\int _0^t\left\| \nabla a(\tau ) \right\| _{C^{m}_b}\,d\tau \right) . \end{aligned}$$
(2.4)

We notice that Remark 2.1 applies also in this case.

The rest of this paragraph is devoted to sketch the proof of Theorem 2.2. As for existence, we derive it from Theorem 2.1, after truncation, approximation and passage to the limit in the truncation parameter. We conclude by discussing time regularity and uniqueness issues.

Existence. The first step is to construct a suitable truncation of the drift function. For this purpose, let us introduce a smooth cut-off function \(\chi \in C^\infty _c(\mathbb {R}^d)\) such that \(\chi \) is radially decreasing, \(\chi (x)=1\) for \(|x|\le 1\) and \(\chi (x)=0\) for \(|x|\ge 2\). For all real \(M>0\), we define

$$\begin{aligned} a_M(t,x)\,:=\, \chi \Big ( \frac{x}{M} \Big )\,a(t,x). \end{aligned}$$
(2.5)

Notice that, by assumptions (2.3), we immediately get that \(a_M\,\in \,L^1_T(C^{m+1}_b)\) for all \(M>0\). Moreover, in view of Remark 2.2, an easy computation shows that

$$\begin{aligned} \bigl (\nabla a_M\bigr )_M\,\prec \,L^1_T(C^m_b),\qquad \qquad \text{ with } \qquad \left\| \nabla a_M\right\| _{L^1_T(L^\infty )}\,\le \,C, \end{aligned}$$
(2.6)

for a constant \(C>0\) independent of M. Indeed, denoting by \(\mathbb {1}_A\) the characteristic function of a set \(A\subset {{\mathbb {R}}}^d\) and by B(xR) the ball in \({{\mathbb {R}}}^d\) of center x and radius \(R>0\), we can compute

$$\begin{aligned} \left\| \nabla a_M\right\| _{L^\infty }\,&=\,\left\| \frac{1}{M}\,\nabla \chi \Big (\frac{x}{M} \Big )\,a\, +\, \chi \Big (\frac{x}{M} \Big )\,\nabla a\right\| _{L^\infty }\,\\&\le \, C\,\frac{1}{M}\,\left\| a\,\mathbb {1}_{B(0,2M)}\right\| _{L^\infty }\,+\,\left\| \nabla a\right\| _{L^\infty }\;\le \,C. \end{aligned}$$

The bounds for higher order derivatives follow the same lines, after noticing that, at each order of differentiation, we gain a factor 1 / M in front of a.

At this point, for each fixed \(M>0\), we can consider the truncated problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t \rho \, +\, \mathop {\mathrm{div}}\left( a_M\,\rho \right) \, =\, g \\ \rho _{|t=0}\, =\, \rho _0, \end{array}\right. } \end{aligned}$$
(2.7)

which possesses a unique weak solution \(\rho _M\,\in \,C\bigl ([0,T];H^m({{\mathbb {R}}}^d)\bigr )\), by virtue of Theorem 2.1. Moreover, each \(\rho _M\) satisfies the energy estimate (2.4), up to replacing a by \(a_M\). Thus, we have

$$\begin{aligned} \left\| \rho _M(t) \right\| _{H^m}\, \le \,C \left( \left\| \rho _0 \right\| _{H^m}\, +\, \int _0^t \left\| g(\tau ) \right\| _{H^m}\, d\tau \right) \;\exp \left( C\,\int _0^t\left\| \nabla a_M(\tau ) \right\| _{C^{m}_b}\,d\tau \right) . \end{aligned}$$
(2.8)

Thanks to property (2.6), we deduce the uniform bounds \(\bigl (\rho _M\bigr )_M\,\prec \,L^\infty \bigl ([0,T];H^m({{\mathbb {R}}}^d)\bigr )\). As a consequence, we obtain the existence of a \(\rho \,\in \,L^\infty _T(H^m)\) such that, up to the extraction of a subsequence, one has \(\rho _M\,{\mathop {\rightharpoonup }\limits ^{*}}\,\rho \) in \(L^\infty _T(H^m)\).

Our next goal is to show that \(\rho \) actually solves problem (2.1) in the weak form, see (2.2). For this purpose, we need to pass to the limit, for \(M\rightarrow +\infty \), in the weak formulation of (2.7): for any \(\phi \in C^\infty _c\bigl (\mathbb {R}^d \times [0,T[\,\bigr )\), we have

$$\begin{aligned}&-\int _0^T \int _{\mathbb {R}^d}\rho _M\, \partial _t \phi \,dx\,dt \,-\, \int _0^T \int _{\mathbb {R}^d}\rho _M\, a_M \cdot \nabla \phi \,dx\,dt \nonumber \\&\quad = \int _0^T \int _{\mathbb {R}^d}g\, \phi \,dx\,dt\, +\, \int _{\mathbb {R}^d}\rho _0 \,\phi (0)\,dx. \end{aligned}$$
(2.9)

Of course, it is enough to prove the convergence in the case of minimal regularity, i.e. for \(m=0\). Thus, we restrict to this case in the next argument.

The only term which presents some difficulties is (2.9) is the “non-linear” term \(\rho _M\,a_M\): its convergence is based on the next lemma, whose proof is elementary hence omitted.

Lemma 2.1

For all compact set \(K \subset \mathbb {R}^d\), there holds

$$\begin{aligned} \left\| a_M -a\right\| _{L^1_T(L^\infty (K))}\, \longrightarrow \, 0\qquad \qquad \text{ as } \quad M\,\rightarrow \,+\infty . \end{aligned}$$

Let now K denote the support in x of \(\phi \), where \(\phi \) is the test function appearing in (2.9). Thanks to uniform bounds, to the strong convergence of \(a_M\) to a in \(L^1_T\bigl (L^\infty (K)\bigr )\) (given by Lemma 2.1) and the weak-\(*\) convergence of \(\rho _M\) to \(\rho \) in \(L^\infty _T(L^2)\), it is an easy exercice to deduce that \(\bigl (\rho _M\,a_M\bigr )_M\) is uniformly bounded in \(L^1_T\bigl (L^2(K)\bigr )\), and \(\rho _M\,a_M\,{\mathop {\rightharpoonup }\limits ^{*}}\,\rho \,a\) in that space, in the limit when \(M\rightarrow +\infty \).

In the end, we have proved that the limit function \(\rho \) is a weak solution to (2.1). Observe that, thanks to (2.8), the uniform bounds (2.6) and lower semicontinuity of the norm, we also deduce that \(\rho \) verifies the energy estimate (2.4).

Time regularity and uniqueness It remains to prove uniqueness of solutions and their time regularity. They are both consequences of the next proposition.

Proposition 2.1

Let \(T>0\) and take \(m\in {{\mathbb {N}}}\). Let \(\rho \,\in \,L^\infty _T(H^m)\) be a weak solution to Eq. (2.1) under hypotheses (2.3).

Then \(\rho \,\in \,C\bigl ([0,T];H^m({{\mathbb {R}}}^d)\bigr )\) and it verifies the energy estimate (2.4).

We present the proof of the previous claim just in the minimal regularity case, namely for \(m=0\). The general case follows by the same token. To start with, let us state a classical lemma (see e.g. [1, 18] for details), whose proof is hence omitted.

For later use, let us fix a function \(s\,\in \,C^\infty _c({{\mathbb {R}}}^d)\), with \(s\equiv 1\) for \(|x|\le 1\) and \(s\equiv 0\) for \(|x|\ge 2\), s radially decreasing and such that \(\int _{{{\mathbb {R}}}^d}s\,=\,1\). For all \(n\in {{\mathbb {N}}}\), we then define \(s_n(x)\,:=\,n^d\,s(nx)\). We refer to the family \(\big (s_n\big )_n\) as a family of standard mollifiers.

Lemma 2.2

Let \(\bigl (s_n\bigr )_n\) be a family of standard mollifiers, as constructed here above. For all \(n\in {{\mathbb {N}}}\), define the operator \(S_n\), acting on tempered distributions over \({{\mathbb {R}}}_+\times {{\mathbb {R}}}^d\), by the formula

$$\begin{aligned} S_n\rho \,:=\,s_n\,*_x\,\rho , \end{aligned}$$

where the symbol \(*_x\) means that the convolution is taken only with respect to the space variable. For given \(\rho \,\in \,L^\infty _T(L^2)\) and \(a\,\in \,L^1_T(C^1)\) such that \(\nabla a\,\in \,L^1_T(C_b)\), we set, for all \(n\in {{\mathbb {N}}}\) and \(1\le j\le d\),

$$\begin{aligned} r^j_n(\rho )\,:=\,\partial _j\left( \big [a,S_n\big ]\rho \right) . \end{aligned}$$

Then, for all j fixed, we have \(\big (r^j_n\big )_n\,\subset \,L^1_T(L^2)\); moreover, for \(n\rightarrow +\infty \), we have the strong convergence \(r^j_n\,\longrightarrow \,0\) in \(L^1_T(L^2)\).

Let us also recall the following standard notation. For X a Banach space and \(X^*\) its predual, we denote by \(C_w\bigl ([0,T];X\bigr )\) the set of measurable functions \(f:[0,T]\rightarrow X\) which are continuous with respect to the weak topology. Namely, for any \(\phi \in X^*\), the function \(t\,\mapsto \,\langle \phi ,f(t)\rangle _{X^*\times X}\) is continuous over [0, T].

With this preparation, we are now ready to prove Proposition 2.1.

Proof of Proposition 2.1

With the same notations as in Lemma 2.2, let us define \(\rho _n\,:=\,S_n\rho \). Notice that \(\big (\rho _n\big )_n\,\subset \,L^\infty _T(L^2)\). Moreover, \(\rho _n\) satisfies the equation

$$\begin{aligned} \partial _t\rho _n\,+\,\mathrm{div}\,\big (a\,\rho _n\big )\,=\,g_n\,+\,r_n,\qquad \qquad \hbox { with }\qquad \big (\rho _n\big )_{|t=0}\,=\,S_n\rho _0, \end{aligned}$$
(2.10)

where we have set \(r_n\,:=\,\mathrm{div}\,\left( \big [a,S_n\big ]\rho \right) \). Notice that one has \(\left\| S_n\rho _0\right\| _{L^2}\,\le \,C\,\left\| \rho _0\right\| _{L^2}\) and \(\left\| g_n\right\| _{L^1_T(L^2)}\,\le \,C\,\Vert g\Vert _{L^1_T(L^2)}\). Furthermore, when \(n\rightarrow +\infty \), we have the strong convergences \(g_n\,\longrightarrow \,g\) in \(L^1_T(L^2)\) and \(S_n\rho _0\,\longrightarrow \,\rho _0\) in \(L^2\). In addition, by Lemma 2.2, we know that \(\left\| r_n\right\| _{L^1_T(L^2)}\,\le \,C\) and \(r_n\,\longrightarrow \,0\) in \(L^1_T(L^2)\).

Next, an easy inspection of (2.10) shows that \(\big (\partial _t\rho _n\big )_n\,\prec \,L^1_T(H^{-1}_{\mathrm{loc}})\), which in turn gives us the uniform embedding \(\big (\rho _n\big )_n\,\prec \,C_T(H^{-1}_{\mathrm{loc}})\). From this latter property, combined with a density argument and the uniform boundedness of \(\big (\rho _n\big )_n\) in \(L^\infty _T(L^2)\), we deduce that \(\big (\rho _n\big )_n\) is uniformly bounded in \(C_w\bigl ([0,T];L^2({{\mathbb {R}}}^d)\bigr )\).

Now, let us take the \(L^2\) scalar product of Eq. (2.10) by \(\rho _n\): by standard computations we get

$$\begin{aligned} \frac{1}{2}\,\frac{d}{dt}\left\| \rho _n\right\| ^2_{L^2}\,+\,\frac{1}{2}\int \mathrm{div}\,a\,\left| \rho _n\right| ^2\,dx\,=\,\int g_n\,\rho _n\,dx, \end{aligned}$$
(2.11)

which implies that, for all \(n\in {{\mathbb {N}}}\), one has \(\left\| \rho _n(t)\right\| _{L^2}\,\in \,C\big ([0,T]\big )\). Thanks to this property, together with the fact that \(\rho _n\,\in \,C_w\bigl ([0,T];L^2({{\mathbb {R}}}^d)\bigr )\), after writing

$$\begin{aligned} \left\| \rho _n(t+h)\,-\,\rho (t)\right\| _{L^2}^2\,=\,\left\| \rho _n(t+h)\right\| ^2_{L^2}\,-\,2\,\langle \rho _n(t+h),\rho _n(t)\rangle _{L^2\times L^2}\,+\,\left\| \rho _n(t)\right\| ^2_{L^2}, \end{aligned}$$

one immediately deduces that \(\rho _n\) belongs to \(C_T(L^2)\) for all \(n\in {{\mathbb {N}}}\).

On the other hand, by straightforward computations, relation (2.11) also yields

$$\begin{aligned} \left\| \rho _n(t) \right\| _{L^2}\,&\le \,C\,\exp \left( C\int _0^t \left\| \mathrm{div}\,a(\tau ) \right\| _{L^\infty }\,d\tau \right) \,\nonumber \\&\quad \times \left( \left\| S_n\rho _0 \right\| _{L^2}\,+\,\int _0^t \left( \left\| g_n(\tau ) \right\| _{L^2}\,+\,\left\| r_n(\tau ) \right\| _{L^2}\right) d\tau \right) \nonumber \\&\le \,C\,\exp \left( C\int _0^t \left\| \mathrm{div}\,a(\tau ) \right\| _{L^\infty }\,d\tau \right) \, \left( \left\| \rho _0 \right\| _{L^2}\,+\,\int _0^t\left\| g(\tau ) \right\| _{L^2}\,d\tau \right) , \end{aligned}$$
(2.12)

for all \(t \in [0,T]\), thanks also to the previous properties on \(\big (S_n\rho _0\big )_n\), \(\big (g_n\big )_n\) and \(\big (r_n\big )_n\). In view of this energy estimate, we deduce that \(\big (\rho _n\big )_n\) is uniformly bounded in \(C_T(L^2)\).

By a similar argument, using the fact that \(\big (S_n\rho _0\big )_n\), \(\big (g_n\big )_n\) and \(\big (r_n\big )_n\) are strongly convergent in the respective functional spaces, one can moreover deduce that \(\left( \rho _n\right) _n\) is a Cauchy sequence in \(C_T(L^2)\).

That properties implies that the limit \(\rho \) of the sequence \(\big (\rho _n\big )_n\) belongs to \(C_T(L^2)\), and the convergence \(\rho _n\longrightarrow \rho \) is strong in this space. Finally, passing to the limit in the left-hand side of (2.12) we discover that \(\rho \) verifies the energy estimate (2.4). \(\square \)

Now, stability and uniqueness are easy consequences of Proposition 2.1.

Proposition 2.2

Fix \(T>0\) and \(m\in {{\mathbb {N}}}\), and let a be as in (2.3). For \(i=1,2\), take an initial datum \(\rho _0^i\,\in \,H^m({{\mathbb {R}}}^d)\) and an external force \(g^i\,\in \,L^1\bigl ([0,T];H^m({{\mathbb {R}}}^d)\bigr )\), and let \(\rho ^i\in L^\infty _T(H^m)\) be a corresponding solution to (2.1) (whose existence is guaranteed by the previous arguments).

Then, after defining \(\delta \rho _0\,:=\,\rho _0^1-\rho ^2_0\), \(\delta g\,:=\,g^1-g^2\) and \(\delta \rho \,:=\,\rho ^1-\rho ^2\), the following estimate holds true for all \(t\in [0,T]\), for some constant C independent of the data and the respective solutions:

$$\begin{aligned} \left\| \delta \rho (t)\right\| _{H^m}\,\le \,C\,\left( \left\| \delta \rho _0 \right\| _{H^m}\, +\,\int _0^t \left\| \delta g(\tau ) \right\| _{H^m}\, d\tau \right) \; \exp \left( C\,\int _0^t \left\| \nabla a(\tau ) \right\| _{C^{m}_b}\, d\tau \right) . \end{aligned}$$

Indeed, it is enough to remark that, by taking the difference of the equations satisfied by \(\rho ^1\) and \(\rho ^2\), one deduces that \(\delta \rho \,\in \,L^\infty _T(L^2)\) is a weak solution to the following equation:

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t\delta \rho \, +\, \mathop {\mathrm{div}}\bigl (a\,\delta \rho \bigr )\,=\,\delta g \\ \delta \rho _{|t=0}\, =\, \delta \rho _0. \end{array}\right. } \end{aligned}$$

2.1.2 The case of the transport equation

The characterization of ensemble controls with the optimality conditions given in Sect. 4.2, requires the solution of an adjoint Liouville problem, which is given by a linear transport problem. In preparation of that discussion, and to complete the analysis of the present section, we consider the following transport problem:

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _tq\, +\, a\cdot \nabla q\, +\, b\,q\, =\, g &{}\qquad \text { in }\; [0,T]\times {{\mathbb {R}}}^d\\ q_{|t=0}\, =\, q_0 &{}\qquad \text { on } \mathbb {R}^d. \end{array}\right. } \end{aligned}$$
(2.13)

We assume that the data \(q_0\), a and g verify the assumptions in (2.3), where \(\rho _0\) is replaced by \(q_0\). Moreover, we assume that b has the same regularity as \(\mathop {\mathrm{div}}a\): that is, \(b\,\in \,L^1\bigl ([0,T];C^m_b({{\mathbb {R}}}^d)\bigr )\).

The weak formulation of (2.13) now reads as follows: for all \(\phi \, \in \, C_c^\infty \bigl (\mathbb {R}^d \times [0, T[\,\bigr )\), one has

$$\begin{aligned}&-\int _0^T\int _{\mathbb {R}^d}\rho \,\partial _t \phi \,-\int _0^T \int _{\mathbb {R}^d}\rho \, a\cdot \nabla \phi \,-\int ^T_0\int _{{{\mathbb {R}}}^d}\rho \,\mathrm{div}\,a\,\phi \,+\int ^T_0\int _{{{\mathbb {R}}}^d}\rho \,b\,\phi \nonumber \\&\quad = \int _0^T\int _{\mathbb {R}^d}g\,\phi \, + \int _{\mathbb {R}^d} \rho _0\,\phi (0). \end{aligned}$$
(2.14)

For (2.13), we have the following well-posedness result, analogous to Theorem 2.2 for the Liouville equation.

Theorem 2.3

Fix \(T>0\) and \(m\in {{\mathbb {N}}}\), and let a, b, \(q_0\) and g satisfy the assumptions stated above.

Then there exists a unique solution \(q\,\in \,C\big ([0,T];H^m({{\mathbb {R}}}^d)\big )\) to Eq. (2.13). Moreover, there exists a “universal” constant \(C>0\), independent of \(q_0\), a, b, g, q and T, such that the following estimate holds true for any \(t\in [0,T]\):

$$\begin{aligned} \left\| q(t) \right\| _{H^m}\,&\le \,C\,\left( \left\| q_0 \right\| _{H^m}\, +\, \int _0^t \left\| g(\tau ) \right\| _{H^m}\, d\tau \right) \;\nonumber \\&\quad \times \exp \left( C\,\int _0^t\left( \left\| \nabla a(\tau ) \right\| _{C^{m}_b}\,+\,\left\| b(\tau ) \right\| _{C^{m}_b}\right) \,d\tau \right) . \end{aligned}$$
(2.15)

The proof is analogous to the one given for Theorem 2.1, so it is omitted here. The only point which deserves some attention is passing to the limit in the weak formulation (2.14) at step n of the regularization procedure, especially in the terms involving \(\mathrm{div}\,a^n\) and \(b^n\): for this, one can use Proposition 4.21 and Theorem 4.22 of [7] to deduce that both terms converge respectively to \(\mathrm{div}\,a\) and b in \(L^1_T\bigl (L^\infty (K)\bigr )\) for \(n\rightarrow +\infty \).

2.2 Well-posedness theory in weighted spaces

In this section, we extend the previous theory to Sobolev spaces with weights. This analysis is especially important for the investigation of the Liouville control-to-state map and of the Liouville ensemble optimal control problem, carried out in the next sections.

Remark 2.3

We limit ourselves to treat the case of the Liouville equation. However, the statements that follow hold also for the transport problem, with minor modifications in the proofs.

2.2.1 Definition of weighted spaces

For the analysis of the Liouville control-to-state map in Sect. 3, we need to prove weighted integrability of \(\rho \), due to the growth of the drift function. For this purpose, we introduce the following definition.

Definition 2.1

Fix \((m,k)\in {{\mathbb {N}}}^2\). We define the space \(H^m_k(\mathbb {R}^d)\) in the following way:

$$\begin{aligned} H^0_k({{\mathbb {R}}}^d)\,=\,L^2_k({{\mathbb {R}}}^d)\,:=\,\left\{ f\in L^2({{\mathbb {R}}}^d)\;\bigl |\quad |x|^k\,f\;\in \;L^2(\mathbb {R}^d)\right\} , \end{aligned}$$

and, for \(m\ge 1\), we set

$$\begin{aligned} H^m_k({{\mathbb {R}}}^d)\,:=\,\left\{ f\in H^m({{\mathbb {R}}}^d)\cap H^{m-1}_k({{\mathbb {R}}}^d)\;\big |\quad |x|^k\,D^\alpha f\;\in \;L^2(\mathbb {R}^d)\quad \forall \;|\alpha |=m\right\} . \end{aligned}$$

The space \(H^m_k\) is endowed with the following norm:

$$\begin{aligned} \left\| f \right\| _{H^m_k} := \sum _{|\alpha | \le m} \left\| \big (1\,+\,|x|^k\big )\,D^\alpha f\right\| _{L^2}. \end{aligned}$$

Sometimes, given \(m\in {{\mathbb {N}}}\), we will use the notation \(\left\| \nabla ^mf\right\| _{L^2}\,=\,\sum _{|\alpha |=m}\left\| D^\alpha f\right\| _{L^2}\), and analogous writing for weighted norms.

Notice that, for all m and k in \({{\mathbb {N}}}\), one has \(H^m_k\,\subset \,H^m\). Of course, \(H^m\,=\,H^m_0\) for all \(m\ge 0\). Furthermore, since we want to avoid too singular behaviours close to 0, we will often focus on the special case (which will be enough for our scopes) \(m\,\le \,k\). In that case, we have a simple characterization of the spaces \(H^m_k\), which will be useful especially in Sect. 3, when studying the control-to-state map related to our optimal control problem.

Proposition 2.3

  1. (i)

    Given \(k\in {{\mathbb {N}}}\), one has \(f\,\in \,L^2_k\) if and only if \((1+|x|^k)\,f\,\in \,L^2\).

  2. (ii)

    For \(k\in {{\mathbb {N}}}{\setminus }\{0\}\) and \(1\le m\le k\), let \(f\,\in \,H^m\cap H^{m-1}_k\). Then \(f\,\in \,H^m_k\) if and only if \(|x|^k\,f\,\in \,H^m\).

    In particular, a tempered distribution f belongs to \(H^1_1\) if and only if both f and \(|x|\,f\) belong to \(H^1\); it belongs to \(H^2_2\) if and only if both f and \(|x|^2\,f\) belong to \(H^2\) and \(\nabla f\) belongs to \(L^2_2\).

Proposition 2.3 relies on the next lemma, whose proof is elementary, hence omitted.

Lemma 2.3

Let \((m,k)\in {{\mathbb {N}}}^2\), with \(m\le k\). If \(f\,\in \,H^m_k\), then \((1+|x|^k)\,f\,\in \,H^{m}\).

Thanks to Lemma 2.3, we can prove Proposition 2.3.

Proof of Proposition 2.3

Assertion (i) is trivial. So, let us focus on the proof of (ii).

Suppose that \(f\in H^m\cap H^{m-1}_k\). Then, by Lemma 2.3 above, we have that \(|x|^k\,f\,\in \,H^{m-1}_k\). At this point, for \(|\alpha |=m\), we write, using Leibniz rule,

$$\begin{aligned} D^\alpha \left( |x|^k\,f\right) \,=\,|x|^k\,D^\alpha f\,+\,\sum _\beta D^\beta |x|^k\,D^{\alpha -\beta }f, \end{aligned}$$

where the sum is performed for all \(\beta \le \alpha \) such that \(|\beta |\ge 1\). By the previous arguments, and the fact that \(m\le k\), we have that all the terms in the sum belong to \(L^2\). Then, the term on the left-hand side belongs to \(L^2\) if and only if the first term on the right-hand side does.

The last sentences follow by straightforward computations, using the equality \(\partial _j\big (|x|\,f\big )\,=\,\partial _j|x|\,f\,+\,|x|\,\partial _jf\), where \(1\le j\le d\), and the relation

$$\begin{aligned} \nabla ^2\big (|x|^2\,f\big )\,\sim \,\nabla \big (|x|\,f\,+\,|x|^2\,\nabla f\big )\,\sim \,\nabla |x|\,f\,+\,\big (|x|+|x|^2\big )\,\nabla f\,+\,|x|^2\,\nabla ^2f. \end{aligned}$$

We omit the details here. \(\square \)

2.2.2 The Liouville equation in weighted spaces

After the above preliminaries, we are ready to state the main result of this section, which show well-posedness of the Liouville equation in \(H^m_k\) spaces.

Theorem 2.4

Let \(T>0\) and \((m,k)\in {{\mathbb {N}}}^2\) fixed, and let a be a vector field satisfying hypotheses (2.3). Assume also that \(\rho _0\in H^m_k({{\mathbb {R}}}^d)\) and \(g\in L^1\big ([0,T];H^m_k({{\mathbb {R}}}^d)\big )\).

Then there exists a unique solution \(\rho \,\in \,C\bigl ([0,T];H^{m}_k(\mathbb {R}^d)\bigr )\) to problem (2.1). Moreover, there exists a “universal” constant \(C>0\), independent of \(\rho _0\), a, g, \(\rho \) and T, such that the following estimate holds true for any \(t\in [0,T]\):

$$\begin{aligned} \left\| \rho (t) \right\| _{H^m_k}\, \le \,C\,\exp \left( C\,\int _0^t\left\| \nabla a(\tau )\right\| _{C^m_b}\,d\tau \right) \, \left( \left\| \rho _0 \right\| _{H^m_k}\, +\, \int _0^t \left\| g(\tau ) \right\| _{H^m_k}\, d\tau \right) . \end{aligned}$$
(2.16)

Most of the claims of the previous statement directly follow from Theorem 2.2. We have just to prove propagation of higher integrability (i.e. \(k\ge 1\)). Omitting a standard regularization procedure for the sake of brevity, we focus only on energy estimates for Eq. (2.1).

Before proving Theorem 2.4 in its full generality, let us consider its version for simpler cases, which will be needed in the proof of the general case. Moreover, their precise form is important, in view of their application in Sect. 3.

We start with the case \(m=0\).

Lemma 2.4

Assume that the hypotheses of Theorem 2.4 hold true with \(m=0\).

Then there exists a unique solution \(\rho \,\in \,C\bigl ([0,T];L^2_k(\mathbb {R}^d)\bigr )\) to problem (2.1). Moreover, there exists a “universal” constant \(C>0\) such that the following estimate holds true for any \(t\in [0,T]\):

$$\begin{aligned} \left\| \rho (t) \right\| _{L^2_k}\, \le \,C\,\exp \left( C\,\int _0^t\left\| \nabla a(\tau )\right\| _{L^\infty }\,d\tau \right) \, \left( \left\| \rho _0 \right\| _{L^2_k}\, +\, \int _0^t \left\| g(\tau ) \right\| _{L^2_k}\, d\tau \right) . \end{aligned}$$

Proof of Lemma 2.4

Recall that, in the case \(k=0\), taking the \(L^2\) scalar product of Eq. (2.1) by \(\rho \) and performing standard computations yield

$$\begin{aligned} \frac{1}{2}\,\frac{d}{dt}\left\| \rho \right\| ^2_{L^2}\,+\,\frac{1}{2}\int \mathrm{div}\,a\,|\rho |^2\,dx\,=\,\int g\,\rho \,dx, \end{aligned}$$

which readily implies

$$\begin{aligned} \frac{d}{dt}\left\| \rho \right\| _{L^2}\,\le \,\Vert \mathrm{div}\,a\Vert _{L^\infty }\,\left\| \rho \right\| _{L^2}\,+\,\left\| g\right\| _{L^2}. \end{aligned}$$
(2.17)

Analogously, multiplying Eq. (2.1) by \(|x|^k\), we get that \(\rho _k\,:=\,|x|^k\,\rho \) satisfies

$$\begin{aligned} \partial _t\rho _k\,+\,\mathrm{div}\,\big (a\,\rho _k\big )\,=\,|x|^k\,g\,+\,\rho \,a\cdot \nabla |x|^k. \end{aligned}$$

Taking now the \(L^2\) scalar product by \(\rho _k\) and repeating the same computations as above, we find

$$\begin{aligned} \frac{d}{dt}\left\| \rho _k\right\| _{L^2}\,\le \,\Vert \mathrm{div}\,a\Vert _{L^\infty }\,\left\| \rho _k\right\| _{L^2}\,+\,\left\| |x|^k\,g\right\| _{L^2}\,+\,\left\| \rho \,a\cdot \nabla |x|^k\right\| _{L^2}. \end{aligned}$$
(2.18)

We need to control the last term on the right-hand side of the previous estimate. For this, we use the fact that \(\nabla |x|^k\,\sim \,|x|^{k-1}\) for all \(k\ge 1\), and Remark 2.2, to obtain

$$\begin{aligned} \left\| \rho \,a\cdot \nabla |x|^k\right\| _{L^2}\,\le \,C\,\Vert \nabla a\Vert _{L^\infty }\,\left\| \big (1+|x|^k\big )\,\rho \right\| _{L^2}. \end{aligned}$$

Inserting this bound into (2.18) and summing up the resulting expression to (2.17), we have

$$\begin{aligned} \frac{d}{dt}\left\| \big (1+|x|^k\big )\,\rho \right\| _{L^2}\,\le \,C\,\Vert \nabla a\Vert _{L^\infty }\,\left\| \big (1+|x|^k\big )\,\rho \right\| _{L^2}\,+\,\left\| \big (1+|x|^k\big )\,g\right\| _{L^2}. \end{aligned}$$
(2.19)

Hence, an application of Grönwall’s lemma gives the desired estimate. \(\square \)

Next, we present the result for \(m=1\). For notational convenience, let us set

$$\begin{aligned}{}[x]_k\,:=\,1\,+\,|x|^k. \end{aligned}$$

Lemma 2.5

Assume that the hypotheses of Theorem 2.4 hold true with \(m=1\).

Then there exists a unique weak solution \(\rho \,\in \,C\bigl ([0,T];H^1_k(\mathbb {R}^d)\bigr )\) to (2.1), which moreover satisfies, for some “universal” constant \(C>0\) and for all \(t\in [0,T]\), the estimate

$$\begin{aligned} \left\| \rho (t) \right\| _{H^1_k}\, \le \,C\,\exp \left( C\,\int _0^t\left\| \nabla a(\tau )\right\| _{C^1_b}\,d\tau \right) \, \left( \left\| \rho _0 \right\| _{H^1_k}\, +\, \int _0^t \left\| g(\tau ) \right\| _{H^1_k}\, d\tau \right) . \end{aligned}$$

Proof of Lemma 2.5

We start by differentiating equation (2.1) with respect to \(x^j\), for some \(1\le j\le d\), getting

$$\begin{aligned} \partial _t\partial _j\rho \,+\,\mathrm{div}\,\big (a\,\partial _j\rho \big )\,=\,\partial _jg\,-\,\partial _j\mathrm{div}\,a\;\rho \,-\,\partial _ja\cdot \nabla \rho . \end{aligned}$$

Applying estimate (2.19) to this equation gives

$$\begin{aligned} \frac{d}{dt}\left\| [x]_k\,\partial _j\rho \right\| _{L^2}\,&\le \,C\,\Vert \nabla a\Vert _{L^\infty }\,\left\| [x]_k\,\partial _j\rho \right\| _{L^2}\,+\,\left\| [x]_k\,\partial _jg\right\| _{L^2}\,\\&\quad +\, \left\| [x]_k\,\partial _j\mathrm{div}\,a\;\rho \right\| _{L^2}\,+\,\left\| [x]_k\,\partial _ja\cdot \nabla \rho \right\| _{L^2}, \end{aligned}$$

from which we obtain, for another constant \(C>0\), the following bound:

$$\begin{aligned} \frac{d}{dt}\left\| [x]_k\,\nabla \rho \right\| _{L^2}\,&\le \,C\,\Vert \nabla a\Vert _{L^\infty }\,\left\| [x]_k\,\nabla \rho \right\| _{L^2}\,+\, \left\| [x]_k\,\nabla g\right\| _{L^2}\,+\,\left\| \nabla ^2a\right\| _{L^\infty }\,\left\| [x]_k\,\rho \right\| _{L^2}. \end{aligned}$$
(2.20)

We can now sum up (2.19) and (2.20) to get

$$\begin{aligned} \frac{d}{dt}\left\| \rho \right\| _{H^1_k}\,\le \,C\,\left\| \nabla a\right\| _{C^1_b}\,\left\| \rho \right\| _{H^1_k}\,+\,\left\| g\right\| _{H^1_k}, \end{aligned}$$
(2.21)

and Grönwall’s lemma allows us to get the result. \(\square \)

Now, we can address the proof of the general case, namely of Theorem 2.4.

Proof of Theorem 2.4

We argue by induction on the order of derivatives, i.e. on m, the cases \(m=0\) and \(m=1\) being given by Lemmas 2.4 and 2.5, respectively.

Let \(m\ge 2\), and let us assume that, for any \(0\le \ell \le m-1\), the following inequality holds true

$$\begin{aligned} \frac{d}{dt}\left\| [x]_k\,\nabla ^\ell \rho \right\| _{L^2}\,&\le \,C\Vert \nabla a\Vert _{L^\infty }\left\| [x]_k\,\nabla ^\ell \rho \right\| _{L^2}\nonumber \\&\quad + \left\| [x]_k\,\nabla ^\ell g\right\| _{L^2}+\sum _{0\le p\le \ell -1}\left\| \nabla ^{p+1}a\right\| _{L^\infty }\left\| [x]_k\,\nabla ^p\rho \right\| _{L^2}. \end{aligned}$$
(2.22)

Our goal is to prove an analogous estimate also for \(\left\| [x]_k\,\nabla ^m\rho \right\| _{L^2}\).

For this purpose, let us take an \(\alpha \in {{\mathbb {N}}}^d\) such that \(|\alpha |=m\). Applying the operator \(D^\alpha \) to (2.1), we deduce

$$\begin{aligned} \partial _tD^\alpha \rho \,+\,\mathrm{div}\,\big (a\,D^\alpha \rho \big )\,=\,D^\alpha g\,-\,\sum _{0<\beta \le \alpha }D^\beta \mathrm{div}\,a\;D^{\alpha -\beta }\rho \,-\,\sum _{0<\beta \le \alpha }D^\beta a\cdot \nabla D^{\alpha -\beta }\rho , \end{aligned}$$
(2.23)

where the notation \(0<\beta \) means that \(\beta \in {{\mathbb {N}}}^d\) has at least one non-zero component.

Following the computations of Lemma 2.5, we need to estimate the \(L^2_k\) norm of the last two terms in the right-hand side of the previous equation. First of all, we have

$$\begin{aligned} \left\| [x]_k\,D^\beta \mathrm{div}\,a\;D^{\alpha -\beta }\rho \right\| _{L^2}\,\le \,\left\| \nabla ^{|\beta |+1}a\right\| _{L^\infty }\,\left\| [x]_k\,D^{\alpha -\beta }\rho \right\| _{L^2}. \end{aligned}$$

Notice that, since \(\beta >0\), the terms \(D^{\alpha -\beta }\rho \) are lower order. The same can be said of the terms

$$\begin{aligned} \left\| [x]_k\,D^\beta a\cdot \nabla D^{\alpha -\beta }\rho \right\| _{L^2}\,\le \,\left\| \nabla ^{|\beta |}a\right\| _{L^\infty }\,\left\| [x]_k\,\nabla D^{\alpha -\beta }\rho \right\| _{L^2}, \end{aligned}$$

whenever \(|\beta |\ge 2\); on the contrary, when \(|\beta |=1\), the terms \(\nabla D^{\alpha -\beta }\rho \) contain exactly m derivatives.

Therefore, applying estimate (2.19) to Eq. (2.23), and using the previous controls, we infer

$$\begin{aligned} \frac{d}{dt}\left\| [x]_k\,\nabla ^m\rho \right\| _{L^2}\,&\le \,C\Vert \nabla a\Vert _{L^\infty }\,\left\| [x]_k\,\nabla ^m\rho \right\| _{L^2}+ \left\| [x]_k\,\nabla ^mg\right\| _{L^2}\\&\quad +\sum _{0<\beta \le \alpha }\left\| \nabla ^{|\beta |+1}a\right\| _{L^\infty }\left\| [x]_k\,D^{\alpha -\beta }\rho \right\| _{L^2}\\&\le \,C\Vert \nabla a\Vert _{L^\infty }\left\| [x]_k\,\nabla ^m\rho \right\| _{L^2}+ \left\| [x]_k\,\nabla ^mg\right\| _{L^2}\\&\quad + \sum _{0\le \ell \le m-1}\left\| \nabla ^{\ell +1}a\right\| _{L^\infty }\left\| [x]_k\,\nabla ^\ell \rho \right\| _{L^2}, \end{aligned}$$

which proves formula (2.22) at the level m. Therefore that formula is true for any \(m\in {{\mathbb {N}}}\), by induction.

Now, it is just a matter of summing up inequality (2.20) for \(\ell =0\) to m to get, for some constant also depending on m, the following bound:

$$\begin{aligned} \frac{d}{dt}\left\| \rho \right\| _{H^m_k}\,\le \,C\,\left\| \nabla a\right\| _{C^m_b}\,\left\| \rho \right\| _{H^m_k}\,+\,\left\| g\right\| _{H^m_k}, \end{aligned}$$

which immediately implies the claimed estimate. Theorem 2.4 is now proved. \(\square \)

3 The Liouville control-to-state map

In this section, we define the Liouville control-to-state map and investigate its continuity and differentiability properties. For reasons which will appear clear in the following analysis, we need to resort to weighted spaces \(H^m_k\), as introduced in Sect. 2.2.

We start by making an important remark.

Remark 3.1

Throughout this section, the data of the Liouville equation has to be thought as fixed. Specifically, for \(m\ge 0\) and \(k\ge 0\), we take an initial datum \(\rho _0\,\in \,H^m_k\), a source term \(g\,\in \,L^1_T(H^m_k)\), and a drift function \(a_0\,\in \,L^1_T(C^{m+1})\), with \(\nabla a_0\,\in \,L^1_T(C^m_b)\).

We are then interested in the dependence of the solution \(\rho \) to the Liouville equation (2.1), with drift a given by (1.4), on the control state \(u\in U_{ad}\), where \(U_{ad}\) has been defined in (1.9).

3.1 Definition and continuity properties

We remark that the statements of Theorems 2.2 and 2.4 cover the case of the Liouville equation with the controlled drift function given by (1.4), where \(u \in U_{ad}\). In particular, the next proposition-definition immediately follows.

Proposition 3.1

Fixed data \(\rho _0\), g and \(a_0\) as in Remark 3.1, let us consider drift functions a of the form (1.4), with \(u\in U_{ad}\). Introduce the Liouville control-to-state map G, defined by

$$\begin{aligned} G:\, U_{ad}\, \longrightarrow \, L^\infty \big ([0,T];L^2({{\mathbb {R}}}^d)\big ), \qquad u\, \mapsto \, \rho := G(u), \end{aligned}$$

where \(\rho \) is the unique solution to the Liouville equation with the given data.

Then G is well-defined.

Let us make an important comment about the previous definition.

Remark 3.2

The theory developed in Sects. 2.1 and 2.2 entails that the solution \(\rho \) actually belongs to \(C_T(H^m_k)\). However, due to a loss of regularity, both in m and k, when proving Fréchet differentiability of G, it is convenient to look at G as a map with values in the space with the weakest topology.

Finally, we consider \(L^\infty \) regularity with respect to time, because it will be convenient also to look at weak continuity properties of G, see Proposition 3.2.

Next, we study some properties of the map G that are relevant for the analysis of ensemble optimal control problems. We start by establishing that G is weak-weak continuous from \(U_{ad}\) into \(L^\infty _T(L^2)\). Notice that we do not need any restriction on m and k (and so, on the initial data) in this case.

Proposition 3.2

Take \(m\ge 0\) and \(k\ge 0\) and initial data \(\rho _0\), g and \(a_0\) as in Remark 3.1. Let \(u\in U_{ad}\) and \(\big (u^l\bigr )_l\,\subset \,U_{ad}\) be a sequence of controls, and assume that \(u^l\,{\mathop {\rightharpoonup }\limits ^{*}}\,u\) in \(\mathbb {L}^\infty _T\).

Then \(G(u^l)\,{\mathop {\rightharpoonup }\limits ^{*}}\,G(u)\) in the weak-\(*\) topology of \(L^\infty _T(L^2)\).

Proof

Of course, it is enough to prove the previous proposition in the case of minimal regularity and integrability, namely for \(m=k=0\).

By definition of the set \(U_{ad}\), we infer that \((u^l)_l\) is uniformly bounded in \(\mathbb {L}^\infty _T\). On the other hand, by hypotheses and Theorem 2.2, for all \(l\in {{\mathbb {N}}}\) there exists a unique \(\rho ^l\,:=\,G(u^l)\,\in \,C_T(L^2)\) which solves the Liouville equation (2.1). In addition, by inequality (2.4), we deduce that \(\big (\rho ^l\big )_l\) is uniformly bounded in \(C_T(L^2)\); then there exists \(\rho \in L^\infty _T(L^2)\) such that, up to extraction of a subsequence, \(\rho ^l\,{\mathop {\rightharpoonup }\limits ^{*}}\,\rho \) in \(L^\infty _T(L^2)\).

So, the proof reduces to showing that \(\rho \) is a weak solution to the Liouville equation

$$\begin{aligned} \partial _t\rho \,+\,\mathrm{div}\,\big (a(t,x;u)\,\rho \big )\,=\,g,\qquad \qquad \text{ with } \quad \rho _{|t=0}\,=\,\rho _0. \end{aligned}$$
(3.1)

Indeed, if this is the case, by uniqueness we get \(\rho \,=\,G(u)\) and that the whole sequence \(\big (\rho ^l\big )_l\) converges.

The previous property follows by passing to the limit in the weak formulation of the equation for \(\rho ^l\). This can be easily obtained; notice that, in order to treat the products between \(\rho ^l\) and \(u^l\), one also needs to establish strong convergence for \(\rho ^l\) in suitable spaces (as done in the proof to Proposition 2.1). We leave the details to the reader. \(\square \)

For the analysis of our optimal control problem, we need stronger regularity properties for G. We start by showing Lipschitz continuity, which will be the basis to prove Gâteaux differentiability of G. The key here is to perform careful estimates in order to identify the right topology: the reason is that, due to hyperbolicity of the Liouville equation, stability estimates involve a loss of regularity.

Lemma 3.1

Let the data \(\rho _0\), g and \(a_0\) be fixed as in Remark 3.1 above, with \(m\ge 1\) and \(k\ge 1\). Let u and v be in \(U_{ad}\), and denote by G(u) and G(v) the corresponding \(C_T(H^m_k)\) solutions to (2.1), with drift a given by (1.4). Set \(\delta G\,:=\,G(u)-G(v)\).

Then there exists a constant \(C>0\), independent of the data and respective solutions, such that, for all \(1\le \ell \le k\), if we set

$$\begin{aligned} K^{(\ell )}_0\,:=\,C\,\exp \Big (C\left( \left\| \nabla a_0\right\| _{L^1_T(C^1_b)}\,+\,\Vert u\Vert _{\mathbb {L}^1_T}\,+\,\Vert v\Vert _{\mathbb {L}^1_T}\right) \Big )\times \left( \left\| \rho _0\right\| _{H^1_\ell }\,+\,\left\| g\right\| _{L^1_T(H^1_\ell )}\right) , \end{aligned}$$
(3.2)

then, for all \(t\in [0,T]\), one has

$$\begin{aligned} \left\| \delta G(t)\right\| _{L^2_{\ell -1}}\, \le \,K^{(\ell )}_0\,\int ^t_0|u(\tau )-v(\tau )|\,d\tau . \end{aligned}$$

If moreover \(m\ge 2\) and we set

$$\begin{aligned} K^{(\ell )}_1\,:=\,C\,\exp \Big (C\left( \left\| \nabla a_0\right\| _{L^1_T(C^2_b)}\,+\,\Vert u\Vert _{\mathbb {L}^1_T}\,+\,\Vert v\Vert _{\mathbb {L}^1_T}\right) \Big )\times \left( \left\| \rho _0\right\| _{H^2_\ell }\,+\,\left\| g\right\| _{L^1_T(H^2_\ell )}\right) , \end{aligned}$$
(3.3)

we also have

$$\begin{aligned} \left\| \delta G(t) \right\| _{H^1_{\ell -1}}\,\le \,K^{(\ell )}_1\,\int ^t_0|u(\tau )-v(\tau )|\,d\tau . \end{aligned}$$

Proof

By linearity of the Liouville equation, we find that \(\delta G\) satisfies

$$\begin{aligned} \partial _t\delta G\,+\,\mathrm{div}\,\big (a(t,x;u)\,\delta G\big )\,=\,-\,\mathrm{div}\,\big (\overline{a}(t,x;u-v)\,G(v)\big ),\qquad \qquad \delta G_{|t=0}\,=\,0, \end{aligned}$$
(3.4)

where we have set \(\overline{a}(t,x;u-v)\,:=\,a(t,x;u)-a(t,x;v)\,=\,(u_1-v_1)+x\circ (u_2-v_2)\).

Applying \(L^2_{\ell -1}\) estimates of Theorem 2.2 to Eq. (3.4), we immediately get

$$\begin{aligned} \left\| \delta G(t)\right\| _{L^2_{\ell -1}}\le & {} C\,\exp \left( C\int ^t_0\left\| \nabla a(\tau ,x;u)\right\| _{L^\infty }\,d\tau \right) \,\\&\times \int ^t_0\left\| \mathrm{div}\,\big (\overline{a}(\tau ,x;u-v)\,G(v)\big )\right\| _{L^2_{\ell -1}}\,d\tau . \end{aligned}$$

By explicit computations and using the Leibniz rule, we deduce that

$$\begin{aligned}&\left\| \mathrm{div}\,\big (\overline{a}(\tau ,x;u-v)\,G(v)\big )\right\| _{L^2_{\ell -1}}\,\le \,|u(\tau )-v(\tau )|\,\left( \left\| G(v)\right\| _{L^2_{\ell -1}}\,+\,\left\| \nabla G(v)\right\| _{L^2_\ell }\right) \nonumber \\&\qquad \qquad \le \,C\,|u(\tau )-v(\tau )|\,\exp \left( C\int ^\tau _0\left\| \nabla a(s,x;v)\right\| _{C^1_b}\,ds\right) \,\left( \Vert \rho _0\Vert _{H^1_\ell }+\int ^\tau _0\Vert g(s)\Vert _{H^1_\ell }\,ds\right) , \end{aligned}$$
(3.5)

where the second inequality holds true in view of the bound \(\left\| G(v)\right\| _{L^2_{\ell -1}}+\left\| \nabla G(v)\right\| _{L^2_\ell }\,\le \,\left\| G(v)\right\| _{H^1_\ell }\) and Lemma 2.5. This estimate completes the proof of the first inequality.

Now, we focus on \(H^1_{\ell -1}\) bounds for \(\delta G\). Thanks to Lemma 2.5, we have

$$\begin{aligned} \left\| \delta G(t)\right\| _{H^1_{\ell -1}}\,\le \,C\,\exp \left( C\int ^t_0\left\| \nabla a(\tau ,x;u)\right\| _{C^1_b}\,d\tau \right) \, \int ^t_0\left\| \mathrm{div}\,\big (\overline{a}(\tau ,x;u-v)\,G(v)\big )\right\| _{H^1_{\ell -1}}\,d\tau . \end{aligned}$$
(3.6)

By definition, we have that \(\Vert f\Vert _{H^1_{\ell -1}}\,=\,\Vert f\Vert _{L^2_{\ell -1}}+\Vert \nabla f\Vert _{L^2_{\ell -1}}\). Concerning the first term, we have

$$\begin{aligned}&\left\| \mathrm{div}\,\big (\overline{a}(\tau ,x;u-v)\,G(v)\big )\right\| _{L^2_{\ell -1}}\,=\,\left\| \mathrm{div}\,\overline{a}\;G(v)\right\| _{L^2_{\ell -1}}\,+\, \left\| \overline{a}\cdot \nabla G(v)\right\| _{L^2_{\ell -1}} \nonumber \\&\qquad \qquad \le \,C\,|u(\tau )-v(\tau )|\,\left( \left\| G(v)\right\| _{L^2_{\ell -1}}\,+\,\left\| \nabla G(v)\right\| _{L^2_{\ell }}\right) \;\le \;C\,|u(\tau )-v(\tau )|\,\left\| G(v)\right\| _{H^1_\ell }. \end{aligned}$$
(3.7)

Next, we need to bound in \(L^2_{\ell -1}\) the quantity \(\nabla \mathrm{div}\,\big (\overline{a}(\tau ,x;u-v)\,G(v)\big )\): we have then to control four terms. First of all, we notice that \(\nabla \mathrm{div}\,\overline{a}\equiv 0\). Moreover, we can write

$$\begin{aligned} \left\| \mathrm{div}\,\overline{a}\;\nabla G(v)\right\| _{L^2_{\ell -1}}\,&\le \,|u(\tau )-v(\tau )|\,\left\| \nabla G(v)\right\| _{L^2_{\ell -1}}, \end{aligned}$$
(3.8)

and the same estimate holds true also for the term \(\nabla \overline{a}\cdot \nabla G(v)\). Finally, we have

$$\begin{aligned} \left\| \overline{a}\cdot \nabla ^2G(v)\right\| _{L^2_{\ell -1}}\,&\le \,C\,|u(\tau )-v(\tau )|\,\left\| \,\nabla ^2G(v)\right\| _{L^2_\ell }. \end{aligned}$$
(3.9)

Putting (3.7), (3.8) and (3.9) together, we infer the control

$$\begin{aligned} \left\| \mathrm{div}\,\big (\overline{a}(\tau ,x;u-v)\,G(v)\big )\right\| _{H^1_{\ell -1}}\,&\le \,C\,|u(\tau )\,-\,v(\tau )|\,\left\| G(v)\right\| _{H^2_\ell }. \end{aligned}$$

Inserting this last inequality into (3.6) and using the bounds of Theorem 2.4, we finally get the claimed estimate for the \(H^1\)-type norms of \(\delta G\). \(\square \)

3.2 Differentiability of the control-to-state map

In this section, we investigate differentiability properties of the control-to-state map G, defined above.

First of all, with Lemma 3.1 at hand, we can establish Gâteaux differentiability of G. For any given u in an open set \(U_0\subset U_{ad}\), let G(u) be the corresponding solution to the Liouville equation, as defined in Proposition 3.1, and let \(\delta u=(\delta u_1,\delta u_2)\) be an admissible variation of u, such that \(u+\varepsilon \delta u \in U_{ad}\) for \(\varepsilon \in {{\mathbb {R}}}{\setminus }\{0\}\) sufficiently small. Then the Gâteaux derivative of G with respect to the variation \(\delta u\) at u is defined as the limit (whenever such a limit exists)

$$\begin{aligned} \delta _{\delta u}G(u)\,:=\,\lim _{\varepsilon \rightarrow 0} \frac{G(u+ \varepsilon \delta u)\,-\,G(u)}{\varepsilon }. \end{aligned}$$
(3.10)

The next proposition holds true.

Proposition 3.3

Let \(m\ge 2\) and \(k\ge 2\). Let the data \(\rho _0\), g and \(a_0\) be fixed as in Remark 3.1 above. Let u belong to \(\mathrm{int\,}U_{ad}\), where \(\mathrm{int\,}U_{ad}\) denotes the interior part of the set \(U_{ad}\).

Then, for any admissible variation \(\delta u\) of u, the limit (3.10) exists in \(L^\infty _T(L^2)\). In particular, the control-to-state map G is Gâteaux differentiable at u. Moreover, \(\delta _{\delta u}G\) satisfies the Liouville problem

$$\begin{aligned} \partial _t\delta _{\delta u}G \, + \, \mathrm{div}\,\bigl ( a(t,x;u) \, \delta _{\delta u}G \bigr )\, = \, -\,\mathrm{div}\,\bigl ( \overline{a}(t,x;\delta u) \, G(u) \bigr ),\qquad \qquad \text{ with } \quad \delta _{\delta u} G_{|t=0}\,=\, 0, \end{aligned}$$
(3.11)

where we have defined \(\overline{a}(t,x;\delta u)\,:=\,\delta u_1 + x \circ \delta u_2\).

Proof

For any \(0<|\varepsilon |<1\) small enough, let us define \(\delta G^\varepsilon \,:=\,\big (G(u+ \varepsilon \delta u)\,-\,G(u)\big )/\varepsilon \). It is easy to see that \(\delta G^\varepsilon \) solves the equation

$$\begin{aligned} \partial _t\delta G^\varepsilon \,+\,\mathrm{div}\,\big (a(t,x;u)\,\delta G^\varepsilon \big )\,=\,-\,\mathrm{div}\,\big (\overline{a}(t,x;\delta u)\,G(u+\varepsilon \delta u)\big ), \end{aligned}$$
(3.12)

with initial datum \(\delta G^\varepsilon _{|t=0}=0\).

Notice that, by uniform bounds provided by Lemma 3.1 (which holds for \(m\ge 1\) and \(k\ge 1\)) and weak compactness methods, we can prove that \(\delta G^\varepsilon \) converges (up to extraction of a suitable subsequence) to some \(\rho \in L^\infty _T(L^2)\) in the weak-\(*\) topology of that space, and that \(\rho \) satisfies the same equation as (3.12), with right-hand side equal to \(-\,\mathrm{div}\,\big (\overline{a}(t,x;\delta u)\,G(u)\big )\,\in \,L^1_T(L^2)\). Now, by uniqueness we deduce that \(\rho \) has to coincide with \(\delta _{\delta u}G\), and in addition the whole sequence \(\big (\delta G^\varepsilon \big )_\varepsilon \) converges to it.

Unfortunately, the previous argument does not yield the Gâteaux differentiability of G, because we need that the limit exists in the strong topology, namely in the \(L^\infty _T(L^2)\) norm. In order to get this property, let us write the equation for \(\rho ^\varepsilon \,:=\,\delta G^\varepsilon -\rho \): since \(G(u+\varepsilon \delta u)-G(u)\,=\,\varepsilon \,\delta G^\varepsilon \), we find

$$\begin{aligned} \partial _t\rho ^\varepsilon \,+\,\mathrm{div}\,\big (a(t,x;u)\,\rho _\varepsilon \big )\,=\,-\,\varepsilon \,\mathrm{div}\,\big (\overline{a}(t,x;\delta u)\,\delta G^\varepsilon \big ), \end{aligned}$$

with zero initial datum. Then, an energy estimate immediately gives

$$\begin{aligned} \left\| \rho ^\varepsilon (t)\right\| _{L^2}\,&\le \,C\,\varepsilon \,\exp \left( C\int ^t_0\Vert \mathrm{div}\,a(\tau ,x;u)\Vert _{L^\infty }\right) \, \int ^t_0\left\| \mathrm{div}\,\big (\overline{a}(\tau ,x;\delta u)\,\delta G^\varepsilon \big )\right\| _{L^2}\,d\tau \\&\le \,C\,\varepsilon \,\exp \left( C\int ^t_0\Vert \mathrm{div}\,a(\tau ,x;u)\Vert _{L^\infty }\right) \,\int ^t_0|\delta u(\tau )|\,\left\| \delta G^\varepsilon \right\| _{H^1_1}\,d\tau , \end{aligned}$$

where we have argued as in the first line of (3.5) in order to pass from the first inequality to the second one. At this point, applying the second estimate of Lemma 3.1 to Eq. (3.12) yields

$$\begin{aligned} \left\| \delta G^\varepsilon (\tau )\right\| _{H^1_1}\,\le \,C_0\,\int ^\tau _0|\delta u(s)|\,ds, \end{aligned}$$
(3.13)

for any \(\tau \in [0,t]\), \(t\le T\), for a fixed constant \(C_0\) (depending on T, \(u^a\), \(u^b\), and \(\Vert \nabla a_0\Vert _{L^1_T(C^2_b)}\), \(\Vert \rho _0\Vert _{H^2_2}\) and \(\Vert g\Vert _{L^1_T(H^2_2)}\)). Putting this bound in the previous estimate entails

$$\begin{aligned} \left\| \rho ^\varepsilon (t)\right\| _{L^2}\,&\le \,C\,C_0\,\varepsilon \,e^{\left( C\int ^t_0\Vert \mathrm{div}\,a(\tau ,x;u)\Vert _{L^\infty }\right) }\,\left( \int ^t_0|\delta u(\tau )|\,d\tau \right) ^2\\&\le \, \varepsilon \,C\,\Vert \delta u\Vert ^2_{L^\infty _T}\,e^{C\Vert \mathrm{div}\,a(t,x;u)\Vert _{L^1_T(L^\infty )}}, \end{aligned}$$

from which we deduce that \(\rho ^\varepsilon \,\longrightarrow \,0\) in \(L^\infty _T(L^2)\), for \(\varepsilon \rightarrow 0\). The proposition is now proved. \(\square \)

Next, we tackle the proof of the Fréchet differentiability of G.

Theorem 3.1

Let \(m\ge 2\) and \(k\ge 2\). Let the data \(\rho _0\), g and \(a_0\) be fixed as in Remark 3.1 above, and let \(u\in \mathrm{int\,}U_{ad}\). Define \(DG(u)[\delta u]\) to be the unique solution to Eq. (3.11).

Then there exists a constant \(C>0\) (depending only on T, \(u^a\), \(u^b\), and \(\Vert \nabla a_0\Vert _{L^1_T(C^2_b)}\), \(\Vert \rho _0\Vert _{H^2_2}\) and \(\Vert g\Vert _{L^1_T(H^2_2)}\)) such that

$$\begin{aligned} \Big \Vert G(u+\delta u)\,-\,G(u)\,-\,DG(u)[\delta u]\Big \Vert _{L^\infty _T(L^2)}\,\le \,C\,\left\| \delta u \right\| ^2_{L^\infty _T}. \end{aligned}$$

In particular, the map G is Fréchet differentiable from \(\mathrm{int\,}U_{ad}\) into \(L^\infty _T(L^2)\), and its Fréchet differential at any point \(u\in \mathrm{int\,}U_{ad}\) is given by DG(u).

Proof

In order to prove that G is Fréchet differentiable, with Fréchet differential given by \(DG(u)[\delta u]\), we have to show that

$$\begin{aligned} \lim _{\left\| \delta u \right\| _{L^\infty _T} \rightarrow 0} \frac{\Big \Vert G(u+\delta u) - G(u) - DG(u)[ \delta u]\Big \Vert _{L^\infty _T(L^2)}}{\left\| \delta u \right\| _{L^\infty _T}}\,=\,0. \end{aligned}$$

We recall also that, if G is Fréchet differentiable at u, then it is also Gâteaux differentiable at the same point, and one has \(\delta _{\delta u}G =DG(u)[ \delta u]\).

For simplicity, let us introduce the notation \(\mathcal G_u(\delta u)\,:=\,G(u+\delta u) - G(u) - DG(u)[ \delta u]\). The same computations performed on \(\rho ^\varepsilon \), in the proof of Proposition 3.3 above, lead us to an equation for \(\mathcal G_u(\delta u)\):

$$\begin{aligned} \partial _t\mathcal G_u(\delta u)\,+\,\mathrm{div}\,\big (a(t,x;u)\,\mathcal G_u(\delta u)\big )\,=\,-\,\mathrm{div}\,\Bigl ( \overline{a}(t,x;\delta u) \,\big ( G(u+\delta u)-G(u) \big ) \Big ), \end{aligned}$$

with initial datum \(\mathcal G_u(\delta u)_{|t=0}=0\). Now, it is just a matter of repeating the estimates performed on \(\rho ^\varepsilon \): we easily find, for every \(t\in [0,T]\), the inequality

$$\begin{aligned} \left\| \mathcal G_u(\delta u)(t)\right\| _{L^2}\,&\le \,C\,\exp \left( C\int ^t_0\Vert \mathrm{div}\,a(\tau ,x;u)\Vert _{L^\infty }\right) \,\int ^t_0|\delta u(\tau )|\,\left\| G(u+\delta u)-G(u)\right\| _{H^1_1}\,d\tau . \end{aligned}$$

Observe that an inequality analogous to (3.13) holds also for \(G(u+\delta u)-G(u)\): inserting this relation in the previous estimate, we find

$$\begin{aligned} \left\| \mathcal G_u(\delta u)(t)\right\| _{L^2}\,&\le \,C\,C_0\,\exp \left( C\int ^t_0\Vert \mathrm{div}\,a(\tau ,x;u)\Vert _{L^\infty }\right) \,\left( \int ^t_0|\delta u(\tau )|\,d\tau \right) ^2\,\le \,K\,\Vert \delta u\Vert ^2_{L^\infty _T}, \end{aligned}$$

for a new positive constant K. From this last inequality, the claims of the theorem follow.

\(\square \)

4 Analysis of the Liouville optimal control problem

In this section, we investigate our Liouville ensemble optimal control problem. In the first part, we prove the existence of optimal controls by means of classical arguments. However, notice that one has to carefully justify that the reduced functional \(\widehat{J}\) (see its definition below) is weakly lower semi-continuous. In fact, this property is not obvious, since \(\rho =G(u)\) depends non-linearly on u. After that, in Sect. 4.2 we characterise optimal controls as solutions of a related first-order optimality system. In Sect. 4.3 we discuss uniqueness of optimal controls.

4.1 Existence of optimal controls

In this section, we deal with existence of optimal solutions to an ensemble optimal control problem. Our analysis is based on the following assumptions.

(A.1) :

We fix \((m,k)\in {{\mathbb {N}}}^2\), and we take an initial datum \(\rho _0\in H^m_k({{\mathbb {R}}}^d)\), a force \(g\in L^1\big ([0,T];H^m_k({{\mathbb {R}}}^d)\big )\) and a vector field \(a_0\in L^1\big ([0,T];C^{m+1}({{\mathbb {R}}}^d)\big )\), with \(\nabla a_0\in L^1\big ([0,T];C^{m}_b({{\mathbb {R}}}^d)\big )\).

(A.2) :

We fix parameters \((\gamma ,\delta ,\nu )\in {{\mathbb {R}}}^3\) such that \(\gamma >0\), \(\delta \ge 0\) and \(\nu \ge 0\).

(A.3) :

Chosen \(u^a=\big (u^a_1,u^a_2\big )\) and \(u^b=\big (u^b_1,u^b_2\big )\) in \({{\mathbb {R}}}^{2d}\), with \(u^a\le u^b\), we define the set of admissible controls to be

$$\begin{aligned} U_{ad}\,&:=\,\left\{ u \,\in \, \mathbb {L}^\infty _T(\mathbb {R}^d)\;\bigl |\quad u^a\,\le \,u(t)\,\le \,u^b \qquad \text{ for } \text{ a.e. } \; t\,\in \,[0,T]\right\} \qquad \text{ if } \quad \nu =0 \end{aligned}$$
(4.1)
$$\begin{aligned} U_{ad}\,&:=\,\left\{ u \,\in \, \mathbb {H}^1_T(\mathbb {R}^d)\;\bigl |\quad u^a\,\le \,u(t)\,\le \,u^b \qquad \text{ for } \text{ all } \; t\,\in \,[0,T]\right\} \qquad \text{ if } \quad \nu >0. \end{aligned}$$
(4.2)
(A.4) :

We take two attracting potentials \(\theta \) and \(\varphi \) in \(L^2({{\mathbb {R}}}^d)\), in the sense specified in Sect. 1.2.

Remark 4.1

We point out that assumption (A.4) (which will be strengthened in Sect. 4.3 for getting uniqueness, see condition (A.4)* there) is taken for simplicity of presentation, since more general \(\theta \) and \(\varphi \) can be considered in our framework. For instance, we can allow for \(\theta \) to depend on time: \(\theta \,\in \,L^1_T(L^2)\), or \(\theta \,\in \,L^1_T(H^1_1)\) in (A.4)* below. The case \(\theta (x)=|x|^2\) and \(\varphi (x)=|x|^2\) is more delicate, and will be matter of further discussions in Sect. 4.4.

Now, consider our cost functional given by

$$\begin{aligned} J(\rho ,u)\,&:=\,\int _0^T\int _{{{\mathbb {R}}}^d}\theta (x)\,\rho (x,t)\,dx\,dt\, +\, \int _{{{\mathbb {R}}}^d} \varphi (x) \, \rho (x,T) \, dx \nonumber \\&\qquad +\frac{\gamma }{2}\,\int _0^T\big |u(t)\big |^2\,dt\,+\,\delta \,\int _0^T\big |u(t)\big |\,dt\,+\,\frac{\nu }{2}\,\int _0^T\left| \frac{d}{d t} u(t)\right| ^2 \,dt. \end{aligned}$$
(4.3)

Remark that J is well-defined whenever \(u\in \mathbb {L}^2_T\) if \(\nu =0\), or \(u\in \mathbb {H}^1_T\) if \(\nu >0\), and \(\rho \in C\big ([0,T];L^2({{\mathbb {R}}}^d)\big )\).

Our ensemble optimal control problem requires to find

$$\begin{aligned} \min _{u \in U_{ad}} J(\rho ,u), \end{aligned}$$
(4.4)

subject to the differential constraint

$$\begin{aligned} \left\{ \begin{array}{ll} \partial _t\rho \,+\,\mathrm{div}\,\bigl ( a(t,x;u) \, \rho \bigr )\, =\,g \qquad \qquad &{} \text{ in } \quad [0,T]\times {{\mathbb {R}}}^d \\ \rho _{|t=0}\,=\, \rho _0 \qquad \qquad &{} \text{ on } \quad {{\mathbb {R}}}^d, \end{array} \right. \end{aligned}$$
(4.5)

where the drift function a(txu) is defined as

$$\begin{aligned} a(t,x;u)\,:=\,a_0(t,x)\,+\,u_1(t)\,+\, x \circ u_2(t). \end{aligned}$$
(4.6)

Under our assumptions, Theorem 2.4 applies. Thus, for every \(u\in U_{ad}\), there exists a unique corresponding solution \(\rho \in C\big ([0,T];H^{m}_k(\mathbb {R}^d)\big )\) to the problem (4.5). Therefore, resorting to the control-to-state map G defined in Sect. 3, we can introduce the so-called reduced cost functional, given by

$$\begin{aligned} \widehat{J}(u)\,:=\,J\big (G(u),u\big ). \end{aligned}$$
(4.7)

Hence, the ensemble optimal control problem (4.4)–(4.5) can be rephrased as follows:

$$\begin{aligned} \min _{u \in U_{ad}}\widehat{J}(u). \end{aligned}$$
(4.8)

Remark 4.2

Recall that we have defined G with values in \(L^\infty _T(L^2)\). However, under our assumptions, we know that the solution to the Liouville equation actually belongs to \(C_T(L^2)\), so that the \(\varphi \)-term in (4.3) is well-defined, and so is \(\widehat{J}\).

In the following, we prove existence of a minimizer to (4.8).

Theorem 4.1

Under assumptions (A.1)(A.2)(A.3)(A.4), the ensemble optimal control problem (4.8) admits at least one solution \(u^*\,\in \,U_{ad}\). The corresponding state \(\rho ^*\,:=\,G(u^*)\) belongs to the space \(C\big ([0,T];H^m_k({{\mathbb {R}}}^d)\big )\).

Proof

Let us focus on the case \(\nu =0\) for simplicity; the case \(\nu >0\) follows from the same argument.

The functional J given in (4.3) is well-defined for \((\rho ,u)\,\in \,C_T(L^2)\times \mathbb {L}^\infty _T\), and \(U_{ad}\) is a bounded subset of \(\mathbb {L}^\infty _T\). On the other hand, owing to estimate (2.16) in Theorem 2.4, and the embedding \(C_T(H^m_K)\hookrightarrow L^\infty _T(L^2)\), the map G takes its values in a bounded set of \(L^\infty _T(L^2)\). It follows that \(\widehat{J}\) is bounded; in particular, \(\widehat{J}\) is a proper map, i.e. \(\inf _{U_{ad}}\widehat{J}\,>\,-\infty \), and \(\widehat{J}\) is not identically equal to \(+\infty \).

Next, we claim that \(\widehat{J}\) is weakly lower semi-continuous. To prove this fact, it is enough to use the weak-weak continuity of G, as stated in Proposition 3.2, and to remark that J is weakly lower semi-continuous. Indeed, the last three terms in (4.3) are norms, so they are weakly lower semi-continuous. On the other hand, the first two terms are linear in \(\rho \), and then they are weakly continuous with respect to the \(L^\infty _T(L^2)\) and \(L^2\) topologies, respectively. Thus we immediately get that, if \(\big (u_n\big )_n\subset U_{ad}\) is a sequence which converges weakly-\(*\) to a \(u\in U_{ad}\) in \(L^\infty _T\), we have

$$\begin{aligned} \liminf _{n\rightarrow +\infty }\widehat{J}(u_n)\,=\,\liminf _{n\rightarrow +\infty }J\big (G(u_n),u_n\big )\,\ge \,J\big (G(u),u\big )\,=\,\widehat{J}(u). \end{aligned}$$

At this point, proving the existence of a minimizer for \(\widehat{J}\) is standard. Let us take a minimizing sequence \(\big (u_n\big )_n\subset U_{ad}\). Since \(U_{ad}\) is a bounded set in \(\mathbb {L}^\infty _T\), we can extract a weakly-\(*\) convergent subsequence, which we do not relabel for simplicity; let us call \(u^*\in U_{ad}\) its limit-point. Then, by the weak-lower semi-continuity of \(\widehat{J}\), we can conclude that \(u^*\) is a minimizer for \(\widehat{J}\). \(\square \)

We discuss uniqueness of the minimizers in Sect. 4.3 below. For this purpose, we use characterization of minimizers as solutions to a suitable optimality system, which we derive in the next section.

4.2 Liouville optimality systems

This section is devoted to the characterization of ensemble optimal controls as solutions of the related first-order optimality system. For this purpose, in addition to hypotheses (A.1)(A.2)(A.3)(A.4) stated above, from now on we take

$$\begin{aligned} m\,\ge \,1 \qquad \qquad \text{ and } \qquad \qquad k\,\ge \,1. \end{aligned}$$

In correspondence to (4.3)–(4.4)–(4.5), we consider the Lagrange multipliers framework, see e.g. [20, 24], and introduce the Lagrange functional \(\mathcal {L}\) as follows:

$$\begin{aligned} \mathcal {L}(\rho ,u,q)&:=\,J(\rho ,u)+\int _0^T\int _{\mathbb {R}^d}\Big ( \partial _t\rho +\mathrm{div}\,\big (a(x,t;u)\rho \big )-g\Big )q\,dxdt\nonumber \\&+\int _{{{\mathbb {R}}}^d}\big (\rho (0,x)-\rho _0(x)\big )q_0(x)\,dx, \end{aligned}$$
(4.9)

where, for the sake of generality, we have included a right-hand side g. The variable q represents the Lagrange multiplier. Notice that \(\mathcal {L}\) is well-defined whenever \(u\in \mathbb {L}^\infty _T\) if \(\nu =0\), \(u\in \mathbb {H}^1_T\) if \(\nu >0\), \(q\in L^\infty _T(L^2)\), \(q_0\,\in \,L^2\) and \(\rho \in C_T(L^2)\) such that both \(\partial _t\rho \) and \(\mathrm{div}\,\big (a(x,t;u)\,\rho \big )\) belong to \(L^1_T(L^2)\). In particular, it is enough to have \(\rho \,\in \,W^{1,1}_T(L^2)\,\cap \,L^\infty _T(H^1_1)\), recall also Proposition 2.3. Notice that, a posteriori, we will find \(q\in C_T(L^2)\) and \(q_0=q(0)\); see the discussion below for details.

For clarity, in order to derive the optimality system, we first discuss the case with \(L^2\) costs only, then the case with \(L^2-H^1\) costs, and finally the case with \(L^2-L^1- H^1\) costs.

The case \(\delta =\nu =\mathbf {0}\). If \(\delta =0\), then J is Fréchet differentiable over \(C_T(L^2)\times \mathrm{int\,}U_{ad}\), since it is linear in \(\rho \) and the control costs with \(\gamma >0\), \(\nu \ge 0\) are given by differentiable norms. It is then an easy computation to show that \(\mathcal {L}\) is Fréchet differentiable over the space

$$\begin{aligned} \mathbb {X}_T\;:=\;\left( W^{1,1}_T(L^2)\,\cap \,L^\infty _T(H^1_1)\right) \;\times \;\mathbb {L}^2_T\;\times \;C_T(L^2), \end{aligned}$$

where \(\mathbb {L}^2_T\) has to be replaced by \(\mathbb {H}^1_T\) in the case when \(\nu >0\). The Fréchet differential of \(\mathcal {L}\) at \((\rho ,u,q)\) is given by the linearization of each of its terms at that point.

Now, consider in addition \(\nu =0\). The optimality system is obtained by putting to zero the Fréchet derivatives of \(\mathcal {L}(\rho ,u,q)\) with respect to each of its arguments separately. We obtain

$$\begin{aligned}&\partial _t\rho \,+\,\mathrm{div}\,\big (a(x,t;u)\,\rho \big )\,=\,g, \qquad \qquad \text{ with } \quad \rho _{|t=0}\,=\,\rho _0 \end{aligned}$$
(4.10)
$$\begin{aligned}&\quad -\,\partial _tq\,-\,a(x,t;u)\cdot \nabla q\,=\,-\,\theta , \qquad \qquad \text{ with } \quad q_{|t=T}\,=\,-\,\varphi \end{aligned}$$
(4.11)
$$\begin{aligned}&\quad \left( \gamma \,u^r_j\,+\,\int _{\mathbb {R}^d}\,\mathrm{div}\,\left( \frac{\partial a}{\partial u^r_j}\,\rho \right) \, q\,dx\;,\;v^r_j\,-\,u^r_j \right) _{L^2(0,T)}\,\ge \,0\nonumber \\&\qquad \forall v \in U_{ad},\;j\,=\,1,2, r\,=\,1\ldots d. \end{aligned}$$
(4.12)

We remark that, denoting by \(e^r\) the r-th unit vector of the canonical basis of \({{\mathbb {R}}}^d\) and by \(x^r\) the r-th component of the vector \(x\in {{\mathbb {R}}}^d\), by Definition 4.6 we have \(\partial a/\partial u^r_1\,=\,e^r\) and \(\partial a/\partial u^r_2\,=\,x^r\,e^r\). Then, equation (4.12) can be equivalently written in the following form: for any \(1\le r\le d\),

$$\begin{aligned} \left\{ \begin{array}{l} \left( \gamma \, u^r_1 \,+\, \displaystyle {\int }_{\mathbb {R}^d} \partial _r\rho \,q\,dx\;,\;v^r_1\,-\,u^r_1 \right) _{L^2(0,T)}\,\ge \,0\\ \left( \gamma \, u^r_2\,+\,\displaystyle {\int }_{\mathbb {R}^d}\partial _r\big (x^r\,\rho \big )\,q\,dx\;,\;v^r_2\,-\,u^r_2 \right) _{L^2(0,T)}\,\ge \,0. \end{array} \right. \end{aligned}$$

Further, if we sum up equations (4.12) for all m and all r, we can write, in the following compact form

$$\begin{aligned} \left( \gamma \,u\,+\,\int _{\mathbb {R}^d}\,\mathrm{div}\,\big ((e+x)\,\rho \big )\, q\,dx\;,\;v\,-\,u \right) _{\mathbb {L}^2_T}\,\ge \,0\qquad \qquad \text{ for } \text{ all } \quad v\,\in \,U_{ad}, \end{aligned}$$
(4.13)

where we have defined the vector \(e\,=\,(1\ldots 1)\).

Equation (4.10) is our Liouville model (also called the forward equation in this context). The results of Sect. 2.2 guarantee that, under our assumptions, there exists a unique solution \(\rho \,\in \,C_T(H^1_1)\). Moreover, since \(u\,\in \,U_{ad}\), an inspection of (4.10) reveals that \(\partial _t\rho \,\in \,L^1_T(L^2)\).

Equation (4.11) is the adjoint Liouville equation; it is obtained by taking the Fréchet derivative of (4.9) with respect to \(\rho \). This is a transport equation that evolves backwards in time. By setting \(\widetilde{q}(t,x)\,=\,q(T-t,-x)\), we obtain a transport problem for \(\widetilde{q}\), as in (2.13), with source term \(-\theta \) and initial condition \(\widetilde{q_{|t=0}}\,=\,-\varphi \). Thus, the results of Paragraph 2.1.2 guarantee the existence and uniqueness of a Lagrange multiplier \(q \in C_T(L^2)\), provided that \(\theta \) and \(\varphi \) are in \(L^2\).

From the discussion above, we get that any solution to the optimality system (4.10)–(4.11)–(4.12), with \(u\in U_{ad}\), belongs indeed to the space \(\mathbb {X}_T\).

Equation (4.12) represents the optimality condition. To better illustrate this fact, we suppose from now on that

$$\begin{aligned} m\,\ge \,2 \qquad \qquad \hbox { and }\qquad \qquad k\,\ge \,2. \end{aligned}$$

Then, the reduced cost functional \(\widehat{J}\), defined in (4.7), is Fréchet differentiable; in terms of the reduced minimisation problem (4.8), the optimal solution \(u^*\) in the convex, closed and bounded set \(U_{ad}\) is characterized by the optimality condition given by

$$\begin{aligned} \left( \nabla _u\widehat{J}(u^*)\;,\; \, v\,-\,u^* \right) _{\mathbb {L}^2_T} \ge 0, \qquad \text { for all } v \in U_{ad}, \end{aligned}$$

where \(\nabla _u \widehat{J}\) denotes the \(L^2\)-gradient of \(\widehat{J}\) with respect to u. In fact, a direct computation of \(\nabla _u J\big (G(u),u\big )\), with the introduction of the auxiliary adjoint variable q, gives the optimality system above, and the following relation:

$$\begin{aligned} \nabla _{u^r_j} \widehat{J}(u)\,= \,\gamma \, u^r_j\, +\, \int _{\mathbb {R}^d} \mathrm{div}\,\left( \frac{\partial a}{\partial u^r_j}\, \rho \right) \, q\,dx. \end{aligned}$$

The case \(\mathbf {\delta =0,\;\nu >0}\). Next, assume that \(\delta =0\) and \(\gamma , \nu >0\). Recall that, in this case, the set \(U_{ad}\) is defined by (4.2). Then, the natural Hilbert space where \(u^*\) is sought is \(\widetilde{\mathbb {H}}^1_T(\mathbb {R}^d)\,:=\, \widetilde{H}^1_T(\mathbb {R}^d) \times \widetilde{H}^1_T(\mathbb {R}^d)\), where \(\widetilde{H}^1_T\) corresponds to the \(H^1_T\) space, endowed with the weighted \(H^1\)-product given by

$$\begin{aligned} (u,v)_{\widetilde{H}^1_T}\, :=\,\gamma \, \int _0^T u(t) \cdot v(t) \, dt\, +\, \nu \, \int _0^T u'(t) \cdot v'(t) \, dt . \end{aligned}$$

The notation \(\phantom {u}'\,=\,d/dt\) stands for the weak time derivative.

Now, let \(\mu \) be the \({\widetilde{H}}^1\)-Riesz representative of the continuous linear functional

$$\begin{aligned} v\;\mapsto \; \left( \int _{\mathbb {R}^d} \mathrm{div}\,\left( \frac{\partial a}{\partial u} \,\rho \right) q\,dx\;,\;v \right) _{\mathbb {L}^2_T}. \end{aligned}$$

Assuming that \(u\,\in \,U_{ad}\cap H^1_0\big ([0,T];{{\mathbb {R}}}^{2d}\big )\), then \(\mu \) can be computed by solving the equation

$$\begin{aligned} \left( -\, \nu \,\frac{d^2}{dt^2}\,+\,\gamma \right) \mu \,=\,\int _{\mathbb {R}^d}\mathrm{div}\,\left( \frac{\partial a}{\partial u}\,\rho \right) \,q\,dx, \qquad \mu (0)\,=\,\mu (T)\,=\,0, \end{aligned}$$
(4.14)

which is understood in a weak sense. Notice that the choice \(u\in H^1_0\big ([0,T];{{\mathbb {R}}}^{2d}\big )\) corresponds to the modelling requirement that the control is switched on at \(t=0\) and switched off at \(t=T\). Other initial and final time conditions on u may be required and encoded as boundary conditions in (4.14).

With the setting above, the \({\widetilde{H}}^1\)-gradient is given, for \(j=1,2\) and all \(r=1\ldots d\), by

$$\begin{aligned} {\widetilde{\nabla }}_{u^r_j} \widehat{J}(u)\,=\, u^r_j\, +\, \mu ^r_j. \end{aligned}$$
(4.15)

The optimality condition (4.12) then becomes

$$\begin{aligned} \big ( u^r_j\, +\, \mu ^r_j \;,\; v^r_j\,-\,u^r_j \big )_{\widetilde{H}^1_T}\, \ge \, 0 \end{aligned}$$
(4.16)

for all \(v\,\in \,U_{ad}\), \(j=1,2\) and \(1\le r\le d\).

The case \(\mathbf {\delta >0}\). In this case, a \(L^1\) norm of the control appears in the cost functional. This term is not Gâteaux differentiable and the discussion becomes more involved. By using of the control-to-state map, we start by defining

$$\begin{aligned} f(u)\,&:=\,\int _0^T \int _{{{\mathbb {R}}}^d} \theta (x) \, G(u)(x,t) \, dx\, dt\, +\, \int _{{{\mathbb {R}}}^d} \varphi (x) \, G(u)(x,T) \, dx\,\\&\quad +\,\frac{\gamma }{2} \int _0^T \big |u(t)\big |^2 \, dt\,+\, \frac{\nu }{2}\int _0^T \left| \frac{d}{d t} u(t)\right| ^2 \, dt \\ g(u)\,&:= \,\delta \, \left\| u \right\| _{L^1_T}. \end{aligned}$$

The \(L^1\)-cost, represented by g, admits a subdifferential \(\partial g(u)\, =\, \delta \, \partial \big (\left\| u \right\| _{L^1}\big ) \), see e.g. Section 2.3 of [5]. If we denote by \(\mathbb {L}_T^*\,:=\,\big (\mathbb {L}^\infty _T(\mathbb {R}^d)\big )^*\) and by \(\langle \cdot ,\cdot \rangle \) the duality product in \(\mathbb {L}^*_T\times \mathbb {L}^\infty _T\), the following formula holds true:

$$\begin{aligned} \partial \big (\left\| u \right\| _{L^1}\big )\,&= \,\Big \{ \phi \in \mathbb {L}^*_T\;\big |\quad \left\| v \right\| _{L^1}\, -\, \left\| u \right\| _{L^1}\,\ge \,\big \langle \phi ,\,v-u\big \rangle \quad \forall \, v \in U_{ad} \Big \} \nonumber \\&=\,{\left\{ \begin{array}{ll} \left\{ \phi \in \mathbb {L}^*_T\;\big |\quad \left\| \phi \right\| _{\mathbb {L}^*_T}\, =\,1,\;\langle \phi ,u\rangle \,=\, \left\| u \right\| _{\mathbb {L}^\infty _T}\right\} &{} \text{ if } \quad u\not \equiv 0 \\ \text {unit ball in }\;\mathbb {L}^*_T &{} \text{ if } \quad u\equiv 0 . \end{array}\right. } \end{aligned}$$
(4.17)

Now, the reduced functional can be written as \(\widehat{J}(u)\,=\, f(u)\, +\, g(u)\). In this case, the Eqs. (4.10) and (4.11) in the corresponding optimality system are the same; however, we have a different optimality condition (4.12). In the case \(\nu =0\), as in Theorem 2.2 in [14], we have the following result; for its proof, we refer to [14] and [23]. Notice that, as for Eqs. (4.10)–(4.11)–(4.12), Eq. (4.18) can be written even when G, and hence \(\widehat{J}\), are not Fréchet differentiable.

Theorem 4.2

Under assumptions (A.1)(A.2)(A.3)(A.4), where we take \(m\ge 1\) and \(k\ge 1\), suppose moreover that the pair \((\rho ,u)\,\in \,C_T(H^m_k)\times U_{ad}\) is a minimizer for (4.8).

Then there exists a unique \(q\,\in \,C_T(L^2)\) which solves (4.11), and a \(\widehat{\lambda }\,\in \,\partial g(u)\) such that the following inequality condition is satisfied:

$$\begin{aligned}&\left( \gamma \,u^r_j\, +\, \widehat{\lambda }_j^r \,+\, \int _{\mathbb {R}^d}\mathrm{div}\,\left( \frac{\partial a}{\partial u^r_j}\,\rho \right) q\,dx\;,\; v^r_j-u^r_j \right) _{L^2(0,T)}\,\ge 0\nonumber \\&\quad \forall \,v \in U_{ad},\;j= 1,2,\;r = 1\ldots d . \end{aligned}$$
(4.18)

Moreover, there exist \(\lambda _+\) and \(\lambda _-\), belonging to \(L^ \infty _T(\mathbb {R}^d)\), such that (4.18) is equivalent to the equations

$$\begin{aligned} \left\{ \begin{array}{l} \gamma \, u^r_j\, + \, \displaystyle {\int _{\mathbb {R}^d}} \mathrm{div}\,\left( \dfrac{\partial a}{\partial u^r_j}\, \rho \right) q\,dx\,+\,(\lambda _+)_j^r-(\lambda _-)_j^r+\widehat{\lambda }_j^r\,=\, 0\\ (\lambda _+)_j^r\,\ge \, 0,\qquad u^b-u_j^r\, \ge \, 0,\qquad (\lambda _+)_j^r\;(u^b-u_j^r) \,=\, 0 \\ (\lambda _-)_j^r \,\ge \, 0,\qquad u_j^r - u^a\, \ge \, 0,\qquad (\lambda _-)_j^r\;(u_j^r-u^a)\, =\, 0 \\ \widehat{\lambda }_j^r\, =\, \delta \qquad \text{ a.e. } \text{ in } \quad \left\{ t \in [0,T]\;\big |\quad u_j^r(t)\,>\,0 \right\} \\ \left| \widehat{\lambda }_j^r\right| \,\le \,\delta \qquad \text{ a.e. } \text{ in } \quad \left\{ t \in [0,T]\;\big |\quad u_j^r(t)\,=\,0 \right\} \\ \widehat{\lambda }_j^r\, =\, \delta \qquad \text{ a.e. } \text{ in } \quad \left\{ t \in [0,T]\;\big |\quad u_j^r(t)\,<\,0 \right\} , \end{array} \right. \end{aligned}$$

for \(j=1,2\) and all \(1\le r\le d\).

Remark 4.3

In our case, \(\widehat{\lambda }_j^r\) can be understood to be \(\delta \,\text {sgn}(u_j^r)\), where \(\text {sgn}{(x)}\) is the sign function, equal to 1 or \(-1\) depending if \(x>0\) or \(x<0\) respectively, and equal to 0 if \(x=0\).

Furthermore, we notice that the additional Lagrange multipliers \((\lambda _\pm )_j^r\) are due to the constraints \(u^a\,\le \, u(t)\, \le \, u^b\) for almost all \(t \in [0,T]\).

Finally, the case \(\delta >0\) and \(\nu >0\) can be treated as done before. After resorting once again to the space \({\widetilde{\mathbb {H}}}^1_T\), let \(\mu \) be the \({\widetilde{H}}^1\)-Riesz representative of the continuous linear functional

$$\begin{aligned} v\;\mapsto \; \left( \widehat{\lambda }\, +\, \int _{\mathbb {R}^d} \mathrm{div}\,\left( \frac{\partial a}{\partial u}\,\rho \right) q\,dx\;,\;v\right) _{\mathbb {L}^2_T}. \end{aligned}$$

Then, assuming that \(u\in U_{ad}\cap H^1_0\big ([0,T];{{\mathbb {R}}}^{2d}\big )\), we can compute \(\mu \) as above, by solving the equation

$$\begin{aligned} \left( -\, \nu \,\frac{d^2}{dt^2}\,+\,\gamma \right) \mu \,=\,\widehat{\lambda }\,+\,\int _{\mathbb {R}^d}\mathrm{div}\,\left( \frac{\partial a}{\partial u}\,\rho \right) \,q\,dx, \qquad \mu (0)\,=\,\mu (T)\,=\,0. \end{aligned}$$

With this definition, relation (4.15) still holds true, and the optimality condition (4.12) can be expressed once again by equations (4.16).

4.3 Uniqueness of optimal controls

In this section, we prove uniqueness of optimal controls in the situation when \(\delta =0\) and \(\nu =0\) in (4.3). Our proof relies on the characterization of optimal controls as solutions to the corresponding optimality system. The cases \(\delta >0\) or \(\nu >0\) read more complicated and are left aside in our discussion.

To begin with, in order to prove uniqueness, we need additional regularity on the cost functions \(\theta \) and \(\varphi \). We then formulate the following assumption, which strengthen (A.4).

(A.4)* :

Suppose that both \(\theta \) and \(\varphi \) belong to \(H^1_1({{\mathbb {R}}}^d)\).

In the constrained-control case, the characterization of optimal controls is given by an inequality, see (4.12). This is a very weak information: this is the reason why we are able to prove uniqueness only under a smallness condition, either on the time T or on the size of the data \(\rho _0\), g, \(\nabla a_0\), \(\theta \) and \(\varphi \) in their respective functional spaces.

Let us recall that existence of an optimal control has been proved in Theorem 4.1 above.

Theorem 4.3

Under assumptions (A.1)(A.2)-(A.3)-(A.4)*, suppose that both \(m\ge 2\) and \(k\ge 2\). Take moreover \(\delta =\nu =0\) in (4.3). Finally, define

$$\begin{aligned}&\widetilde{K}\,:=\,C\,\exp \Big (C\left( \left\| \nabla a_0\right\| _{L^1_T(C^2_b)}+T\,\max \left\{ \big |u^a\big |,\,\big |u^b\big |\right\} \right) \Big )\, \left( \left\| \rho _0\right\| _{H^2_2}\,+\,\left\| g\right\| _{L^1_T(H^2_2)}\right) \,\\&\quad \times \left( \left\| \varphi \right\| _{H^1_1}\,+\,T\,\left\| \theta \right\| _{H^1_1}\right) , \end{aligned}$$

where the constant \(C>0\) can be taken as the maximum of the constants C appearing in (4.20), (4.22), (4.23) and in the definition (3.3) of \(K^{(2)}_1\).

If the condition \({\widetilde{K}}\,T/\gamma \,<\,1\) holds true, then there exists at most one optimal control \(u^*\) in \(\mathrm{int\,}U_{ad}\).

Proof

The previous result being classical in optimal control problems, let us just give a sketch of the proof. Let \((u,\rho _1,q_1)\) and \((v,\rho _2,q_2)\) be two optimal triplets solving the minimization problem (4.8). From (4.13) we deduce that, for all \(w\in U_{ad}\),

$$\begin{aligned}&\left( \gamma u + \int _{\mathbb {R}^d}\mathrm{div}\,\big ((e+x)\rho _1\big )q_1,\,u-w\right) _{\mathbb {L}^2_T}\le 0\\&\quad \text{ and } \qquad \left( \gamma v + \int _{\mathbb {R}^d}\mathrm{div}\,\big ((e+x)\rho _2\big )q_2,\,w-v\right) _{\mathbb {L}^2_T}\ge 0. \end{aligned}$$

Take \(w=v\) in the former inequality, \(w=u\) in the latter and compute the difference of the resulting expressions. After setting \(\delta \rho \,:=\,\rho _1-\rho _2\) and \(\delta q\,:=\,q_1-q_2\), straightforward computations lead to

$$\begin{aligned} \gamma \int ^T_0|u(t)-v(t)|^2\,dt\,&\le \, \int ^T_0\left( \int _{\mathbb {R}^d}\left| \mathrm{div}\,\big ((e+x)\,\delta \rho \big ) q_1\right| +\int _{\mathbb {R}^d}\left| \mathrm{div}\,\big ((e+x)\,\rho _2\big )\,\delta q\right| \right) \nonumber \\&\quad \times |u(t)-v(t)|\,dt. \end{aligned}$$
(4.19)

Now we estimate the two space integrals, at any time \(t\in [0,T]\). We start with the former term, for which we obtain

$$\begin{aligned}&\int _{\mathbb {R}^d}\left| \mathrm{div}\,\big ((e+x)\,\delta \rho (t)\big )\,q_1(t)\right| \,dx\,\le \,\left\| q_1(t)\right\| _{L^2}\,\left\| \mathrm{div}\,\big ((e+x)\,\delta \rho (t)\big )\right\| _{L^2}\\&\quad \le \,C_1\,\left( \left\| \delta \rho (t)\right\| _{L^2}\,+\,\left\| \big (1+|x|\big )\,\nabla \delta \rho (t)\right\| _{L^2}\right) \;\le \,C_1\,\left\| \delta \rho (t)\right\| _{H^1_1}, \end{aligned}$$

where we have also used Theorem 2.3 applied to the transport Eq. (4.11) for treating the \(q_1\) term. Notice that the constant \(C_1\) can be expressed as

$$\begin{aligned} C_1\,:=\,C\,\exp \Big (C\left( \left\| \mathrm{div}\,a_0\right\| _{L^1_T(L^\infty )}+\left\| u_1\right\| _{\mathbb {L}^1_T}\right) \Big ) \big (\left\| \varphi \right\| _{L^2}\,+\,T\,\left\| \theta \right\| _{L^2}\big ), \end{aligned}$$
(4.20)

for a “universal” constant \(C>0\) that depends on the space dimension d. At this point, we recall that both \(\rho _1\) and \(\rho _2\) satisfy Eq. (4.10), with controls \(u_1\) and \(u_2\), respectively. Then, taking their difference and applying Lemma 3.1 finally yields, for a new constant \({\widetilde{C}}_1\,=\,C_1\,K_1^{(2)}\) just depending on the data of the problem, the following bound:

$$\begin{aligned} \int _{\mathbb {R}^d}\left| \mathrm{div}\,\big ((e+x)\,\delta \rho (t)\big ) q_1(t)\right| \,dx\,\le \,{\widetilde{C}}_1\,\int ^t_0\big |u(\tau )-v(\tau )\big |\,d\tau . \end{aligned}$$
(4.21)

Next, consider the second integral in (4.19). The computations are similar to the previous ones: first of all, we can estimate

$$\begin{aligned} \int _{\mathbb {R}^d}\left| \mathrm{div}\,\big ((e+x)\rho _2(t)\big )\,\delta q(t)\right| \,dx\,&\le \, \left\| \delta q(t)\right\| _{L^2}\,\left\| \rho _2(t)\right\| _{H^1_1}\;\le \,C_2\,\left\| \delta q(t)\right\| _{L^2}, \end{aligned}$$

where we have applied Theorem 2.4 to equation (4.10) for \(\rho _2\) to control its \(H^1_1\) norm. In particular, it follows from that theorem that

$$\begin{aligned} C_2\,:=\,C\,\exp \left( C\left( \left\| \nabla a_0\right\| _{L^1_T(C^1_b)}\,+\,\left\| u_2\right\| _{\mathbb {L}^1_T}\right) \right) \, \left( \left\| \rho _0 \right\| _{H^1_1}\, +\, \left\| g \right\| _{L^1_T(H^1_1)} \right) , \end{aligned}$$
(4.22)

for a “universal” constant \(C>0\).

Now, we use the fact that \(q_1\) and \(q_2\) are both solutions of (4.11), related to the controls \(u_1\) and \(u_2\) respectively. Hence, taking the difference of those equations and arguing as in the proof of Lemma 3.1 (keep in mind also Remark 2.3), one easily infers the existence of a “universal” constant \(C>0\) such that

$$\begin{aligned} \left\| \delta q(t)\right\| _{L^2}\,&\le \,C\,\exp \Big (C\left( \left\| \mathrm{div}\,a_0\right\| _{L^1_T(L^\infty )}\,+\,\left\| u_1\right\| _{\mathbb {L}^1_T}\right) \Big )\,\\&\quad \times \int ^T_t|u(\tau )-v(\tau )|\,\left\| \big (1+|x|\big )\,\nabla q_2(\tau )\right\| _{L^2}\,d\tau \\&\le \,C\,\exp \Big (C\left( \left\| \nabla a_0\right\| _{L^1_T(C^1_b)}+\left\| u_1\right\| _{\mathbb {L}^1_T}+\left\| u_2\right\| _{\mathbb {L}^1_T}\right) \Big ) \left( \left\| \varphi \right\| _{H^1_1}+T\,\left\| \theta \right\| _{H^1_1}\right) \\&\quad \times \int ^T_t\big |u(\tau )-v(\tau )\big |\,d\tau . \end{aligned}$$

Notice that the integral is from t to T, because (4.11) is a backward transport equation. After defining the constants

$$\begin{aligned} {\widetilde{K}}^{(1)}_1\,:=\,C\,\exp \Big (C\left( \left\| \nabla a_0\right\| _{L^1_T(C^1_b)}+\left\| u_1\right\| _{\mathbb {L}^1_T}+\left\| u_2\right\| _{\mathbb {L}^1_T}\right) \Big ) \left( \left\| \varphi \right\| _{H^1_1}+T\,\left\| \theta \right\| _{H^1_1}\right) \end{aligned}$$
(4.23)

and \({\widetilde{C}}_2\,:=\,C_2\,{\widetilde{K}}^{(1)}_1\), we obtain

$$\begin{aligned} \int _{\mathbb {R}^d}\left| \mathrm{div}\,\big ((e+x)\,\rho _2(t)\big )\,\delta q(t)\right| \,dx\,\le \,{\widetilde{C}}_2\,\int ^T_t\big |u(\tau )-v(\tau )\big |\,d\tau . \end{aligned}$$
(4.24)

At this point, we can insert estimates (4.21) and (4.24) into (4.19), and get, for a new constant \(K\,=\,{\widetilde{C}}_1+{\widetilde{C}}_2\), the relation

$$\begin{aligned} \gamma \int ^T_0\big (\sigma (\tau )\big )^2\,dt\,\le \,K\,\int ^T_0\sigma (t)\,\left( \int ^T_0\sigma (s)\,ds\right) \,dt\,=\,K\left( \int ^T_0\sigma (\tau )\right) ^2, \end{aligned}$$

where, for simplicity of notation, we have defined \(\sigma (t)\,:=\,\big |u(t)-v(t)\big |\). Hence, by Cauchy-Schwarz inequality we easily deduce

$$\begin{aligned} \gamma \int ^T_0\big (\sigma (t)\big )^2\,dt\,\le \,K\,T\int ^T_0\big (\sigma (t)\big )^2\,dt, \end{aligned}$$

which obviously implies \(\sigma \equiv 0\) almost everywhere on [0, T] whenever \(K\,T/\gamma \,<\,1\). Then, we conclude the proof remarking that \(K\,\le \,\widetilde{K}\). \(\square \)

4.4 The case of confining \(\theta \) and \(\varphi \)

As pointed out in Remark 4.1, from the applications viewpoint, it may be desirable to consider the case when both \(\theta \) and \(\varphi \) are quadratic potentials. In this section, we discuss the necessary adaptations to be implemented in our arguments in order to address this case.

Therefore, from now on we choose

$$\begin{aligned} \theta (x)\,=\,|x|^2\qquad \text{ and } \qquad \varphi (x)\,=\,|x|^2, \end{aligned}$$

although the discussion can be further adapted, in order to treat more general polynomial growths. In order to simplify the presentation, we also assume that \(\delta =\nu =0\).

First of all, we notice that, in view of (4.3), for J to be well-defined it is necessary that \(|x|^2\,\rho \) belongs to \(L^1\). Then, we have to assume higher integrability on \(\rho \), namely that

$$\begin{aligned} \rho \,\in \,C\big ([0,T];L^2_k({{\mathbb {R}}}^d)\big ),\qquad \qquad \text{ for } \text{ some } \quad k\,>\,2\,+\,\frac{d}{2}. \end{aligned}$$

This of course entails that, in (A.1), one has to take \(\rho _0\,\in \,H^m_k\) and \(g\,\in \,L^1_T(H^m_k)\), with the same restriction \(k>2+d/2\). However, Theorem 4.1 still holds true.

The main changes pertain Sect. 4.2, starting from the Definition 4.9 of the functional \(\mathcal {L}\). First of all, let us focus on the Lagrangian multiplier q. On the one hand, we need it to be in some duality pairing with \(\rho \): then, keeping in mind Definition 2.1, we introduce, for \((m,k)\in {{\mathbb {N}}}^2\), the spaces

$$\begin{aligned} H^m_{-k}({{\mathbb {R}}}^d)\,:=\,\left\{ f\in H^m_{\mathrm{loc}}({{\mathbb {R}}}^d)\;\big |\quad \big (1+|x|\big )^{-k}\,D^\alpha f\;\in \;L^2(\mathbb {R}^d)\quad \forall \;0\le |\alpha |\le m\right\} . \end{aligned}$$

This space is endowed with the natural norm

$$\begin{aligned} \Vert f\Vert _{H^m_{-k}}\,=\,\sum _{0\le |\alpha |\le m}\left\| \big (1+|x|\big )^{-k}\,D^\alpha f\right\| _{L^2}. \end{aligned}$$

On the other hand, we still expect q to solve (4.11) to an extent, although the meaning of that equation is now no more clear, owing to the fact that \(\theta \) and \(\varphi \) do not belong anymore to \(L^2\). To deal with both issues, we need the following lemma, whose proof can be performed arguing as in the proof of Theorem 2.4 above, using this time the weight \(\big (1+|x|\big )^{-k}\). We omit to give the details here.

Lemma 4.1

Let \(T>0\) and \((m,k)\in {{\mathbb {N}}}^2\) fixed, and let a be a vector field satisfying hypotheses (2.3). Moreover, assume that \(q_0\in H^m_{-k}({{\mathbb {R}}}^d)\) and \(g\in L^1\big ([0,T];H^m_{-k}({{\mathbb {R}}}^d)\big )\).

Then there exists a unique solution \(q\,\in \,C\bigl ([0,T];H^{m}_{-k}(\mathbb {R}^d)\bigr )\) to the problem

$$\begin{aligned} \partial _tq\,+\,a\cdot \nabla q\,=\,g,\qquad \qquad \hbox { with }\quad q_{|t=0}\,=\,q_0. \end{aligned}$$

Moreover, there exists a constant \(C>0\) such that the following estimate holds true for any \(t\in [0,T]\):

$$\begin{aligned} \left\| q(t) \right\| _{H^m_{-k}}\,\le \,C\,\exp \left( C\,\int _0^t\left\| \nabla a(\tau )\right\| _{C^m_b}\,d\tau \right) \, \left( \left\| q_0 \right\| _{H^m_{-k}}\, +\, \int _0^t \left\| g(\tau ) \right\| _{H^m_{-k}}\, d\tau \right) . \end{aligned}$$
(4.25)

Let us come back to our optimal control problem. In view of Lemma 4.1, we can solve Eq. (4.11) with \(\theta \) and \(\varphi \) equal to \(|x|^2\), getting a unique solution in the space \(C_T(L^2_{-k})\) for any \(k>2+d/2\). Let us fix, once for all, the choiceFootnote 1

$$\begin{aligned} k_0\,=\,3\,+\,\left[ \frac{d}{2}\right] . \end{aligned}$$

Then, it is easy to see that the functional \(\mathcal {L}\) is well-defined on the space

$$\begin{aligned} \widetilde{\mathbb {X}}_T\;:=\;\left( W^{1,1}_T(L^2_{k_0})\,\cap \,L^\infty _T(H^1_{k_0+1})\right) \;\times \;\mathbb {L}^2_T\;\times \;C_T(L^2_{-k_0}). \end{aligned}$$

Of course, we also need to take \(\rho _0\) and g as in assumption (A.1), with \(m\ge 1\) and \(k\ge k_0+1\).

Thereafter, we can write the optimality system (4.10)–(4.11)–(4.12), as done above. In order to characterize Eq. (4.12) in terms of the gradient of the reduced functional \(\widehat{J}\), we need to further assume that \(m\ge 2\) and \(k\ge k_0+2\).

Finally, also the analysis in Sect. 4.3 works similarly as above. Of course, assumption (A.4)* is now too strong, and we have to dismiss it.

However, it is still possible to get a result analogous to Theorem 4.3. More precisely, we have the following statement for the unconstrained problem.

Proposition 4.1

Under assumptions (A.1)(A.2)(A.3), suppose also that both \(m\ge 2\) and \(k\ge k_0+2\). In addition, take \(\delta =\nu =0\) in (4.3), and \(\theta (x)\,=\,\varphi (x)\,=\,|x|^2\). Finally, define

$$\begin{aligned} \widetilde{\mathcal K}\,&:=\,C\,(1+T)\,\left\| \big (1+|x|\big )^{-k_0+2}\right\| _{L^2}\,\times \\&\qquad \qquad \times \,\exp \Big (C\left( \left\| \nabla a_0\right\| _{L^1_T(C^2_b)}+T\,\max \left\{ \big |u^a\big |,\,\big |u^b\big |\right\} \right) \Big )\, \left( \left\| \rho _0\right\| _{H^2_{k_0+2}}\,+\,\left\| g\right\| _{L^1_T(H^2_{k_0+2})}\right) , \end{aligned}$$

where the constant \(C>0\) is a suitable positive constant.

If the condition \(\widetilde{\mathcal K}\,T/\gamma \,<\,1\) holds true, then there exists at most one optimal control \(u^*\) in \(\mathrm{int\,}U_{ad}\).

Proof

The proof is very similar to the one to Theorem 4.3, therefore we limit ourselves to put in evidence the main changes to be adopted, and to treat the most delicate points of the analysis.

As before, let \((u_1,\rho _1,q_1)\) and \((u_2,\rho _2,q_2)\) be two optimal controls with corresponding state and adjoint state. Arguing as above, we find that \(\delta u=u_1-u_2\) fulfils estimate (4.19). Let us now focus on the estimate of each integral appearing in that relation.

As for the former integral term, also by use of Lemma 4.1, we can write

$$\begin{aligned}&\int _{\mathbb {R}^d}\left| \mathrm{div}\,\big ((e+x)\,\delta \rho (t)\big )\,q_1(t)\right| \,dx\,&\le \,\left\| q_1(t)\right\| _{L^2_{-k_0}}\,\left\| \mathrm{div}\,\big ((e+x)\,\delta \rho (t)\big )\right\| _{L^2_{k_0}}\\&\quad \le \,C_3\,\left\| \delta \rho (t)\right\| _{H^1_{k_0+1}}. \end{aligned}$$

Notice that the constant \(C_3\) can be expressed as follows

$$\begin{aligned} C_3\,:=\,C\,(1+T)\,\left\| |x|^2\,\big (1+|x|\big )^{-k_0}\right\| _{L^2}\,\exp \Big (C\left( \left\| \mathrm{div}\,a_0\right\| _{L^1_T(L^\infty )}+\left\| u_1\right\| _{\mathbb {L}^1_T}\right) \Big ), \end{aligned}$$
(4.26)

for a “universal” constant \(C>0\). At this point, the estimate for \(\delta \rho \) works as before, finally leading to

$$\begin{aligned} \int _{\mathbb {R}^d}\left| \mathrm{div}\,\big ((e+x)\,\delta \rho (t)\big ) q_1(t)\right| \,dx\,\le \,{\widetilde{C}}_3\,\int ^t_0\big |\delta u(\tau )\big |\,d\tau , \end{aligned}$$
(4.27)

where we have defined \(\widetilde{C}_3\,=\,C_3\,K_1^{(k_0+2)}\), just depending on the data of the problem.

Next, consider the second integral in (4.19): we can estimate

$$\begin{aligned} \int _{\mathbb {R}^d}\left| \mathrm{div}\,\big ((e+x)\,\rho _2(t)\big )\,\delta q(t)\right| \,dx\,&\le \, \left\| \delta q(t)\right\| _{L^2_{-k_0}}\,\left\| \rho _2(t)\right\| _{H^1_{k_0+1}}\;\le \,C_4\,\left\| \delta q(t)\right\| _{L^2_{-k_0}}, \end{aligned}$$

where, by Theorem 2.4 applied to Eq. (4.10) for \(\rho _2\), we obtain that

$$\begin{aligned} C_4\,:=\,C\,\exp \left( C\left( \left\| \nabla a_0\right\| _{L^1_T(C^1_b)}\,+\,\left\| u_2\right\| _{\mathbb {L}^1_T}\right) \right) \, \left( \left\| \rho _0 \right\| _{H^1_{k_0+1}}\, +\, \left\| g \right\| _{L^1_T(H^1_{k_0+1})} \right) , \end{aligned}$$
(4.28)

for a “universal” constant \(C>0\). On the other hand, Lemma 4.1 applied to the equation for \(\delta q\) gives, for a new constant \(C>0\), the estimate

$$\begin{aligned} \left\| \delta q(t)\right\| _{L^2_{-k_0}}\,&\le \,C\,\exp \Big (C\left( \left\| \mathrm{div}\,a_0\right\| _{L^1_T(L^\infty )}\,+\,\left\| u_1\right\| _{\mathbb {L}^1_T}\right) \Big )\,\\&\quad \times \int ^T_t|\delta u(\tau )|\,\left\| \big (1+|x|\big )\,\nabla q_2(\tau )\right\| _{L^2_{-k_0}}\,d\tau . \end{aligned}$$

Notice that \(\left\| \big (1+|x|\big )\,\nabla q_2(\tau )\right\| _{L^2_{-k_0}}\,\le \,\left\| \nabla q_2(\tau )\right\| _{L^2_{-k_0+1}}\). In order to bound this quantity, we can differentiate the equation for \(q_2\) with respect to \(x^j\), for \(1\le j\le d\), and get (notice that \(\partial _j|x|^2\,=\,2\,x^j\))

$$\begin{aligned}&\partial _t\left( \big (1+|x|\big )^{-k_0+1}\,\partial _jq_2\right) \,+\,a(t,x;u_2)\cdot \nabla \left( \big (1+|x|\big )^{-k_0+1}\,\partial _jq_2\right) \,= \\&\qquad =\,2\,x^j\,\big (1+|x|\big )^{-k_0+1}\,-\,\big (1+|x|\big )^{-k_0+1}\,\partial _ja(t,x;u_2)\cdot \nabla q_2, \end{aligned}$$

with initial datum equal to \(2\,x^j\,\big (1+|x|\big )^{-k_0+1}\). Obviously, the latter term in the right-hand side can be absorbed by a Grönwall argument; in addition, an easy computation shows that the former is in \(L^2\). Therefore, by applying an \(L^2\) estimate of Theorem 2.3 to the previous equation implies, for a “universal” constant \(C>0\), the following bound:

$$\begin{aligned} \left\| \nabla q_2(\tau )\right\| _{L^2_{-k_0+1}}\,\le \,C\,\exp \Big (C\left( \left\| \nabla a_0\right\| _{L^1_T(L^\infty )}\,+\,\left\| u_2\right\| _{\mathbb {L}^1_T}\right) \Big )\,(1+T)\, \left\| |x|\,(1+|x|)^{-k_0+1}\right\| _{L^2}. \end{aligned}$$

By use of this latter estimate, we finally obtain

$$\begin{aligned} \int _{\mathbb {R}^d}\left| \mathrm{div}\,\big ((e+x)\,\rho _2(t)\big )\,\delta q(t)\right| \,dx\,\le \,\widetilde{C}_4\,\int ^T_t\big |\delta u(\tau )\big |\,d\tau , \end{aligned}$$
(4.29)

where we have defined \(\widetilde{C}_4\,:=\,C_4\,\widetilde{\mathcal K}^{(1)}_1\) and

$$\begin{aligned} \widetilde{\mathcal K}^{(1)}_1\,:{=}\,C\,\exp \Big (C\left( \left\| \nabla a_0\right\| _{L^1_T(C^1_b)}{+}\left\| u_1\right\| _{\mathbb {L}^1_T}{+}\left\| u_2\right\| _{\mathbb {L}^1_T}\right) \Big )\,(1+T)\, \left\| |x|\,(1+|x|)^{-k_0+1}\right\| _{L^2}. \end{aligned}$$
(4.30)

We can now insert (4.27) and (4.29) into (4.19), and conclude as done in the proof to Theorem 4.3. \(\square \)