Abstract
In this paper we model the role of a government of a large population as a mean field optimal control problem. Such control problems are constrained by a PDE of continuity-type, governing the dynamics of the probability distribution of the agent population. We show the existence of mean field optimal controls both in the stochastic and deterministic setting. We derive rigorously the first order optimality conditions useful for numerical computation of mean field optimal controls. We introduce a novel approximating hierarchy of sub-optimal controls based on a Boltzmann approach, whose computation requires a very moderate numerical complexity with respect to the one of the optimal control. We provide numerical experiments for models in opinion formation comparing the behavior of the control hierarchy.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Self-organization in social interactions is a fascinating mechanism, which inspired the mathematical modeling of multi-agent interactions towards formation of coherent global behaviors, with applications in the study of biological, social, and economical phenomena. Recently there has been a vigorous development of literature in applied mathematics and physics describing collective behavior of multiagent systems [41,42,43, 52, 56, 58, 82], towards modeling phenomena in biology, such as cell aggregation and motility [21, 59, 60, 73], coordinated animal motion [12, 27, 34, 37,38,39, 43, 66, 70, 71, 74, 79, 85], coordinated human [40, 44, 76] and synthetic agent behavior and interactions, such as cooperative robots [35, 63, 72, 77]. As it is very hard to be exhaustive in accounting all the developments of this very fast growing field, we refer to [28,29,30, 33, 81] for recent surveys.
Two main mechanisms are considered in such models to drive the dynamics. The first, which takes inspiration, e.g., from physics laws of motion, is based on binary forces encoding observed “first principles” of biological, social, or economical interactions. Most of these models start from particle-like systems, borrowing a leaf from Newtonian physics, by including fundamental “social interaction” forces within classical systems of 1st or 2nd order equations. In this paper we mix general principles with concrete modeling instances to encounter the need of both a certain level of generality and to provide immediately concrete applications. Accordingly, we consider here mainly large particle/agent systems of form:
where \(P(\cdot ,\cdot )\) represents the communication function between agents \(x_i \in \mathbb R^d\) and \(B_i^t\) is a d-dimensional Brownian motion.
The second mechanism, which we do not address in detail here, is based on evolutive games, where the dynamics is driven by the simultaneous optimization of costs by the players, perhaps subjected to selection, from game theoretic models of evolution [54] to mean field games, introduced in [62] and independently under the name Nash Certainty Equivalence (NCE) in [55], later greatly popularized, e.g., within consensus problems, for instance in [67, 68].
The common viewpoint of these branches of mathematical modeling of multi-agent systems is that the dynamics are based on the free interaction of the agents or decentralized control. The wished phenomenon to be described is their self-organization in terms of the formation of complex macroscopic patterns.
One fundamental goal of these studies is in fact to reveal the possible relationship between the simple binary forces acting at individual level, being the “first principles” of social interaction or the game rules, and the potential emergence of a global behavior in the form of specific patterns.
For instance one can use the model in (1.1), for \(d=1\) and \(x_i\in I=[-1,1]\), a bounded interval, to formulate classical opinion models, where \(x_i\) represents an opinion in the continuous set between two opposite opinions \(\{-1,1\}\). According to the choice of the communication function \(P(\cdot ,\cdot )\), consensus can emerge or not, and different studies have been made in order to enforce the emergence of a global consensus, [5,6,7, 45, 80]. The mathematical property for a system to form patterns is actually its persistent compactness. There are actually several mechanisms of promotion of compactness to yield eventually self-organization. In the recent paper [65], for instance, the authors name the heterophilia, i.e., the tendency to bond more with those who are “different” rather than those who are similar, as a positive mechanism in consensus models to reach accord. However also in homophilious societies influenced by more local interactions, global self-organization towards consensus can be expected as soon as enough initial coherence is given. At this point, and perhaps reminiscently of biblic stories from the Genesis, one could enthusiastically argue “Let us give them good rules and they will find their way!” Unfortunately, this is not true, at all. In fact, in homophilious regimes there are plenty of situations where patterns will not spontaneously form. In Sect. 5 below we mathematically demonstrate with a few simple numerical examples the incompleteness of the self-organization paradigm, and we refer to [18] for its systematic discussion. Consequently, we propose to amend it by allowing possible external interventions in form of centralized controls. The human society calls them government.
The general idea consists in considering dynamics of the form
where the control \(f = (f_1,\ldots ,f_N)\) minimizes a given functional J(x, f). As an example we can consider the following variational formulation
where \(\bar{x}\) represents a target point, \(\gamma \) is the penalization parameter of the control g, which is chosen among the admissible controls in \({\mathcal {U}}\), and \(\Psi :\mathbb R^d \rightarrow \mathbb R_+ \cup \{0 \}\) is a convex function. The choice of this particular cost function, and especially of the term \(\int _0^T \frac{1}{2}\int |x-\bar{x}|^2\mu (x,t)\,dx\) is absolutely arbitrary. It is consistent with our wish of mixing general statements with instances of applications, and the cost function is so given to provide immediately a specific instance of application oriented to opinion consensus problems. Similar models as (1.3) have been studied recently also for the flocking dynamics in [4, 17, 24, 51] and one can of course consider many more instances, as soon as one ensures enough continuity of the cost, see, e.g., [51].
As the number of particles \(N \rightarrow \infty \), the finite dimensional optimal control problem with ODE constraints (1.2)–(1.3) converges to the following mean field optimal control problem [1, 15, 51, 61]:
where the interaction force \(\mathcal P\) is given by
and the solution \(\mu \) is controlled by the minimizer of the cost functional
To a certain extent, the mean field optimal control problem (1.4)–(1.6) can be viewed as a generalization of optimal transport problems [14] for which the term \(P \equiv 0\), the term \(\int _0^T \frac{1}{2}\int |x-\bar{x}|^2\mu (x,t)\,dx\) does not appear in the cost, and final conditions are given. Differently from mean field games [62] the goal here is not to derive the equilibria of a multi-player game, rather to compute mean field optimal government strategies for a population so large that the curse of dimensionality would otherwise prohibit numerical solutions. The mean field optimal control problem (1.4)–(1.6) provides an artificial confinement vector field f, inducing the right amount of compactness to have global convergence to steady states (pattern formation). Local convergence towards, e.g., to global Maxwellians, is provided for certain second order mean field-type of equations in [32, 46]. Hence, our results can be also interpreted as an external model perturbation to induce global stability.
In this paper we provide a friendly introduction to mean field optimal controls of the type (1.4)–(1.6), showing their main analytical properties and furnish a simple route to their numerical solutions, which we call “the control hierarchy”. Although some of the results contained in this paper are certainly also derived elsewhere, see, e.g., [15, 51], we made an effort to present them in a simplified form as well as providing rigorous derivations.
In particular, in Sect. 2, we show existence of mean field optimal controls for first order models in case of both stochastic and deterministic control problems. We also derive rigorously in Sect. 3 the corresponding first order optimality conditions, resulting in a coupled system of forward/backward time-dependent PDEs. The forward equation is given by (1.4), while the backward one is a nonlocal integro-differential advection-reaction-diffusion equation. The presence of nonlocal interaction terms in form of integral functions is another feature, which distinguishes mean field optimal control problems from classical mean field games [62] and optimal transport problems [14], where usually \(P \equiv 0\). The nonlocal terms pose additional challenges in the numerical solution, which are subject of recent studies [22].
Although mean field optimal controls are designed to be independent of the number N of agents to provide a way to circumvent the curse of dimensionality of \(N\rightarrow \infty \), still their numerical computation needs to be realized by solving the first-order optimality conditions. The complexity of their solution depends on the intrinsic dimensionality d of the agents, which is affordable only at moderate dimensions (e.g., \(d \le 3\)). For this reason, in Sect. 4 we approach the solution of the mean field optimal control, by means of a novel hierarchy of suboptimal controls, computed by a Boltzmann approach: first one derives a control for a system of two representative particles, then one plugs it into a collisional operator considering the statistics of the interactions of a distribution of agents, and finally one performs a quasi-invariant limit to approximate the PDE of continuity-type, governing the dynamics of the probability distribution of the agent population. For the two particle system considered in the first step of the Boltzmann approach above, we propose two suboptimal controls stemming from the binary Boltzmann approach: the first level is given by an instantaneous model predictive control on two interacting agents—we shall call this control instantaneous control (IC), while the second stems from the solution of the binary optimal control problem by means of the Bellman dynamical programming principle—we shall call this control finite horizon control (FH). These two controls have the advantage that the complexity of their computation is dramatically reduced with respect to the mean field optimal control (OC) in its full glory, still retaining their ability to induce government of the population. We describe in detail how they can be efficiently numerically computed. In Sect. 5 we provide simple numerical approaches, easily implementable, for solving one-dimensional mean field optimal control problems of the type (1.4)–(1.6). We eventually numerically compare the control hierarchy with the mean field optimal control in a model of opinion formation and we show the quasi-optimality of the Boltzmann–Bellman (FH) control. To facilitate the reproducibility of our results and to allow other scientists to easily access this very exciting field, we provide at the link https://www-m15.ma.tum.de/Allgemeines/SoftwareSite the Matlab code used to produce our numerical experiments.
2 Existence of Mean Field Optimal Controls
2.1 Deterministic Case
In this section, we study global existence and uniqueness of weak solutions for Eq. (1.4) in \(\mathbb R^d\) without the diffusion, i.e., \(\sigma =0\), namely
We also investigate the mean field limit of the ODE constrained control problem (1.2)–(1.3) in the deterministic setting. Let us denote by \( {\mathcal {M}}(\mathbb R^d)\) and \( {\mathcal {M}}_p(\mathbb R^d)\) the sets of all probability measures and the ones with finite moments of order \(p \in [1,\infty )\) on \(\mathbb R^d\), respectively. We first define a notion of weak solutions to the equation to (2.1).
Definition 2.1
For a given \(T > 0\), we call \(\mu \in \mathcal {C}([0,T]; {\mathcal {M}}_1(\mathbb R^d))\) a weak solution of (2.1) on the time-interval [0, T] if for all compactly supported test functions \(\varphi \in \mathcal {C}^\infty _c(\mathbb R^d \times [0,T])\),
We also introduce a set of admissible controls \(\mathcal {F}_\ell ([0,T])\) in the definition below.
Definition 2.2
For a given T and \(q \in [1,\infty )\), we fix a control bound function \(\ell \in L^q(0,T)\). Then \(f \in \mathcal {F}_\ell ([0,T])\) if and only if
-
(i)
\(f : {\mathbb R^d \times [0,T]} \rightarrow \mathbb R^d\) is a Carathéodory function.
-
(ii)
\(f(\cdot ,t) \in W^{1,\infty }_{loc}(\mathbb R^d)\) for almost every \(t \in [0,T]\).
-
(iii)
\(|f(0,t)| + \Vert f(\cdot ,t)\Vert _\mathrm{{Lip}} \le \ell (t)\) for almost every \(t \in [0,T]\).
For the existence and mean field limit, we use the topology on probability measures induced by the Wasserstein distance, which is defined by
where \(\Gamma (\mu ,\nu )\) is the set of all probability measures on \(\mathbb R^{2d}\) with first and second marginals \(\mu \) and \(\nu \), respectively. Note that \( {\mathcal {M}}_1(\mathbb R^d)\) is a complete metric space endowed with the \(\mathcal {W}_1\) distance, and \(\mathcal {W}_1\) is equivalently characterized in duality with Lipschitz continuous functions [84].
The following result is a rather straightforward adaptation from [51] and we shall prove it rather concisely. For more details we address the interested reader to [51], which has been written in a more scholastic and perhaps accessible form.
Theorem 2.1
Let the initial data \(\mu _0 \in { {\mathcal {M}}(\mathbb R^d)}\) and assume that \(\mu _0\) is compactly supported, i.e., there exists \(R > 0\) such that
where \(B(0,R) := \{ x \in \mathbb R^d : |x| < R\}\). Furthermore, we assume that \(P \in W^{1,\infty }(\mathbb R^{2d})\). Then, for a given \(f \in \mathcal {F}_\ell ([0,T])\), there exists a unique weak solution \(\mu \in \mathcal {C}([0,T]; {\mathcal {M}}_1(\mathbb R^d))\) to Eq. (1.4) with \(\sigma =0\). Furthermore, \(\mu \) is determined as the push-forward of the initial measure \(\mu _0\) through the flow map generated by the locally Lipschitz velocity field \(\mathcal {P}[\mu ] + f\). Moreover, if \(\mu ^i,i=1,2\) are two such with initial data \(\mu _0^i\) satisfying the above assumption, we have
where \(C > 0\) depends only on \(\Vert P\Vert _{W^{1,\infty }}\), R, T, and \(\Vert \ell \Vert _{L^q}\).
Proof
\(\bullet \) Existence and Uniqueness Let \(\mu \in \mathcal {C}([0,T]; {\mathcal {M}}_1(\mathbb R^d))\) with compact support in B(0, R) for some positive constant \(R > 0\). Then we can easily show that the interaction force \(\mathcal {P}\) is locally bounded and Lipschitz:
and
On the other hand, since \(f \in \mathcal {F}_\ell ([0,T])\), we obtain that the vector field \(\mathcal {P}[\mu ] + f\) is also locally bounded and Lipschitz. Then this together with employing the argument in [23, Theorem 3.10] and existence theory for Carathéodory differential equation in [50], we can get the local-in-time existence and uniqueness of weak solutions to the system (1.4) with \(\sigma =0\) in the sense of Definition 2.1. Note that those solutions exist as long as that solutions are compactly supported. Set
Let us consider the following characteristic \(X(t):=X(t;s,x): \mathbb R_+ \times \mathbb R_+ \times \mathbb R^d \rightarrow \mathbb R^d\):
with the initial data \(X_0 = x \in \mathbb R^d\). We notice that characteristic is well-defined on the time interval [0, T] due to the regularity of the velocity field. A straightforward computation yields that for \(x,y \in \) supp\((\mu _0)\)
This deduces
and
where C depends only on T, \(\Vert P\Vert _{L^\infty }\), and \(\Vert \ell \Vert _{L^q}\). Thus, by continuity arguments, we have the global existence of weak solutions. We can also find that for \(h \in \mathcal {C}^\infty _c(\mathbb R^d)\)
This implies that \(\mu \) is determined as the push-forward of the initial density through the flow map (2.2).
\(\bullet \) Stability Estimate Let \(T>0\) and \(\mu ^i,i=1,2\) be the weak solutions to Eq. (1.4) with \(\sigma = 0\) obtained in the above. Let \(X_i\) be the characteristic flows defined in (2.2) generated by the velocity fields \(\mathcal {P}[\mu ^i] + f\), respectively. For a fixed \(t_0 \in [0,T]\), we choose an optimal transport map for \(\mathcal {W}_1\) denoted by \(\mathcal {T}^0(x)\) between \(\mu ^1_{t_0}\) and \(\mu ^2_{t_0}\), i.e., \(\mu ^2_{t_0} = \mathcal {T}^0 \# \mu ^1_{t_0}\). It also follows from the above that \(\mu ^i_t = X_i(t;t_0,\cdot ) \# \mu ^i_{t_0}\) for \(t \ge t_0\). Furthermore, we get \(\mathcal {T}^t \# \mu ^1_t = \mu ^2_t\) with \(\mathcal {T}^t = {X_2(t;t_0,\cdot )} \circ \mathcal {T}^0 \circ X_1(t_0;t,\cdot )\) for \(t \in [t_0,T]\). Then we obtain
where \(I_i,i=1,2\) are estimated as follows.
where we used the fact that \(\mu \) has the compact support for the estimate of \(I_1\). We now combine the above estimates together with being \(t_0\) arbitrary in [0, T] to conclude
This completes the proof. \(\square \)
In Theorem 2.1, we show the global existence and uniqueness of weak solutions \(\mu \) to Eq. (1.4) with \(\sigma = 0\) for a given control \(f \in \mathcal {F}_\ell ([0,T])\). In the rest of this part, we show the rigorous derivation of the infinite dimensional optimal control problem from the finite dimensional one as \(N \rightarrow \infty \). Let us recall the finite/infinite dimensional optimal control problems:
-
Finite dimensional optimal control problem:
$$\begin{aligned} \min _{f \in \mathcal {F}_\ell }J(x,f):= \min _{f \in \mathcal {F}_\ell }\int _0^T\frac{1}{N}\sum _{i=1}^N\left( \frac{1}{2}|x_i-\bar{x}|^2 + \gamma \Psi ({f(x_i,t)})\right) \,dt, \end{aligned}$$(2.3)where \(x_i\) is a unique solution of
$$\begin{aligned} \dot{x}_i =\frac{1}{N}\sum _{j=1}^N P(x_i,x_j)(x_j-x_i) + {f(x_i,t)}, \quad i=1,\ldots ,N, \quad t > 0, \end{aligned}$$(2.4) -
Infinite dimensional optimal control problem:
$$\begin{aligned} \min _{f \in \mathcal {F}_\ell }J(\mu _t,f):= \min _{f \in \mathcal {F}_\ell }\int _0^T\left( \frac{1}{2}\int _{\mathbb R^d} |x-\bar{x}|^2 \,\mu _t(dx) + \gamma \int _{\mathbb R^d} \Psi (f) \,\mu _t(dx) \right) \,dt, \end{aligned}$$(2.5)where \(\mu \in \mathcal {C}([0,T]; {\mathcal {M}}_1(\mathbb R^d))\) is a unique weak solution of
$$\begin{aligned} \partial _t \mu _t&= \nabla \cdot \left( \left(\mathcal {P}[\mu _t] + f\right)\mu _t \right), \quad (x,t) \in \mathbb R^d \times [0,T],\nonumber \\ \mathcal {P}[\mu _t](x)&= \int _{\mathbb R^d} P(x,y)(y-x)\mu _t(dy). \end{aligned}$$(2.6)
For the convergence from (2.3)–(2.4) to (2.5)–(2.6), we need a weak compactness result in \(\mathcal {F}_\ell \) whose proof can be found in [51, Corollary 2.7].
Lemma 2.2
Let \(p \in (1,\infty )\). Suppose that \((f_j)_{j \in \mathbb N} \in \mathcal {F}_\ell \) with \(\ell \in L^q(0,T)\) for \(1 \le q < \infty \). Then there exists a subsequence \((f_{j_k})_{k \in N}\) and a function \(f \in \mathcal {F}_\ell \) such that
i.e.,
Define the empirical measure \(\mu ^N\) associated to the particle system (2.4) as
Then we are now in a position to state our theorem on the mean field limit of the optimal control problem.
Theorem 2.3
Let \(T >0\). Suppose that \(P \in W^{1,\infty }(\mathbb R^{2d})\) and \(\Psi \) satisfies that there exist \(C \ge 0\) and \(1 \le q < \infty \)
Let \(\ell (t)\) be a fixed function in \(L^q(0,T)\). Furthermore we assume that \(\{x_i^0\}_{i=1}^N \subset B(0,R_0)\) for \(R_0 > 0\) independent of N. For all \(N \in \mathbb N\), let us denote the control function \(f_N \in \mathcal {F}_\ell \) as a solution of the finite dimensional optimal control problem (2.3)–(2.4). If there exits a compactly supported initial data \(\mu _0 \in {\mathcal {M}}_1(\mathbb R^d)\) such that \(\lim _{N \rightarrow \infty }\mathcal {W}_1(\mu _0^N, \mu _0)\), then there exists a subsequence \((f^{N_k}_t)_{k \in \mathbb N}\) and a function \(f^\infty _t\) such that \(f^{N_k}_t \rightarrow f^\infty _t\) in the sense of (2.7). Moreover, \(f^\infty _t\) and the corresponding \(\mu ^\infty _t\) are solutions of the infinite dimensional optimal control problem (2.5)–(2.6).
Proof
We first notice that the existence of an optimal control \(f^N_t\) on the time interval [0, T] for the finite dimensional optimal problem (2.3)–(2.4) can be obtained by using the weak compactness estimate in Lemma 2.2 together with the strong regularity of velocity field \(\mathcal P + f\), see [51, Theorem 3.3]. For any \(f \in \mathcal {F}_\ell ([0,T])\), let us denote \((\mu _f)^N_t\) by the solution to the equation (2.4) with the initial data \((\mu _f)_0^N\) satisfying \(\lim _{N \rightarrow \infty } \mathcal {W}_1((\mu _f)_0^N,\mu _0) = 0\). Let denote also by \(\mu ^{f_t}_t\) is a solution associated to (2.6) with the control \(f_t\) and that initial data \(\mu _0\), which is ensured by Theorem 2.1. Moreover, by Theorem 2.1, \(\lim _{N \rightarrow \infty } \mathcal {W}_1((\mu _f)_t^N, \mu _t^{f_t}) = 0\). On the other hand, it follows from Lemma 2.2 that there exists a subsequence \(f^{N_k}_t\) such that \(f^{N_k}_t \rightharpoonup f^\infty _t\) weakly* in \(L^q(0,T;W^{1,p}(\mathbb R^d))\) as \(k \rightarrow \infty \) for some \(f^\infty _t \in \mathcal {F}_\ell \). Let \(\mu ^{\infty }_t\) is the solution to (2.6) with the control function \(f^\infty _t\). Then, by the lower-semicontinuity of the onset functional, we get
where \(\mu ^{N_k}_t\) is a solution to the particle equation (2.4) with the optimal control \(f^{N_k}_t\). Then, due to the minimality of \(f^{N_k}_t\), it is clear that
We finally use the convergence of \(\lim _{k \rightarrow \infty } \mathcal {W}_1((\mu _f)_t^{N_k}, \mu _t^f) = 0\) together with the compactly supported solution \(\mu _t\) to have
Since \(f_t\) is arbitrarily chosen in \(\mathcal {F}_\ell ([0,T])\), this concludes
i.e., \(f^\infty _t\) is the optimal control for the problem (2.5)–(2.6). \(\square \)
2.2 Stochastic Case
In this section, we study the parabolic optimal control problem in a bounded domain. In this section we are to a certain extent inspired by the work [20]. As we are deviating from that in certain estimates, we take the burden somehow of presenting the results in more details than in the previous section.
Let \(\Omega \) denote an open, bounded, smooth subset of \(\mathbb R^d\). We first introduce function spaces:
and the set of admissible controls
for a given \(M>0\). Then our optimization problem is to show the existence of
where \(\mu \) is a weak solution to the following parabolic equation:
with the initial data
and the zero-flux boundary condition
where n(x) is the outward normal to \(\partial \Omega \) at the point \(x \in \partial \Omega \). Here the interaction term is given by
We next provide a notion of weak solution to Eq. (2.9).
Definition 2.3
For a given \(T >0\), a function \(\mu : \Omega _T \rightarrow [0,\infty )\) is a weak solution of Eq. (2.9) on the time-interval [0, T] if and only if
-
1.
\(\mu \in L^2(0,T;H^1(\Omega ))\) and \(\partial _t \mu \in L^2(0,T; H^{-1}_*(\Omega ))\).
-
2.
For any \(\varphi \in \mathcal {C}^1(\overline{\Omega _T})\) with \(\varphi (\cdot ,0) = \varphi (\cdot ,T) = 0\),
$$\begin{aligned} \int _0^T \int _\Omega \mu \partial _t \varphi +\left( \mathcal {P}[\mu ] \mu + f \mu - \sigma \nabla \mu \right)\cdot \nabla \varphi \,dx dt =0. \end{aligned}$$
Theorem 2.4
For a given \(T, M >0\), let \(f \in Q_M\) and \(\mu _0 \in L^2(\Omega )\). Furthermore, we assume \(P \in L^\infty (\Omega ^2)\). Then there exists a unique weak solution \(\mu \) to Eq. (2.9) in the sense of Definition 2.3.
Proof
Existence.- We first employ the following iteration scheme: Let \(\mu ^1(x,t) := \mu _0(x)\) for \((x,t) \in \Omega _T\). For \(n \ge 1\), let \(\mu ^{n+1}\) be the solution of
with the initial data \(\mu ^n(x)|_{t=0} = \mu _0(x)\) for all \(n \ge 1\) \(x \in \Omega \) and the zero-flux boundary conditions. It is clear that \(\int _\Omega \mu ^n(x,t)\,dx = \int _\Omega \mu _0(x)\,dx\). Note that for given \(\mu ^n \in V\) we can have a unique weak solution to Eq. (2.10) since \(\mathcal {P}[\mu ^n] \in L^\infty (\Omega )\) and \(f \in L^\infty (\Omega )\). We next show that \(\mu ^{n+1} \in V\). A straightforward computation yields
where \(I_2\) can be easily estimated as
For the estimate of \(I_1\), we use the fact that
to obtain
Combining the above estimates and choosing \(\epsilon < \sigma \), we find
Applying Gronwall’s inequality to the above differential inequality deduces
We also get that for all \(\psi \in H^1(\Omega )\)
Thus we obtain \(\partial _t \mu ^{n+1}\in L^2(0,T; H^{-1}_*(\Omega ))\) due to (2.11) and (2.12). This concludes \(\mu ^n \in V\) for all \(n \ge 2\). Note that this also implies \(\mu ^{n} \in \mathcal {C}([0,T];L^2(\Omega ))\) for all \(n \ge 2\). Indeed, we have
where C only depends on T. Then, by Aubin–Lions lemma, there exist a subsequence \(\mu ^{n_k}\) and a function \(\mu \in L^2(\Omega _T)\) such that
We next show that the above limiting function \(\mu \) solves Eq. (2.9) in the sense of Definition 2.3. For this, it suffices to take into account the interaction term \(\mathcal {P}[\mu ]\mu \) since the other terms are linear with respect to \(\mu \). Using the linearity of the functional \(\mathcal {P}\) together with (2.11) and the following fact
we get
where \(C_0 > 0\) is given by
Hence we have that the limiting function \(\mu \) satisfies
Uniqueness.- Let \(\mu _i,i=1,2\) be two solutions to Eq. (2.9) with initial data \(\mu _i(0) \in L^2(\Omega )\). Then, by using the similar estimate as in (2.14), we find
where \(C_\epsilon \) depends only on \(\Omega \), \(\epsilon \), \(\Vert \mu _1\Vert _{L^\infty (0,T;L^2)}\), and \(\Vert \mu _2(0)\Vert _{L^1}\). Finally, we apply the Gronwall’s inequality to the above differential inequality to get
where \(C_1\) depends only on \(T,\sigma ,\Vert \mu _2(0)\Vert _{L^2}, M, \Omega \), and \( \Vert \mu _1\Vert _{L^\infty (0,T;L^2)}\). This completes the proof. \(\square \)
Theorem 2.5
For a given \(T, M> 0\), let us assume \(\mu _0 \in L^2(\Omega )\). Furthermore, we assume that \(P \in L^\infty (\Omega ^2)\) and \(\Psi \) satisfies that for all \(R > 0\)
for some \(C > 0\). Then there exist \(f^\infty \in Q_M\) and the corresponding density \(\mu ^\infty \) solving the optimal control problem (2.8)–(2.9).
Proof
For \(f \in Q_M\), by Theorem 2.4, there exists a weak solution \(\mu \) in the sense of Definition 2.3. Note that \(0 \in Q_M\) and
where \(\mu ^0\) is a weak solution of Eq. (2.9) with \(f=0\). Since \(J(\mu ,f) \ge 0\) for all \((\mu ,f)\in V \times Q_M\), there exist a sequence \((f^j)_{j \in \mathbb N} \in Q_M\) and the corresponding density \((\mu ^j)_{j \in \mathbb N} \in V\) solving (2.9) such that
On the other hand, since \((\mu ^j,f^j)_{j \in \mathbb N} \in V \times Q_M\), by Banach–Alaoglu theorem, there exist a subsequence \((\mu ^{j_k},f^{j_k}) \in V \times Q_M\) and \((\mu ^\infty ,f^\infty )\in V \times Q_M\) such that
We next show that \((\mu ^\infty ,f^\infty )\) is a solution to (2.9). Since the term involving \(\mathcal {P}[\mu ]\) can be easily handled by using the similar estimate to (2.14) and the above strong convergence (2.15), it is enough to show that
for \(\phi \in L^2(0,T;H^1(\Omega ))\). For this, we decompose \(I_k\) into two parts as
Since
it is clear from (2.15) that \(I_k^1 \rightarrow 0\) as \(k \rightarrow \infty \). For the convergence of \(I_k^2\), we get
Thus we conclude that \((\mu ^\infty ,f^\infty )\) is a solution to (2.9). Furthermore, we obtain
due to \(|\Omega | < \infty \). We also find
More precisely, we estimate
where we used the convexity of \(\Psi \) and the positivity of \(\mu ^{j_k}\). We then claim that \(\lim _{k \rightarrow \infty } J_k^1 \ge 0\) and \(\lim _{k \rightarrow \infty } J_k^2 \ge 0\), and for this, we show that
For \(\phi \in \mathcal {C}_c(\Omega _T)\), we get
thus \(J_k^2 \rightarrow 0 \) as \(k \rightarrow \infty \). Similarly as before, for the estimate of \(J_k^1\), we use the fact that \(\nabla \Psi (f^\infty )\, \mu ^{j_k}\,\phi \in L^2(0,T; L^1(\Omega ))\) uniformly in k and \((L^2(0,T;L^1))' = L^2(0,T;L^\infty )\) to obtain \(J_k^1 \rightarrow 0\) as \(k \rightarrow \infty \). Then, this and together with de la Vallée–Poussin’s theorem, provides the semicontinuity (2.16). This yields
Hence we conclude
3 First Order Optimality Conditions
In this section, we derive first order optimality conditions for the mean field optimal control problem studied in Sect. 2:
where the control f is the solution of the minimization of the following cost functional:
3.1 Formal Derivation of the Optimality Conditions
Let us first write the Lagrangian of the mean field optimal control defined by (3.1) and (3.2), as follows
Integrating by parts and taking the terminal data \(\psi (x,T) = 0\), we get
where we omit the dependency on (x, t) where not necessary. We compute the functional derivatives of the Lagrangian with respect to the state function \(\mu \) and the control f,
Let \((\mu ^*,\psi ^*,f^*)\) be the solution to the optimal control problem. Then we have
This yields from (3.5) that
We also find from (3.6) that \(\psi ^*\) satisfies
or equivalently
due to (3.7), where \(\mu ^*\) satisfies
3.2 Rigorous Derivation of the Optimality Conditions
The first order optimality conditions (3.10) are of utmost relevance as they are often used for the numerical computation of mean field optimal controls and we show how to proceed for that in Sect. 5. Although they are very often formally derived, as we do above, and used in several contributions, see, e.g. [15], as a relatively straightforward consequence of the Lagrange multiplier theorem, we feel that presenting their rigorous derivation can be useful for a reader not familiar with such derivations. Moreover, by doing so, we highlight more precisely certain technical difficulties and aspects, which one may in fact encounter along the process, and are often left to a certain extent as for granted. Let us recall then the Lagrange multiplier theorem in Banach spaces.
Let X and Y be Banach spaces, and let a functional \(J: U(x^*) \subseteq X \rightarrow \mathbb R\) and a mapping \(G: U(x^*)\subseteq X \rightarrow Y\) be continuously differentiable on an open neighbourhood of \(x^*\). Consider the following optimal problem:
Then we recall the following first order optimality condition whose proof can be found in [86, Sect. 4.14].
Theorem 3.1
Let \(x^*\) be a solution to the problem (3.9), and let the range of the operator \(G'(x^*) : X \rightarrow Y\) be closed. Then there exists a nonzero pair \((\lambda ,p) \in \mathbb R\times Y'\) such that
where
Moreover, if Im \(G'(x^*) = Y\), then \(\lambda \ne 0\) in the above, thus we can assume that \(\lambda = 1\).
In order to apply the above theorem, we set
and
for \(\psi \in Y' = L^2(0,T;H^1_0(\Omega ))\). Then straightforward computations yield
for \((\nu ,\psi ) \in V \times Y'\), and
Note that the interaction terms on the right hand side of the equality for \(G'_\mu (\mu ,f)(\nu ,\psi )\) can be rewritten as
We now present our main result on the first order optimality condition in the theorem below.
Theorem 3.2
Let \((\mu ^*,f^*) \in V \times Q_M\) be a solution to the problem (3.1)–(3.2). Suppose that there exists a \(\mu _\ell > 0\) such that \(\mu ^* \ge \mu _\ell \) for all \((x,t) \in \Omega _T\). Then there exists \(\psi ^* \in Y'\) such that
Before presenting the proof of the first order optimality conditions (3.10), let us comment the positivity principle on the existence of \(\mu _\ell > 0\) such that \(\mu ^* \ge \mu _\ell \) for all \((x,t) \in \Omega _T\). If we assume that \(\mu _0, f, P \in \mathcal {C}^2\) and \(\mu _0\) is bounded from below by a positive constant, then by Feynman–Kac formula, we can show that \(\mu \) is bounded from below by some positive constant until the fixed time T. However, we a priori assume it to avoid any further stronger regularity assumption for the control f. Later, we will verify this property numerically in Sect. 5.
Proof
For the proof, we show that linear operators \(G'_\mu (\mu ^*,f^*): V \rightarrow Y\) and \(G'_f (\mu ^*,f^*): L^2(\Omega _T)\left(\supseteq Q_M\right) \rightarrow Y\) are surjective. Then, by Theorem 3.1, we conclude our desired results.
Surjectivity of \(G'_\mu (\mu ^*,f^*)\). Let \((\mu ^*,f^*) \in V \times Q_M\) be a solution to (3.1)–(3.2). We want to show that for any \(\eta \in Y\) there exists a \(\nu \in V\) such that
Note that finding the above equality is equivalent to show that for given \((\mu ^*,f^*,\eta ) \in V \times Q_M \times Y\), there exists a solution \(\nu \in V\) to the Cauchy problem:
with the initial data \(\nu _0 \in L^2(\Omega )\) and the boundary condition:
We notice that (3.11) is linear parabolic equation of \(\nu \). Thus the existence of \(\nu \in V\) is enough to show the following a priori estimates which are very similar to that in the proof of Theorem 2.4:
Here we used
and similarly
This yields
and
Surjectivity of \(G'_f(\mu ^*,f^*)\). For \(\xi \in Y\), we first consider the following weak formulation of Poisson equation:
where we already took account the space–time decomposition of the test function. Note that solving Eq. (3.12) is equivalent to finding \(u \in L^2(0,T;H^1_0(\Omega ))\) such that
with
and \((\cdot ,\cdot )\) is the inner product in \(L^2(\Omega _T)\). Due to Poincaré inequality, we find that \(a(\cdot ,\cdot )\) is an inner product on \(L^2(0,T;H^1_0(\Omega ))\) with the induced norm:
Define
Then this functional is continuous on \(L^2(0,T;H^1_0(\Omega ))\) since \(|F(v)| \le \Vert f\Vert _{L^2(0,T;H^{-1})}\Vert v\Vert _{L^2(0,T;H^1_0)}\). Thus, by Riesz representation theorem, there exists a unique \(u \in L^2(0,T;H^1_0(\Omega ))\) solving Eq. (3.12).
We now get back to our original problem. Our goal was to show that for given \(\mu ^* \in V\) and \(\xi \in Y\), there exists a function \(g \in L^2(\Omega _T)\) such that
Then we now construct the solution g to the above equation by
where the existence of \(u \in L^2(0,T;H^1_0(\Omega ))\) was guaranteed in the beginning of the proof. Moreover, by the assumption \(\mu ^*(x,t)>\mu _\ell > 0\) in \(\Omega \times [0,T]\), we have
due to \(u \in L^2(0,T;H^1_0(\Omega ))\). This completes the proof. \(\square \)
4 Hierarchy of Controls via Boltzmann Equation
For large values of N, the solution of finite horizon control problems of the type (1.2)–(1.3) through standard methods stumble upon prohibitive computational costs, due to the nonlinear constraints and the lack of convexity in the cost. Although mean field optimal controls (1.4)–(1.6) are designed to be independent of the number N of agents to provide a way to circumvent the course of dimensionality of \(N\rightarrow \infty \), still their numerical computation needs to be realized by solving the first-order optimality conditions. The complexity of their solution depends on the intrinsic dimensionality d of the agents, which is affordable only at moderate dimensions (e.g., \(d \le 3\)). In order to tackle these difficulties, we introduce a novel reduced setting, by introducing a binary dynamics whose evolution can be described by means of a Boltzmann-type equation, [3, 69]. Hence we will show that this description, under a proper scaling [80, 83], converges to the mean field equation (1.4), [6, 36, 80]. This type of approach allows to embed the control dynamics into two different ways:
-
(i)
we can assume the control f to be a given function, possibly obtained from the solution of the optimal control problem (1.2)–(1.3);
-
(ii)
alternatively, the control is obtained as a solution of the reduced optimal control problem associated to the dynamics of two single agents. We refer to this approach as binary control.
Similar ideas have been used in a control context in [5,6,7, 45, 49]. We devote the forthcoming sections to show different strategies to derive such binary controls. Thus we want to approach the mean field optimal control problem (1.2)–(1.3) as the last step of a control hierarchy, starting from an instantaneous control strategy and going towards a binary Hamilton–Jacobi–Bellmann control.
4.1 Binary Controlled Dynamics
We consider the discrete controlled system (1.2)–(1.3) in the simplified case of only two interacting agents \((x_i(t),x_j(t))\) and in absence of noise, i.e. \(\sigma = 0\). Hence, by defining the sample time \(\Delta t\) such that \(t_m = m\Delta t\), so that \(0=t_0<\cdots<t_m<\cdots <t_M=T\) and introducing a forward Euler discretization, we write (1.2) as follows
where from now on we denote the control pair \(u:=(u_i,u_j)\) associated to the state variable \(x:=(x_i,x_j)\), and having used the compact notation for \(x^m_i=x_i(t_m), u^m_i=u_i(t_m)\).
The discretized form for the functional (1.3) for the binary dynamics (4.1) reads
where the stage cost is given by
In the following we propose two alternative methods in order to characterize \(u_i,u_j\) as (sub-)optimal feedback controller. In both cases, we will consider the controlled dynamics in the deterministic case. Nonetheless, we will show in Sect. 5.3 that such controls are robust with respect to the presence of noise, (\(\sigma > 0\)) and they shall be employed in the corresponding stochastic setting as well.
4.1.1 Instantaneous Control
A first approach towards obtaining a low complexity computational realization of the solution of the optimal control problem (4.1)–(4.2) is the so-called model predictive control (MPC). This strategy furnishes a suboptimal control by an iterative solution over a sequence of finite time steps, representing the predictive horizon [6, 8, 64]. Since we are only interested in instantaneous control strategies, we limit the MPC method to a single time prediction horizon, therefore we reduce the original optimization into the minimization on every time interval \([t_m,t_{m+1}]\) of the following functional
Note that from (4.1) we have that \(x^{m+1}\) depends linearly on \(u^m\), thus
can be directly computed from the following system
In the case of a quadratic penalization of the control, i.e. \(\Psi (c) := |c|^2/2\), we can furnish the following explicit expression for the minimizers
hence (4.5) gives a feedback control for the full binary dynamics, which can be plugged as an instantaneous control into (4.1).
Remark 4.1
Note that the instantaneous control (4.6) embedded into the discretized dynamics (4.1), is of order \(o(\Delta t)\). To obtain an effective contribution of the control in the dynamics we will assume that the penalization parameter \(\gamma \) scales with the time discretization, in this way the leading order is recovered, [6, 8], e.g. for \(\gamma = \Delta t \bar{\gamma }\) we have
4.1.2 Finite Horizon Optimal Control
The instantaneous feedback control derived in the previous section is the optimal control action for the binary system with a single step prediction horizon. An improved, yet more complex optimal feedback synthesis can be performed by considering an extended finite horizon control problem. Let us define the value function associated to the finite horizon discrete cost (4.2) as
with terminal condition \(V(x_i,x_j,t_M)=0\). It is well-known that the application of the Dynamic Programming Principle [13] with the discrete time dynamics (4.1) characterizes the value function as the solution of the following recursive Bellman equation
where \(x=(x_i,x_j)\), \(u=(u_i,u_j)\), and \(F(x_i,x_j) := (P(x_i,x_j)(x_j-x_i),P(x_i,x_j)(x_j-x_i))\). Once this functional relation has been solved, for every time step the optimal control is recovered from the optimality condition as follows
As in the expression (4.5), this optimal control is also in feedback form, depending not only on the current states of binary system \((x_i,x_j)\), but also on the discrete time variable \(t_m\).
Remark 4.2
The system (4.9) is a first-order approximation of the Hamilton–Jacobi–Bellman equation
related to the continuous time optimal control problem. In fact, this latter equation corresponds to the adjoint (3.6) when the nonlocal integral terms are neglected, and therefore this approach although optimal for the binary system, cannot be expected to satisfy the optimality system (3.5)–(3.6) related to the mean field optimal control problem.
4.2 Boltzmann Description
We introduce now a Boltzmann framework in order to describe the statistical evolution of a system of agents ruled by binary interactions, [8, 69].
Let \(\mu (x,t)\) denote the kinetic density of agents in position \(x\in \Omega \) at time \(t\ge 0\), such that the total mass is normalized
and the time evolution of the density \(\mu \) is given as a balance between the bilinear gain and loss of the agents position due to the binary interaction. In a general formulation, we assume that two agents have positions \(x, y\in \Omega \) and modify their positions according to the following rule
where \((x^*,y^*)\) are the post-interaction positions, the parameter \(\alpha \) measures the influence strength of the different terms, \((\xi ,\zeta )\) is a vector of i.i.d. random variables with a symmetric distribution \(\Theta (\cdot )\) with zero mean and variance \(\sigma \), and \(U_\alpha (x,y,t)\) indicates the forcing term due to the control dynamics.
We consider now a kinetic model for the evolution of the density \(\mu =\mu (x,t)\) of agents with \(x \in \mathbb R^d\) at time \(t\ge 0\) and ruled by the following Boltzmann-type equation
where the interaction operator \(Q_{\alpha }(\mu ,\mu )\) in (4.13), accounts the loss and gain of agents in position x at time t, as follows
where \((x_*,y_*)\) are the pre-interaction positions that generate arrivals (x, y). The bilinear operator \(Q_\alpha (\cdot ,\cdot )\) includes the expectation value with respect to \(\xi ^x\) and \(\xi ^y\), while \(\mathcal {J}_\alpha \) represents the Jacobian of the transformation \((x,y)\rightarrow (x^*,y^*)\), described by (4.12). Here \(\mathcal {B}_*=\mathcal {B}_{(x_*,y_*)\rightarrow (x,y)} \) and \(\mathcal {B}=\mathcal {B}_{(x,y)\rightarrow (x^*,y^*)} \) are the transition rate functions. More into the details we take into account
as the functions with an interaction rate \(\eta >0\), and where \(\chi _\Omega \) is the characteristic function of the domain \(\Omega \). Note that in this case the transition functions depends on the relative position, similarly to [80], as we introduced a bounded domain \(\Omega \) into the dynamics. A major simplification occurs in the case the bounded domain is preserved by the binary interactions itself, therefore the transition is constant and the interaction operator (4.14) reads
In [6, 80] authors showed that in opinion dynamics binary interactions are able to preserve the boundary, according to the choice of a small support of the symmetric random variable \(\xi \) and introducing a suitable function D(x) acting as a local weight on the noise in (4.12).
In the next section we will perform the analysis of this model in the simplified case of \(\Omega = \mathbb {R}^d\) and constant rate of interaction \(\eta \).
Remark 4.3
Note that the binary dynamics (4.12) is equivalent to the Euler–Maruyama discretization for Eq. (1.2) in the two agents case
where we impose that \(\alpha = \Delta t/2\), \(\alpha U_\alpha (x_i,x_j) = \Delta t U^m_{ij}\), and \(\sqrt{2\alpha }\xi = \sqrt{2\sigma }\Delta B^m_i\) is a random variable normally distributed with zero mean value and variance \(\Delta t\), for \(\Delta B^m_i\) defined as the \(\Delta B^m_i=B_i(t_{m+1})-B_i(t_m)\).
4.2.1 The Quasi-Invariant Limit
We consider now the Boltzmann operator (4.15) in the case \(\Omega = \mathbb R^d\), and in order to obtain a more regular description we introduce the so-called quasi-invariant interaction limit, whose basic idea is considering a regime where interactions strength is low and frequency is high. This technique, analogous to the grazing collision limit in plasma physics, has been thoroughly studied in [83] and specifically for first order models in [36, 80], and allows to pass from Boltzmann equation (4.13) to a mean field equation of the Fokker–Planck-type, [5, 6]. In order to state the main result we start fixing some notation and terminology.
Definition 4.1
(Multi-index) For any \(a \in \mathbb {N}^d\) we set \(|a| = \sum ^d_{i = 1} a_i\), and for any function \(h \in C^q(\mathbb R^d \times \mathbb R^d,\mathbb R)\), with \(q\ge 0\) and any \(a \in \mathbb {N}^d\) such that \(|a| \le q\), we define for every \((x,v) \in \mathbb R^d \times \mathbb R^d\)
with the convention that if \(a= (0, \ldots , 0)\) then \(\partial ^{a}_x h(x) := h(x)\).
Definition 4.2
(Test functions) We denote by \(\mathcal {T}_{\delta }\) the set of compactly supported functions \(\varphi \) from \(\mathbb R^{d}\) to \(\mathbb R\) such that for any multi-index \(a \in \mathbb {N}^d\) we have,
-
1.
if \(|a| < 2\), then \(\partial ^{a}_x \varphi (\cdot )\) is continuous for every \(x \in \mathbb R^d\);
-
2.
if \(|a| = 2\), then there exists \(C > 0\) such that, \(\partial ^{a}_x \varphi (\cdot )\) is uniformly Hölder continuous of order \(\delta \) for every \(x \in \mathbb R^d\) with Hölder bound C, that is for every \(x,y \in \mathbb R^d\)
$$\begin{aligned} \left\| \partial ^{a}_x \varphi (x) - \partial ^{a}_x \varphi (y) \right\| \le C \left\| x - y \right\| ^{\delta }, \end{aligned}$$and \(\Vert \partial ^{a}_x \varphi (x)\Vert \le C\) for every \(x \in \mathbb R^{d}\).
Definition 4.3
(\(\delta \)-weak solution) Let \(T > 0\), \(\delta > 0\), we call a \(\delta \) -weak solution of the initial value problem for Eq. (4.13), with initial datum \(\mu ^0=\mu (x,0) \in \mathcal {M}_0(\mathbb R^{d})\) in the interval [0, T], if \(\mu \in L^2([0,T], \mathcal {M}_0(\mathbb R^{d}))\) such that, \(\mu (x,0) = \mu ^0(x)\) for every \(x \in \mathbb R^{d}\), and there exists \(R_T > 0\) such that \(\text {supp}(\mu (t)) \subset B_{R_T}(0)\) for every \(t \in [0,T]\) and \(\mu \) satisfies the weak form of Eq. (4.13), i.e.,
for all \(t \in (0,T]\) and all \(\varphi \in \mathcal {T}_{\delta }\), where
Moreover, we assume that
-
(a)
the system (4.12) constitutes invertible changes of variables from (x, y) to \((x^*,y^{*})\);
-
(b)
there exists an integrable function K(x, y, t) such that the following limit is well defined
$$\begin{aligned} \lim _{\alpha \rightarrow 0}U_\alpha (x,y,t) = K(x,y,t). \end{aligned}$$(4.19)In the case of instantaneous control of type (4.6), we can explicitly give an expression to the limit as \(K(x,y,t) = (\bar{x}-x)/\gamma \).
We state the following theorem.
Theorem 4.4
Let us fix a control \(U_\alpha \in {\mathcal {U}}\) and \(\alpha \ge 0\), and \(T > 0\), \(\delta > 0\), \(\varepsilon >0\), and assume that density \(\Theta \in {\mathcal {M}}_{2+\delta }(\mathbb R^d)\) and the function \(P(\cdot ,\cdot )\in L^q_{loc}\) for \(q = 2, 2+\delta \) and for every \(t\ge 0\). We consider a \(\delta \)-weak solution \(\mu \) of Eq. (4.13) with initial datum \(\mu _0(x)\). Thus introducing the following scaling
for the binary interaction (4.12) and defining by \(\mu ^\varepsilon (x,t)\) a solution for the scaled equation (4.13), for \(\varepsilon \rightarrow 0\) \(\mu ^\varepsilon (x,t)\) converges pointwise, up to a subsequence, to \(\mu (x,t)\) where \(\mu \) satisfies the following Fokker–Planck-type equation,
with initial data \(\mu _0(x)=\mu (x,0)\) and where \(\mathcal {P}\) represents the interaction kernel (1.5) and \(\mathcal {K}\) is the control.
with K(x, y, t) defined as in (4.19).
Proof
\(\bullet \) Taylor approximation We consider the weak formulation of the Boltzmann equation (4.17) and we expand \(\varphi (x^{*})\) inside the operator (4.18) in Taylor series of \(x^* - x\) up to the second order, obtaining
where the first and second order terms are
and \(R_1^{\varphi }({\varepsilon })\) is the reminder of the Taylor expansion, with a form
with \(\overline{x} := (1-\theta ) x^* + \theta x\), for some \(\theta \in [0,1]\). By using the relation given by the scaled interaction rule (4.12), i.e.
where for the sake of brevity we denoted \(F_\alpha (x,y) := P(x,y)(y-x) +U_\alpha (x,y)\). Note that from the hypothesis it follows that \(F_\alpha \in L^q_{loc}\). Thus we obtain
where the noise term, \(\xi \) is canceled out since it has zero mean. For the same reason in the second order term \(T^\varphi _2\) all the mixed products between \(F_\alpha \) and \(\xi \) vanish, the same hold for all the crossing terms \(\xi _{i} \xi _{j}\) since \(\xi _{i}\) are supposed to be independent variables. Hence the only contribution we have reads
\(\bullet \) Quasi-invariant limit We now introduce the scaling (4.20), for which we can substitute in the previous equations, \(\eta \alpha =1\) and \(\eta \alpha ^2={\varepsilon }\), thus we have that terms \(T_1^\varphi \) and \(T_{22}^\varphi \) represent the leading order and \(R^\varphi ({\varepsilon }) := R^\varphi _1+R^\varphi _2\) a reminder, so we can recast the scaled expression (4.23) as follows
Let us now consider the limit \(\varepsilon \rightarrow 0\), assuming that for every \(\varphi \in \mathcal {T}_{\delta }\)
holds true, we have thanks to (4.19) and (4.26) that the weak scaled Boltzman equation (4.17) converges pointwise to the Fokker–Planck-type equation (4.21) as follows
where the operators \(\mathcal {P}[\mu ]\) and \(\mathcal {K}[\mu ]\) are defined in (1.5) and (4.22). Since \(\varphi \) has compact support, Eq. (4.28) can be revert in strong form by means of integration by parts, we eventually obtain system (4.21).
\(\bullet \) Estimates for the reminder In order to conclude the proof it is sufficient to show that the limit (4.27) for \(R^\varphi ({\varepsilon })\) vanishes. From the definition of \(\overline{x}\) it follows that \(\left\| \overline{x} -x\right\| \le \left\| x^* - x \right\| \), then for every \(\varphi \in \mathcal {T}_{\delta }\) we have
Hence for \(R_1^\varphi \) we get
from the inequality \(|a+b|^{2+\delta }\le 2^{2+2\delta }(|a|^{2+\delta }+|b|^{2+\delta })\) for some a, b we obtain
Analogous computation can be yield for \(R_2^\varphi \) for which we have the following inequality
Since \(F_{\varepsilon }\in L^q_{loc}\) for \(q=2,2+\delta \) and \(\Theta \in {\mathcal {M}}_{2+\delta }(\mathbb R^d)\) we can conclude that for \({\varepsilon }\rightarrow 0\) the limit (4.27) holds true. \(\square \)
Remark 4.4
Note that in the case \(U_\alpha (x,y,t)=U_\alpha (x,t)\), namely if the feedback control depends only by the position x of the agents at time t, then the kernel \({\mathcal {K}}[\mu ](x,t)\) reduces to K(x, t). This observation holds also if we consider a sampling from the optimal control, i.e. \(U_\alpha (x,y,t) = f(x,t)\), thus Eq. (4.21) becomes exactly the original equation (1.2).
5 Numerical Methods
In this section we are concerned with the development of numerical methods for the mean field optimal control problem (1.2)–(1.3). First, we present direct simulation Monte Carlo methods for the constrained Boltzmann-type model (4.13), and discuss the implementation of the binary feedback controllers introduced in Sect. 4.1. Next, we describe a sweeping algorithm based on the iterative solution of the optimality system, (3.1)–(3.8).
5.1 Asymptotic Constrained Binary Algorithms
One of the most common approaches to solve Boltzmann-type equations is based on Monte Carlo methods. For this, we consider the initial value problem given by Eq. (4.13), in the grazing interaction regime (4.20), with initial data \(\mu (x,t=0)=\mu _0(x)\), as follows
Here we have made explicit the dependence of the interaction operator \(Q_{\varepsilon }(\cdot ,\cdot )\) on the frequency of interactions \(1/\varepsilon \), and decomposing it into its gain and loss parts according to (4.15). With \(Q^{+}_\varepsilon (\cdot ,\cdot )\) we denote the gain part, which accounts the density of agents gained at position x after the binary interaction (4.12).
We tackle the Boltzmann-type equation (5.1) by means of a binary interaction algorithm [3, 69], where the basic idea is to solve the binary exchange of information described by (4.12), under the grazing interaction scaling (4.20), in order to obtain in the limit an approximate solution of the mean field equation (4.21). Note that the consistency of this procedure is given by Theorem 4.4.
Let us now consider a time interval [0, T] discretized in \(M_{tot}\) intervals of size \(\Delta t\). We denote by \(\mu ^m\) the approximation of \(\mu (x,m\Delta t)\), thus the first order forward scheme of the scaled Boltzmann-type equation (5.1) reads
where, since \(\mu ^m\) is a probability density, thanks to mass conservation, and also \(Q_\varepsilon ^{+}(\mu ^m,\mu ^m)\) is a probability density. Under the restriction \(\Delta t\le \varepsilon \), \(\mu ^{m+1}\) is a probability density, since it is a convex combination of probability densities.
From a Monte Carlo point of view, Eq. (5.2) can be interpreted as follows: an individual with position x will not interact with other individuals with probability \(1-\Delta t/\varepsilon \) and it will interact with others with probability \(\Delta t/\varepsilon \) according to the interaction law stated by \(Q_\varepsilon ^{+}(\mu ^m,\mu ^m)\). Note that, since we aim at small values of \(\varepsilon \) and we have to fulfill the condition \(\Delta t\le {\varepsilon }\), the natural choice is to take \(\Delta t=\varepsilon \). At every time step, this choice maximizes the number of interactions among the agents.
For the numerical treatment of the operator \(Q_\varepsilon ^{+}(\mu ^m,\mu ^m)\), we have to account that, every interaction includes action of the feedback control. In the case of instantaneous control this can be evaluated directly, for example in the case of quadratic functional defining the scaling version of (4.7) as
On the other hand, the realization of the optimal feedback controller in the finite horizon setting requires the numerical approximation of the Bellman equation (4.9). This approximation is performed offline and only once, previous to the simulation of the mean field model. For a state space of moderate dimension, such as in our binary model, several numerical schemes for the approximation of Hamilton–Jacobbi–Bellman equations are available, and we refer the reader to [47, Chap. 8] for a comprehensive description of the different available techniques. Since the binary model is already introduced in discrete time, a natural choice is to solve eq. (4.9) by means of a sequential semi-Lagrangian scheme, following the same guidelines as in the recent works [11, 48, 57]. Once the value function has been approximated, online feedback controllers can be implemented through the evaluation of the optimality condition (4.10).
We report in Algorithm 1 a stochastic procedure to solve (5.2), based on Nanbu’s method for plasma physics, [3, 16].
Where function \(\textsc {Iround}(\cdot )\) denotes the integer stochastic rounding defined as
with \(\zeta \) a uniform [0, 1] random number and \([\cdot ]\) the integer part.
Remark 5.1
(Efficency) In general, computing the interactions among a multi-agent system is a procedure of quadratic cost with respect to the number of agents, since every agent needs to evaluate its influence with every other. Note that with the proposed algorithm this cost becomes linear with respect to the number of samples introduced \(O(N_s)\), since only binary interactions are accounted. A major difference compared to standard algorithms for Boltzmann equations is the way in which particles are sampled from \(Q_{\varepsilon }^+(\mu ^m,\mu ^m)\) which does not require the introduction of a space grid [16].
Remark 5.2
(Accuracy) The choice \(\Delta t=\varepsilon \) is optimal if \({\varepsilon }\) is of the order of \(O({N_s}^{-1/2})\). Indeed, the accuracy of the method will not increase for smaller values of \(\Delta t\), because the numerical error is dominated by the fluctuations of the Monte Carlo method. For further details we refer to [3, 69].
5.2 Numerical Approximation of the Optimality Conditions
As shown in Sect. 3, the solution of the mean field optimal control problem (3.1)–(3.2) satisfies the optimality system
5.2.1 Forward Equation
In order to solve Eq. (5.3), we consider a first order forward scheme the time evolution and the Chang–Cooper scheme for the space discretization, [31]. The formulation is based on the finite volume approximation of the density \(\mu \) and f. Defining the operator \(\mathcal {G}[\mu ,f] := \mathcal {F}[\mu ,f] +\sigma \nabla \mu \), with \(\mathcal {F}[\mu ,f] = \mathcal {P}[\mu ] + f\), then we can write in the one-dimensional domain \([-L,L]\) the (semi)-discretized equation (5.3) as
where we have introduced the uniform grid \(x_{i}=-L+i\delta x\), \(i=0,\ldots ,N,\) with \(\delta x = 2L/N\), and denoted by \(x_{i \pm 1/2}=x_i \pm \delta x/2\). Thus, the operator \(\mathcal {G}_{i+1/2}[\mu ,f]\) in the case of constant diffusion \(\sigma \) reads
where the weights \(\theta _{i+1/2}\) are in general depending on the solution and the parameters of Eq. (5.3). Hence the flux functions are defined as a combination of upwind and centered discretizations, and such that for \(\sigma = 0\) the scheme reduces to an upwind scheme, i.e. \(\theta _{i+1/2} = 0\). The choice of the weights is the key point of the scheme (5.6), which allows to preserve steady state solutions and the non-negativity of the numerical density. We refer to [9, 19, 31] for the details on the properties and analysis of the Chang–Cooper scheme for similar Fokker–Planck models and to [75], and references therein, for applications to control problems.
Alternatively, scheme (5.2) furnishes a consistent method to solve the forward equation (5.3), which we expect to be more efficient for problems with high dimensionality, since it relies on a stochastic evaluation of the nonlocal operator \(\mathcal {P}[f]\).
5.2.2 Backward Equation
The main difficulty of the integro-differential advection-reaction-diffusion equation (5.4) resides on the efficient approximation of the integral term. We follow a finite difference approach, which we describe in the following. First, with time parameter \(\delta t\) as in the forward problem, we consider the first-order temporal approximation
where \(\psi ^M=0\). At this level, f, \(\mu \), and \(\nabla \psi \) are treated as external data available at every discrete instance. In particular \(\nabla _y\) (inside the integral) is reconstructed by numerical differentiation. Then, the integral terms are evaluated with a Monte Carlo method generating \(M_s\) samples according to the distribution \(\mu \), and values of \(\nabla _y\psi \) are obtained by interpolation of the reconstructed variable. The advection term is approximated with a space-dependent upwind scheme, and diffusion is approximated with centered differences.
5.2.3 Optimality Condition and Sweeping Iteration
Once the forward–backward system has been discretized, what remains is to establish a coupling procedure in order to find the solution of the optimality system matching both initial and terminal conditions. For this, a first possibility is to consider the full space–time discretization of the forward–backward system, together with the optimality condition \(\nabla \Psi (f) = {-\frac{1}{\gamma }\nabla \psi }\), and cast it as a large-scale set of nonlinear equations, which can be solved via a Newton method. This idea has been already successfully applied in the context of mean field games in [2]. We pursue a different approach that has proven to be equally effective, developed in [26], where the authors apply a sweeping algorithm, which in our setting reads as follows.
Our numerical experience is consistent with what has been already reported in [26], in the sense that solutions satisfying the optimality system can be found after few sweeps. Convergence of a similar sweeping iteration in the context of mean-field games has been recently proven in [25]. An alternative approach is to follow a gradient-type method, as in [20].
5.3 Numerical Experiments
In order to validate our previous analysis we focus on models for opinion dynamics, [53, 69, 78, 80], thus in the unidimensional case the state variable \(x\in [-L,L]\) represents the agent opinion with respect to two opposite opinions \(\{-L,+L\}\), and the control f(x, t) can be interpreted as the strategy of a policy maker, [5, 6].
Therefore we consider the following initial value problem
with no-flux boundary conditions, and where f denotes the control term, solution of
where we consider a quadratic penalization of the control, i.e. \(\Psi (c) = |c|^2/2\).
For different interaction kernels \(P(\cdot ,\cdot )\), we will study the performance of the proposed controllers \(f=f(x,t)\), obtained through the following synthesis procedures: instantaneous control (IC), finite horizon (FH), and the sweeping algorithm (OC).
We report in Table 1 the choice of the algorithms and parameters, indicating for which method they have been used to compute (5.8)–(5.9).
5.3.1 Test 1: Sznajd Model
We consider the Sznajd model, [10, 78] for which the interaction operator \(P(\cdot ,\cdot )\) in (5.8) is defined as follows
for \(\beta \) a constant. Note that in this case the interaction kernel \(P(\cdot ,\cdot )\) models the propensity of voters to change their opinions within the domain \(\Omega = [-1,1]\), and for values close to the extremal opinions \(\{-1,1\}\) the influence is low, conversely for opinions close to zero the influence is high. The dynamics is such that for \(\beta >0\) concentration of the density profile appears, whereas for \(\beta <0\) separation occurs, namely concentration around \(x=1\) and \(x=-1\), see [10].
For our first test we fix \(\beta =-1\) and we define in the time interval [0, T], \(T= 8\). We solve the control problem (5.8)–(5.9), with a bivariate initial data \(\mu ^0(x) := \varrho _+(x+0.75;0.05,0.5)+\varrho _+(x-0.5;0.15,1),\) where \(\varrho _+(y;a,b) := \max \{-(y/b)^2+a,0\}\), with diffusion coefficient \(\sigma = 0.01\), and desired state \(\bar{x} = -0.5\).
In Fig. 1 we depict the final state of (5.10) at time \(T = 8\) for the uncontrolled and controlled dynamics. The simulations show the concentration of the profiles around the reference position \(\bar{x}\) in presence of the control, instead in the uncontrolled case the density tends to concentrate around the boundary. The left-hand side figure refers to a penalization of the control \(\gamma = 0.5\), the right-hand side figure with \(\gamma = 0.05\). As expected, with smaller control penalizations, the final state is driven closer to the desired reference.
Figure 2 illustrates the transient behavior of the density \(\mu (x,t)\) and the control f(x, t) in the \([-1,+1]\times [0,T]\) frame, respectively for \(\gamma =0.5\) and \(\gamma = 0.05\), and we report the values of the cost function \(J(\mu ,f)\) corresponding to the different methods. Note that that the action of the instantaneous control is almost constant in time steering the system toward \(\bar{x}\) but with the higher cost \(J(\mu ,f)\), on the other hand the optimal finite horizon for the binary dynamics (FH) produces a similar control with respect to the optimal control obtained by the sweeping algorithm (OC), with a small difference between the values of the cost functional.
5.3.2 Test 2: Hegselmann–Krause Model
In this second test we consider the mean field Hegselmann–Krause model [53], also known as bounded confidence model, whose interaction kernel reads
This type of model describes the propensity of agents to interact only within a confidence range \(K=[x-\kappa ,x+\kappa ]\) of their opinion x, in the present experiment we fix \(\kappa = 0.15\). Thus we study the evolution of the control problem (5.8)–(5.9) up to time \(T = 20\) with initial data defined as \(\mu ^0(x) = C_0(0.5+\epsilon (1-x^2)),\) for \(\epsilon = 0.01\) and \(C_0\) such that the total density is a probability distribution. The diffusion coefficient is \(\sigma = 10^{-5}\), the penalization parameter \(\gamma = 2.5\), and the desired state \(\bar{x} = 0\).
The uncontrolled evolution of this model shows the emergence of multiple clusters, as it is shown in the top picture of Fig. 3, due to the small value of \(\kappa \) and small diffusion. Figure 3 depicts the transient behavior of the density \(\mu (x,t)\) and the control signal f(x, t) in the frame \(\Omega \times [0,T]\).
We observe in Fig. 3 that for the instantaneous control (IC), consensus is slowly reached with a cost functional value of \(J_{IC}(\mu ,f)=0.8807\); the finite horizon control (FH) and the solution of the optimality conditions (OC) are able to steer faster the system towards \(\bar{x}\), respectively with cost \(J_{FH}(\mu ,f)= 0.6079\), and \({J_{OC}=0.5080}\).
These experiments are showing very clearly the hierarchy of the controls (IC) \(\rightarrow \) (FH) \(\rightarrow \) (OC). In particular, it is evident the quasi-optimality of (FH), to the extent that we can claim (FH) \(\approx \) (OC). The intuition is that (FH) is an optimal control on the binary dynamics of two particles, and, through the Boltzmann collisional operator, its binary optimality is “smeared” over the entire population. However, we have no quantitative method yet to assess such an approximation. In fact, as commented in Remark 4.2, although the (FH) fulfills a Hamilton–Jacobi–Bellman equation, its synthesis by means of (4.22) to control (4.21) unfortunately does not fulfill the backward equation (5.4) of the optimality conditions, even not approximately: by testing (4.22) within (5.4), there a few useful cancellations, but, because of lack of symmetry, certain terms remains, whose magnitude is still hard to estimate. We expect that those terms are actually not so large and this would somehow justify the quasi-optimality of (FH). This issue remains an interesting open problem.
6 Concluding Remarks
In this paper, we have presented a hierarchy of control designs for mean field dynamics. At the bottom of the hierarchy, we have introduced optimal feedback controls which are derived for two-agent models, and which are subsequently realized at the mean field level through a Boltzmann approach. At the top of the hierarchy, one finds the mean field optimal control problem and its correspondent optimality conditions. In both cases, we presented a theoretical and numerical analysis of the proposed designs, as well as computational implementations. From the numerical experiments presented in the last section, we observe that although the numerical realization of the mean field optimality system yields the best controller in terms of the cost functional value, feedback controllers obtained for the binary system perform reasonably well, and provide a much simpler control synthesis. We expect to further proceed along this direction of research, in particular in relation to the computation of feedback controllers via Dynamic Programming and Hamilton–Jacobi–Bellman equations for the binary system, as it provides a versatile framework to address different control problems.
References
Achdou, Y., Laurière, M.: Mean field type control with congestion. Appl. Math. Optim. 73(3), 393–418 (2016)
Achdou, Y., Camilli, F., Capuzzo-Dolcetta, I.: Mean field games: numerical methods for the planning problem. SIAM J. Control Optim. 50(1), 77–109 (2012)
Albi, G., Pareschi, L.: Binary interaction algorithms for the simulation of flocking and swarming dynamics. Multiscale Model. Simul. 11, 1–29 (2013)
Albi, G., Pareschi, L.: Modeling of self-organized systems interacting with a few individuals: from microscopic to macroscopic dynamics. Appl. Math. Lett. 26, 397–401 (2013)
Albi, G., Pareschi, L., Zanella, M.: Boltzmann-type control of opinion consensus through leaders. Philos. Trans. R. Soc. A 372, 20140138/1–20140138/18 (2014)
Albi, G., Herty, M., Pareschi, L.: Kinetic description of optimal control problems and applications to opinion consensus. Commun. Math. Sci. 13(6), 1407–1429 (2015)
Albi, G., Bongini, M., Cristiani, E., Kalise, D.: Invisible control of self-organizing agents leaving unknown environments. SIAM J. Appl. Math. 76, 1683–1710 (2016)
Albi, G., Pareschi, L., Toscani, G., Zanella, M.: Recent advances in opinion modeling: control and social influence. In: Bellomo, N., Degond, P., Tadmor, E. (eds.) Active Particles, Vol. 1: Theory, Methods, and Applications. Birkhauser-Springer, Boston (2016)
Albi, G., Pareschi, L., Zanella, M.: Opinion dynamics over complex networks: kinetic modeling and numerical methods. Kinet. Relat. Mod. 10(1), 1–32 (2017)
Aletti, G., Naldi, G., Toscani, G.: First-order continuous models of opinion formation. SIAM J. Appl. Math. 67(3), 837–853 (2007)
Alla, A., Falcone, M., Kalise, D.: An efficient policy iteration algorithm for dynamic programming equations. SIAM J. Sci. Comput. 37(1), A181–A200 (2015)
Ballerini, M., Cabibbo, N., Candelier, R., Cavagna, A., Cisbani, E., Giardina, L., Lecomte, L., Orlandi, A., Parisi, G., Procaccini, A., Viale, M., Zdravkovic, V.: Interaction ruling animal collective behavior depends on topological rather than metric distance: evidence from a field study. PNAS 105(4), 1232–1237 (2008)
Bellman, R., Kalaba, R.E.: Dynamic Programming and Modern Control Theory, vol. 81. Citeseer (1965)
Benamou, J.-D., Brenier, Y.: A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem. Numer. Math. 84(3), 375–393 (2000)
Bensoussan, A., Frehse, J., Yam, P.: Mean Field Games and Mean Field Type Control Theory. Springer, New York (2013)
Bobylev, A., Nanbu, K.: Theory of collision algorithms for gases and plasmas based on the Boltzmann equation and the Landau-Fokker-Planck equation. Phys. Rev. E 61(4), 4576 (2000)
Bongini, M., Fornasier, M.: Sparse stabilization of dynamical systems driven by attraction and avoidance forces. Netw. Heterog. Media 9(1), 1–31 (2014)
Bongini, M., Fornasier, M.: Sparse control of multiagent systems. In: Bellomo, N., Degond, P., Tadmor, E. (eds.) Active Particles, vol. 1: Theory, Methods, and Applications. Birkhauser-Springer, Boston (2016)
Buet, C., Dellacherie, S.: On the Chang and Cooper scheme applied to a linear Fokker-Planck equation. Commun. Math. Sci. 8(4), 1079–1090 (2010)
Burger, M., Francesco, M.D., Markowich, P.A., Wolfram, M.-T.: Mean field games with nonlinear mobilities in pedestrian dynamics. Discret. Contin. Dyn. Syst. Ser. B 19(5), 1311–1333 (2014)
Camazine, S., Deneubourg, J., Franks, N., Sneyd, J., Theraulaz, G., Bonabeau, E.: Self-organization in Biological Systems. Princeton University Press, Princeton (2003)
Camilli, F., Jakobsen, E.R.: A finite element like scheme for integro-partial differential Hamilton-Jacobi-Bellmann equations. SIAM J. Numer. Anal. 47(4), 2407–2431 (2009)
Cañizo, J.A., Carrillo, J.A., Rosado, J.: A well-posedness theory in measures for some kinetic models of collective motion. Math. Models Methods Appl. Sci. 21(3), 515–539 (2011)
Caponigro, M., Fornasier, M., Piccoli, B., Trélat, E.: Sparse stabilization and optimal control of the Cucker-Smale model. Math. Control Relat. Fields 3, 447–466 (2013)
Cardaliaguet, P., Hadikhanloo, S.: Learning in mean field games: the fictitious play. ESAIM: COCV 23(2), 569–591 (2017)
Carlini, E., Silva, F.J.: A fully discrete semi-Lagrangian scheme for a first order mean field game problem. SIAM J. Numer. Anal. 52(1), 45–67 (2014)
Carrillo, J.A., D’Orsogna, M.R., Panferov, V.: Double milling in self-propelled swarms from kinetic theory. Kinet. Relat. Models 2(2), 363–378 (2009)
Carrillo, J.A., Fornasier, M., Toscani, G., Vecil, F.: Particle, kinetic, and hydrodynamic models of swarming. In: Naldi, G., Pareschi, L., Toscani, G., Bellomo, N. (eds.) Mathematical Modeling of Collective Behavior in Socio-Economic and Life Sciences. Modeling and Simulation in Science, Engineering and Technology, pp. 297–336. Birkhäuser Boston, Boston (2010)
Carrillo, J.A., Choi, Y.-P., Hauray, M.: The derivation of swarming models: mean-field limit and Wasserstein distances. In: Muntean, A., Toschi, F. (eds.) Collective Dynamics from Bacteria to Crowds. CISM International Centre for Mechanical Sciences, pp. 1–46. Springer, New York (2014)
Carrillo, J.A., Choi, Y.-P., Pérez, S.: A review on attractive–repulsive hydrodynamics for consensus in collective behavior. In: Bellomo, N., Degond, P., Tadmor, E. (eds.) Active Particles. Modeling and Simulation in Science, Engineering and Technology, vol. 1, pp. 259–298. Birkhäuser, Cham (2017)
Chang, J., Cooper, G.: A practical difference scheme for Fokker-Planck equations. J. Comput. Phys. 6(1), 1–16 (1970)
Choi, Y.-P.: Global classical solutions of the Vlasov-Fokker-Planck equation with local alignment forces. Nonlinearity 29(7), 1887–1916 (2016)
Choi, Y.-P., Ha, S.-Y., Li, Z.: Emergent dynamics of the Cucker–Smale flocking model and its variants. In: Bellomo, N., Degond, P., Tadmor, E. (eds.) Active Particles. Modeling and Simulation in Science, Engineering and Technology, vol 1, pp. 299–331. Birkhäuser, Cham (2017)
Chuang, Y., D’Orsogna, M., Marthaler, D., Bertozzi, A., Chayes, L.: State transition and the continuum limit for the 2D interacting, self-propelled particle system. Physica D 232, 33–47 (2007)
Chuang, Y., Huang, Y., D’Orsogna, M., Bertozzi, A.: Multi-vehicle flocking: scalability of cooperative control algorithms using pairwise potentials. In: IEEE International Conference on Robotics and Automation, pp. 2292–2299 (2007)
Cordier, S., Pareschi, L., Toscani, G.: On a kinetic model for a simple market economy. J. Stat. Phys. 120(1–2), 253–277 (2005)
Couzin, I., Franks, N.: Self-organized lane formation and optimized traffic flow in army ants. Proc. R. Soc. Lond. B 270, 139–146 (2002)
Couzin, I., Krause, J., Franks, N., Levin, S.: Effective leadership and decision making in animal groups on the move. Nature 433, 513–516 (2005)
Cristiani, E., Piccoli, B., Tosin, A.: Modeling self-organization in pedestrians and animal groups from macroscopic and microscopic viewpoints. In: Naldi, G., Pareschi, L., Toscani, G., Bellomo, N. (eds.) Mathematical Modeling of Collective Behavior in Socio-Economic and Life Sciences, Modeling and Simulation in Science. Engineering and Technology. Birkhäuser Boston, Boston (2010)
Cristiani, E., Piccoli, B., Tosin, A.: Multiscale modeling of granular flows with application to crowd dynamics. Multiscale Model. Simul. 9(1), 155–182 (2011)
Cucker, F., Dong, J.-G.: A general collision-avoiding flocking framework. IEEE Trans. Autom. Control 56(5), 1124–1129 (2011)
Cucker, F., Mordecki, E.: Flocking in noisy environments. J. Math. Pures Appl. (9) 89(3), 278–296 (2008)
Cucker, F., Smale, S.: Emergent behavior in flocks. IEEE Trans. Autom. Control 52(5), 852–862 (2007)
Cucker, F., Smale, S., Zhou, D.: Modeling language evolution. Found. Comput. Math. 4(5), 315–343 (2004)
Degond, P., Herty, M., Liu, J.-G.: Meanfield games and model predictive control. Comm. Math. Sci. 15(5), 1403–1422 (2017)
Duan, R., Fornasier, M., Toscani, G.: A kinetic flocking model with diffusion. Commun. Math. Phys. 300, 95–145 (2010)
Falcone, M., Ferretti, R.: Semi-Lagrangian Approximation Schemes for Linear and Hamilton-Jacobi Equations. Society for Industrial and Applied Mathematics, Philadelphia (2013)
Festa, A.: Reconstruction of independent sub-domains for a class of Hamilton–Jacobi equations and application to parallel computing. ESAIM: M2AN 50(4), 1223–1240 (2016)
Festa, A., Wolfram, M.-T.: Collision avoidance in pedestrian dynamics. In: 2015 54th IEEE Conference on Decision and Control (CDC), pp. 3187–3192 (2015)
Filippov, A.F.: Differential Equations with Discontinuous Righthand Sides. Mathematics and Its Applications. Kluwer, Dordrecht (1988)
Fornasier, M., Solombrino, F.: Mean-field optimal control. ESAIM Control Optim. Calc. Var. 20(4), 1123–1152 (2014)
Grégoire, G., Chaté, H.: Onset of collective and cohesive motion. Phys. Rev. Lett. 92(2), 025702 (2004)
Hegselmann, R., Krause, U.: Opinion dynamics and bounded confidence: models, analysis and simulation. J. Artif. Soc. Soc. Simul. 5(3), 1–33 (2002)
Hofbauer, J., Sigmund, K.: Evolutionary Games and Population Dynamics. Cambridge University Press, Cambridge (1998)
Huang, M., Caines, P., Malhamé, R.: Individual and mass behaviour in large population stochastic wireless power control problems: centralized and Nash equilibrium solutions. In: Proceedings of the 42nd IEEE Conference on Decision and Control Maui, Hawaii, USA, December 2003, pp. 98–103 (2003)
Jadbabaie, A., Lin, J., Morse, A.S.: Correction to: “Coordination of groups of mobile autonomous agents using nearest neighbor rules”. IEEE Trans. Autom. Control 48(9), 1675 (2003)
Kalise, D., Kröner, A., Kunisch, K.: Local minimization algorithms for dynamic programming equations. SIAM J. Sci. Comput. 38(3), A1587–A1615 (2016)
Ke, J., Minett, J., Au, C.-P., Wang, W.-Y.: Self-organization and selection in the emergence of vocabulary. Complexity 7, 41–54 (2002)
Keller, E.F., Segel, L.A.: Initiation of slime mold aggregation viewed as an instability. J. Theor. Biol. 26(3), 399–415 (1970)
Koch, A., White, D.: The social lifestyle of myxobacteria. Bioessays 20, 1030–1038 (1998)
Lacker, D.: Limit theory for controlled McKean-Vlasov dynamics. SIAM J. Control Optim. 55(3), 1641–1672 (2017)
Lasry, J.-M., Lions, P.-L.: Mean field games. Jpn. J. Math. (3) 2(1), 229–260 (2007)
Leonard, N., Fiorelli, E.: Virtual leaders, artificial potentials and coordinated control of groups. In: Proceeding of 40th IEEE Conference on Decision and Control, pp. 2968–2973 (2001)
Mayne, D.Q., Rawlings, J.B., Rao, C.V., Scokaert, P.O.M.: Constrained model predictive control: stability and optimality. Autom. J. IFAC 36(6), 789–814 (2000)
Motsch, S., Tadmor, E.: Heterophilious dynamics enhances consensus. SIAM Rev. 56(4), 577–621 (2014)
Niwa, H.: Self-organizing dynamic model of fish schooling. J. Theor. Biol. 171, 123–136 (1994)
Nuorian, M., Caines, P., Malhamé, R.: Synthesis of Cucker–Smale type flocking via mean field stochastic control theory: Nash equilibria. In: Proceedings of the 48th Allerton Conference on Communication, Control, and Computing, Monticello, Illinois, pp. 814–819, September 2010, pp. 814–815 (2010)
Nuorian, M., Caines, P., Malhamé, R.: Mean field analysis of controlled Cucker–Smale type flocking: linear analysis and perturbation equations. In: Proceedings of 18th IFAC World Congress Milano (Italy), 28 August–2 September 2011, pp. 4471–4476 (2011)
Pareschi, L., Toscani, G.: Interacting Multi-agent Systems. Kinetic Equations & Monte Carlo Methods. Oxford University Press, Oxford (2013)
Parrish, J., Edelstein-Keshet, L.: Complexity, pattern, and evolutionary trade-offs in animal aggregation. Science 294, 99–101 (1999)
Parrish, J., Viscido, S., Gruenbaum, D.: Self-organized fish schools: an examination of emergent properties. Biol. Bull. 202, 296–305 (2002)
Perea, L., Gómez, G., Elosegui, P.: Extension of the Cucker-Smale control law to space flight formations. AIAA J. Guid. Control Dyn. 32, 527–537 (2009)
Perthame, B.: Transport Equations in Biology. Birkhäuser, Basel (2007)
Romey, W.: Individual differences make a difference in the trajectories of simulated schools of fish. Ecol. Model. 92, 65–77 (1996)
Roy, S., Annunziato, M., Borzì, A.: A Fokker-Planck feedback control-constrained approach for modeling crowd motion. J. Comput. Theor. Transp. 45(6), 442–458 (2016)
Short, M.B., D’Orsogna, M.R., Pasour, V.B., Tita, G.E., Brantingham, P.J., Bertozzi, A.L., Chayes, L.B.: A statistical model of criminal behavior. Math. Models Methods Appl. Sci. 18(suppl.), 1249–1267 (2008)
Sugawara, K., Sano, M.: Cooperative acceleration of task performance: foraging behavior of interacting multi-robots system. Physica D 100, 343–354 (1997)
Sznajd-Weron, K., Sznajd, J.: Opinion evolution in closed community. Int. J. Mod. Phys. C 11(06), 1157–1165 (2000)
Toner, J., Tu, Y.: Long-range order in a two-dimensional dynamical xy model: how birds fly together. Phys. Rev. Lett. 75, 4326–4329 (1995)
Toscani, G.: Kinetic models of opinion formation. Commun. Math. Sci. 4(3), 481–496 (2006)
Vicsek, T., Zafeiris, A.: Collective motion. Phys. Rep. 517, 71–140 (2012)
Vicsek, T., Czirok, A., Ben-Jacob, E., Cohen, I., Shochet, O.: Novel type of phase transition in a system of self-driven particles. Phys. Rev. Lett. 75, 1226–1229 (1995)
Villani, C.: On a new class of weak solutions to the spatially homogeneous Boltzmann and Landau equations. Arch. Ration. Mech. Anal. 143(3), 273–307 (1998)
Villani, C.: Optimal Transport, vol. 338 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer, Berlin (2009)
Yates, C., Erban, R., Escudero, C., Couzin, L., Buhl, J., Kevrekidis, L., Maini, P., Sumpter, D.: Inherent noise can facilitate coherence in collective swarm motion. Proc. Natl Acad. Sci. U.S.A. 106, 5464–5469 (2009)
Zeidler, E.: Applied Functional Analysis. Applied Mathematical Sciences. Springer, New York (1995)
Acknowledgements
G.A., Y.P.C., and M.F. acknowledge the support of the ERC-Starting Grant HDSPCONTR “High-Dimensional Sparse Optimal Control”. Y.P.C. is also supported by the Alexander Humboldt Foundation through the Humboldt Research Fellowship for Postdoctoral Researchers. D.K. acknowledges the support of the ERC-Advanced Grant OCLOC “From Open-Loop to Closed-Loop Optimal Control of PDEs”.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Albi, G., Choi, YP., Fornasier, M. et al. Mean Field Control Hierarchy. Appl Math Optim 76, 93–135 (2017). https://doi.org/10.1007/s00245-017-9429-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00245-017-9429-x