1 Introduction

The dynamics of pedestrian crowds can exhibit highly complex phenomena, which stem from the complexity of the cognitive processes behind human actions but also from self-organization, emerging from the combination of simple interaction rules. The ubiquity of self-organization of several interacting agents system has been proved in many different fields (see [9, 19, 24, 29, 30, 46]). To model mathematically such complex behaviors, researchers resorted to many different approaches, which encompass different scales. This chapter focuses on a measure-theoretic approach, which allows to combine different scales, taking advantage of different modeling capabilities.

One of the most famous models at microscopic level is the celebrated social force model, proposed by Helbing and Molnár [27]. The main idea is that the reaction of pedestrians to the environment can be modeled as the effect of forces. The latter are not true physical forces but rather a modeling abstraction to represent the reactions of human as social being and thus are referred to as social forces. Such model is mainly based on a desired velocity, depending on the single pedestrian characteristics and goals, and on terms representing interactions among pedestrians and with the environment (such as walls and barriers). Interestingly enough, such model shares many similarities with models proposed, independently, by biologists for animal groups. Moreover, the social force model is similar to the well-known Cucker-Smale alignment model, which has been studied extensively by the applied mathematics community and was first defined to model the dynamics of languages.

As for dilute gases theory, one may pass to the limit in the number of agents (here instead of particles) and achieve mean-field limit equations, usually of Vlasov-Poisson type [12, 23]. Moreover, such equations allow, for nonlocal velocity fields, a well-developed theory for measure solutions and convergence analysis using the Wasserstein distance. The latter, widely used in optimal transport [45], metrizes the weak convergence on compact sets. These facts naturally call to use measures to represent the dynamics of crowds. Moreover, a measure can naturally represent different scales, with Dirac masses corresponding to microscopic components and absolutely continuous measures to macroscopic components.

One can define general nonlinear nonlocal transport equations, which include both the microscopic models as empirical measure solutions and the macroscopic mean-field limits as absolutely continuous solutions. The theory for such equations is strongly based on Wasserstein distances. In particular, Lipschitz conditions w.r.t. the Wasserstein distance for measures and uniform norm for vector fields allow to prove existence and uniqueness of solutions. The same result is not true if one uses the total variation norm, which corresponds to the L 1 distance for functions. The Wasserstein distance also has modeling advantages, as explained in Sect. 2.5.

After revising the theory developed in recent works, we propose a new modeling framework for crowds. The latter is based on the idea of using the mass to represent the social influence of a pedestrian. In other words, a bigger mass would represent a pedestrian with higher effects on the others. We propose a dynamic model where the masses may vary in time. The variation will depend on the mass of the pedestrian under consideration, but also on the interaction with the other pedestrians. We first detail the microscopic model, providing also analytical properties. Natural modeling choices lead to lack of conservation of the mass; thus we resort to the generalized Wasserstein distance, which is defined for measures with different masses. It is possible to define the mean-field limit of these models with time-varying mass. The obtained equations exhibit transport as well as source terms. Using the generalized Wasserstein distance, it is then possible to develop a complete theory for such equations.

To complete the presentation, we include simulation results which show the difference between mass-preserving and mass-varying models for an evacuation problem.

2 Microscopic and Multi-scale Models

The main idea behind the use of measure-theoretic models is the possibility of representing different scales in a unique framework. For this purpose, we will first recall some microscopic models, which provide the basis of ingredients for single pedestrian motion. There is a wealth of mesoscopic and macroscopic models as well: they can be either obtained as mean-field limits of microscopic models or they are based on general principles, such as conservation of mass and balance of momentum. We will not include a review of mesoscopic and macroscopic models and refer the reader to [3, 5,6,7, 17] for details. Then we introduce the measure-theoretic approach which allows the inclusion of different scales in a unique framework.

2.1 Microscopic: The Social Force Models

The most used microscopic model for pedestrian motion and crowd dynamics is the celebrated social force model first introduced by Helbing and Molnár (see, for instance, [27]). The popularity of the model is mostly due to the fact that it is relatively simple yet capable of capturing various self-organization phenomena observed in crowd dynamics. The work of Helbing and Molnár was inspired by the previous work of Lewin, which considered forces to represent the influence of the environment on social behavior [31]. The main concept behind such approach is the idea that variations in velocity of pedestrians (physically accelerations and decelerations) are caused as reactions to the perceived environment, including the presence of other pedestrians, and can be mimicked by forces. The latter are not real forces but rather the effect of “social” interactions with the environment; this explains the name of the model.

Due to its popularity, many authors contributed to variations of the original model: listing all the proposed model would require too much space; thus we will rather point out the main variations considered. Notice that many effects can be neglected if pedestrians are assumed to be dimensionless points in the space, but more realistically each pedestrian should be modeled according to the space occupied.

Let us start indicating by x i the position of the i-the pedestrian in a walkable area \(\varOmega \subset \mathbb {R}^2\). One may consider also the case of \(\varOmega \subset \mathbb {R}^3\), but this is much less common. Each pedestrian possesses a desired velocity \(\bar {v}_i\), which is usually the vector pointing toward the desired destination and having modulus equal to a comfort speed. Each pedestrian tends to reach the desired velocity at a given relaxation time τ; thus, a first force is described by:

$$\displaystyle \begin{aligned} \begin{array}{rcl} F_i(v_i)=\frac{\bar{v}_i-v_i}{\tau}. {}\end{array} \end{aligned} $$
(1)

Notice that F i may also depend on the position x i and time t.

Pedestrians do interact among each other. These interactions may include repulsion effects, when the distance among pedestrians is lower than a desired personal space, and attraction. Both effects can be taken in to account by some attraction-repulsion potential, giving rise to the forces:

$$\displaystyle \begin{aligned} \begin{array}{rcl} F_{ij}(x_i,x_j,v_i,v_j)=F_{int}(x_j-x_i,v_j-v_i)=\nabla \varPhi(x_j-x_i, v_j-v_i). {}\end{array} \end{aligned} $$
(2)

Notice that terms of this type are very common for models used in other domains, such as animal groups. See also Sect. 2.2.

Then one considers the presence of walls and other obstacles characterizing the environment. The interactions with the environment can be captured by potentials which depend only on the position and speed of the pedestrian, thus giving rise to forces of the type:

$$\displaystyle \begin{aligned} \begin{array}{rcl} F_{E}(x_i,v_i)=\nabla\varPsi(x_i,v_i). {}\end{array} \end{aligned} $$
(3)

As mentioned above, many variations have been proposed of the original social force model, including body compression, sliding frictions, other frictions, group forces, and other. While the modeling of the forces F i, F ij, and F E appears to be comparable with experimental results, the other terms are usually less easy to tune with data [4].

One of the main assumptions of the social force model is the summability of the effects of the different forces. This is clearly an idealization, that is, most of times the working assumption. Summarizing, given N pedestrians in position x i and having speed v i, their dynamics is described by the system of ordinary differential equations:

$$\displaystyle \begin{aligned} \left\{ \begin{array}{l} \dot{x}_i=v_i\\ \dot{v}_i=F_i(v_i)+\sum_j F_{ij}(x_i,x_j,v_i,v_j)+F_{E}(x_i,v_i). \end{array} \right. \end{aligned} $$
(4)

If the functions F i, F ij, and F E are Lipschitz continuous, then for every initial condition x 0 = (x 1(0), …, x N(0)), v 0 = (v 1(0), …, v N(0)), there exists a unique solution. The only singularity usually considered occurs when the repulsion component of F ij is unbounded for x j − x i tending to 0. This is a well-studied problem in many different fields, for instance, for its application to conflict resolution in aviation [44], robot groups [36], and general mathematical models [11].

Oftentimes researchers include uncertainties by adding stochastic terms. The social force has similarities with various microscopic and kinetic approaches to gas and fluid dynamics, and a wide literature is available, including stochastic models (see, for instance, [13] and references therein). However, to our knowledge, most researchers using the social force model focus on Langevin-type approach for simulations, rather than investigating the mathematically rigorous aspects.

2.1.1 Panic

From a modeling point of view, a lot of attention was devoted to distinguish situations where pedestrians behave normally to those of emergency situations where the rational behavior ceases and other phenomena occur. Many authors refer to “panic” for such situations. The social force model includes panic situations by appropriately modifying the involved forces. Notice that in most panic situation one should include the role played by the mechanics of the pedestrian body; indeed contacts and interactions occur in a fully 3D situation rather than the usual 2D ones. Authors refer to these forces as body forces. One of the most known phenomena is the formation of arches at exits that usually slow down or even block the flow through doors or other restricted passages. A full treatment of this situation goes beyond the scope of this paper, and we refer the reader to [17, 33] for details and references. Let us just mention that a wealth of models were proposed for pedestrian motion at nanoscopic level, i.e., considering also the dynamics of the pedestrian body. One of the most celebrated is the Laumond model [2], based on lab experiments. The same problem has been studied also in [14]. See also [22] for a model combining the Laumond and social force model.

2.2 Microscopic: Models for Animal Groups

As mentioned above, a parallel literature was developed for animal group dynamics. Let us just review which are the main ingredients of the models commonly used for animal groups to point out the similarities with social force models and the differences. We notice that such approach was applied to many different species, including fishes, birds, mammals, and others. We refer the reader to [16] for a more extensive discussion.

Microscopic models for animal group dynamics are also based mainly on attraction and repulsion forces. One has to notice that models are either of Newtonian type, i.e., mimicking physical forces as in the social force model, or first order, i.e., prescribing directly the speed of single animals. The modeling explanation of first-order models is based on the fact that animals (but also pedestrians) have high capability of changing their speed quickly in many situations; thus, the control they exert on their motion tends to overcome physical forces. Regarding energy, even if theoretically it is possible to write an energy balance equation, the latter would have to take into account internally stored energy of the animals and thus encompass different time scales and very complex energy processes. A complete debate goes beyond the scope of this paper, but we think that both approaches do have merits. The role of attraction is much better understood and modeled in animal groups with respect to pedestrian: it is based on advantages in foraging, mating, and escaping predators. Also attraction is differentiated between group attraction, where attraction depends and acts only on distances and could enter first-order models, and alignment, where attraction acts on velocities and depends on distances, as usual for Newtonian models. Another feature is the fact that the presence of leaders is well discussed in the literature and models may or may not have leader(s). Also in this case, the biological explanations for the presence or absence of leaders appear to be well developed.

In biological literature there are two elements of key importance: the number of interacting members of the group and the shape of interaction zone. Models are classified as metric or topological. The former refers to interaction occurring with all mates present in the interaction zones (thus a variable number). The latter refers to interactions with a fixed number of mates (ordered, for instance, by distance). The interaction zone is usually different depending on the acting force (attraction, repulsion, and alignment) and not isotropic, to reflect the animal body and eyes’ positions (or position of other sensing organs).

We notice that most social force models tend to consider all pedestrians to interact with each other. This is not realistic since pedestrians, as animals, tend to interact with closest neighbors or in restricted interaction zones. The assumption of all agents interacting renders the mean-field limit approach easier to manage (see Sect. 3), but formal limits are possible also with topological type models (see [26]).

2.3 Microscopic: Cucker-Smale Model

A special role in the literature is played by the well-known Cucker-Smale model (CS) [18] model. Interestingly enough, many authors consider this model a prototype for alignment (thus considered mainly as an animal group model or even aviation model [37]). However, the model was first introduced to study the linguistic dynamics. The CS model has many similarities with the social force ones and was definitely the most studied in the applied mathematics community (see, for instance, [10, 12, 25] and reference therein). The Cucker-Smale model reads:

$$\displaystyle \begin{aligned} \begin{cases} \dot{x}_i(t)&=v_i(t)\\ \dot{v}_i(t)&=\frac{1} {N}\sum_{j=1}^N a(\Vert x_{j}(t) - x_{i}(t)\Vert )(v_j(t)-v_i(t)), \end{cases} \qquad \qquad i=1,\ldots,N \end{aligned} $$
(5)

where \(x_i\in \mathbb {R}^d\), \(v_i\in \mathbb {R}^d\), and a ∈ C 1([0, +)) is a nonincreasing positive function, called interaction potential or rate of communication. In the original paper, the author set \(a(s)=\frac {1}{(1+s^2)^\beta }\), with β > 0. Notice that the state for each agent is given by the couple (x i, v i) and, as in alignment models, the final configuration will promote consensus on the variable v i.

2.4 Multi-scale Models

We start here by introducing a multi-scale model based on time-evolving measures. The main idea is that a microscopic dynamics as well as a macroscopic one with nonlocal interactions can be included together in a single equation for a measure, which possesses an atomic part (representing the microscopic component) and an absolutely continuous part (representing the macroscopic component).

To deal with the general case, we will consider a measure μ which evolves in time according to a velocity field v. The system is then written as first-order, or single, equation but can easily encompass Newton-type models as we will explain later on. Therefore, the main modeling aspect of the multi-scale model is the velocity field v, which has to account for the various “forces” described above. Once a velocity field is assigned, the evolution equation for a measure μ t = μ(t) is formally written as:

$$\displaystyle \begin{aligned} \frac{\partial\mu_t}{\partial t}+\nabla\cdot{(\mu_t v)}=0,{} \end{aligned} $$
(6)

together with an initial condition μ(0) = μ 0. The equation must be interpreted in weak sense, i.e., for every ϕ smooth with compact support and almost every t we have:

$$\displaystyle \begin{aligned} \frac{d}{dt}\int_{\mathbb{R}^d}\phi(x)\,d\mu_t(x)=\int_{\mathbb{R}^d}v(t,\,x)\cdot\nabla{\phi}(x)\,d\mu_t(x), \end{aligned}$$

where we assume that the integral on the right-hand side is well defined, which amounts to integrability of v w.r.t. μ t uniformly in t, and the map t → μ t is continuous for the weak-∗ topology.

We now discuss possible choices for the velocity field v considering the general situation, i.e., v = v[μ]. As for the social force model, v must take into account a desired velocity v d which depends only on the position x of the pedestrian. Such velocities are usually determined by a final destination and point toward it. If the pedestrian would just follow the integral curves v d, then she would reach the final destination avoiding obstacles. If other pedestrians are present in the environment, then we assume there is another velocity component called interaction velocity, which corresponds to the tendency of avoiding more crowded zones. Clearly, v i = v i[μ] because it depends on the position of other pedestrians, thus on the whole measure μ. The main mathematical question is the expected regularity of v d and v i for models reflecting the social force and other approaches. It is natural to assume that v d is locally Lipschitz and locally bounded; thus trajectories of v d exist and are unique. This automatically implies existence and uniqueness of weak solutions to (6) (see, for instance, [45]).

The regularity of v i is more delicate. The main purpose of v i is to model the attraction-repulsion with other pedestrians. For simplicity we limit to repulsion which acts on areas close to the pedestrian position; thus, it is more problematic for possible presence of singularities. We assume that there exists a kernel function \(\eta :\, \mathbb {R}^d \rightarrow [ 0,+\infty )\), representing a weighted interaction potential with nearby pedestrians. A possibility is the following: define the center of mass of the crowd w.r.t. η by

$$\displaystyle \begin{aligned}x^*:=\frac{\int_{\mathbb{R}^d}y\,\eta(x-y)\,d\mu(y)}{\int_{\mathbb{R}^d}\eta(x-y)\,d\mu(y)},\end{aligned}$$

and set

$$\displaystyle \begin{aligned} \begin{array}{rcl} v^i\left[ \mu \right] (x):=(x-x^*) f\left( \int_{\mathbb{R}^d}\eta(x-y)\,d\mu(y) \right). {}\end{array} \end{aligned} $$
(7)

where f is a nondecreasing function. In simple words, the velocity field drives away from the weighted barycenter x with strength depending on the crowding. To avoid singularities, we set \(v^i\left [ \mu \right ] (x)=0\) when \(\int _{\mathbb {R}^d}\eta (x-y)\,d\mu (y)=0\).

The main question we address now is the well-posedness of the transport equation with nonlocal velocity (6). Notice that \(v^i=v^i\left [ \mu \right ] \); therefore, the equation is nonlinear in μ. With this goal, we first introduce the main analytic tool to study such equations, that is, the Wasserstein distance. Then, we recall our main results of existence and uniqueness of solutions to (6). Finally some possible choices of velocities (7) for crowd models are presented, discussing the regularity of the corresponding transport equation.

2.4.1 The Wasserstein Distance

In this section, we briefly recall the definition and the key properties of the Wasserstein distance, referring to [45] for a complete overview. We need first to introduce few concepts of general measure theory.

We denote by \(\mathcal {M}\) the set of positive Radon measures with finite mass. If \(\mu '\in \mathcal {M}\) is absolutely continuous with respect to \(\mu \in \mathcal {M}\), we write μ′≪ μ. If μ′≪ μ and μ′(A) ≤ μ(A) for all Borel sets, we write μ′≤ μ. Given \(\mu \in \mathcal {M}\), we denote with \(|\mu |:=\mu (\mathbb {R}^d)\) its norm (or total mass). More in general, if μ = μ + − μ is a signed Borel measure, we have |μ| := |μ +| + |μ |. Such norm defines a distance in \(\mathcal {M}\), that is, |μ − ν|.

Given two positive measures μ, ν, one can always write in a unique way μ = μ ac + μ s such that μ ac ≪ ν and μ s ⊥ ν, i.e., there exists B such that μ s(B) = 0 and \(\nu (\mathbb {R}^d\setminus B)=0\). This is the Lebesgue decomposition theorem. Then, it exists a unique f ∈ L 1() such that ac(x) = f(x) (x). Such function is called the Radon-Nikodym derivative of μ with respect to ν. We denote it with D ν μ and we have \(|\mu _{ac}|=\int |{D_{\nu }\mu }|\,d\nu \). For more details, see, e.g., [21].

Given a Borel map \(\gamma :\, \mathbb {R}^d \rightarrow \mathbb {R}^d\), one can consider the following action on a measure \(\mu \in \mathcal {M}\), called the push-forward of measures:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \gamma\#\mu(A):=\mu(\gamma^{-1}(A)). \end{array} \end{aligned} $$

An evident property is that the mass of μ, i.e., \(\mu (\mathbb {R}^d)\) is identical to the mass of γ#μ. Then, given two measures μ, ν with the same mass, it is natural to seek for a γ such that ν = γ#μ, in which case we say that γ sends μ to ν. One can add a cost integrating over the distances covered by the masses moved by γ. More precisely, we define the cost of a map as:

$$\displaystyle \begin{aligned}I\left[ \gamma \right] :=|\mu|{}^{-1}\,\int_{\mathbb{R}^d} |x-\gamma(x)|{}^p \,d\mu(x).\end{aligned}$$

This means that each infinitesimal mass δμ is sent to δν and that its infinitesimal cost is related to the p-th power of the distance between them. The problem of finding a map γ realizing such minimum is known as the Monge problem and was first formulated in 1791. A minimizing γ exists only for special μ, ν and p. Indeed, there exist simple examples of μ, ν for which a γ sending μ to ν does not exist. For example, the measures μ = 2δ 1 and ν = δ 0 + δ 2 on the real line have the same mass, but there exists no γ with ν = γ#μ. The main issue is that a map γ cannot separate masses.

One could resort to multifunctions, to send masses to different locations. However, we need to split mass in all possible ways, and this is naturally realized by a probability measure π on the product space \(\mathbb {R}^d\times \mathbb {R}^d\), seen as a generalization of a function mapping one measure onto the other. Each infinitesimal mass at a location x is sent to a location y with a probability given by π(x, y). Formally, π is “sending” the measure μ onto ν if the following holds:

$$\displaystyle \begin{aligned} \begin{array}{rcl} |\mu|\,\int_{\mathbb{R}^d} d\pi(x,\cdot)=d\mu(x),\qquad \qquad |\mu|\,\int_{\mathbb{R}^d} d\pi(\cdot,y)=d\nu(y). {}\end{array} \end{aligned} $$
(8)

Such a probability measure π is called a transference plan between μ and ν, and the set of transference plans between μ and ν is denoted by Π(μ, ν). The condition (8) is equivalent to ask for all \(f,g\in C^\infty _c(\mathbb {R}^d)\) the following equality:

$$\displaystyle \begin{aligned}|\mu|\,\int_{\mathbb{R}^d\times \mathbb{R}^d} (f(x)+g(y))\,d\pi(x,y) = \int_{\mathbb{R}^d} f(x)\,d\mu(x)+ \int_{\mathbb{R}^d} g(y)\,d\nu(y).\end{aligned}$$

Following the same logic as for maps, one defined the cost of a transference plan π as

$$\displaystyle \begin{aligned}J\left[ \pi \right] :=\int_{\mathbb{R}^d\times\mathbb{R}^d} |x-y|{}^p \,d\pi(x,y).\end{aligned}$$

The problem of minimizing J over the set Π(μ, ν) is known as the Monge-Kantorovich problem. The Monge-Kantorovich problem is a generalization of the Monge one: Given a γ, with γ#μ = ν, a transference plan can be defined by π = (Id × γ). In other words, \(d\pi (x,y)=\mu (\mathbb {R}^d)^{-1}\, d\mu (x)\delta _{y=\gamma (x)}\). It is easy to check that \(J\left [ \mathrm {Id}\times \gamma \right ] =I\left [ \gamma \right ] \). Notice that there always exists a transference plan between \(\mu ,\nu \in \mathcal {M}\) with the same mass; indeed one can, e.g., choose π(A × B) = |μ|−1 μ(A)ν(B), i.e., the mass from μ is split proportionally to the mass of ν.

The Monge-Kantorovich problem can be more generally stated on the space of Radon measures with finite p-moment, that is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mathcal{M}^p:=\left\{ \mu\in\mathcal{M}\ |\ \int |x|{}^p\, d\mu(x)<\infty \right\}. \end{array} \end{aligned} $$

The minimum realizing the solution to the Monge-Kantorovich problem always exists in such spaces, when μ, ν have the same mass. Such minimum defines a distance on the set of measures of \(\mathcal {M}^p\) with a given mass, called the Wasserstein distance:

$$\displaystyle \begin{aligned} \begin{array}{rcl} W_p(\mu,\nu)=(|\mu|\,\min_{\pi\in\varPi(\mu,\nu)} J\left[ \pi \right] )^{1/p}. \end{array} \end{aligned} $$

The Wasserstein distance metrizes the topology of weak convergence under assumptions of bounds on the p-moments, namely, we have the following:

Proposition 1

The two following statements are equivalent for \(\mu _i,\mu \in \mathcal {M}^p(\mathbb {R}^d)\) :

  • limi W p(μ i, μ) = 0;

  • μ i ⇀n μ and limRlimsupi|x|>R|x|p i(x) = 0.

We also notice that W p(, ) = k 1∕p W p(μ, ν) for k ≥ 0, by observing that Π(, ) = Π(μ, ν) and that \(J\left [ \pi \right ] \) does not depend on the mass of the measures.

For future use, we recall an important duality property of the Wasserstein distance (for p = 1):

$$\displaystyle \begin{aligned} W_1(\mu,\nu)=\sup\left\{\int_{\mathbb{R}^d} f\,d(\mu-\nu)\ : f\in{\mathrm{Lip}}(\mathbb{R}^d,\mathbb{R}),\ Lip(f)\leq 1 \right\}, \end{aligned} $$
(9)

where \(\mathrm {Lip}(\mathbb {R}^d,\mathbb {R})\) is the space of globally Lipschitz functions and Lip(f) indicates the Lipschitz constant of f. The equality (9) is known as the Kantorovich-Rubinstein duality.

2.4.2 Existence and Uniqueness of Solutions to (6)

In this section, we recall results of existence and uniqueness of solutions to (6). From now on we focus, for simplicity, on the space \(\mathcal {P}\) of probability measures (positive Radon measure with mass equal to one) and the subspace \(\mathcal {P}_c\) of probability measures with compact support. The key idea is that the correct topology to deal with equations as (6) is the one induced by the Wasserstein distance. More precisely, we will use the classical conditions on each vector field v[μ] of boundedness and Lipschitz continuity, while we will ask the map v[⋅] to be Lipschitz with respect to the Wasserstein distance and the usual C0 norm on vector fields.

Our main assumptions are the following: The function

$$\displaystyle \begin{aligned}{v\left[ \mu \right] } : \left\{ \begin{array}{ccl} {\mathcal{P}_c(\mathbb{R}^d)} & \rightarrow & {C^{1}(\mathbb{R}^d)\cap L^\infty(\mathbb{R}^d)} \\ {\mu} & \mapsto& {v\left[ \mu \right] } \end{array} \right.\end{aligned}$$

satisfies

  1. (H1)

    \(v\left [ \mu \right ] \) is uniformly Lipschitz and uniformly bounded, i.e., there exist L, M not depending on μ, such that for all \(\mu \in \mathcal {P}_c(\mathbb {R}^d), x,y\in \mathbb {R}^d,\)

  2. (H2)

    v is a Lipschitz function, i.e., there exists K such that

    $$\displaystyle \begin{aligned} \begin{array}{rcl} \|v\left[ \mu \right] -v\left[ \nu \right] \|{}_{\mathrm{C^0}} \leq K \,W_p(\mu,\nu). \end{array} \end{aligned} $$

Under these assumptions the following holds:

Theorem 1

Assume that (H1)–(H2) hold true. Then for every \(\mu _0\in \mathcal {P}_c(\mathbb {R}^d)\) , there exists a solution to (6) . Moreover, given μ, ν, two solutions of (6) in \(C([0,T],\mathcal {P}_c(\mathbb {R}^d))\) , we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} W_p(\mu_t,\nu_t)\leq e^{2t (L+K)}W_p(\mu_0,\nu_0). \end{array} \end{aligned} $$

In particular, if μ 0 = ν 0 , then μ t = ν t for all t ∈ [0, T]; thus, uniqueness of solutions holds true.

Proof

See [38].

2.4.3 Regularity of Interaction Kernels

In view of Theorem 1, to ensure a well-posed theory for crowd dynamics, we need to investigate if velocity models do satisfy assumptions (H1) and (H2). As explained above, the regularity of the component v d is quite standard; thus, we focus on the component v i, assuming that it is given in the form (7). We consider two cases: the first is given by f(x) ≡ 1, while the second is given by f(x) = x α with α ≥ 1. We show that with the first choice the assumptions (H1)–(H2) are not satisfied, while in the second case they are.

We start with the first case, so f ≡ 1. If the v i is nontrivial, i.e., if η is not vanishing everywhere, then the corresponding velocity field \(v\left [ \mu \right ] \) is not even continuous w.r.t. the Wasserstein distance. We construct a counterexample based on the graphical idea explained in Fig. 1.

Fig. 1
figure 1

The velocity field v i given by (7) is not continuous if f ≡ 1

We indicate by B s(y) the balls centered at y of radius s and by \(\overline {B}_s(y)\) its closure. Assume \(\mathrm {supp}\left ( \eta \right )\subset B_R(0)\) for R > 0. By continuity of η, the set \(A:=\left \{ \eta >0 \right \}\) is open. Take r > 0 (r < R) sufficiently small so that B r(0) ⊂ A and there exists \(\tilde {x}\in A\backslash \overline {B}_r(0)\). Since A is open and \(\overline {B}_r(0)\) closed, there exists 𝜖 such that \(B_\varepsilon (\tilde x)\subset A\backslash B_r(0)\). Finally, we let C be a compact set such that \(C\cap \overline {B}_r(0)=\emptyset \) and define \(s=\sup \left \{ |x-y|\,\mbox{ s.t. }\, x\in B_\varepsilon (\tilde x),\,y\in C \right \}\).

We now define a family of measures μ t that will provide a counterexample to continuity of v i. Set:

$$\displaystyle \begin{aligned} \mu_t:=\left( t \frac{{\chi}_{B_\varepsilon(\tilde x)}}{\mathcal{L}(B_\varepsilon(\tilde x))} + (1-t) \frac{{\chi}_{C}}{\mathcal{L}(C)} \right)\,\mathcal{L},\end{aligned} $$

where \(\mathcal {L}\) is the Lebesgue measure. From \(\int _{\mathbb {R}^d} \eta (-y)\, d\mu _0(y)=\frac {1}{\lambda (C)} \int _C 0\, d\mathcal {L}(y)=0\), we deduce

$$\displaystyle \begin{aligned} v\left[ \mu_0 \right] (0)=0.\end{aligned} $$
(10)

For t > 0, we have \(\int _{\mathbb {R}^d} \eta (-y) \, d\mu _t(y)>0\); hence:

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} |v\left[ \mu_t \right] (0)|&\displaystyle =&\displaystyle \left| {\frac{\int_{\mathbb{R}^d} y\eta(-y), d\mu_t(y)}{\int_{\mathbb{R}^d} \eta(-y), d\mu_t(y)}} \right|=\left| \frac{ \frac{t}{\mathcal{L}(B_\varepsilon(\tilde x))} \int_{B_\varepsilon(\tilde x)} y\eta(-y)\, d\mathcal{L}(y)}{ \frac{t}{\mathcal{L}(B_\varepsilon(\tilde x))} \int_{B_\varepsilon(\tilde x)} \eta(-y)\, d\mathcal{L}(y)} \right|\geq\\ &\displaystyle \geq&\displaystyle \frac{\inf \left\{ |y|\, \mbox{ s.t. }y\in B_\varepsilon(\tilde x) \right\} \int_{B_\varepsilon(\tilde x)} \eta(-y)\, d\mathcal{L}(y)}{\int_{B_\varepsilon(\tilde x)} \eta(-y)\, d\mathcal{L}(y)}\geq r. \end{array} \end{aligned} $$
(11)

From (10) and (11), we have that v i[μ t] is not continuous at t = 0. Then assumption (H2) will be violated if we prove that μ t is continuous at t = 0 w.r.t. Wasserstein distance, i.e., if limt→0 W p(μ 0, μ t) = 0. For this, we define \(\nu _t:=(1-t) \frac {{\chi }_{C}}{\mathcal {L}(C)}\,\mathcal {L}\), then \(W_p(\mu _0,\mu _t)=W_p(\mu _0-\nu _t,\mu _t-\nu _t) = W_p \left ( t \frac {{\chi }_{C}}{\mathcal {L}(C)}\mathcal {L},t \frac {{\chi }_{B_\varepsilon (\tilde x)}} {\mathcal {L}(B_\varepsilon (\tilde x))}\mathcal {L} \right )\). Since all measures are absolutely continuous w.r.t. \(\mathcal {L}\), there exists a map γ realizing the Wasserstein distance. Moreover, we can estimate |x − γ(x)|≤ s and then W p(μ 0, μ t) ≤ st 1∕p, proving the continuity of μ t.

Let us now pass to the case f(x) = x α with α ≥ 1. We have the following:

Proposition 2

Let v i be defined by (7) , where η is a smooth, positive function with bounded support. If f(x) = x α , with α ≥ 1, then v i satisfies (H1) and (H2).

Proof

We first prove that (H1) is satisfied. Notice that:

$$\displaystyle \begin{aligned} \left| {\int_{\mathbb{R}^d}\eta(x-y)\,d\mu(y)} \right|\leq |\eta|{}_\infty, \end{aligned}$$

thus we get \(|v\left [ \mu \right ] (x)| \leq R |\eta |{ }_\infty ^\alpha \), assuming supp(η) ⊂ B R(0). Moreover, indicating by L the Lipschitz constant of η, it holds:

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle |v\left[ \mu \right] (x)-v\left[ \mu \right] (z)| = \leq |\eta|{}_\infty^{\alpha-1} R \int_{\mathbb{R}^d} |\eta(x-y)-\eta(z-y)|\,d\mu(y)\leq\\ &\displaystyle &\displaystyle \leq |\eta|{}_\infty^{\alpha-1} R L |x-z| \int_{\mathbb{R}^d} \,d\mu(y)=|\eta|{}_\infty^{\alpha-1} R L |x-z|, \end{array} \end{aligned} $$

thus v[μ] is bounded and Lipschitz continuous. Similarly, we prove (H2) with the following estimate:

$$\displaystyle \begin{aligned} |v\left[ \mu \right] (x)-v\left[ \nu \right] (x)|\leq |\eta|{}_\infty^{\alpha-1} \left| {\int_{\mathbb{R}^d}(x-y)\eta(x-y)\,d(\mu-\nu)(y)} \right|. \end{aligned}$$

Since the function φ(y) := (x − y)η(x − y) is Lipschitz continuous, using the Kantorovich-Rubinstein duality (9), we get

$$\displaystyle \begin{aligned}\|v\left[ \mu \right] -v\left[ \nu \right] \|{}_{C^0}\leq |\eta|{}_\infty^{\alpha-1} R L W_1(\mu,\nu).\end{aligned}$$

2.5 Wasserstein Distance and Total Variation Norm

The Wasserstein distance W p is a natural distance since it metrizes (over compact sets) the weak* topology as dual of the space \(\mathcal {C}_0\) (closure of continuous functions with compact support for the uniform norm). On the other side, one may consider the total norm over signed measures \(\|\mu ^+-\mu ^-\|{ }_{TV}=\mu ^+(\mathbb {R}^d)+\mu ^-(\mathbb {R}^d)\) given by the total variation, equal to \(\mu (\mathbb {R}^d)\) for positive measures, which corresponds to strong convergence. It is obvious that mathematically weak convergence is easier to achieve; however, there are also modeling reasons to prefer the Wasserstein distance. In this section we provide a comparison of the two metrics.

First, let us notice that the space \(\mathcal {M}\) can be endowed with many different distances (see, e.g., [43]). The total variation norm coincides with the L 1 distance for absolutely continuous measures:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \|\mu-\nu\|{}_{L^1}:= \int |\mu(x)-\nu(x)| dx. \end{array} \end{aligned} $$

Assume now that two crowd configurations in the ambient space are represented by the measures \(\mu _i=\frac {1}{N}\sum _{j=1}^N \delta _{x^i_j}\), i = 1, 2, j = 1, …, N. Then the Wasserstein distance is given by the minimum over all permutations σ : {1, …, N}→{1, …, N} of the quantity \(\frac {1}{N} \sum _j |x^1_j-x^2_{\sigma (j)}|\). Indeed, all possible ways to move the mass from μ 1 to μ 2 correspond to the maps between the points \(x^1_j\) and \(x^2_j\), which in turn can be represented by a permutation σ. Consider, for instance, the following situation in \(\mathbb {R}\): \(\mu _1=\frac {1}{2}\delta _0+\frac {1}{2}\delta _1\) and \(\mu _2=\frac {1}{2}\delta _{\epsilon }+\frac {1}{2}\delta _{1+\epsilon }\). In other words μ 1 is given by two pedestrians in position 0 and 1, while μ 2 by two pedestrians in position 𝜖 and 1 + 𝜖. The total variation distance verifies ∥μ 1 − μ 2∥ = 1, while the Wasserstein distance is W(μ 1, μ 2) = 𝜖. Clearly, if 𝜖 is small, the two configurations are close to each other. This is reflected in the Wasserstein distance but not in the total variation one.

Beside the modeling reasons, the Wasserstein distance is preferable also for the uniqueness of solutions to transport equations. For instance, we may replace the assumption (H2) with the following:

  1. (H3)

    The function v[⋅] satisfies for some K > 0:

    $$\displaystyle \begin{aligned} \begin{array}{rcl}\|v\left[ \mu \right] -v\left[ \nu \right] \|{}_{\mathrm{C^0}} \leq K \|\mu-\nu\|{}_{TV}. \end{array} \end{aligned} $$

It is possible to define a velocity field v that satisfies assumptions (H1) and (H3) but does guarantee uniqueness of solutions to the Cauchy problem. The idea is depicted in Fig. 2 and is based on lack of uniqueness of the classical example for ordinary differential equations: \(\dot x=\sqrt {x}\), x(0) = 0. We provide a sketch of the proof, referring the reader to [38] for details.

Fig. 2
figure 2

(H3) does not guarantee uniqueness of the solution

Fix d = 2 and define a curve ν t in the space of probability measures as follows. The squares \(Q^i_t\) have sides parallel to coordinate axes of length s i, share a side, and have the upper ones on the line y = 1 + t 2. The measure ν t is given by:

$$\displaystyle \begin{aligned} \nu_t:=\sum_{i=0}^\infty m_i \chi_{Q^i_t} \mathcal{L}, \end{aligned}$$

where m i are positive and, as before, \(\mathcal {L}\) is the Lebesgue measure. We then define the velocity field by \(v\left [ \nu _t \right ] :=(0,2t)\). Choosing s i := 4i and \(m_i=\frac {1}{2} 8^i\), one can prove that v satisfies (H3). Moreover, one can define v on the whole \(\mathcal {M}\). It is easy to show that the Cauchy problem with initial condition ν 0 has two solutions: μ 1(t) ≡ ν 0 and μ 2(t) = ν t. We can also estimate W p(ν t, ν s) = t 2 − s 2; thus, v does not satisfy assumption (H2).

3 Mean-Field Limits of Microscopic Models

In this section, we introduce the mean-field limit of microscopic models for crowd dynamics. The goal is to describe the dynamics of the crowd when the number of agents tends to infinity. As a result, the description of each agent is lost, and the crowd is then represented by a spatial density evolving in time.

3.1 Definition of the Mean-Field Limit

In this section, we recall the definition of the mean-field limit. Historically, the mean-field limit has been introduced as the limit of classical and quantum mechanical systems (see, e.g., [34] and references therein). In the case of crowd dynamics, some standard physical interaction laws are not satisfied (e.g., the action-reaction principle). We then use an approach less influenced by the physical intuition, following Neunzert in [35]. Even though his description focuses on ordinary differential equations of the second order, the method presented there can be applied verbatim to first-order systems.

Consider an ordinary differential equation describing the dynamics of N particles in the phase space \(\mathbb {R}^d\). In a very general form, it can be written as follows:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \dot x_i=f_N(x_i;x_1,\ldots,x_N),~~~i=1,\ldots,N.{} {}\end{array} \end{aligned} $$
(12)

Here, we highlight that the expression of the dynamics f N depends on the number N of particles. Then, one can see (12) as a family of ordinary differential equations indexed by N, each of them describing the dynamics of N particles in the phase space \(\mathbb {R}^d\), hence each of them describing a dynamics in the space \(\mathbb {R}^{dN}\).

Assume now that each f N satisfies some properties ensuring existence, uniqueness, and well-posedness of solutions to (12), e.g., the classical Lipschitz condition. Then, for each N and an initial data \(X_N^0=(x_1^0,\dots ,x_N^0)\), there exists a unique trajectory X N(t) = (x 1(t), …, x N(t)).

The goal of mean-field limit is to describe the limit of the trajectories of such systems when N tends to infinity. The first difficulty is that each trajectory, indexed by N, lives in a different space, that is, \(\mathbb {R}^{dN}\). One then needs to add the following key hypothesis: particles x i are identical or indistinguishable. As a consequence, an exchange of the particle x i with x j induces no change in the dynamics of the whole system, in the following sense: the trajectories of these two particles are exchanged, and the trajectories of the other particles are kept. Clearly, this requirement strongly restricts the set of possible dynamics f N. The most classical expressions are of the form

$$\displaystyle \begin{aligned} \begin{array}{rcl} f_N(x_i;x_1,\ldots,x_N)=f^0(x_i)+\frac{1}{N}\sum_{j=1}^N f^1(x_j-x_i).\vspace{-2pt} {}\end{array} \end{aligned} $$
(13)

Under such hypothesis of indistinguishability, one can then replace the trajectory X N(t) with its description in terms of measures, by introducing the empirical measure. Given X N(t) = (x 1(t), …, x N(t)), define the corresponding empirical measure as

$$\displaystyle \begin{aligned}\mu_N(t):=\frac{1}{N} \sum_{i=1}^N \delta_{x_i(t)}.\end{aligned} $$

Then, all trajectories μ N now evolve in the same space \(\mathcal {P}(\mathbb {R}^d)\) of probability measures defined on the phase space \(\mathbb {R}^d\) that does not depend on N anymore. Moreover, the space \(\mathcal {P}(\mathbb {R}^d)\) is naturally endowed with the topology of the weak convergence of measures, that is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mu^i\rightharpoonup_{i\to\infty} \mu^*~~~\Leftrightarrow~~~\lim_{i\to\infty}\int f\,d\mu^i=\int f\,d\mu^*~~~ \mbox{for all}~~f\in C^\infty_c(\mathbb{R}^d). \vspace{-3pt}\end{array} \end{aligned} $$

In the space \(\mathcal {P}(\mathbb {R}^d)\), one can define dynamical systems too. For simplicity, we focus on the case that we will use most in the following: a Cauchy problem for measure with dynamics given by a transport equation, that is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \begin{cases} \partial_t \mu +\nabla\cdot (V[\mu]\mu)=0,\\ \mu(0)=\mu^0. \end{cases} {}\end{array} \end{aligned} $$
(14)

Assume to have properties on V  ensuring existence and uniqueness of solutions to (14), such as (H1)–(H2) in Sect. 2.4.2. Denote with μ(t) the corresponding unique solution. In a more abstract setting, one can simply consider to have a functional (e.g., a semigroup) that to each initial state μ 0 associates a unique trajectory μ(t).

We are now ready to define the mean-field limit. We say that (14) is the mean-field limit of (12) if the following property holds:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mu_N(0)\rightharpoonup_{N\to\infty} \mu(0)~~~~~~\Rightarrow ~~~~~~\mu_N(t)\rightharpoonup_{N\to\infty} \mu(t) ~~~ \mbox{for all}~~t\geq 0. {}\end{array} \end{aligned} $$
(15)

A particular but relevant case for mean-field limits is the following. Assume to have a measure dynamics of the form (14) with the following property:

(MF-N)::

When the initial data μ 0 is an empirical probability measure \(\mu ^0_N\) associated to an initial data \(X_N^0\) of N particles, then the dynamics (14) rewrites as the ordinary differential equation (12).

Hence, given an initial data \(X_N^0\) with N particles, the trajectory of (14) with initial data \(\mu ^0_N\) coincides with the empirical measure associated to the solution for (12) with initial data X N. This property is somehow stronger than (15) for empirical measures, since it requires identity of trajectories for each N, and not only convergence for N →. Nevertheless, such property is often naturally imposed in crowd models with arbitrary N agents, e.g., by choosing dynamics of the form (13). Instead, some relevant physical models do not satisfy such property (see, e.g., [34, Sec. 1.5]).

Clearly, condition (MF-N) makes sense for empirical measures only. We then need to add a condition for all other measures to ensure that (14) is the mean-field limit of (12). The most natural one is to require the following continuity condition:

(C)::

The solution μ(t) to (14) is continuous with respect to the initial data μ 0.

Such property is somehow natural in crowd models that are written in terms of (14), since they are always approximated models of a crowd with a large but finite number of agents. Hence, continuous dependence is necessary to ensure that the behavior of the approximated model is sufficiently close to the real dynamics.

We now prove that (14) is the mean-field limit of (12), under the hypotheses (MF-N)-(C). Indeed, recall that the set of empirical probability measures is dense in \(\mathcal {P}(\mathbb {R}^d)\) endowed with the topology of weak convergence. Then, take any initial data μ(0) and a sequence of empirical measures μ N(0) ⇀N μ(0) that exists by density. Observe now that, by (MF-N), in this specific case μ N(t) is both the empirical measure associated to the solution to (12), as in the definition (15) of mean-field limit, and the unique solution to (14). Then, (15) holds, since it is the continuity property (C) for the measure dynamics (14).

It is clear that proving (15) can be easier if there exists a metric that metrizes the weak topology of measures. This is the case of the Wasserstein distance (Sect. 2.4.1), under some restrictive hypotheses that usually hold for crowd models.

For crowd modeling, trajectories usually have a uniformly bounded support, e.g., when the initial measure has bounded support and velocities are bounded. Then, in our setting, Proposition 1 ensures that convergence in Wasserstein distance is equivalent to weak convergence of measures. Hence, we can restate (15) in terms of the Wasserstein distance: the dynamics (14) is the mean-field limit of (12) if the following property holds:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \lim_{N\to\infty}W_p(\mu_N(0),\mu(0))=0~~~\Rightarrow ~~~\lim_{N\to\infty}W_p(\mu_N(t),\mu(t))=0 ~~~ \mbox{for all}~~t\geq 0. \end{array} \end{aligned} $$

It is important to observe that such statement is a rewriting of (15) in the space \(\mathcal {P}_p(\mathbb {R}^d)\) only. In particular, we will see in the following Sect. 5 that the original definition (15) of mean-field limit makes sense also for models with varying mass, while the Wasserstein distance between two measures with different masses is undefined.

3.2 The Mean-Field Limit of the Helbing-Molnár Model

In this section, we derive the mean-field limit of the Helbing-Molnár model of social forces recalled in Sect. 2.1. The idea is to follow the method described in Sect. 3.1: we first write a partial differential equation of the form (14) satisfying the property (C) of continuity with respect to the initial data. We then prove that the original Helbing-Molnár model (4) satisfies the property (MF-N), i.e., it is the rewriting of the partial differential equation when the initial data is an empirical measure.

We start by writing the measure μ = μ(t, x, v), that is, a time-varying probability measure in the space \(\mathcal {P}(\mathbb {R}^d\times \mathbb {R}^d)\), i.e., in the space of probability measures on the space of configurations (x, v) in the space \(\mathbb {R}^d\). For the Helbing-Molnár model, one usually has d = 2 or 3, as recalled above.

We then write the partial differential equation for the mean-field limit of the Helbing-Molnár model, that is

$$\displaystyle \begin{aligned} \begin{array}{rcl} \begin{cases} \partial_t \mu +\nabla\cdot (V_{HM}[\mu]\mu)=0,\\ \mu(0)=\mu^0. \end{cases} {}\end{array} \end{aligned} $$
(16)

where the vector field for the Helbing-Molnár model is

$$\displaystyle \begin{aligned} \begin{array}{rcl} V_{HM}[\mu](x,v):=\left( \begin{array}{c} v\\ F_i(v)+(F_{int}\star\mu)(x,v)+F_e(x,v) \end{array} \right) {}\end{array} \end{aligned} $$
(17)

Here, the functions F i, F int, F e are defined in (1)–(2)–(3), respectively. We already observed that such functions are chosen to be globally Lipschitz, to ensure existence and uniqueness of the solutions to the ordinary differential equation (4) of the Helbing-Molnár model for all times. We now prove that the vector field (17) satisfy the hypotheses (H1)–(H2) of Theorem 1. More precisely, we will prove that V HM can be modified outside a sufficiently large compact set so as to have the same solutions to (16) and to satisfy (H1)–(H2).

It is clear that the three functions (x, v) → v, F i and F e are Lipschitz with respect to (x, v). Moreover, they are independent on μ. We now need to study the term F int ⋆ μ, for which it holds

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle |F_{int}\star \mu(x,v)-F_{int}\star \mu(y,w)|\leq\\ &\displaystyle &\displaystyle \int_{\mathbb{R}^{2d}}\left|F_{int}(x-\alpha,v-\beta)-F_{int}(y-\alpha,w-\beta)\right|d\mu(\alpha,\beta)\leq\\ &\displaystyle &\displaystyle \int_{\mathbb{R}^{2d}}L \left| (x-\alpha,v-\beta)-(y-\alpha,w-\beta)\right|d\mu(\alpha,\beta)=L|(x,v)-(y,w)|, \end{array} \end{aligned} $$

where L is the Lipschitz constant of F int. Moreover, it also holds

$$\displaystyle \begin{aligned} \begin{array}{rcl} &\displaystyle &\displaystyle |F_{int}\star \mu(x,v)-F_{int}\star \nu(x,v)|=\\ &\displaystyle &\displaystyle \left| {\int_{\mathbb{R}^{2d}}F_{int}(x-\alpha,v-\beta)d(\mu(\alpha,\beta)-\nu(\alpha,\beta))} \right|\leq L W_1(\mu,\nu),\end{array} \end{aligned} $$

where we used the Kantorovich-Rubinstein duality formula (9). Then, V  satisfies the first condition of (H1), as well as (H2) with p = 1.

We are now left to prove that V  also satisfies the second condition of (H1), i.e., uniform boundedness. This is clearly false, e.g., since (x, v) → v is an unbounded function. Nevertheless, observe that V  being uniformly Lipschitz implies that V  has sublinear growth, in the following sense: there exists C > 0 such that supp(μ) ⊂ B R(0) implies V ([μ]) ≤ C(1 + R). Indeed, observe that the following conditions hold:

  • (x, v) ∈supp(μ) ⊂ B R(0) implies |v|≤ R;

  • |F i(v)|≤ L|v| + |F i(0)|≤ LR + |F i(0)|;

  • |F e(x, v)|≤ L|(x, v)| + |F e(0, 0)|≤ LR + |F e(0, 0)|.

To prove boundedness of F int ⋆ μ, observe that it holds

$$\displaystyle \begin{aligned} \begin{array}{rcl} |F_{int}\,{\star}\,\mu(x,v)|&\displaystyle {\leq}&\displaystyle |F_{int}\,{\star}\,\mu(x,v){-}F_{int}\,{\star}\,\delta_0(x,v)|{+}|F_{int}{\star}\delta_0(x,v){-}F_{int}{\star}\delta_0(0,0)|{+}\\ &\displaystyle &\displaystyle |F_{int}\star \delta_0(0,0)|\leq LW_1(\mu,\delta_0)+L|(x,v)|+|F_{int}\star \delta_0(0,0)|\leq\\ &\displaystyle &\displaystyle 2LR+|F_{int}(0,0)|. \end{array} \end{aligned} $$

Here we used the fact that for a measure μ satisfying supp(μ) ⊂ B R(0), we have W 1(μ, δ 0) ≤ R, since transportation plans all have rays with length smaller than R. Then, choosing \(C:=2\max \left \{ 1+4L, F_i(0)+F_e(0,0)+|F_{int}(0,0)| \right \}\), one has sublinear growth of the vector field V , independent on the measure μ. Similar to classical techniques for ODEs, this implies that, when \(\mathrm {supp}(\mu _0)\in \mathcal {P}(B_R(0))\), then \(\mathrm {supp}(\mu (t,\cdot ,\cdot ))\in \mathcal {P}(B_{S(t)}(0))\) with S(t) = e Ct(1 + R) − 1.

Let us now consider an initial compact set \(K\subset \mathbb {R}^d\times \mathbb {R}^d\) and a time T > 0. When supp(μ 0) ∈ K ⊂ B R(0) for some R, then the solution satisfies supp(μ(t, ⋅, ⋅)) ∈ B S(T)(0) for all t ∈ [0, T]. Choose now V[μ](x, v) coinciding with V HM for \(\mu \in \mathcal {P}(B_{S(T)}(0))\) and (x, v) ∈ B S(T)(0), being bounded, Lipschitz with respect to (x, v) and μ outside of it: then, V satisfies both conditions of (H1), as well as (H2). Moreover, solutions of (16) with \(\mu _0\in \mathcal {P}(K)\) coincide with the ones where V HM is replaced by V. Then, we have existence and uniqueness of solutions to (16) when \(\mu _0\in \mathcal {P}(K)\). Since K is an arbitrary compact set, we have existence and uniqueness of solutions to (16) for any μ 0 with compact support. Moreover, continuity with respect to the initial data (i.e., condition (C)) is satisfied too.

We are now left to prove that (MF-N) is satisfied. A direct rewriting of (16) with \(\mu _0=\frac {1}{N} \sum _{i=1}^N \delta _{x_i}\) shows that it coincides with (4). Then, both conditions (C)-(MF-N) are satisfied; hence (16) is the mean-field limit of the Helbing-Molnár model (4).

Remark 1

A relevant particular case of the Helbing-Molnár model is given by the Cucker-Smale model for alignment that we introduced in Sect. 2.3. There, the functions F i, F e in (4) are identically zero, while the interaction term F ij(x i, x j, v i, v j) is given by a(∥x j − x i∥)(v j − v i).

Then, it is clear that, by choosing

$$\displaystyle \begin{aligned}F_{int}(x,v)=-a(|x|)v,\end{aligned}$$

the measure evolution (16) is the mean-field limit of the Cucker-Smale model (5). This result was already obtained with different techniques in [25]. A classical rewriting, splitting the differential operator in the (x, v) variables, is

$$\displaystyle \begin{aligned} \partial_t \mu +\langle v, \nabla_x \mu\rangle+\mathrm{div}_v \left( (F_{int}\star\mu)\, \mu \right)=0. \end{aligned}$$

4 Microscopic Models with Varying Mass

In this section, we introduce some microscopic models in which each agent has a mass that varies in time. In crowd models, the mass of an agent may represent his influence with respect to the rest of the crowds, such as leadership, reputation, or persuasion.

The key difficulty for these models, strongly correlated to the goal of building mean-field limits for them, is that we do not aim to label agents in different classes (such as leaders vs followers, eventually switching from one to another; see [1, 8, 20]), but to keep a form of homogeneity for them.

In this section, we describe a model of N agents, each of them represented by its position x i in the phase space and its mass m i. The dynamics for the crowd is given in the following form:

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \begin{cases} \dot x_i=V_0(x_i)+\sum_{j=1}^N m_j V_1(x_j-x_i),\\ \dot m_i=m_i(S_0(x_i)+\sum_{j=1}^N m_j S_1(x_j-x_i)), \end{cases}&\displaystyle ~~~~~~i=1,\ldots,N. \end{array} \end{aligned} $$
(18)

All the functions V 0, V 1, S 0, S 1 are required to be uniformly bounded and uniformly Lipschitz with respect to their variables, to ensure existence and uniqueness of the solution to the associated Cauchy problem. Moreover, we also require V 1(0) = S 1(0) = 0.

We highlight some key properties of the model (18). The first is that, in the first equation, the term m j is the weight of the interaction term V 1(x j − x i). In this sense, the mass m j plays the role of the influence of the j-th particle onto the i-th one.

The second, crucial property, is that one can replace the i-th particle with position-mass (x i, m i) with two (or more) new particles (y 1, n 1), (y 2, n 2) that lie in the same position (y 1 = y 2 = x i) and whose total mass coincides with the initial one (n 1 + n 2 = m i). Indeed, consider on one side the trajectories of the N-particles system ((x 1, m 1), …, (x i, m i), …, (x N, m N)) satisfying (18) and starting from an initial data \(((x^0_1,m^0_1),\ldots ,(x^0_i,m^0_i),,\ldots ,(x^0_N,m^0_N))\). On the other side, consider the trajectories of the N + 1-particles system

$$\displaystyle \begin{aligned}((\tilde x_1,\tilde m_1),\ldots,(\tilde x_{i-1},\tilde m_{i-1}),(y_1,n_1),(y_2,n_2),(\tilde x_{i+1},\tilde m_{i+1}),\ldots,(\tilde x_N,\tilde m_N))\end{aligned}$$

satisfying (18) and starting from an initial data

$$\displaystyle \begin{aligned}((\tilde x^0_1,\tilde m^0_1),\ldots,(\tilde x^0_{i-1},\tilde m^0_{i-1}),(y_1^0,n_1^0),(y_2^0,n_2^0),(\tilde x^0_{i+1},\tilde m^0_{i+1}),\ldots,(\tilde x^0_N,\tilde m^0_N))\end{aligned}$$

with the following properties:

  • it holds \(x^0_j=\tilde x^0_j\) and \(m_j^0=\tilde m_j^0\) for all j ≠ i;

  • it holds \(x^0_i=y^0_1=y^0_2\) and \(m_i^0=n_1^0+n_2^0\).

Then, the following identities hold true for all times \(t\in \mathbb {R}\):

  • the trajectories satisfy \((x_j(t),m_j(t))=(\tilde x_j(t),\tilde m_j(t))\) for all j ≠ i;

  • the trajectory of the i-th particle satisfies x i(t) = y 1(t) = y 2(t), while its mass satisfies m i(t) = n 1(t) + n 2(t).

The proof follows from a direct computation of the derivatives and from uniqueness of the solution to (18). Such property is instrumental for the mean-field limit. Indeed, one needs to preserve the property of indistinguishability of particles also for time-varying masses.

Observe that the model (18) preserves positivity/negativity of the mass. In our interpretation of the mass as a degree of influence, one might accept the presence of negative influences. Moreover, the presence of negative masses would produce signed measures at the mean-field limit that can be efficiently treated with a generalization of the methods described here (see [41]).

A direct computation also shows that the total mass is not preserved, since t(∑i m i) ≠ 0. It is then sufficient to add a correction term in \(\dot m_i\) to ensure such property, such as the rescaling term

$$\displaystyle \begin{aligned}-\frac{m_i}{\sum_{k=1}^N m_k}\left(\sum_{k=1}^N m_k\left(S_0(x_k)+\sum_{j=1}^N m_j S_1(x_j-x_k)\right)\right).\end{aligned}$$

Also in this case, in our setting there is no general reason to assume a constant total influence, and such constraint is not either necessary for the mean-field limit.

We now present a classical pedestrian evacuation problem: the simulation of the exit of a crowd from a room through a single large door. We compare two models. The first is the classical social force model, where the mass of each agent has a constant value for the whole simulation. In the second case, the mass (modeling influence) exponentially decreases when the agent exits the door. These two approaches model two different known behaviors of pedestrians: in the first, the agent wanders around the exit door (confused, trying to find help, or simply stopping in the proximity of the exit), while in the second he runs toward a far safe place (e.g., meeting point). Our simulations show that the average exit time can be reduced 8% when the second model is implemented. The maximal exit time is even reduced 11% in the second model.

Following the first-order model by Piccoli-Tosin [42], inspired by the Helbing-Molnár model, we describe the behavior of a single agent as follows. His velocity is the sum of two terms: first a desired velocity, which in our case is a unitary vector pointing to the exit, and second, a term of repulsion to other agents, to model the tendency of avoiding overcrowded areas. In our simulation, the agent computes the barycenter of the mass of agents in a ball of radius 2 around himself and then moves in the opposite direction of such barycenter.

In Fig. 3 (left), we show three different times of the simulation with no variation of the mass: the initial random configuration of 200 agents, then an intermediate time T = 6 in which clusters appear, and finally time T = 16.4 in which the last agent exits the room. The average exit time is 8.075 s.

Fig. 3
figure 3

Evolution of the microscopic model. Left: no mass reduction. Right: mass reduction

In Fig. 3 (right), we show the simulation in which the mass decreases exponentially when an agent exits the door. This is represented by the circle reducing its radius. The initial configuration coincides with the previous case. The simulation is shown at the intermediate time T = 6 and then at the time T = 14.6 in which the last agent exits the room. The average exit time is 7.415 s.

5 Measure Dynamics for Mass-Varying Models

In this section, we describe the mean-field limit of microscopic models with varying mass introduced in Sect. 4. With this goal, we first recall the definition of the generalized Wasserstein distance that we introduced in [39, 40]. It will be the main analytical tool for the study of the mean-field limit. We will then write the limit measure dynamics and prove that it is the mean-field limit of the microscopic model. We finally recall main results of well-posedness of the limit PDE.

Other relevant distances between measures of different masses, further generalizing the one presented here, have been recently described (see [15, 28, 32]).

5.1 The Generalized Wasserstein Distance

We recall here the definition of the generalized Wasserstein distance \(W^{a,b}_{p}(\mu ,\nu )\). We first give a rough description of the idea. Imagine to have three different admissible actions on μ, ν: either add/remove mass to μ or add/remove mass to ν or transport mass from μ to ν. The three techniques have their cost: add/remove mass has a unitary cost a (in both cases) and transport of mass has the classic Wasserstein cost, multiplied by a fixed constant b. The distance is the minimal cost of a mix of such techniques.

From now on, we denote with \(\mathcal {M}\) be the space of Borel measures with finite mass on \(\mathbb {R}^d\) and with \(\mathcal {M}_c\) its subset of measures with bounded mass and compact support. We now formally define the generalized Wasserstein distance.

Definition 1

Let a, b ∈ (0, ) and p ≥ 1 be fixed. The generalized Wasserstein distance is

$$\displaystyle \begin{aligned} \begin{array}{rcl} W^{a,b}_{p}(\mu,\nu)=\inf_{\scriptstyle \tilde\mu,\tilde\nu\in\mathcal{M}, |\tilde\mu|=|\tilde\nu|}\left( a|\mu-\tilde\mu|+a|\nu-\tilde\nu|+bW_p(\tilde\mu,\tilde\nu) \right). \end{array} \end{aligned} $$

Proposition 3

The operator \(W^{a,b}_p\) is a distance. Moreover, the infimum is always attained.

We now observe that the generalized Wasserstein distance metrizes the weak convergence of measures, with the additional requirement of tightness.

Theorem 2

Let μ n be a sequence of measures in \(\mathcal {M}\) and \(\mu \in \mathcal {M}\) . Then, the two following statements are equivalent:

  • \(W^{a,b}_{p}(\mu _n,\mu )\to 0\) ;

  • \(\mu _n\rightharpoonup \mu \) and μ n is a tight sequence (i.e., for each ε > 0 there exists a compact set K ε such that \(\mu _n(\mathbb {R}^d\setminus K_\varepsilon )<\varepsilon \) for all n).

Proof

See [39, Thm. 3].

5.2 The Mean-Field Limit for Mass-Varying Models

In this section, we write a measure dynamics with varying mass and prove that it is the mean-field limit of the microscopic model (18) introduced in Sect. 4.

Consider the following dynamics of measures with a transport and a source term:

$$\displaystyle \begin{aligned} \begin{array}{rcl} \begin{cases} \partial_t\mu+\nabla\cdot(v\left[ \mu \right] \, \mu)=h\left[ \mu \right] ,\\ \mu_{|{}_{t=0}}=\mu_0. \end{cases} {} \end{array} \end{aligned} $$
(19)

We assume the following hypotheses about the functions v and s.

(H4) The function

$$\displaystyle \begin{aligned}{v\left[ \mu \right] } : \left\{ \begin{array}{ccl} {\mathcal{M}} & \rightarrow & {C^{1}(\mathbb{R}^d)\cap L^\infty(\mathbb{R}^d)} \\ {\mu} & \mapsto& {v\left[ \mu \right] } \end{array} \right.\end{aligned}$$

satisfies

  • \(v\left [ \mu \right ] \) is uniformly Lipschitz and uniformly bounded, i.e., there exist L, M not depending on μ, such that for all \(\mu \in \mathcal {M}\) and \(x,y\in \mathbb {R}^d\), it holds:

  • v is a Lipschitz function, i.e., there exists N such that

    $$\displaystyle \begin{aligned} \begin{array}{rcl}\|v\left[ \mu \right] -v\left[ \nu \right] \|{}_{\mathrm{C^0}} \leq N W^{a,b}_{p}(\mu,\nu). \end{array} \end{aligned} $$

(H5) The function

$$\displaystyle \begin{aligned}{h\left[ \mu \right] } : \left\{ \begin{array}{ccl} {\mathcal{M}} & \rightarrow & {\mathcal{M}} \\ {\mu} & \mapsto& {h\left[ \mu \right] } \end{array} \right.\end{aligned}$$

satisfies

  • \(h\left [ \mu \right ] \) has uniformly bounded mass and support, i.e., there exist P, R such that

    $$\displaystyle \begin{aligned} \begin{array}{rcl} h\left[ \mu \right] (\mathbb{R}^d)\leq P,\qquad \mathrm{supp} \left( h\left[ \mu \right] \right)\subseteq B_R(0). \end{array} \end{aligned} $$
  • h is a Lipschitz function, i.e., there exists Q such that

    $$\displaystyle \begin{aligned} \begin{array}{rcl} W^{a,b}_{p}(h\left[ \mu \right] ,h\left[ \nu \right] ) \leq Q W^{a,b}_{p}(\mu,\nu). \end{array} \end{aligned} $$

Under such hypotheses, we proved in [39] the well-posedness of the Cauchy problem (19).

Theorem 3

Assume that (H4)–(H5) hold true. Then, for each initial measure with finite mass and compact support \(\mu _0\in \mathcal {M}_c\) , there exists a solution to (19) . Moreover, given μ, ν, two solutions of (19) in \(C([0,T],\mathcal {M}_c)\) , we have

$$\displaystyle \begin{aligned} \begin{array}{rcl} W^{a,b}_{p}(\mu_t,\nu_t)\leq e^{2t (L+(|\mu_0|+tP)N+Q)}W^{a,b}_{p}(\mu_0,\nu_0). \end{array} \end{aligned} $$

In particular, if μ 0 = ν 0 , then μ t = ν t for all t ∈ [0, T]; thus, uniqueness of solutions holds true.

Proof

See [39, Prop. 7 and Thm. 6].

Remark 2

As already stated, the application to pedestrian dynamics also explains the choice of the basic assumptions (H4)–(H5), namely, that we deal with measures with bounded support.

Recall now the definition of mean-field limit given in Sect. 3: given a microscopic model with the associated time-dependent empirical measures μ N(t) and the macroscopic model with trajectories μ(t), it holds

$$\displaystyle \begin{aligned} \begin{array}{rcl} \mu_N(0)\rightharpoonup_{N\to\infty} \mu(0)~~~~~~\Rightarrow ~~~~~~\mu_N(t)\rightharpoonup_{N\to\infty} \mu(t) ~~~ \mbox{for all}~~t\geq 0. \end{array} \end{aligned} $$

In this definition, the fact that both μ N(t) and μ(t) have masses varying in time plays no role. Moreover, we can apply the methods described in Sect. 3 in the particular case of μ N(t) already being solutions of the macroscopic model. Then, one has that (19) is the mean-field limit of (18) if the following properties hold:

(MF-N)::

When the initial data μ 0 is an empirical measure \(\mu ^0_N\) associated to an initial data (X N, M N)0 of N particles, then the dynamics (19) rewrites as the ordinary differential equation (18);

(C)::

The solution μ(t) to (19) is continuous with respect to the initial data μ 0.

Property (C) holds in general for solutions to (19), according to Theorem 3. Then, we are left to find functionals V [μ], h[μ] such that (MF-N) holds. It is straightforward to prove that the mean-field limit of (18) is then given by

$$\displaystyle \begin{aligned} \begin{array}{rcl} V[\mu]=V_0+V_1\star \mu,\qquad \qquad h[\mu]=S_0\mu+S_1\star \mu. \end{array} \end{aligned} $$

We end this section by presenting simulations of the mean-field limit of the two pedestrian models presented in Sect. 4. It describes the dynamics of a pedestrian crowd exiting a room. We refer to details of the dynamics for each agent to the previous description. Here, we just recall that the mass (influence) of each agent can be treated in two different ways: either it is constant or it has an exponential decrease when the agent exits the room. Our simulations show that the average exit time can be reduced 8.7% in the second model. The maximal exit time is reduced of 8.1% in the second model.

We now show the dynamics of the mean-field limit of such two possible choices. The mathematical method used to numerically solve the nonlocal equation with or without mass reduction has been introduced and studied in [38, 40, 42]. In Fig. 4, the darker areas represent higher crowd density.

Fig. 4
figure 4

Evolution of the mean-field model. Left: no mass reduction. Right: mass reduction

In Fig. 4 (left), we show three different times of the simulation with no variation of the mass: the initial random configuration of agents, then an intermediate time T = 6 in which concentration near the exit appears, and finally time T = 18 in which the whole crowd exits the room. The average exit time is 8.9 s.

In Fig. 4 (right), we show the simulation in which the mass decreases exponentially when an agent exits the door. The initial configuration coincides with the previous case. The simulation is then shown at two different times T = 6 and T = 16.6 in which the last agent exits the room. The average exit time is 8.2 s.