1 Introduction

The general problem of approximating transport PDEs by the empirical measure associated to moving particles is quite classical in many contexts such as particle physics and gravitation. We refer to cornerstone papers such as [19, 33, 35, 41] and to the review paper [22]. A prototype model and variation of the pure transport PDE which gained great attention in the last decades is the nonlocal transport-diffusion equation

$$\begin{aligned} \partial _t \rho = \mathrm {div}(\nabla a(\rho ) + \rho \nabla G*\rho ), \end{aligned}$$
(1)

where \(a=a(\rho )\) is a nonlinear diffusion function and G is a space dependent kernel modelling nonlocal interaction. In the aforementioned contexts in particle physics and gravitation, G is typically a singular kernel, which makes the analysis of (1) quite challenging. A similar situation occurs in the study of Keller-Segel model for chemotaxis, more precisely in its parabolic-elliptic version, see e.g. [4,5,6, 27]. A different situation arises e.g. in [3, 15, 43] in the analysis of mean-field models for granular media, in which G is typically a power law of the form \(G(x)=|x|^\alpha \) with \(\alpha >1\).

In the context of modern applications and real-world problems, equations of the form (1) naturally arise in the description of aggregation phenomena in population dynamics, see [7, 24, 32, 42]. In these works the nonlocal terms are coupled with a linear or nonlinear diffusion arising from stochastic noise, see [29, 34]. Clearly, the classical results in [25, 40] are also relevant in this context although less related from the methodological point of view.

Starting from the early 2000 years, the theory of gradient flows in Wasserstein spaces developed in [2, 28, 36] became an important tool to provide well-posedness results for the class of models (1). Otto [36] first provided the seminal ideas leading to the formulation of the porous medium equation as a gradient flow in the Wasserstein sense, whereas [28] adapted the “minimising movement” idea by De Giorgi to the new metric framework. The case with nonlocal interactions was first studied in [15], which partly anticipated the results in [2] without providing the full metric framework but adapting the theory to the case of (1), whereas [2] addresses a more general theory of gradient flows in metric spaces. The result in [14] is also relevant in this context in that it allowed to extend the theory to kernels G displaying a discontinuity of the gradient at the origin in the re-solution of the JKO-scheme [28] and in the proof of \(\lambda \)-convexity [31] of the related functional. Moreover, Carrillo et al. [14] also provides a finite-time blow-up result for solutions with data in the space of probability measures.

The role of \(\lambda \)-convexity of the functional \({\mathcal {F}}:{{{\mathcal {P}}_2({\mathbb {R}})}}\rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} \mu \mapsto {\mathcal {F}}[\mu ]=\int _{\mathbb {R}}G*\mu \, d\mu , \end{aligned}$$

(here \({{{\mathcal {P}}_2({\mathbb {R}})}}\) denotes the space of probability measures with finite second moment) is essential in order to prove a stability result for two solution curves \(\mu (t), \nu (t)\) (leading also to uniqueness of measure solutions) of the form

$$\begin{aligned} W_2(\mu (t),\nu (t))\le e^{-\lambda t}W_2(\mu (0),\nu (0)), \end{aligned}$$

which often implies as a byproduct a many-particle approximation result for the target equation (1). This is true both in the diffusion-free case and in the case with diffusion, see the recent [12].

The situation is more complicated in cases in which the functional lacks the \(\lambda \)-convexity, which is typically the case when G has a singularity at the origin. Attractive singularities make the study of well-posedness quite challenging in \(L^p\) spaces. In this context, the result in [4] allows to prove existence and uniqueness up to the blow-up time or globally when the singularity is not too strong. The repulsive case is also challenging especially if one wants to prove a many-particle approximation result, because a strong repulsion at the origin for G forces point particles to resolve into absolutely continuous measures. A quite thorough study of the many-particle approximation in the absence of diffusion and with “almost Newtonian” singular kernels (both attractive and repulsive) was provided in [10], based on a technique developed in [26]. In the case of \(a=0\), the result in [8] provides, so far, the only many-particle approximation result for (1) with G being the Newtonian potential, albeit in one space dimension (i.e. \(G(x)=\pm |x|\)). Such a result also explores the connection of (1) with a scalar conservation law satisfied by the cumulative distribution variable \(\int ^x \rho (y,t) dy\).

The diffusion-free case \(a=0\) often allows for a significant “reduction of complexity” of the PDE under consideration in that it often permits to approximate it by a set of deterministic particles, i.e. not subject to stochastic noise and simply obeying a system of ordinary differential equations. Obvious advantages of that are the possibility or approximating the density under consideration by a discrete set of Lagrangian trajectories (a feature of great impact in some applications such as traffic flow or pedestrian movements) and the availability of a new numerical “particle” method for the target PDE. In fact, recent contributions to the literature try to provide deterministic approximations to transport PDEs in the case with diffusion as well, see the classical [38] for one dimensional linear diffusion, the result in [23] for one-dimensional nonlinear diffusion, the results in [11] for multidimensional diffusion.

Recently, the specialised literature displayed an increasing interest of systems of gradient flows, i.e. systems of more than one transport equations of the form (1), modelling the mutual interplay of more than one species of individuals. The case with diffusion has a very rich literature in that it is quite challenging at the level of well-posedness due to the possibility of cross-diffusion effects. We refer for instance to the recent [17], which provides a general existence result of two-species gradient flows of functionals with cross-diffusion and nonlocal interactions terms.

As in the one-species case, the well-posedness and the stability in a Wasserstein gradient flow sense are strictly related with the convergence of a deterministic particle approximation scheme. We mention in this context the result in [18] which allows to prove singular behavior such as a total collapse of particles and cluster formations via stability in the Wasserstein gradient flow sense of [2]. The general (diffusion-free) system considered in [18] reads

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t\rho =\partial _x(\rho H_1'*\rho )+\partial _x(\rho K_1'*\eta ),\\ \partial _t\eta =\partial _x(\eta H_2'*\eta )+\partial _x(\eta K_2'*\rho ), \end{array}\right. } \end{aligned}$$
(2)

where the given potentials \(H_1, H_2, K_1, K_2\) are smooth enough and convex up to a quadratic perturbation.

The recent result in [13] extended the existence and uniqueness proven in [18] to the one-dimensional Newtonian case

$$\begin{aligned} H_1(x)=H_2(x)=-|x|\, , \qquad K_1(x)=K_2(x)=|x|, \end{aligned}$$
(3)

corresponding to a set of particles of two species, with mutual repulsion within the same species (self-repulsion, or intra-specific repulsion) and attraction between particles of opposite species (cross-attraction, or inter-specific attraction), the driving interaction kernels being multiples of the Newtonian potential. The result of [13] holds in one space dimension. In particular, in case of absolutely continuous initial data \(\rho _0, \eta _0\) [13] proves global-in-time existence and uniqueness of solutions by posing system (2)–(3) as gradient flow of the interaction energy functional

$$\begin{aligned} {\mathcal {F}}(\rho ,\eta )=-\frac{1}{2}\int _{\mathbb {R}}N*\rho \,d\rho - \frac{1}{2}\int _{\mathbb {R}}N*\eta \,d\eta + \int _{\mathbb {R}}N*\eta \,d\rho , \end{aligned}$$
(4)

where

$$\begin{aligned} N(x):=|x|, \qquad x\in {\mathbb {R}}. \end{aligned}$$

When dealing with general measures as initial data, in particular Dirac deltas, the sub-differential of \({\mathcal {F}}\) may be empty (see [8]). Hence, in [13], global-in-time existence and uniqueness of solutions to system (2)–(3) is proven by (formally) re-writing the system in the pseudo-inverse formalism and by using the concept of gradient flows in Hilbert spaces à la Brézis, cf. [9]. With potentials \(H_1=H_2=-K_1=-K_2\) featuring a repulsive singularity of logarithmic type at the origin, system (2) has also been studied in the context of multi-sign systems (arising e.g. in semiconductor theory) and evolution models for dislocations in crystals, cf. [1, 21, 30].

In this paper we prove that the PDE system (2)–(3), namely

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t\rho =-\partial _x(\rho \partial _x |\cdot |*\rho )+\partial _x(\rho \partial _x |\cdot |*\eta ),\\ \partial _t\eta =-\partial _x(\eta \partial _x|\cdot |*\eta )+\partial _x(\eta \partial _x |\cdot |*\rho ), \end{array}\right. } \end{aligned}$$
(5)

can be obtained as the many-particle limit of the deterministic ODE system

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle {\dot{x}_i(t)=\sum _{x_k(t)\ne x_i(t)}m_k\mathrm {sign}(x_i(t)-x_k(t))-\sum _{y_k(t)\ne x_i(t)}n_k\mathrm {sign}(x_i(t)-y_k(t))},\\ \displaystyle {\dot{y}_j(t)=\sum _{y_k(t)\ne y_j(t)}n_k\mathrm {sign}(y_j(t)-y_k(t))-\sum _{x_k(t)\ne y_j(t)}m_k\mathrm {sign}(y_j(t)-x_k(t))}, \end{array}\right. } \end{aligned}$$
(6)

with equal masses, i.e., \(m_i=n_j=1/N\), for \(i=1,\ldots ,N\), and \(j=1,\ldots ,N\). In general, system (6) models the movement of N particle for each species, with (not necessarily equal) masses \(m_1,\ldots ,m_N\) for the x-species and \(n_1,\ldots ,n_N\) for the y-species, under the effect of repulsive Newtonian potentials for same-species interactions and attractive Newtonian potentials for cross-species interactions.

We stress that, unlike the associated scalar model studied in [8], particles in the ODE system (6) may overlap. When this happens, the right-hand side of (6) features a jump discontinuity, which brings additional difficulties. To bypass this problem and to better understand the dynamics of (6), we frame it rigorously as the (finite dimensional) gradient flow of the (convex, in a suitable metric sense) functional

$$\begin{aligned} -\frac{1}{2}\sum _{i,j}m_i m_j|x_i-x_j| - \frac{1}{2}\sum _{i,j}n_i n_j|y_i-y_j| + \sum _{i,j}m_i n_j |x_i-y_j|, \end{aligned}$$

in the convex cone \({\mathcal {C}}^N\times {\mathcal {C}}^N\) of ordered configurations

$$\begin{aligned} x_1\le x_2\le \ldots \le x_N\,,\qquad y_1\le y_2\le \ldots \le y_N. \end{aligned}$$

More precisely, among other issues:

  • We prove that the sub-differential of this functional is always non-empty for any given configuration in \({\mathcal {C}}^N\times {\mathcal {C}}^N\) (including overlapping of particles of opposite species).

  • We analyse collisions among particles (which are possible because particles do not “slow down” when they get very close due to the lack of regularity of the interaction potential) and prove that particles of the same species never collide. Moreover, we provide explicit necessary and sufficient conditions for particles of opposite species to cross each other.

  • We explore the case of initial overlapping of particles and provide the explicit solution to the corresponding particle system.

These properties are preparatory to prove the main result of this paper, which is the rigorous derivation of solutions to (5) with \(L^1\) initial data as many-particle limits of the empirical measures of the particle system (6).

More precisely, we consider a pair of nonnegative initial densities \(\rho _0, \eta _0 \in L^1\), both with unit mass. We approximate them via atomic measures by considering N particles, \(x_1,x_2,\ldots ,x_N\), of the first species and another N particles, \(y_1,\ldots ,y_N\), of the second species, with non-zero and equal masses, \(m_1=\cdots =m_N=1/N\) and \(n_1=\cdots =n_N=1/N\), respectively, such that \(\sum _{i=1}^Nm_i=\sum _{k=1}^Nn_k=1\). We let those particles evolve according to the ODE system (6). We then prove that the empirical measures

$$\begin{aligned} \rho ^N(t,x)=\frac{1}{N}\sum _{i=1}^{N}\delta _{x_i(t)}(x), \quad \text {and}\quad \eta ^N(t,x)=\frac{1}{N}\sum _{j=1}^{N}\delta _{y_j(t)}(x), \end{aligned}$$

converge to the unique gradient flow solution to (5) in a suitable distributional sense as \(N\rightarrow +\infty \).

Such a result heavily relies on uniform estimates at the discrete level. In particular, we observe that in the 2-species case weak compactness in the measure sense by itself is insufficient to obtain consistency in the limit due to the cross-interaction terms. Indeed, since the cross-interaction terms cannot be symmetrised (unlike, for instance, the Keller-Segel one-species model), weak \(L^1\) compactness is needed in this case. We shall explain this issue in detail in Sect. 5.

We emphasise that system (5) is not included in the theory of [18] since the interaction potential in the (repulsive) intraspecific parts of \({\mathcal {F}}\) is neither convex nor \(\lambda \)-convex, i.e., convex up to a quadratic perturbation. In one dimension this problem can be overcome as shown in [13]. Another difference with [18] is that the analysis in [13] implies that particle solutions are not gradient flow solutions to system (5). Thus, the mean-field limit cannot be treated via the stability result mentioned previously since the atoms of the empirical measure may diffuse instantaneously.

Finally, we observe that our result is of interest in the framework posed in [21] for two-species models for dislocations with logarithmic singular potentials, more precisely (2) with \(H_1=H_2=-K_1=-K_2\) having a singularity at the origin “not stronger” than a logarithmic one. In [21], the convergence of a discrete particle system to the corresponding PDE system is proven in arbitrary dimensions on the torus. The approximating particle scheme is based on a regularisation of the singular kernels. It is important to emphasise that our approach is fundamentally different in that it does not hinge on a regularisation argument. Instead it relies on identifying the particle system as a gradient flow and an in-depth treatment of particle-particle interactions. Not only are we able to circumnavigate the regularisation argument by taking into account particle collisions, but we also uncover and use the underlying gradient flow structure of the problem which, ultimately, provides existence and uniqueness. As a result we obtain a particle approximation of the system that is less restrictive in that it does not depend on the regularisation strength, albeit in one dimension and for a less singular interaction kernel. We expand upon this aspect in more detail in Remark 10.

The paper is organised as follows.

  • In Sect. 2 we present the right setting for our problem, including our concept of weak measure solution for (5) in Definition 1, and provide some preliminary concepts related with one-dimensional optimal transport.

  • Section 3 is devoted to proving global-in-time existence and uniqueness of solutions to system (6), using the theory of gradient flows in Hilbert spaces. The main result of this section is the one in Lemma 5 proving that the sub-differential of the discrete functional is always non-empty in \({\mathcal {C}}^N\times {\mathcal {C}}^N\). The existence and uniqueness result in the discrete case is provided in Theorem 1.

  • Upon establishing well-posedness for the system of ODEs, we focus on some important properties of its solutions in Sect. 4. In Theorem 2 we provide explicit conditions describing the behavior of particles of opposite species after collision. In Theorem 3 we prove that particles of the same species can never collide. In Theorem 4 we provide an explicit solution to (6) in case the initial condition features a “cluster” or overlapping particles of the two species.

  • Finally, in Sect. 5 we prove our many-particle approximation result. We show that the empirical measure of the particle system (6) converges in a suitable sense to the unique gradient flow solution to (5). The main result is stated in Theorem 5. The basic estimates needed for the proof are provided in Propositions 4 and 5.

2 Preliminaries

Throughout the paper we denote by \({{{\mathcal {P}}_2({\mathbb {R}})}}\) the set of probability measures with finite second moment, i.e.,

$$\begin{aligned} {{{\mathcal {P}}_2({\mathbb {R}})}}=\left\{ \mu \in {{\mathcal {P}}}({\mathbb {R}}) \, |\, m_2(\mu )<+\infty \right\} , \text{ where } m_2(\mu )=\int _{{\mathbb {R}}}|x|^2\,d\mu (x). \end{aligned}$$

We use the symbol \({{{\mathcal {P}}_2^a({\mathbb {R}})}}\) to denote the set of measures in \({{{\mathcal {P}}_2({\mathbb {R}})}}\) which are absolutely continuous with respect to the Lebesgue measure, i.e., \({{\mathcal {P}}_2^a({\mathbb {R}})}={{\mathcal {P}}}({\mathbb {R}})\cap L^1((1+|x|^2)\,dx)\). Next, for any measure \(\mu \in {{\mathcal {P}}}({\mathbb {R}})\) and a Borel map \(T:{\mathbb {R}}\rightarrow {\mathbb {R}}\), we denote by \(\nu = T_{\#}\mu \) the push-forward of \(\mu \) through T, defined by

$$\begin{aligned} \nu (A)&=\mu (T^{-1}(A)),\qquad \qquad \ \ \text{ for } \text{ any } \text{ Borel } \text{ set }\ A\subset {\mathbb {R}},\\ \text{ or }\ \int _{\mathbb {R}}f(y)\,dT_{\#}\mu (y)&=\int _{\mathbb {R}}f(T(x))\,d\mu (x), \qquad \text{ for } \text{ any } \text{ measurable } \text{ function }\ f. \end{aligned}$$

Here, T is usually referred to as transport map pushing \(\mu \) to \(\nu \). Next, we equip the set \({{{\mathcal {P}}_2({\mathbb {R}})}}\) with the 2-Wasserstein distance, which is defined for any \(\mu ,\nu \in {{{\mathcal {P}}_2({\mathbb {R}})}}\) as

$$\begin{aligned} W_2(\mu ,\nu )=\left( \inf _{\gamma \in \varGamma (\mu ,\nu )}\int _{{\mathbb {R}}^2}|x-y|^2\, d\gamma (x,y)\right) ^{1/2}, \end{aligned}$$
(7)

where \(\varGamma (\mu ,\nu )\) is the class of transport plans between \(\mu \) and \(\nu \), that is,

$$\begin{aligned} \varGamma (\mu , \nu ):= \{ \gamma \in {{\mathcal {P}}}({\mathbb {R}}^2)\,|\, \pi ^1_{\#}\gamma = \mu , \,\pi ^2_{\#}\gamma = \nu \}, \end{aligned}$$

where \(\pi ^i:{\mathbb {R}}\times {\mathbb {R}}\rightarrow {\mathbb {R}}\), \(i=1,2\), denotes the projection operator on the \(i\mathrm {th}\) component of the product space \({\mathbb {R}}^2\). Setting \(\varGamma _0(\mu ,\nu )\) as the class of optimal plans, i.e., minimisers of (7), the (squared) Wasserstein distance can be written as

$$\begin{aligned} W_2^2(\mu ,\nu )=\int _{{\mathbb {R}}^2}|x-y|^2\,d\gamma (x,y), \end{aligned}$$

for any \(\gamma \in \varGamma _0(\mu ,\nu )\). The set \({{{\mathcal {P}}_2({\mathbb {R}})}}\) equipped with the 2-Wasserstein metric is a complete metric space which can be seen as a length space, see for instance [2, 39, 44, 45]. Since we are dealing with the evolution of two interacting species, we shall work on the product space \({{{\mathcal {P}}_2({\mathbb {R}})}}\times {{{\mathcal {P}}_2({\mathbb {R}})}}\) equipped with the 2-Wasserstein product distance defined via

$$\begin{aligned} {\mathcal {W}}_2^2(\gamma ,{\tilde{\gamma }})=W_2^2(\rho ,{\tilde{\rho }})+W_2^2(\eta ,{\tilde{\eta }}), \end{aligned}$$

for all \(\gamma =(\rho ,\eta ), {\tilde{\gamma }}=({\tilde{\rho }},\tilde{\eta })\) in \({{{\mathcal {P}}_2({\mathbb {R}})}}\times {{{\mathcal {P}}_2({\mathbb {R}})}}\). Now, let us introduce a crucial tool for the one-dimensional case. For a given \(\mu \in {{{\mathcal {P}}_2({\mathbb {R}})}}\) its cumulative distribution function is given by

$$\begin{aligned} F_\mu (x)=\mu ((-\infty ,x]). \end{aligned}$$
(8)

Since \(F_\mu \) is a non-decreasing, right-continuous function such that

$$\begin{aligned} \lim _{x\rightarrow -\infty } F_\mu (x) = 0, \quad \text{ and } \quad \lim _{x\rightarrow +\infty } F_\mu (x) = 1, \end{aligned}$$

we may define the pseudo-inverse function \(X_\mu \) associated to \(F_\mu \), by

$$\begin{aligned} X_\mu (s):=\inf _{x\in {\mathbb {R}}}\{F_\mu (x)>s\}, \end{aligned}$$
(9)

for any \(s\in (0,1)\). It is easy to see that \(X_\mu \) is right-continuous and non-decreasing as well. Having introduced the pseudo-inverse, let us now recall some of its important properties. First we notice that it is possible to pass from \(X_\mu \) to \(F_\mu \) as follows

$$\begin{aligned} F_\mu (x)=\int _0^1\mathbb {1}_{(-\infty ,x]}(X_\mu (s))\,ds=|\{X_\mu (s)\le x\}|. \end{aligned}$$
(10)

For any probability measure \(\mu \in {{{\mathcal {P}}_2({\mathbb {R}})}}\) and the pseudo-inverse, \(X_\mu \), associated to it, we have

$$\begin{aligned} \int _{\mathbb {R}}f(x)\,d\mu (x)=\int _0^1f(X_\mu (s))\,ds, \end{aligned}$$
(11)

for every bounded continuous function f. Moreover, for \(\mu ,\nu \in {{{\mathcal {P}}_2({\mathbb {R}})}}\), the Hoeffding-Fréchet theorem [37, Section 3.1] allows us to represent the 2-Wasserstein distance, \(W_2(\mu ,\nu )\), in terms of the associated pseudo-inverse functions via

$$\begin{aligned} W_2^2(\mu ,\nu )=\int _0^1|X_\mu (s)-X_\nu (s)|^2\,ds, \end{aligned}$$
(12)

since the optimal plan is given by \((X_\mu (s)\otimes X_\nu (s))_{\#}{\mathcal {L}}\), where \({\mathcal {L}}\) is the Lebesgue measure on the interval [0, 1], cf. also [16, 44]. We have seen that for every \(\mu \in {{{\mathcal {P}}_2({\mathbb {R}})}}\) we can construct a non-decreasing \(X_\mu \) according to (9), and by the change of variables formula (11) we also know that \(X_\mu \) is square integrable. Let us recall that this mapping is indeed a distance-preserving bijection between the space of probability measures with finite second moments and the convex cone of non-decreasing \(L^2\)-functions

$$\begin{aligned} {\mathcal {C}}:=\{f\in L^2(0,1)\,|\,f\ \text{ is } \text{ non-decreasing }\} \subset L^2(0,1). \end{aligned}$$
(13)

For \(p\ge 1\), let us also introduce the p-Wasserstein distance

$$\begin{aligned} W_p(\mu ,\nu )=\inf _{\gamma \in \varGamma (\mu ,\nu )} \left( \int _{{\mathbb {R}}^2}|x-y|^p\, d\gamma (x,y)\right) ^{1/p}, \end{aligned}$$
(14)

for any \(\mu ,\nu \in {{\mathcal {P}}}_p({\mathbb {R}}):=\{\mu \in {{\mathcal {P}}}({\mathbb {R}})\ |\ m_p(\mu ):=\int _{\mathbb {R}}|x|^p\,d\mu (x)<+\infty \}\). Since our problem is set in one space dimension, we have

$$\begin{aligned} W_p(\mu ,\nu )=\Vert X_\mu -X_\nu \Vert _{L^p([0,1])}. \end{aligned}$$
(15)

In the case \(p=1\) we also have

$$\begin{aligned} W_1(\mu ,\nu )=\Vert F_\mu - F_\nu \Vert _{L^1({\mathbb {R}})}. \end{aligned}$$

We refer to [2, 39, 44, 45] for further details. The p-Wasserstein product distance is given by

$$\begin{aligned} {\mathcal {W}}_p(\gamma ,{\tilde{\gamma }})=W_p(\rho ,{\tilde{\rho }})+W_p(\eta ,\tilde{\eta }), \end{aligned}$$

for all \(\gamma =(\rho ,\eta ), {\tilde{\gamma }}=({\tilde{\rho }},\tilde{\eta })\) in \({{\mathcal {P}}}_p({\mathbb {R}})\times {{\mathcal {P}}}_p({\mathbb {R}})\).

For all \((\mu ,\nu ) \in {{{\mathcal {P}}_2({\mathbb {R}})}}\times {{{\mathcal {P}}_2({\mathbb {R}})}}\), we define the interaction energy functional

$$\begin{aligned} {\mathcal {F}}(\mu ,\nu )= & {} -\frac{1}{2}\iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|d\mu (y)d\mu (x)\\&-\frac{1}{2}\iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|d\nu (y)d\nu (x)+\iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|d\mu (x)d\nu (y). \end{aligned}$$

The following lemma is proven for absolutely continuous measures in [13]. Below, we extend it to general measures for the sake of self-containedness.

Lemma 1

For all \((\mu ,\nu ) \in {{{\mathcal {P}}_2({\mathbb {R}})}}\times {{{\mathcal {P}}_2({\mathbb {R}})}}\) we have

$$\begin{aligned} {\mathcal {F}}(\mu ,\nu )\ge 0. \end{aligned}$$

Proof

We first consider the case in which both \(\mu \) and \(\nu \) are absolutely continuous with respect to the Lebesgue measure and have continuous densities \(\rho \) and \(\eta \) respectively. In this case

$$\begin{aligned} {\mathcal {F}}(\mu ,\nu )&= -\frac{1}{2}\iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|\left( \rho (x)\rho (y)+\eta (x)\eta (y)-\rho (x)\eta (y)-\rho (y)\eta (x)\right) dx dy\\&= -\frac{1}{2}\iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|\sigma (x)\sigma (y) dx dy,\qquad \sigma =\rho -\eta . \end{aligned}$$

Recall that \(N(x)=|x|\) satisfies \(N'(x)=\mathrm {sign}(x)\) and \(N''=2\delta _0\) in \({\mathcal {D}}'\). Therefore

$$\begin{aligned} {\mathcal {F}}(\mu ,\nu )=-\frac{1}{2}\int _{{\mathbb {R}}}N*\sigma (x)\delta _0*\sigma (x) dx =-\frac{1}{4}\int _{{\mathbb {R}}}N*\sigma (x)N''*\sigma (x) dx. \end{aligned}$$

Integration by parts yields

$$\begin{aligned} {\mathcal {F}}(\mu ,\nu )=\frac{1}{4}\int _{\mathbb {R}}\left( N'*\sigma (x)\right) ^2 dx -\frac{1}{4}\left[ N*\sigma (x)N'*\sigma (x)\right] _{x=-\infty }^{x=+\infty }. \end{aligned}$$

Now, arguing as in the proof of [13, Lemma 3.7], the boundary term at infinity vanishes due to the fact that \(\rho \) and \(\eta \) have finite second moment and \(\sigma \) has zero average. Hence, \({\mathcal {F}}(\mu ,\nu )\ge 0\).

Let us now consider the general case \((\mu ,\nu )\in {{{\mathcal {P}}_2({\mathbb {R}})}}\times {{{\mathcal {P}}_2({\mathbb {R}})}}\). Assume there exists a pair \(({\bar{\mu }},{\bar{\nu }})\in {{{\mathcal {P}}_2({\mathbb {R}})}}\times {{{\mathcal {P}}_2({\mathbb {R}})}}\) such that \({\mathcal {F}}({\bar{\mu }},{\bar{\nu }})<0\). By density of \((C({\mathbb {R}})\cap {{{\mathcal {P}}_2({\mathbb {R}})}})^2\) in \({{{\mathcal {P}}_2({\mathbb {R}})}}^2\) with respect to the 2-Wasserstein distance, there exist sequences \(\rho _n, \eta _n\in C({\mathbb {R}})\) such that \(W_2({\bar{\mu }},\rho _n)\rightarrow 0\) and \(W_2({\bar{\nu }},\eta _n)\rightarrow 0\) as \(n\rightarrow +\infty \). Consequently, \(\rho _n\otimes \rho _n\rightarrow {\bar{\mu }}\otimes {\bar{\mu }}\) as \(n\rightarrow +\infty \) in the weak measure sense, as well as \(\eta _n\otimes \eta _n\rightarrow {\bar{\nu }}\otimes {\bar{\nu }}\) and \(\rho _n\otimes \eta _n\rightarrow {\bar{\mu }}\otimes {\bar{\nu }}\) as \(n\rightarrow +\infty \). Moreover, since \((x,y)\mapsto |x-y|\) has a sub-quadratic growth at infinity, by a standard cut-off argument, we have

$$\begin{aligned}&\iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|d\rho _n(x)d\rho _n(y)\rightarrow \iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|d{\bar{\mu }}(x)d{\bar{\mu }}(y),\\&\quad \iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|d\eta _n(x)d\eta _n(y)\rightarrow \iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|d{\bar{\nu }}(x)d{\bar{\nu }}(y),\\&\quad \iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|d\rho _n(x)d\eta _n(y)\rightarrow \iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|d{\bar{\mu }}(x)d{\bar{\nu }}(y), \end{aligned}$$

which implies

$$\begin{aligned} {\mathcal {F}}(\rho _n,\eta _n)\rightarrow {\mathcal {F}}({\bar{\mu }},{\bar{\nu }}). \end{aligned}$$

On the other hand, the previous case implies

$$\begin{aligned} {\mathcal {F}}(\rho _n,\eta _n)\ge 0, \qquad \hbox {for all}\, n, \end{aligned}$$

which contradicts the assumption \({\mathcal {F}}({\bar{\mu }},{\bar{\nu }})<0\). \(\square \)

We conclude this subsection by providing the rigorous concepts of solution to the continuum model (5) to be used in the many-particle limit. Such a concept of solution only refers to the case of absolutely continuous initial data.

Definition 1

Let \((\rho _0,\eta _0)\in {{\mathcal {P}}_2^a({\mathbb {R}})}^2\). We say that the absolutely continuous curve \((\rho (\cdot ),\eta (\cdot ))\in C([0,+\infty )\,;\,{{\mathcal {P}}_2^a({\mathbb {R}})}^2)\) is a weak measure solution to (5) if, for all test functions \(\varphi ,\phi \in C^1_c([0,+\infty )\times {\mathbb {R}})\) we have

$$\begin{aligned} \begin{aligned} - \int _0^{+\infty }&\int _{\mathbb {R}}\rho (x,t)\varphi _t(x,t)\, dx\, dt - \int _{\mathbb {R}}\rho _0(x)\varphi (x,0) dx\\&\ = \int _0^{+\infty }\iint _{{\mathbb {R}}\times {\mathbb {R}}} \rho (x,t) \rho (y,t)\mathrm {sign}(x-y)\varphi _x(x,t)\,dy\, dx\, dt\\&\quad \ - \int _0^{+\infty }\iint _{{\mathbb {R}}\times {\mathbb {R}}}\rho (x,t) \mathrm {sign}(x-y)\eta (y,t)\varphi _x(x,t)\,dy \,dx\, dt\,, \end{aligned} \end{aligned}$$
(16a)

and

$$\begin{aligned} \begin{aligned} - \int _0^{+\infty }&\int _{\mathbb {R}}\eta (x,t)\phi _t(x,t)\, dx\, dt - \int _{\mathbb {R}}\eta _0(x)\phi (x,0)\, dx \\&\ = \int _0^{+\infty }\iint _{{\mathbb {R}}\times {\mathbb {R}}} \eta (x,t) \eta (y,t)\mathrm {sign}(x-y)\phi _x(x,t)\,dy\, dx\, dt\\&\quad \ - \int _0^{+\infty }\iint _{{\mathbb {R}}\times {\mathbb {R}}}\eta (x,t) \mathrm {sign}(x-y)\rho (y,t)\phi _x(x,t)\,dy\, dx dt. \end{aligned} \end{aligned}$$
(16b)

Remark 1

The existence and uniqueness of solutions according to Definition 1 follows from the existence and uniqueness result of gradient flow solutions proven in [13] and arguing as in [2, Theorem 11.2.8].

3 Discrete gradient flow

In this section we pose system (6) as gradient flow of a suitable discrete interaction energy functional. We shall denote by \(x=(x_1,\ldots ,x_N)\) and \(y=(y_1,\ldots ,y_N)\) the vectors corresponding to the particles of the two different species, where each particle \(x_i\) has mass \(m_i\in {\mathbb {R}}_+\) and \(y_j\) has mass \(n_j\in {\mathbb {R}}_+\), for all \(i,j\in \{1,2,\ldots ,N\}\). Throughout, we shall work in the Hilbert space \(({{\mathbb {R}}^{N}}\times {{\mathbb {R}}^{N}},\langle \cdot ,\cdot \rangle _w)\), with the weighted scalar product defined by

$$\begin{aligned} \langle Z^1,Z^2\rangle _{w} := \sum _{i=1}^N m_ix_i^1 x_i^2+\sum _{j=1}^N n_j y_j^1 y_j^2, \end{aligned}$$

where \(Z^1=(x^1,y^1),\ Z^2=(x^2,y^2)\in {{\mathbb {R}}^{N}}\times {{\mathbb {R}}^{N}}\). We drop the w-subscript in the definition of the weighted norm

$$\begin{aligned} \Vert Z\Vert ^2=\langle Z,Z\rangle _w. \end{aligned}$$

Since the problem is posed in one spatial dimension, we may label the particles such that they are monotonically ordered. Therefore, up to possibly relabelling, we may restrict the evolution to the convex cone \({\mathcal {C}}^N\times {\mathcal {C}}^N\), with

$$\begin{aligned} {\mathcal {C}}^N:=\left\{ x\in {{\mathbb {R}}^{N}}:x_1\le x_2\le \cdots \le x_N\right\} . \end{aligned}$$

Since the analysis below requires a careful treatment of all particles of the two species, it is useful to introduce the following index notation that provides a precise way to label particles depending on the particles’ relative location.

Definition 2

(Index Notation) Given \(Z=(x,y)\in {\mathcal {C}}^N\times {\mathcal {C}}^N\), for all \(i \in \{1,\ldots ,N\}\) we define

$$\begin{aligned}&\sigma [x_i]=\left\{ k\in \{1,\ldots ,N\}\,:\,\,x_k=x_i\right\} ,\qquad \gamma [x_i]=\left\{ j\in \{1,\ldots ,N\}\,:\,\, y_j=x_i\right\} ,\\&\sigma ^+[x_i]=\left\{ k\in \{1,\ldots ,N\}\,:\,\,x_k>x_i\right\} ,\qquad \gamma ^+[x_i]=\left\{ j\in \{1,\ldots ,N\}\,:\,\, y_j>x_i\right\} ,\\&\sigma ^-[x_i]=\left\{ k\in \{1,\ldots ,N\}\,:\,\,x_k<x_i\right\} ,\qquad \gamma ^-[x_i]=\left\{ j\in \{1,\ldots ,N\}\,:\,\, y_j<x_i\right\} , \end{aligned}$$

and for all \(j\in \{1,\ldots , N\}\) we define

$$\begin{aligned}&\sigma [y_j]=\left\{ h\in \{1,\ldots ,N\}\,:\,\, y_h=y_j\right\} \,,\qquad \gamma [y_j]=\left\{ i\in \{1,\ldots ,N\}\,:\,\, x_i=y_j\right\} ,\\&\sigma ^+[y_j]=\left\{ h\in \{1,\ldots ,N\}\,:\,\, y_h>y_j\right\} \,,\qquad \gamma ^+[y_j]=\left\{ i\in \{1,\ldots ,N\}\,:\,\, x_i>y_j\right\} ,\\&\sigma ^-[y_j]=\left\{ h\in \{1,\ldots ,N\}\,:\,\, y_h<y_j\right\} \,,\qquad \gamma ^-[y_j]=\left\{ i\in \{1,\ldots ,N\}\,:\,\, x_i<y_j\right\} . \end{aligned}$$

Remark 2

(Index Notation) Clearly, some of the sets \(\sigma [x_i]\) and \(\sigma [y_j]\) may be singletons. If a set \(\sigma [x_i]\) contains more than one index, it means that the particle configuration Z contains an x-cluster, i.e., a group of colliding particles of the x-species. Similarly, some of the sets \(\gamma [x_i]\) and \(\gamma [y_j]\) may be empty. A non-empty \(\gamma [x_i]\) implies that there are particles of the y-species attached to \(x_i\) in the Z configuration. Moreover, note that the sets \(\sigma ^-[x_1]\), \(\sigma ^+[x_N]\), \(\sigma ^-[y_1]\), \(\sigma ^+[y_N]\) are always empty.

Remark 3

(Discrete Fubini) Throughout the main body we need to rearrange sums over indices of the particles involved in the dynamics. It is useful to highlight the following equality of index sets:

$$\begin{aligned} \left\{ (i,j)\; \big | \; i< j \le N,\; i = 1, \ldots , N \right\} = \left\{ (i,j)\; \big | \; 1 \le i < j,\; j = 1, \ldots , N \right\} , \end{aligned}$$

and therefore

$$\begin{aligned} \sum _{i=1}^N \sum _{j> i} Q_{i,j} = \sum _{j=1}^N \sum _{i < j} Q_{i,j} = \sum _{i=1}^N \sum _{j > i} Q_{j,i}, \end{aligned}$$
(17)

for any quantity \(Q\in {\mathbb {R}}^{N\times N}\). Note that the first equality holds due to Fubini and the second one is due to swapping the roles of i and j.

Lemma 2

(Properties of Index Notation-I) For a given distribution of particles, \(Z\in {\mathcal {C}}^N\) there exists \(\epsilon _0>0\) such that for all \(Z'\in {\mathcal {C}}^N\) with \(|Z'-Z|_\infty < \epsilon _0/3\) there holds

$$\begin{aligned} \sigma [x_i'] = \sigma [x_i'] \cap \sigma [x_i], \end{aligned}$$
(18)

and the statement remains true when replacing \(x_i\) by \(y_i\).

Proof

The inclusion “\(\supset \)” is trivial and we only show the reverse inclusion “\(\subset \)”. To this end, let \(j \in \sigma [x_i']\) and assume that \(j\notin \sigma [x_i]\). This implies that either \(j\in \sigma ^-[x_i]\) or \(j\in \sigma ^+[x_i]\). Assume, for instance, that \(j\in \sigma ^-[x_i]\), and therefore \(x_j < x_i\) due to the fact that Z is ordered. However, then

$$\begin{aligned} \epsilon _0< x_i-x_j = x_i-x_i'+x_i' -x_j =x_i-x_i'+x_j' -x_j< 2\epsilon _0/3, \end{aligned}$$
(19)

which is impossible. Similarly, \(j\notin \sigma ^+[x_i]\), which completes the proof. \(\square \)

Lemma 3

(Properties of Index Notation-II) For a given distribution of particles, \(Z \in {\mathcal {C}}^N\), there exists \(\epsilon _0>0\) such that for all \(Z' \in {\mathcal {C}}^N\) with \(|Z'-Z|_\infty < \epsilon _0/3\) there holds

$$\begin{aligned} \begin{aligned} \sigma ^-[x'_i]&= \sigma ^-[x_i]{\dot{\cup }}\left( \sigma [x_i]\cap \sigma ^-[x'_i]\right) ,\\ \sigma ^-[y'_i]&= \sigma ^-[y_i]{\dot{\cup }}\left( \sigma [y_i]\cap \sigma ^-[y'_i]\right) , \end{aligned} \end{aligned}$$
(20)

as well as

$$\begin{aligned} \begin{aligned} \sigma [x'_i]&= \sigma [x_i] \setminus (\sigma ^-[x'_i]{\dot{\cup }} \sigma ^+[x'_i]),\\ \sigma [y'_i]&= \sigma [y_i] \setminus (\sigma ^-[y'_i]{\dot{\cup }} \sigma ^+[y'_i]). \end{aligned} \end{aligned}$$
(21)

Concerning the interspecies index sets, there holds

$$\begin{aligned} \begin{aligned} \gamma ^\pm [x'_i]&=\gamma ^\pm [x_i]{\dot{\cup }}(\gamma [x_i]\cap \gamma ^\pm [x'_i])\\ \gamma ^\pm [y'_j]&=\gamma ^\pm [y_j]{\dot{\cup }}(\gamma [y_j]\cap \gamma ^\pm [y'_j]), \end{aligned} \end{aligned}$$
(22)

as well as

$$\begin{aligned} \begin{aligned} \gamma [x'_i]&=\gamma [x_i]\setminus (\gamma ^-[x'_i]{\dot{\cup }}\gamma ^+[x'_i])\\ \gamma [y'_j]&= \gamma [y_j]\setminus (\gamma ^-[y'_j]{\dot{\cup }}\gamma ^+[y'_j]). \end{aligned} \end{aligned}$$
(23)

Proof

Let \(Z \in {\mathcal {C}}^N\) be given. We set

$$\begin{aligned} \epsilon _0 := \min \left\{ |Z_k-Z_l| \; \big |\; 1 \le k,l \le 2N,\, \text {s.t. } Z_k \ne Z_l\right\} >0. \end{aligned}$$

Throughout, we assume that \(Z'\in {\mathcal {C}}^N\), such that \(|Z'-Z|_\infty < \epsilon _0/3\). We begin by proving statement (20).

\(\subset \)”: Let \(j \in \sigma ^-[x_i']\). By definition this means that \(x_j' < x_i'\). Since \(Z' \in {\mathcal {C}}^N\) is ordered, we infer \(j < i\). Since Z is ordered, too, we have \(x_j \le x_i\), which means that

$$\begin{aligned} j&\in \sigma ^-[x_i'] \cap \left( \sigma [x_i] {{\dot{\cup }}} \sigma ^-[x_i]\right) \\&\quad = \left( \sigma ^-[x_i'] \cap \sigma [x_i]\right) {{\dot{\cup }}} \left( \sigma ^-[x_i'] \cap \sigma ^-[x_i]\right) \\&\qquad \subset \left( \sigma ^-[x'_i]\cap \sigma [x_i]\right) {\dot{\cup }} \sigma ^-[x_i], \end{aligned}$$

where the last inclusion is due to fact that, without intersecting with \(\sigma ^-[x_i']\), the second set in the union, becomes larger, i.e., \(\sigma ^-[x'_i]\cap \sigma ^-[x_i] \subset \sigma ^-[x_i]\). “\(\supset \)”: Conversely, let \(j \in \sigma ^-[x_i]{\dot{\cup }}\left( \sigma [x_i]\cap \sigma ^-[x'_i]\right) \). Clearly, if j belongs to the second set the statement is trivially satisfied. If, on the other hand, \(j\in \sigma ^-[x_i]\), we have \(x_j < x_i\) and, due to the fact that both Z and \(Z'\) are ordered, we first obtain \(j<i\) and therefore \(x_j'\le x_i'\). We conclude \(j \in \sigma ^-[x_i'] {{\dot{\cup }}} \sigma [x_i']\). We will now show that \(j\in \sigma [x_i']\) is impossible which implies the statement. The argument is by contradiction and we assume \(j\in \sigma [x_i']\), i.e., \(x_j' = x_i'\). However, by definition of \(\epsilon _0>0\), this implies

$$\begin{aligned} \epsilon _0< x_i - x_j = x_i - x_i' + x_i' - x_j = x_i - x_i' + x_j' - x_j < 2\epsilon _0/3, \end{aligned}$$

where the closeness assumption \(|Z'-Z|_\infty <\epsilon _0/3\) entered in the last inequality. Clearly this statement is absurd and therefore, \(j\in \sigma ^-[x_i']\) which completes the proof of the first statement. The same statement is true for the y-species using the same line of reasoning.

We continue with statement (21).

\(\subset \)”: If \(j\in \sigma [x_i']\), then trivially, \(j\notin \sigma ^-[x_i'] {{\dot{\cup }}} \sigma ^+[x_i']\) and it remains to show that \(j \in \sigma [x_i]\). As before the argument is by contradiction and we assume, for instance, that \(j \in \sigma ^-[x_i]\). As before

$$\begin{aligned} \epsilon _0< x_i-x_j = x_i-x_i'+x_j'-x_j < 2\epsilon _0/3, \end{aligned}$$

ruling out the case \(j \in \sigma ^-[x_i]\). Similarly, \(j \notin \sigma ^+[x_i]\) leaving as only possibility \(j \in \sigma [x_i]\) which proves the inclusion.

\(\supset \)”: Conversely, if \(j\in \sigma [x_i] \setminus \left( \sigma ^-[x_i'] {{\dot{\cup }}} \sigma ^+[x_i']\right) \), there holds \(x_j = x_i\) and \(x_i'\le x_j' \le x_i'\). In particular, \(x_i'=x_j'\), and therefore \(j\in \sigma [x_i']\), which concludes the proof of this inclusion.

Next, we prove statement (22). We only focus on the “-” case, as the statement for “+” is given in a similar manner. Let us begin with “\(\subset \)”: Let \(j \in \gamma ^-[x_i']\), i.e., \(y_j' < x_i'\). Assume, that \(j \in \gamma ^+[x_i]\), i.e., that \(y_j > x_i\). Using these two inequalities yields

$$\begin{aligned} \epsilon _0< y_j - x_i = y_j - y_j' + y_j' - x_i \le y_j - y_j' + x_i' - x_i < 2 \epsilon _0/3, \end{aligned}$$

implying that \(j\in \gamma [x_i] {{\dot{\cup }}} \gamma ^+[x_i]\), which yields the statement together with the fact that \(j \in \gamma ^-[x_i']\).

Regarding the opposite inclusion, “\(\supset \)”, it suffices to show \(\gamma ^-[x_i] \subset \gamma ^-[x_i']\) as the statement is trivially satisfied if j is in the set \((\gamma [x_i]\cap \gamma ^-[x'_i])\). Again, arguing by contradiction, let us assume \(j \notin \gamma ^-[x_i']\), i.e., \(y_j'\ge x_i'\). In this case, we observe

$$\begin{aligned} \epsilon _0 < y_j' - x_i' = y_j'-y_j + y_j - x_i \le 2\epsilon _0/3, \end{aligned}$$

which yields the statement. Finally, we prove statement (23) beginning with “\(\subset \)”.

Let \(j \in \gamma [x_i']\), i.e., \(y_j'=x_i'\), and assume that \(j\notin \gamma [x_i]\), for instance, \(y_j < x_i\). In this case

$$\begin{aligned} \epsilon _0 < x_i-y_j = x_i-x_i' + y_j' - y_j \le 2\epsilon _0/3, \end{aligned}$$

which is a contradiction. Similarly, we show that \(y_j>x_i\) which shows the assertion. Finally, we show the reverse inclusion, “\(\supset \)”:

In this case \(x_i' \le y_j' \le x_i'\), i.e., \(y_j' = x_i'\), and therefore \(j \in \gamma [x_i']\). This concludes the proof of the lemma. \(\square \)

Now, we introduce the discrete interaction energy functional acting on a given \(Z=(x,y)\in {{\mathbb {R}}^{N}}\times {{\mathbb {R}}^{N}}\), as follows

$$\begin{aligned} {\mathcal {F}}[Z]= & {} -\frac{1}{2}\sum _{i\ne j}m_i m_j|x_i-x_j|-\frac{1}{2}\sum _{i\ne j}n_i n_j|y_i-y_j|\nonumber \\&+\sum _{i,j}m_in_j|x_i-y_j| + {\mathcal {I}}_{{\mathcal {C}}^N}(x)+{\mathcal {I}}_{{\mathcal {C}}^N}(y), \end{aligned}$$
(24)

where \({\mathcal {I}}_{{\mathcal {C}}^N}\) is the indicator function of the cone \({\mathcal {C}}^N\), i.e.,

$$\begin{aligned} {\left\{ \begin{array}{ll} 0 &{}\text {if}\ x\in {\mathcal {C}}^N,\\ +\infty &{}\text {otherwise}. \end{array}\right. } \end{aligned}$$
(25)

We shall often use the notation

$$\begin{aligned}&S(x):=-\frac{1}{2}\sum _{i\ne j}m_i m_j|x_i-x_j|, \qquad S(y):=-\frac{1}{2}\sum _{i\ne j}n_i n_j|y_i-y_j|, \end{aligned}$$

and

$$\begin{aligned}&C(x,y):=\sum _{i,j}m_in_j|x_i-y_j|, \end{aligned}$$

so that

$$\begin{aligned} {\mathcal {F}}[Z] = S(x) + S(y) + C(x,y) + {\mathcal {I}}_{{\mathcal {C}}^N}(x)+{\mathcal {I}}_{{\mathcal {C}}^N}(y). \end{aligned}$$

S represents the self-interaction part, i.e., interactions within the same species, while C accounts for cross-interactions.

The functional \({\mathcal {F}}\) is proper, i.e., \(D({\mathcal {F}})=\left\{ Z\in {{\mathbb {R}}^{N}}\times {{\mathbb {R}}^{N}}:{\mathcal {F}}[Z]<+\infty \right\} \ne \emptyset \), since we have

$$\begin{aligned} {\mathcal {F}}[Z]\le |x_N|+|y_N|, \end{aligned}$$

for any \(Z=(x,y)\in {\mathcal {C}}^N\times {\mathcal {C}}^N\). Moreover, note that the self-interaction part of the functional can also be rewritten as

$$\begin{aligned} S(x) + S(y)=-\sum _{i=1}^N\sum _{\{j:\,x_j>x_i\}}m_i m_j(x_j-x_i)-\sum _{i=1}^N\sum _{\{j:\,y_j>y_i\}}n_i n_j(y_j-y_i). \end{aligned}$$

Remark 4

Note that the terms corresponding to the index \(i=N\) give null contribution in the above sum. Nevertheless, we keep them in order to have a further expression for the self-interaction part of the functional that we will use later on in Eqs. (26) and (27). Moreover, for the sake of completeness, let us point out the equivalent formulation

$$\begin{aligned} S(x) + S(y)=-\sum _{i=1}^N\sum _{\{j:\,x_j<x_i\}}m_i m_j(x_i-x_j)-\sum _{i=1}^N\sum _{\{j:\,y_j<y_i\}}n_i n_j(y_i-y_j), \end{aligned}$$

where the terms corresponding to the index \(i=1\) give null contribution.

The above observation allows us to prove the next lemma.

Lemma 4

The functional \({\mathcal {F}}:{{\mathbb {R}}^{N}}\times {{\mathbb {R}}^{N}}\rightarrow {\mathbb {R}}\) is convex.

Proof

Take \(Z^1=(x^1,y^1),\ Z^2=(x^2,y^2)\in {{\mathbb {R}}^{N}}\times {{\mathbb {R}}^{N}}\) and a convex combination between them \(Z^\alpha =\alpha Z^1+(1-\alpha )Z^2=(\alpha x^1+(1-\alpha )x^2, \alpha y^1+(1-\alpha )y^2)=(x^\alpha ,y^\alpha )\), with \(\alpha \in [0,1]\). We need to prove

$$\begin{aligned} {\mathcal {F}}[\alpha X^1+(1-\alpha )X^2]\le \alpha {\mathcal {F}}[X^1]+(1-\alpha ){\mathcal {F}}[X^2]. \end{aligned}$$

If either \(Z^1\not \in {\mathcal {C}}^N\times {\mathcal {C}}^N\) or \(Z^2\not \in {\mathcal {C}}^N\times {\mathcal {C}}^N\) then the above inequality is trivial since the right-hand side is \(+\infty \). When both \(Z^1\) and \(Z^2\) are in \({\mathcal {C}}^N\times {\mathcal {C}}^N\), then so is \(Z^\alpha \) as this set is convex. Hence, the convexity of \({\mathcal {F}}\) can be checked, as follows, by means of the order-preserving property in \({\mathcal {C}}^N\times {\mathcal {C}}^N\),

$$\begin{aligned} {\mathcal {F}}[Z^\alpha ]&=-\sum _{i=1}^N\sum _{j\in \sigma ^+[x_i^\alpha ]} m_i m_j(x_j^\alpha -x_i^\alpha ) - \sum _{i=1}^N\sum _{j\in \sigma ^+[y_i^\alpha ]}n_i n_j(y_j^\alpha -y_i^\alpha )+\sum _{i,j}m_in_j|x_i^\alpha -y_j^\alpha |\\&=-\alpha \sum _{i=1}^N\sum _{j\in \sigma ^+[x_i^1]}m_i m_j(x_j^1-x_i^1)-(1-\alpha )\sum _{i=1}^N\sum _{j\in \sigma ^+[x_i^2]}m_i m_j(x_j^2-x_i^2) \\&\quad -\alpha \sum _{i=1}^N\sum _{j\in \sigma ^+[y_i^1]}n_i n_j(y_j^1-y_i^1)-(1-\alpha )\sum _{i=1}^N\sum _{j\in \sigma ^+[y_i^2]} n_i n_j(y_j^2-y_i^2)\\&\quad +\sum _{i,j} m_in_j|\alpha (x_i^1-y_j^1)+(1-\alpha )(x_i^2-y_j^2)|, \end{aligned}$$

and the assertion follows by using the triangle inequality in the last term. \(\square \)

We shall now investigate in greater detail the functional \({\mathcal {F}}\). Our next goal is to provide an expression of \({\mathcal {F}}[Z]\) for \(Z\in {\mathcal {C}}^N\times {\mathcal {C}}^N\) accounting for a possible superposition of groups of particles.

We now rewrite the functional \({\mathcal {F}}[Z]\) using the above index notation and Remark 4. Let us start with the self-interaction part:

$$\begin{aligned} \begin{aligned} S(x)&=-\sum _{i=1}^N\sum _{j:x_i>x_j}m_i m_j (x_i - x_j) \\&= \sum _{i=1}^N m_i x_i \left[ -\sum _{j\in \sigma ^-[x_i]}m_j + \sum _{j\in \sigma ^+[x_i]}m_j\right] . \end{aligned} \end{aligned}$$
(26)

A similar expression may be obtained for the y-part:

$$\begin{aligned} S(y)=\sum _{j=1}^N n_j y_j \left[ - \sum _{i\in \sigma ^-[y_j]}n_i + \sum _{i\in \sigma ^+[y_j]}n_i\right] . \end{aligned}$$
(27)

We now consider the cross-interaction term

$$\begin{aligned} C(x,y)&=\sum \sum _{x_i>y_j}m_i n_j(x_i-y_j) + \sum \sum _{x_i<y_j}m_i n_j(y_j-x_i)\nonumber \\&= \sum _{i=1}^N m_i x_i\left[ \sum _{j\in \gamma ^-[x_i]} n_j -\sum _{j\in \gamma ^+[x_i]}n_j\right] + \sum _{j=1}^N n_j y_j \left[ \sum _{i\in \gamma ^-[y_j]} m_i-\sum _{i\in \gamma ^+[y_j]} m_i\right] . \end{aligned}$$
(28)

In order to deal with gradient flows in Hilbert spaces, we need to introduce the concept of Fréchet sub-differential. We adapt the definition of this classical concept to our specific case.

Definition 3

(Fréchet sub-differential) For a given proper, convex, and lower semi-continuous functional \({\mathcal {F}}\) on \({{\mathbb {R}}^{N}}\times {{\mathbb {R}}^{N}}\), we say that \(P\in {{\mathbb {R}}^{N}}\times {{\mathbb {R}}^{N}}\) belongs to the sub-differential of \({\mathcal {F}}\) at \(Z\in {\mathbb {R}}^{N}\times {\mathbb {R}}^{N}\) if and only if

$$\begin{aligned} {\mathcal {F}}[Z']-{\mathcal {F}}[Z]\ge \langle P,Z'-Z\rangle _w, \end{aligned}$$
(29)

for all \(Z'\in {\mathbb {R}}^N\times {\mathbb {R}}^N\). The sub-differential of \({\mathcal {F}}\) at Z is denoted by \(\partial {\mathcal {F}}(Z)\), and if \(\partial {\mathcal {F}}(Z)\ne \emptyset \) then we denote by \(\partial ^0 {\mathcal {F}}(Z)\) the element of minimal (weighted) norm of \(\partial {\mathcal {F}}(Z)\).

Remark 5

We recall that, since \({\mathcal {F}}\) is convex, requiring condition (29) to be satisfied for all \(Z'\in {\mathbb {R}}^N\times {\mathbb {R}}^N\) can be relaxed to

$$\begin{aligned} {\mathcal {F}}[Z']-{\mathcal {F}}[Z]\ge \langle P,Z'-Z\rangle _w + o(\Vert Z'-Z\Vert ), \qquad \hbox {as}\, Z'\rightarrow Z. \end{aligned}$$
(30)

As \({\mathcal {F}}[Z]\) attains the value \(+\infty \) outside the cone \({\mathcal {C}}^N\times {\mathcal {C}}^N\), it is reasonable to assume \(Z\in {\mathcal {C}}^N\times {\mathcal {C}}^N\) as a necessary condition to have \(\partial {\mathcal {F}}(Z)\ne \emptyset \). In the one species case (see [8, Proposition 2.10]) one can actually prove that being in the cone is a necessary and sufficient condition to have a non-empty sub-differential. Such a property is non-trivial in the many species case. We provide it in the next lemma.

Lemma 5

Let \(Z=(x,y)\in {{\mathbb {R}}^{N}}\times {{\mathbb {R}}^{N}}\). Then \(\partial {\mathcal {F}}(Z)\ne \emptyset \) if and only if \(Z\in {\mathcal {C}}^N\times {\mathcal {C}}^N\).

Proof

Similarly to [8, Proposition 2.10], let us assume \(Z=(x,y)\not \in {\mathcal {C}}^N\times {\mathcal {C}}^N\). Without restriction, we assume for instance \(x\not \in {\mathcal {C}}^N\), which implies \(I_{{\mathcal {C}}^N}(x)=+\infty \). Assuming \(P\in \partial {\mathcal {F}}[Z]\), we have

$$\begin{aligned} {\mathcal {F}}[Z']-S(x)-S(y)-C(x,y)\ge \langle P,Z'-Z\rangle _w+I_{{\mathcal {C}}^N}(x), \end{aligned}$$

for all \(Z'=(x',y')\in {\mathbb {R}}^{2N}\). In particular, the previous inequality would hold for any \(Z'\in {\mathcal {C}}^N\times {\mathcal {C}}^N\), which is clearly a contradiction since, in this case, the left-hand side is finite, while the right-hand side is infinite.

Let us now assume that \(Z\in {\mathcal {C}}^N\times {\mathcal {C}}^N\). The inequality (29) is trivially satisfied for an arbitrary P in case \(Z'\not \in {\mathcal {C}}^N\times {\mathcal {C}}^N\), therefore we can assume without restriction that \(Z'=(x',y')\in {\mathcal {C}}^N\times {\mathcal {C}}^N\). Our goal is to show that there exists a vector \(P\in {\mathbb {R}}^N\times {\mathbb {R}}^N\) such that (30) holds as \(Z'\rightarrow Z\). Therefore, without restriction we assume that \(\Vert Z'-Z\Vert <\varepsilon _0\) for some \(\varepsilon _0>0\) to be chosen later on.

We now compute

$$\begin{aligned}&S(x')-S(x) \\&\ = \sum _{i=1}^N m_i \left[ x'_i\left( -\sum _{j\in \sigma ^-[x'_i]}m_j + \sum _{j\in \sigma ^+[x'_i]}m_j \right) - x_i \left( -\sum _{j\in \sigma ^-[x_i]}m_j + \sum _{j\in \sigma ^+[x_i]}m_j \right) \right] \\&\ = \sum _{i=1}^N m_i (x'_i-x_i)\left( -\sum _{j<i}m_j+\sum _{j>i} m_j \right) + R, \end{aligned}$$

with

$$\begin{aligned} R&= \sum _{i=1}^N m_i x'_i \left( \sum _{j<i\,:\,\,j\in \sigma [x'_i]}m_j - \sum _{j> i\,:\,\,j\in \sigma [x_i']}m_j\right) \end{aligned}$$
(31)
$$\begin{aligned}&\qquad + \sum _{i=1}^N m_i x_i \left( -\sum _{j<i\,:\,\, j\in \sigma [x_i]}m_j + \sum _{j> i\,:\,\, j\in \sigma [x_i]}m_j\right) \end{aligned}$$
(32)
$$\begin{aligned}&=: R_1 - R_2. \end{aligned}$$
(33)

Note that

$$\begin{aligned} R_1 = \sum _{i=1}^N \sum \limits _{\begin{array}{c} j<i \\ j \in \sigma [x_i'] \end{array}} m_j m_i x_i' - \sum _{i=1}^N \sum \limits _{\begin{array}{c} j<i \\ j \in \sigma [x_i] \end{array}}m_j m_i x_i, \end{aligned}$$
(34)

and

$$\begin{aligned} R_2 = \sum _{i=1}^N \sum \limits _{\begin{array}{c} j>i \\ j \in \sigma [x_i'] \end{array}} m_j m_i x_i' - \sum _{i=1}^N \sum \limits _{\begin{array}{c} j>i \\ j \in \sigma [x_i] \end{array}}m_j m_i x_i. \end{aligned}$$
(35)

Using the fact that

$$\begin{aligned} \sigma [x_i] = (\sigma [x_i] \cap \sigma [x_i']) \,{{\dot{\cup }}}\, (\sigma [x_i] \setminus \sigma [x_i']), \end{aligned}$$

we may split the second sum and simplify the term \(R_1\), i.e.,

$$\begin{aligned} R_1&= \sum _{i=1}^N \sum \limits _{\begin{array}{c} j<i \\ j \in \sigma [x_i'] \end{array}}m_j m_i x_i' - \sum _{i=1}^N \sum \limits _{\begin{array}{c} j<i \\ j \in \sigma [x_i]\cap \sigma [x_i'] \end{array}}m_j m_i x_i - \sum _{i=1}^N \sum \limits _{\begin{array}{c} j<i \\ j \in \sigma [x_i]\setminus \sigma [x_i'] \end{array}}m_j m_i x_i \end{aligned}$$
(36)
$$\begin{aligned}&= \sum _{i=1}^N m_i (x_i'-x_i) \sum \limits _{\begin{array}{c} j<i \\ j \in \sigma [x_i'] \end{array}}m_j - \sum _{i=1}^N m_ix_i \sum \limits _{\begin{array}{c} j<i \\ j \in \sigma [x_i]\setminus \sigma [x_i'] \end{array}}m_j, \end{aligned}$$
(37)

having used Eq. (18) of Lemma 2 in the last line. In the same vein, we have

$$\begin{aligned} R_2&= \sum _{i=1}^N \sum \limits _{\begin{array}{c} j>i \\ j \in \sigma [x_i'] \end{array}}m_j m_i x_i' - \sum _{i=1}^N \sum \limits _{\begin{array}{c} j>i \\ j \in \sigma [x_i]\cap \sigma [x_i'] \end{array}} m_j m_i x_i - \sum _{i=1}^N \sum \limits _{\begin{array}{c} j>i \\ j \in \sigma [x_i]\setminus \sigma [x_i'] \end{array}}m_j m_i x_i \end{aligned}$$
(38)
$$\begin{aligned}&= \sum _{i=1}^N m_i (x_i'-x_i) \sum \limits _{\begin{array}{c} j>i \\ j \in \sigma [x_i'] \end{array}}m_j - \sum _{i=1}^N m_i x_i \sum \limits _{\begin{array}{c} j>i \\ j \in \sigma [x_i]\setminus \sigma [x_i'] \end{array}}m_j. \end{aligned}$$
(39)

Upon subtraction, we obtain

$$\begin{aligned} R_1-R_2&= \sum _{i=1}^N m_i (x_i'-x_i) \left( \sum \limits _{\begin{array}{c} j<i \\ j \in \sigma [x_i'] \end{array}}m_j -\sum \limits _{\begin{array}{c} j>i \\ j \in \sigma [x_i'] \end{array}}m_j\right) + R_3, \end{aligned}$$
(40)

where

$$\begin{aligned} R_3 = \sum _{i=1}^N \sum \limits _{\begin{array}{c} j>i \\ j \in \sigma [x_i]\setminus \sigma [x_i'] \end{array}}m_j m_i x_i - \sum _{i=1}^N \sum \limits _{\begin{array}{c} j<i \\ j \in \sigma [x_i]\setminus \sigma [x_i'] \end{array}}m_j m_i x_i. \end{aligned}$$
(41)

Rearranging the sum according to Remark 3, the first term can be rewritten,

$$\begin{aligned} \sum _{i=1}^N \sum \limits _{\begin{array}{c} j>i \end{array}} \chi _{ \sigma [x_i]\setminus \sigma [x_i']}(j) m_j m_i x_i = \sum _{j=1}^N \sum _{i<j} \chi _{ \sigma [x_i]\setminus \sigma [x_i']}(j) m_j m_i x_i. \end{aligned}$$
(42)

Since

$$\begin{aligned} j \in \sigma [x_i]\setminus \sigma [x_i'] \Longleftrightarrow i \in \sigma [x_j]\setminus \sigma [x_j'], \end{aligned}$$

we can simplify the expression further by relabelling, i.e., switching the roles of i and j, to obtain

$$\begin{aligned} \sum _{i=1}^N \sum \limits _{\begin{array}{c} j<i \end{array}} \chi _{ \sigma [x_i]\setminus \sigma [x_i']}(j) m_j m_i x_i&= \sum _{i=1}^N \sum _{j>i} \chi _{ \sigma [x_i]\setminus \sigma [x_i']}(j) m_i m_j x_j \end{aligned}$$
(43)
$$\begin{aligned}&= \sum _{i=1}^N \sum \limits _{\begin{array}{c} j<i\\ j \in \sigma [x_i]\setminus \sigma [x_i'] \end{array}} m_i m_j x_j. \end{aligned}$$
(44)

Substituting the simplified expression into the first term of \(R_3\), i.e., Eq. (41), we obtain

$$\begin{aligned} R_3 = \sum _{i=1}^N \sum \limits _{\begin{array}{c} j<i\\ j\in \sigma [x_i]\sigma [x_i'] \end{array}} m_i m_j (x_j-x_i) = 0. \end{aligned}$$
(45)

Thus, revisiting Eq. (40), we get

$$\begin{aligned} R_1-R_2&= \sum _{i=1}^N m_i (x_i'-x_i) \left( \sum \limits _{\begin{array}{c} j<i \\ j \in \sigma [x_i'] \end{array}}m_j -\sum \limits _{\begin{array}{c} j>i \\ j \in \sigma [x_i'] \end{array}}m_j\right) . \end{aligned}$$
(46)

The right-hand side is shown to vanish changing the labels ij and using Remark 3. We have therefore proven

$$\begin{aligned} S(x')-S(x) = \sum _{i=1}^N m_i (x'_i-x_i)\left( \sum _{j>i}m_j-\sum _{j<i}m_j\right) , \end{aligned}$$
(47a)

as well as

$$\begin{aligned} S(y')-S(y) = \sum _{j=1}^N n_j (y'_j-y_j)\left( \sum _{i>j}n_i-\sum _{i<j}n_j\right) . \end{aligned}$$
(47b)

Due to (28), we have

$$\begin{aligned}&C(x',y')-C(x,y)\\&\ =\sum _{i=1}^N m_i (x'_i-x_i)\left( \sum _{j\in \gamma ^-[x_i]}n_j -\sum _{j\in \gamma ^+[x_i]}n_j\right) \\&\qquad + \sum _{j=1}^N n_j (y'_j-y_j)\left( \sum _{i\in \gamma ^-[y_j]}m_i -\sum _{i\in \gamma ^+[y_j]}m_i\right) + {\widetilde{R}}, \end{aligned}$$

with

$$\begin{aligned} {{\widetilde{R}}} = {{\widetilde{R}}}_1 + {{\widetilde{R}}}_2 + {{\widetilde{R}}}_3 + {{\widetilde{R}}}_4, \end{aligned}$$
(48)

where

$$\begin{aligned}&{{\widetilde{R}}}_1 = \sum _{i=1}^N m_i x_i' \left( \sum _{j \in \gamma ^-[x_i']} n_j - \sum _{j\in \gamma ^-[x_i]}n_j \right) ,\nonumber \\&\quad \quad \text{ and } \quad {{\widetilde{R}}}_2 = \sum _{i=1}^N m_i x_i' \left( \sum _{j \in \gamma ^+[x_i]} n_j - \sum _{j\in \gamma ^+[x_i']}n_j \right) \end{aligned}$$
(49)

as well as

$$\begin{aligned}&{{\widetilde{R}}}_3 = \sum _{j=1}^N n_j y_j' \left( \sum _{i \in \gamma ^-[y_j']}m_i - \sum _{i\in \gamma ^-[y_j]}m_i\right) ,\nonumber \\&\quad \text{ and }\quad {{\widetilde{R}}}_4 = \sum _{j = 1}^N n_j y_j'\left( \sum _{i\in \gamma ^+[y_j]} m_i - \sum _{i \in \gamma ^+[y_j']}m_i\right) . \end{aligned}$$
(50)

Using Eq. (22) of Lemma 3, we may write

$$\begin{aligned} {{\widetilde{R}}}_1&= \sum _{i=1}^N m_i x_i' \sum _{j \in \gamma [x_i] \cap \gamma ^-[x_i']} n_j \end{aligned}$$
(51)
$$\begin{aligned}&\ge \sum _{i=1}^N \sum _{j =1}^N \chi _{\gamma [x_i] \cap \gamma ^-[x_i']}(j) m_i n_j y_j' \end{aligned}$$
(52)
$$\begin{aligned}&= \sum _{j=1}^N \sum _{i =1}^N \chi _{\gamma [y_j] \cap \gamma ^+[y_j']}(j) m_i n_j y_j' \end{aligned}$$
(53)
$$\begin{aligned}&= \sum _{j=1}^N \sum _{i \in \gamma [y_j] \cap \gamma ^+[y_j']}^N m_i n_j y_j', \end{aligned}$$
(54)

where the inequality is due to the fact that \(j \in \gamma ^-[x_i']\) and thus \(x_i' > y_j'\) and the penultimate line is by rearranging terms in the sum since

$$\begin{aligned} j \in \gamma [x_i] \cap \gamma ^-[x_i] \Leftrightarrow \left( x_i = y_j \text{ and } x_i' > y_j'\right) \Leftrightarrow i \in \gamma [y_j] \cap \gamma ^+[y_j']. \end{aligned}$$
(55)

Using a similar argument, we see that

$$\begin{aligned} {{\widetilde{R}}}_3 = \sum _{j=1}^N \sum _{i\in \gamma [y_j]\cap \gamma ^-[y_j']} m_i n_j y_j' \ge \sum _{i=1}^N \sum _{j\in \gamma [y_i]\cap \gamma ^+[x_i']} m_i n_j x_i'. \end{aligned}$$
(56)

Finally, we note that, upon using Eq. (22) of Lemma 3, we may write

$$\begin{aligned} {{\widetilde{R}}}_2 =- \sum _{i=1}^N \sum _{j\in \gamma [x_i] \cap \gamma ^+[x_i']} m_i n_j x_i', \quad \text{ and }\quad {{\widetilde{R}}}_4 = -\sum _{j=1}^N \sum _{i\in \gamma [y_j]\cap \gamma ^+[y_j]} m_i n_j y_j'. \end{aligned}$$
(57)

Combining the terms, we have \({{\widetilde{R}}}\ge 0\). The above estimate, together with (47), implies that the vector \(P=(p,q)\in {\mathbb {R}}^N\times {\mathbb {R}}^N\) with

$$\begin{aligned} p&=(p_i)_{i=1}^N\,,\qquad q=(q_j)_{j=1}^N,\\ p_i&= -\sum _{j<i}m_j + \sum _{j>i}m_j + \sum _{j\in \gamma ^-[x_i]}n_j - \sum _{j\in \gamma ^+[x_i]}n_j,\\ q_j&=-\sum _{i<j}n_i + \sum _{i>j}n_i + \sum _{i\in \gamma ^-[y_j]}m_i - \sum _{i\in \gamma ^+[y_j]}m_i, \end{aligned}$$

satisfies (30), i.e., \(P\in \partial {\mathcal {F}}(Z)\). \(\square \)

The functional \({\mathcal {F}}\) defined in (24) is proper, continuous, and convex on the Hilbert space \(({\mathbb {R}}^N\times {\mathbb {R}}^N,\langle \cdot ,\cdot \rangle _w)\), in view of Lemma 4. As a consequence of the previous properties we have \(\partial {\mathcal {F}}\) is a maximal monotone operator. Hence, we can use the theory of Brézis [9, Theorem 3.1], e.g. in the form stated in [20, Section 9.6, Theorem 3] in order to pose system (6) as the gradient flow associated to (24).

Definition 4

Let \(Z_0=(x_0,y_0)\in {\mathcal {C}}^N\times {\mathcal {C}}^N\). An absolutely continuous curve \((x(t),y(t))\in {\mathbb {R}}^{2N}\) is a gradient flow for the functional \({\mathcal {F}}\) if \(Z(t):=(x(t),y(t))\) is a Lipschitz function on \([0,+\infty )\), i.e., \(\frac{dZ}{dt}\in L^\infty ([0,+\infty );{{\mathbb {R}}^{N}}\times {{\mathbb {R}}^{N}})\) (in the sense of distributions) and if it satisfies the sub-differential inclusion

$$\begin{aligned} -\begin{pmatrix} \dot{x}(t)\\ \dot{y}(t) \end{pmatrix}\in \partial {\mathcal {F}}(Z(t)), \end{aligned}$$
(58)

for almost every \(t\in [0,+\infty )\) with \((x(0),y(0))=(x_0(\cdot ),y_0(\cdot ))\).

Resorting to the theory of Brézis, we get the following theorem.

Theorem 1

Let \(Z_0=(x_0,y_0)\in {\mathcal {C}}^N\times {\mathcal {C}}^N\) be an initial datum. Then, there exists a unique solution \(Z(t)=(x(t),y(t))\) in the sense of Definition 4 such that \(Z(0)=Z_0\). Moreover, given \(Z^1_0, Z^2_0 \in {\mathcal {C}}^N\times {\mathcal {C}}^N\) and the two corresponding solutions \(Z^1(t), Z^2(t)\) in the sense of Definition 4 with initial data \(Z^1_0\) and \(Z^2_0\) respectively, the stability property

$$\begin{aligned} \Vert Z^1(t)-Z^2(t)\Vert \le \Vert Z^1_0-Z^2_0\Vert , \end{aligned}$$

holds for all \(t\ge 0\).

Remark 6

Lemma 5 affects the statement of the above Theorem 1 in that the class of initial conditions for which existence and uniqueness of a gradient flow solution holds in the sense of Definition 4 coincides with the whole convex cone \({\mathcal {C}}^N\times {\mathcal {C}}^N\). According to [9, Theorem 3.1], initial data should belong to the domain of the sub-differential of \({\mathcal {F}}\) in order to have existence and uniqueness of solutions. Lemma 5 assures that \({\mathcal {C}}^N\times {\mathcal {C}}^N\) and the domain of \(\partial {\mathcal {F}}\) are the same set. Moreover, the result in Lemma 5 also provides an explicit expression P(pq) of at least one element in the sub-differential at any given configuration of particles that includes possible collisions, both within the same species and among particles of opposite species. Such expression anticipates what we shall see in the next section regarding the behavior of particles in presence of superpositions/collisions. For example, the i-th particle of the first species of a given particle configuration is subject to two self-repulsive drifts, the former due to the accumulated mass of particles with label \(<i\) (pointing to the positive direction), the latter combining the amount of mass possessed by particles with label \(>i\) (pointing to the negative direction), regardless of possible superpositions. At the same time, the i-th particle is subject to cross-attractive drifts depending on the particles of the opposite species. Particles of the y-species located strictly to the left of \(x_i\) contribute to a cross-attractive drift pointing to the negative direction, whereas particles of the y species posed strictly to the right of \(x_i\) cause \(x_i\) move in the positive direction. Particles of the y-species whose location coincides with \(x_i\) do not contribute to the cross-interaction part of P.

Remark 7

Note that the minimal selection in the sub-differential in (58) is achieved as consequence of the convexity of \({\mathcal {F}}\), cf. [13, 20], for instance. More precisely, the quoted result in [9] implies Z admits a right derivative for every \(t\in [0,+\infty )\) and

$$\begin{aligned} -\frac{d^+Z}{dt}(t)=\partial ^0{\mathcal {F}}(Z(t)), \end{aligned}$$

for every \(t\in [0,+\infty )\). Here,

$$\begin{aligned} \partial ^0{\mathcal {F}}(Z)={{\,\mathrm{{\mathrm{argmin}}}\,}}\left\{ \Vert P\Vert \,:\,\, P\in \partial {\mathcal {F}}(Z)\right\} . \end{aligned}$$

Moreover the function \(t\mapsto \partial ^0{\mathcal {F}}(X(t))\) is right-continuous and the function \(t\mapsto \left\| \partial ^0{\mathcal {F}}(Z(t))\right\| \) is non-increasing.

4 Qualitative properties of the ODEs system

Having established a well-posedness theory for system (6), let us now focus on some important properties of the solutions. In particular we are interested in the dynamic of collisions and in the support of the solution.

The main results of this section are obtained under the assumption that all particles have the same mass, i.e., 1/N. Similar results may be obtained for more general masses. We highlight this in Remark 8. Since the main goal of this work is the convergence of the particle approximation scheme, we shall henceforth focus on the case of equal masses.

4.1 Collisions between particles of different species

In this subsection we discuss collisions between any two particles \(x_i\) and \(y_j\) of opposite species, for \(i,j\in \{1,2,\ldots ,N\}\). Let us denote by

$$\begin{aligned} t_*=\inf \{t\ge 0:x_i(t)=y_j(t)\ \text{ for } \text{ some }\ 1 \le i,j\le N\}. \end{aligned}$$
(59)

In Sect. 4.5 we shall see that, indeed, \(t_*<+\infty \). Since all trajectories are continuous and the number of particles is finite, there exists a time \(t^*\) such that the above \(\inf \) is achieved. Let \(i_0,j_0 \in \{1,2,\ldots ,N\}\) such that \(x_{i_0}(t_*) = y_{j_0}(t_*)\). The following theorem covers all possible configurations of the colliding particles right after the collision time \(t_*\).

Theorem 2

Let \(t_*\) be a collision time defined in (59) and \(i_0,j_0\) be such that \(x_{i_0}(t_*) = y_{j_0}(t_*)\). Assume that all other particles occupy a different position at \(t=t_*\). Then there exists \(\epsilon >0\) such that \(\theta (t):=x_{i_0}(t) - y_{j_0}(t)\) satisfies

  1. 1.

    \(\theta (t) = x_{i_0}(t) - y_{j_0}(t)> 0, \quad \hbox {if}\quad i_0 > j_0\),

  2. 2.

    \(\theta (t) = x_{i_0}(t) - y_{j_0}(t)< 0, \quad \hbox {if}\quad i_0 < j_0\),

  3. 3.

    \(\theta (t) = x_{i_0}(t) - y_{j_0}(t) = 0, \quad \hbox {if}\quad i_0 = j_0\),

for all \(t \in (t_*, t_* + \epsilon )\). Moreover, in case \(i_0=j_0\),

  • the two particles \(x_{i_0}\) and \(y_{i_0}\) remain attached for all \(t\ge t^*\),

  • the two particles \(x_{i_0}\) and \(y_{i_0}\) have zero velocity on \([t^*,t^*+\epsilon ]\).

Proof

At time \(t_*\) we consider a particle closest to \(x_{i_0}(t_*) = y_{j_0}(t_*)\), denoted by \(z_k\), where \(z_k = x_i\), for \(i\ne i_0\) or \(z_k=y_j\), for \(j\ne j_0\). Let us denote \(d_* := |z_k(t_*) - x_{i_0}(t_*)|\). Since no particle is moving at a speed larger than 2, cf. (6), there exists an \(0< \epsilon < d_*/4\), such that their distance remains strictly positive, for all \(t\in [t_*, t_* + \epsilon ]\). In particular this means that \(x_{i_0}, y_{j_0}\) remain the only particles in the interval \([x_{i_0}(t_*) - d_*/2, x_{i_0}(t_*) + d_*/2]\) for any \(t \in [t_*, t_*+ \epsilon ]\). As a consequence, the sign of \(\theta (t)\) is either strictly positive or strictly negative on \((t_*, t_*+ \epsilon )\), or it is equal to zero, since the velocities remain constant.

If \(\theta (t)\ne 0\), no superpositions occur for the whole time interval \((t_*,t_*+\epsilon )\). As the functional \({\mathcal {F}}\) is \(C^1\) on a configuration without superpositions, the sub-differential of \({\mathcal {F}}\) is single-valued and corresponds to the right-hand side of (6). Therefore, for all \(t_*^+\in (t_*,t_*+\epsilon )\) we have

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{x}_{i_0}(t_*^+)= \displaystyle \sum _{k\in \sigma ^-[x_{i_0}]}\frac{1}{N} - \sum _{k\in \sigma ^+[x_{i_0}]} \frac{1}{N} - \sum _{k\in \gamma ^-[x_{i_0}]} \frac{1}{N} + \sum _{k\in \gamma ^+[x_{i_0}]}\frac{1}{N},\\ \dot{y}_{j_0}(t_*^+)= \displaystyle \sum _{k\in \sigma ^-[y_{j_0}]}\frac{1}{N} - \sum _{k\in \sigma ^+[y_{j_0}]}\frac{1}{N} - \sum _{k\in \gamma ^-[y_{j_0}]}\frac{1}{N} + \sum _{k\in \gamma ^+[y_{j_0}]}\frac{1}{N}. \end{array}\right. } \end{aligned}$$
(60)

We now compare the velocities of the two particles \(x_{i_0}\) and \(y_{j_0}\) in the two configurations (1) and (2) with their velocity. In case \(\theta >0\) on \((t*,t^*+\epsilon )\) we have, for some \(t_*^+\in (t_*,t_*+\epsilon )\),

$$\begin{aligned} \dot{x}_{i_0}(t_*^+)>\dot{y}_{j_0}(t_*^+), \end{aligned}$$
(61)

which is equivalent to

$$\begin{aligned}&\sum _{\{k:x_k<x_{i_0}\}}\frac{1}{N}-\sum _{\{k:x_k>x_{i_0}\}}\frac{1}{N}-\sum _{\{k:y_k<x_{i_0}\}}\frac{1}{N}+\sum _{\{k:y_k>x_{i_0}\}}\frac{1}{N}\\&\quad> \sum _{\{k:y_k<y_{j_0}\}}\frac{1}{N}-\sum _{\{k:y_k>y_{j_0}\}}\frac{1}{N}-\sum _{\{k:x_k<y_{j_0}\}}\frac{1}{N}+\sum _{\{k:x_k>y_{j_0}\}}\frac{1}{N}. \end{aligned}$$

After multiplying by N, the above condition reads

$$\begin{aligned} (i_0-1) \!-\! (N-i_0) \!-\! j_0 \!+\! (N-j_0) > (j_0\!-\!1) \,-\, (N-j_0) \,-\, (i_0-1) \!+\! (N -i_0\!+\!1), \end{aligned}$$

which, upon simplification, is equivalent to

$$\begin{aligned} i_0-j_0&>\frac{1}{2}, \end{aligned}$$

which, in turn, is equivalent to \(i_0>j_0\), due to the fact that \(i_0,j_0\in \{1,\ldots ,N\}\). A similar computation yields that in case \(\theta (t)<0\) on the interval \((t_*,t_*+\epsilon )\) then \(\dot{x}_{i_0}(t_*^+)<\dot{y}_{j_0}(t_*^+)\) on some time \(t_*^+\), and then

$$\begin{aligned}&i_0-1-(N-i_0)-(j_0-1)+N-j_0+1\\&<j_0-1-(N-j_0)-i_0+(N-i_0)\iff i_0<j_0. \end{aligned}$$

Clearly, in case \(i_0=j_0\) none of the two above situations are possible, and we must necessarily have that the two particles \(x_{i_0}\) and \(y_{j_0}\) overlap on the time interval \([t^*,t^*+\epsilon )\). In order to determine the speed of the two particles in this case, we observe that, in the particle configuration in which \(x_{i_0}=y_{i_0}\) and all other particles occupy different positions, the \(i_{0}\)-th component of sub-differential \(P=(p,q)\) found in Lemma 5 reads

$$\begin{aligned} p_{i_0}=q_{i_0}=(i_0-1)-(N-i_0) -(i_0-1)+(N-i_0) = 0, \end{aligned}$$

and by uniqueness of the gradient flow solution according to Definition 4 the two particles have zero velocity on the time interval \([t^*,t^*+\epsilon ]\) in which they do not collide with other particles. \(\square \)

Remark 8

In case of different masses, the inequality in (61) reads

$$\begin{aligned} M_{i_0-1}-(1-M_{i_0})-N_{j_0}+1-N_{j_0} > N_{j_0-1}-(1-N_{j_0})-M_{i_0-1}+(1-M_{i_0-1}), \end{aligned}$$
(62)

which gives the more general condition \(m_{i_0}>4N_{j_0-1}+3n_{j_0}-4M_{i_0-1}\) for \(x_{i_0}(t)>y_{j_0}(t)\). Here \(M_{i_0}=m_1+\cdots +m_{i_0}\) and \(N_{j_0}=n_1+\cdots +n_{j_0}\).

4.2 Collisions between particles of the same species

Theorem 2 covers all possible types of collisions between two particles of opposing species. This subsection is dedicated to investigating whether or not two particles of the same species can collide.

Theorem 3

Assume the particles \(x_1,\ldots ,x_N\) do not overlap initially and that \(m_1 = \dots = m_N = n_1 = \dots = n_N = 1/N\). Then, particles of the x-species never overlap for all times \(t> 0\). The same statement holds for particles of the y-species.

Proof

Arguing by contradiction, let us assume there exists a time \(t_*\) such that \(x_i(t_*)=x_{i+1}(t_*)\). Without loss of generality we may choose such \(t_*\) as the first collision time for those two particles. Still without losing generality, we assume there exists an \(\epsilon >0\) such that no other collisions involving either \(x_i\) or \(x_{i+1}\) occur on \((t_*-\epsilon ,t_*)\). Clearly, there exists \(t_*^- \in (t_* - \epsilon , t_*)\) such that

$$\begin{aligned} \dot{x}_{i+1}(t_*^-)<\dot{x}_i(t_*^-). \end{aligned}$$
(63)

We shall cover all the possible cases.

Case 1 \(x_i\) and \(x_{i+1}\) collide “without any particles of the opposite species strictly between them”. This case also covers the situation in which one or more particles of the y-species collide with \(x_i\) and \(x_{i+1}\) at \(t_*\) but none of them are set strictly between \(x_i\) and \(x_{i+1}\) on the above time interval. Hence, there exists an index j such that both \(x_i\) and \(x_{i+1}\) have exactly j y-particles on their left and \(N-j\) on their right on the same time interval \((t_*-\epsilon ,t_*)\). Hence, denoting by \(M_i=m_1+\cdots +m_i\) and \(N_j=n_1+\cdots +n_j\) for any \(i,j\in \{1,2,\ldots ,N\}\), inequality (63) implies

$$\begin{aligned} M_i-(1-M_{i+1})-N_j+1-N_j<M_{i-1}-(1-M_i) -N_j+1-N_j\iff m_i+m_{i+1}<0, \end{aligned}$$

which is clearly false.

Case 2 as in Case 1 but with more colliding particles such that none of them “strictly between \(x_i\) and \(x_{i+1}\)”. This situation can be covered as in Case 1. All particles moving strictly outside the interval \([x_i,x_{i+1}]\) prior to the collision do not affect the computations in Case 1.

Case 3 one y-particle is set “strictly between \(x_i\) and \(x_{i+1}\)before collision. Assume now there is an index j such that \(x_i(t_*)=x_{i+1}(t_*)=y_j(t_*)\) and \(x_i(t)<y_j(t)< x_{i+1}(t)\) for all \(t\in (t_*-\epsilon ,t_*)\). In this case, there must be a time \(t_*^-\) such that (63) is satisfied since \(y_j\) slows down \(x_{i+1}\) and attracts \(x_i\). The explicit computation of the velocities yields in this case

$$\begin{aligned}&M_i-(1-M_{i+1})-N_j+1-N_j<M_{i-1}-(1-M_i)-N_{j-1}+1-N_{j-1}\\&\quad \iff m_i+m_{i+1}<2n_j. \end{aligned}$$

Since we are assuming that all particles have the same mass 1/N the above is a contradiction.

Case 4 one y-particle is attached to either \(x_i\) o r \(x_{i+1}\) before the collision time \(t_*\). Assume now that we are in the same situation as in Case 3 except that the particle \(y_j\) is attached to \(x_{i+1}\) on the time interval \((t_*-\epsilon ,t_*)\). From Theorem 2 we know that this is possible only if \(j={i+1}\) and \({\dot{x}}_{i+1}={\dot{y}}_j = 0\) on \((t_* - \epsilon , t_*)\). On the other hand, with the notation of Case 1, \({\dot{x}}_i\) is explicitly computed on \(t\in (t_*-\epsilon ,t_*)\),

$$\begin{aligned} {\dot{x}}_i(t)=M_{i-1} - (1-M_i) -N_i+1-N_i, \end{aligned}$$

and since all particles have mass 1/N we deduce

$$\begin{aligned} {\dot{x}}_i(t)=-1/N, \end{aligned}$$

which clearly shows that \(x_i\) and \(x_{i+1}\) cannot collide at time \(t_*\) in this case. We remark that this also covers the situation in which one or more particles of the y species are set strictly between \(x_i\) and \(x_{i+1}=y_j\) before time \(t_*\).

Case 5 more than one y-particle is set between \(x_i\) and \(x_{i+1}\). Assume now that \(x_i\) and \(x_{i+1}\) collide having two or more particles of the y species, say \(y_{j},\ldots ,y_{j+k}\) with \(k\ge 1\), strictly between them in the time interval \((t_*,t_*+\epsilon )\). In this case, at least two particles of the y-species collide without particles of the x-species strictly between them, and this is impossible due to the first case we considered for the x-species (with reversed roles). \(\square \)

We can collect the information in Theorems 3 and 2 as follows.

Corollary 1

Assume the particles \(x_1,\ldots ,x_N\) do not overlap initially, and assume the same holds for the particles \(y_1,\ldots ,y_N\). Then, particles of the same species never collide for all times. Particles of opposite species can only meet in a binary collision. When that occurs, they behave according to the three cases stated in Theorem 2.

Remark 9

The results in Theorem 2 and Corollary 1 clearly show that there can be no bouncing in the particle system (6), i.e., a particle cannot reach another particle and then remain strictly before it after touching it. This is immediate in the case of particles of the same species as they simply never collide. As for the case of particles of opposite species, assume \(x_i\) reaches \(y_j\) at time \(t^*\). If they touch each other and are then bounced back, this would imply that their post-collisional velocities are the same as their pre-collisional ones, hence, for instance, \({\dot{x}}_i(t)>{\dot{y}}_j(t)\) for \(t>t^*\), but this is in contradiction with \(x_i(t)<y_j(t)\) for \(t>t^*\), recalling that \(x_i(t^*)=y_j(t^*)\).

4.3 Initial overlapping

The results in the previous two subsections are relevant in case of no initial overlapping of particles. In this subsection we analyse the situation of an initial “cluster” involving particles of both species. We prove the following result.

Lemma 6

Assume there exist non-negative integers \(0\le h,k,n,m\le N\) with \(h<k<n<m\) and some \(\lambda \in {\mathbb {R}}\) such that

$$\begin{aligned}&x_{i}(0)=\lambda , \qquad \hbox {for all}\, i=h+1,\ldots ,n,\\&y_{j}(0)=\lambda , \qquad \hbox {for all}\, j=k+1,\ldots ,m, \end{aligned}$$

and assume no other particles of the x or y species occupy the position \(\lambda \) at time \(t=0\). Assume further that no particles other than \(x_i\) with \(i=h+1,\ldots ,n\) and \(y_j\) with \(j=k+1,\ldots ,m\) overlap at \(t=0\). Then, for \(t>0\) and prior to the next collision, the following holds:

$$\begin{aligned}&x_{h+1}(t)<\ldots<x_{k}(t)<\lambda ,\nonumber \\&x_{k+1}(t)\equiv \ldots \equiv x_{n}(t)\equiv \lambda ,\nonumber \\&y_{k+1}(t)\equiv \ldots \equiv y_{n}(t)\equiv \lambda ,\nonumber \\&\lambda<y_{n+1}(t)<\ldots <y_{m}(t). \end{aligned}$$
(64)

Proof

We prove the assertion by providing an explicit particle trajectory

$$\begin{aligned} Z(t)=(x(t),y(t))=(x_1(t),\ldots ,x_N(t),y_1(t),\ldots ,y_N(t)), \end{aligned}$$

satisfying (64) for all \(t\in [0,\epsilon ]\) for a suitably small \(\epsilon \). To perform this task, we will prove that our chosen particle trajectory satisfies the sub-differential inclusion (58) for all \(t\in (0,\epsilon )\) for a suitably small \(\epsilon >0\). The result then follows by uniqueness, see Theorem 1. For simplicity we adopt the notation

$$\begin{aligned} x_i(0)={\bar{x}}_i\,,\qquad y_j(0)={\bar{y}}_j, \end{aligned}$$

for all \(i,j=1,\ldots ,N\). Moreover, we use the notation

$$\begin{aligned} I:=\{1,\ldots ,N\}\,,\qquad J:=\{k+1,\ldots ,n\}. \end{aligned}$$

By assumption, particles \(x_i\) and \(y_j\) with \(i=1,\ldots ,h,n+1,\ldots ,N\) and \(j=1,\ldots ,k,m+1,\ldots ,N\) occupy distinct positions at \(t=0\), therefore we verify that they move as follows, for \(t\in [0,\epsilon )\) and \(\epsilon >0\) small enough such that no collisions arise in \((0,\epsilon )\):

$$\begin{aligned}&x_i(t) ={\bar{x}}_i+\frac{1}{N}[i-1 - (N-i)-\#\{j\in I\,:\,\, {\bar{y}}_j<{\bar{x}}_i\} + \#\{j\in I\,:\,\,{\bar{y}}_j>{\bar{x}}_i\}]t,\\&y_j(t)={\bar{y}}_j + \frac{1}{N}[j-1-(N-j) - \#\{i\in I\,:\,\, {\bar{x}}_i<{\bar{y}}_j\} + \#\{i\in I\,:\,\,{\bar{x}}_i>{\bar{y}}_j\}]t. \end{aligned}$$

Moreover, we set

$$\begin{aligned}&x_i(t)=\lambda -\frac{1}{N}(2(k-i)+1)t\,,\qquad \hbox {for}\, i=h+1,\ldots ,k,\\&y_j(t)=\lambda +\frac{1}{N}(2(j-n)-1)t\,,\qquad \hbox {for}\, j=n+1,\ldots ,m,\\&x_i(t)\equiv \lambda \,,\qquad \hbox {for}\, i=k+1,\ldots ,n,\\&y_j(t)\equiv \lambda \,,\qquad \hbox {for}\, j=k+1,\ldots ,n, \end{aligned}$$

for \(t\in [0,\epsilon )\) for \(\epsilon >0\) small enough such that \(x_{h+1}\) does not collide with \(x_h\) and \(y_{m}\) does not collide with \(y_{m+1}\), which is guaranteed by the fact that \(|{\bar{y}}_{m+1}-{\bar{y}}_m|>0\) and \(|{\bar{x}}_{h+1}-{\bar{x}}_h|>0\). For a fixed \(t\in (0,\epsilon )\) we prove that the vector

$$\begin{aligned}&-{\dot{Z}}(t)=-({\dot{x}}(t),{\dot{y}}(t))=-(p,q)\\&\quad p=(p_i)_{i=1}^N\,,\qquad q= (q_j)_{j=1}^N\\&\quad p_i={\dot{x}}_i(t)\,,\qquad q_j={\dot{y}}_j(t). \end{aligned}$$

belongs to \(\partial {\mathcal {F}}(Z(t))\). For simplicity we denote \(x=x(t)\) and \(y=y(t)\). By means of a direct computation we obtain

$$\begin{aligned}&{\mathcal {F}}(x',y')-{\mathcal {F}}(x,y)\\&\ = \frac{1}{N^2}\sum _{i\in I\setminus J} (x_i'-x_i)[-(i-1)+N-i]+\frac{1}{N^2}\sum _{j\in I\setminus J}(y_j'-y_j)[-(j-1)+N-j]\\&\quad +\frac{1}{N^2}\sum _{i\in J}(x_i'-x_i)[-k+(N-n)]\\&\quad +\frac{1}{N^2}\sum _{i\in J}x_i'[-\#\{h\in J\,:\,\,x_h'<x_i'\}+\#\{h\in J\,:\,\,x_h'>x_i'\}]\\&\quad -\frac{1}{N^2}\sum _{i\in J}x_i\underbrace{[-\#\{h\in J\,:\,\,x_h<x_i\}+\#\{h\in J\,:\,\,x_h>x_i\}]}_{=0}\\&\quad +\frac{1}{N^2}\sum _{j\in J}(y_j'-y_j)[-k+(N-n)]\\&\quad +\frac{1}{N^2}\sum _{j\in J}y_j'[-\#\{k\in J\,:\,\,y_k'<y_j'\}+\#\{k\in J\,:\,\,y_k'>y_j'\}]\\&\quad -\frac{1}{N^2}\sum _{j\in J}y_j\underbrace{[-\#\{k\in J\,:\,\,y_k<y_j\}+\#\{k\in J\,:\,\,y_k>y_j\}]}_{=0}\\&\quad + \frac{1}{N^2}\sum _{i\in I\setminus J}(x_i'-x_i)[\#\{k\in I\,:\,\,y_k<x_i\}-\#\{k\in I\,:\,\,y_k>x_i\}]\\&\quad + \frac{1}{N^2}\sum _{i\in J}(x_i'-x_i)[k-(N-n)]\\&\quad +\frac{1}{N^2}\sum _{i\in J}x_i'[\#\{k\in J\,:\,\,y_k'<x_i'\}-\#\{k\in J\,:\,\,y_k'>x_i'\}]\\&\quad -\frac{1}{N^2}\sum _{i\in J}x_i\underbrace{[\#\{k\in J\,:\,\,y_k<x_i\}-\#\{k\in J\,:\,\,y_k>x_i\}]}_{=0}\\&\quad +\frac{1}{N^2}\sum _{j\in I\setminus J}(y_j'-y_j)[\#\{h\in I\,:\,\,x_h<y_j\}-\#\{h\in I\,:\,\,x_h>y_j\}]\\&\quad +\frac{1}{N^2}\sum _{j\in J}(y_j'-y_j)[k-(N-n)]\\&\quad +\frac{1}{N^2}\sum _{j\in J}y_j'[\#\{h\in J\,:\,\,x_h'<y_j'\}-\#\{h\in J\,:\,\,x_h'>y_j'\}]\\&\quad -\frac{1}{N^2}\sum _{j\in J}y_j\underbrace{[\#\{h\in J\,:\,\,x_h<y_j\}-\#\{h\in J\,:\,\,x_h>y_j\}]}_{=0}. \end{aligned}$$

Combining all the terms, and recalling that for small t

$$\begin{aligned}&\#\{k \in I\,:\,\, y_k<x_i\}-\#\{k\in I\,:\,\,y_k>x_i\} = 2k-N\qquad \hbox {for}\, i\in \{h+1,\ldots ,k\}\\&\#\{h \in I\,:\,\, x_h<y_j\}-\#\{h\in I\,:\,\,x_h>y_j\} = 2n-N\qquad \hbox {for}\, j\in \{n+1,\ldots ,m\} \end{aligned}$$

we obtain

$$\begin{aligned}&{\mathcal {F}}(x',y')-{\mathcal {F}}(x,y) \\&\quad =-\frac{1}{N^2}\sum _{i\in \{1,\ldots ,h,n+1,\ldots ,N\}}(x_i'-x_i)[i-1-(N-i)-\#\{j\in I\,:\,\,y_i<x_i\}\\&\qquad +\#\{j\in I\,:\,\,y_j>x_i\}]\\&\qquad - \frac{1}{N^2}\sum _{j\in \{1,\ldots ,k,m+1,\ldots ,N\}}(y_j'-y_j)[j-1-(N-j)-\#\{i\in I\,:\,\,x_i<y_j\}\\&\qquad +\#\{i\in I\,:\,\, x_i>y_j\}]\\&\qquad + \frac{1}{N^2}\sum _{i=h+1}^k(x_i'-x_i)[2(k-i)+1]+ \frac{1}{N^2}\sum _{j=n+1}^m(y_j'-y_j)[2(n-j)+1]\\&\qquad - \frac{1}{N^2}\sum \sum _{i,h\in J\,:\,\,x_h'<x_i'} x_i' +\frac{1}{N^2} \sum \sum _{i,h\in J\,:\,\,x_h'>x_i'} x_i'-\frac{1}{N^2}\sum \sum _{j,k\in J\,:\,\,y_k'<y_j'}y_j'\\&\qquad +\frac{1}{N^2} \sum \sum _{j,k\in J\,:\,\,y_k'> y_j'}y_j'\\&\qquad + \sum \sum _{i,k\in J\,:\,\,y_k'<x_i'} x_i' - \sum \sum _{i,k\in J\,:\,\,y_k'>x_i'} x_i' + \sum \sum _{j,h\in J\,:\,\,x_h'<y_j'} y_j' - \sum \sum _{j,h\in J\,:\,\,x_h'>y_j'} y_j'. \end{aligned}$$

We now observe that the last eight terms above can be rewritten as

$$\begin{aligned} -\frac{1}{2N^2}\sum _{i\in J}\sum _{h\in J} |x_i'-x_h'|-\frac{1}{2N^2}\sum _{j\in J}\sum _{k\in J} |y_j'-y_k'| + \frac{1}{N^2}\sum _{i\in J}\sum _{j\in J} |x'_i-y_j'|, \end{aligned}$$

which equals the functional \({\mathcal {F}}\) computed on the particle configuration \((x_i)_{i\in J},\ (y_j)_{j\in J}\). Due to Lemma 1, these terms amount to a non-negative quantity. This proves the assertion. \(\square \)

Theorem 4

Assume there exist non-negative integers ijhk such that

$$\begin{aligned} x_i(0)=x_{i+1}(0)=\ldots =x_{h-1}(0)=x_{h}(0)= y_j(0)=y_{j+1}(0)=\ldots =y_{k-1}(0)=y_k(0), \end{aligned}$$

and no other particles occupy the same position. Then,

  • all particles (of both species) having indeces in the set \(\{i,\ldots ,h\}\cap \{j,\ldots ,k\}\) remain in the same position for all \(t\ge 0\).

Moreover, on some time interval \([0,\epsilon ]\) for a suitably small \(\epsilon >0\),

  • if \(i<j\), particles \(x_i,\ldots ,x_{j-1}\) detach from \(x_j\) moving to the left, and are strictly ordered;

  • if \(i>j\), particles \(y_j,\ldots ,y_{i-1}\) detach from \(y_i\) moving to the left, and are strictly ordered;

  • if \(h>k\), particles \(x_{k+1},\ldots ,x_h\) detach from \(x_k\) moving to the right, and are strictly ordered;

  • if \(k>h\), particles \(y_{h+1},\ldots ,y_k\) detach from \(y_h\) moving to the right, and are strictly ordered.

Proof

Assume first there is only one group of particles occupying the same position initially as in the hypothesis of the theorem. The case in which \(k\ge h\) and \(i\le j\) is covered in Lemma 6. The symmetric case \(k\le h\) and \(i\ge j\) is analogous. The case in which either

  • \(i<j\) and \(h>k\)

or

  • \(i>j\) and \(h<k\)

can be proven similarly. We omit the details.

The general case in which there is more than one cluster of overlapping particles follows by similar computations as in Lemma 6 in case there are “excess” particles of one species on one side and “excess” particles of the opposite species on the other side, and by similar considerations as the one above in this proof for the general case. The computations are quite long and tedious but do not include any additional technical difficulty to the one in Lemma 6 and are therefore left to the reader. \(\square \)

As a consequence of the result in Theorem 4, we are able to describe in full detail the short-time solution of the particle system (6) in case of initial overlapping of particles:

  1. (1)

    If particles of the same species, for instance \(x_i,\ldots ,x_{i+h}\) occupy the same position initially and no particles of the other species are in the same initial position, then the particles “scatter apart”: they immediately detach and move apart, with \(x_i(t)<\ldots <x_{i+h}(t)\). The same situation occurs in the one species case, see [8].

  2. (2)

    If particles of the two species occupy the same position initially, their behaviour depends on the cumulative mass of each species on that position. More precisely, particles of the two species featuring the same index are stationary for all times and remain attached to the initial cluster. The remaining particles move away from the cluster and “disperse” towards the particles of the opposite species with the same indexes.

  3. (3)

    As a special case of case (2), let us highlight that, if no particles have the same index, no stationary cluster is formed and all particles diffuse.

  4. (4)

    In case (2), the whole particle system is split into two independent particle sub-systems separated by the initial cluster. Each of the two sub-system is only subject to the interaction energy \({\mathcal {F}}\) restricted to it. The two sub-systems are totally decoupled.

4.4 Support of the solution

The solution to system (6) is a pair \((x(t),y(t))\in {\mathcal {C}}^N\times {\mathcal {C}}^N\), i.e., 2N particles of two opposing species distributed on the real line such that

$$\begin{aligned} x_1(t)\le x_2(t)\le \cdots \le x_N(t), \qquad \text{ and } \qquad y_1(t)\le y_2(t)\le \cdots \le y_N(t), \end{aligned}$$

for \(t\ge 0\). Hence, we may consider the support of the solution as the time-dependent interval [a(t), b(t)], where

$$\begin{aligned} a(t)=\min \{x_1(t), y_1(t)\}, \qquad \text{ and } \qquad b(t)=\max \{x_N(t), y_N(t)\}, \end{aligned}$$

for \(t\ge 0\). An interesting property concerning the support is that it is determined by the initial datum in the following way.

Proposition 1

Let [a(t), b(t)] be the support of the solution (x(t), y(t)) to system (6) in the sense of Definition 4 with an initial datum \((x_0,y_0)\in {\mathcal {C}}^N\times {\mathcal {C}}^N\). Then \([a(t),b(t)]\subseteq [a(0),b(0)]\).

Proof

Let us assume without loss of generality \(a(0)=x_1(0)\) and \(b(0)=y_N(0)\). Assume first that \(x_1\) is the only particle occupying the position a(0) at \(t=0\). In this case it is easily seen that \({\dot{x}}_1(0)=-1+1/N+1\ge 0\), by Eq. (6), and the particle moves to the right. Assume now that a cluster of particles \(x_1,\ldots ,x_h\) and \(y_1,\ldots ,y_k\) occupy the position a(0) at time \(t=0\). The result in Theorem 4 implies that particles of both species with indices in \(\{1,\ldots ,\min \{h,k\}\}\) remain at a(0) for all times, whereas the remaining particles move towards the positive direction. In particular \({\dot{x}}_1=0\).

Hence, in any case, \({\dot{x}}_1(t)\ge 0\) until \(x_1\) collides with another particle. Similarly, one can prove that \({\dot{y}}_N\le 0\) until \(y_N\) collides with another particle. When the first collision occurs, we have a cluster of particles and we can re-apply Theorem 4 and conclude that \(x_1\) will stop moving for all times. An analogous statement holds for \(y_N\). The assertion is therefore proven. \(\square \)

The uniform bound for the support of the particle system proven above has an important repercussion on the sub-differential of the functional \({\mathcal {F}}\):

Proposition 2

Let \(T\ge 0\) be fixed and let \(Z(\cdot )=(x(\cdot ),y(\cdot ))\) be the unique gradient flow solution to (6) according to Definition 4. Then, there exists a constant \(C\ge 0\) independent of N and only depending on the diameter of the initial support of the particles such that

$$\begin{aligned} \sup \left\{ \Vert P\Vert _w\,:\,\, P=(p,q)\in \partial {\mathcal {F}}(Z(t))\,,\,\, t\in [0,T]\right\} \le C. \end{aligned}$$

Proof

From Definition 3 with \(Z'=Z(t)+P\) we obtain

$$\begin{aligned} \langle P,P\rangle _w \le {\mathcal {F}}(Z(t)+P)-{\mathcal {F}}(Z(t)). \end{aligned}$$

Now, a simple triangle inequality implies

$$\begin{aligned} {\mathcal {F}}(Z(t)+P)\le & {} \frac{1}{N^2}\sum _{i,j}|x_i(t)+p_i-y_j(t)-q_j|\\\le & {} \frac{1}{N^2}\sum _{i,j}|x_i(t)-y_j(t)| + \frac{1}{N^2}\sum _{i,j}(|p_i| + |q_j|) \end{aligned}$$

and due to Proposition 1 and Young inequality we get

$$\begin{aligned} {\mathcal {F}}(Z(t)+P) \le C_1 + \frac{\Vert P\Vert _w^2}{2} \end{aligned}$$

for some \(C_1\ge 0\) only depending on the diameter of the initial support of the particles. By a similar estimate we get \(-{\mathcal {F}}(Z(t))\le C_2\), where \(C_2\ge 0\) once again only depends on the diameter of the initial support of the particles. Therefore, we get

$$\begin{aligned} \Vert P\Vert _w^2 \le C_1+C_2 + \frac{\Vert P\Vert _w^2}{2}, \end{aligned}$$

and the assertion follows. \(\square \)

4.5 Number of collisions and long time behaviour

The results in the previous Theorem 4 emphasise that some initial conditions may imply the formation of “mixed-clusters”, i.e., groups of particles of both species that are stationary in time and split the whole set of particles into groups that move independently. As these groups can be considered as separate gradient flows of the same energy functional (up to rescaling the mass), in order to understand the long-time behaviour of our system we can assume without loss of generality that no “mixed-clusters” are formed.

The case in which no such mixed clusters form immediately after time \(t=0\) occurs in one of the following three cases:

  1. (1)

    There is no superposition of particles initially;

  2. (2)

    The only superposition consists of particles of the same species;

  3. (3)

    There is an initial superposition of particles of mixed species but no particles of two opposite species and same index occupy the same position.

Therefore, without loss of generality we shall assume we are in one the above three situations.

The first thing we emphasise here is that the velocity of each particle is the same between two consecutive collisions, and we know that collisions cannot occur among particles of the same species, see Theorem 3. Hence, only collisions with the opposite species can occur, according to Theorem 2. Clearly, velocities change after a collision only in case of crossings, and Remark 9 shows that crossings are only possible after a collision. Now, the result in Theorem 2 shows that two particles \(x_i\) and \(y_j\) collide and cross if and only if \(i>j\). After the collision, \(x_i\) slows down by 2/N, as it has one more particle of the y-species attracting it form the left, and one less particle of the y-species attracting it from the right. Nothing will change with respect to the interaction with particles of the x-species.

A simple computation allows to compare velocities of two particles \(x_h\) and \(y_k\) of opposite species even when they are far apart, showing that \({\dot{x}}_h(t)-{\dot{y}}_k(t)\) is always positive when \(h>k\), and of course \(x_h(t)<y_k(t)\). Indeed, we have the following bounds for the respective velocities:

$$\begin{aligned} \dot{x}_h(t)&\ge h-1-N+h-(k-1)+N-k+1=2h-2k+1,\\ \dot{y}_k(t)&\le k-1 - N + k - h+ N-h = 2k -2h-1. \end{aligned}$$

This easily implies

$$\begin{aligned} \dot{x}_h(t)-\dot{y}_k(t)\ge 4h - 4k +2\ge 0 \iff h\ge k, \end{aligned}$$

since \(h,k\in {\mathbb {N}}\). Hence, \(x_i\) may continue colliding with the next particle of the y-species, namely \(y_{j+1}\), having crossed \(y_j\), provided i is also larger than \(j+1\), and so on, for a certain number of times, n, until \(i=j+n\). Then the two particles will collide and stick together for all times.

As a consequence of that, we have an explicit control on the total number N of collisions involving a given particle. More precisely, every particle with index i can collide with at most \(N_i=i\) particles, this number being reached, for example, if \(x_i\) has \(y_1,\ldots ,y_i\) on its right. Hence, the maximum possible number of collisions is

$$\begin{aligned} N_{\mathrm {max}} = 2\sum _{i=1}^N i = N (N+1). \end{aligned}$$

Since the minimum relative velocity between two consecutive particles that will collide is 2/N, the largest possible time between two consecutive collisions is of order

$$\begin{aligned} \varDelta _{\mathrm {max}}\sim \frac{N}{2}. \end{aligned}$$

Hence, one expects that particles will reach the “stationary solution”

$$\begin{aligned} x_i=y_i\,,\qquad \hbox {for all}\, i=1,\ldots ,N, \end{aligned}$$

by a time of order \(N^3\) at the latest.

5 The “continuum” gradient flow as a many-particle limit

In this section we deal with the rigorous derivation of system (5) as a many-particle limit of a system of the form (6).

We will pursue this task for general probability measures as initial conditions. In order to single out the mathematical difficulties arising from the case of singular measures as initial conditions, we will start for simplicity with the case of an absolutely continuous initial condition \(\rho _0, \eta _0 \in {{\mathcal {P}}_2^a({\mathbb {R}})}\) with compact support. We recall that the support of a measure \(\mu \in {{{\mathcal {P}}_2({\mathbb {R}})}}\) is the closed set

$$\begin{aligned} \text {supp}(\mu )=\{x\in {\mathbb {R}}\ |\ \mu (B_r(x))>0,\ \forall r>0\}. \end{aligned}$$

Following a standard atomisation strategy, we discretise the initial datum by splitting the total mass into equal parts as follows. Let \([{\bar{x}}_{min},{\bar{x}}_{max}]\) be the convex hull of the support of \(\rho _0\) and let \([{\bar{y}}_{min},{\bar{y}}_{max}]\) be the convex hull of the support of \(\eta _0\). Fixing \(N\in {\mathbb {N}}\) large enough, we split the non-negative subgraph of \(\rho _0\) in N regions of measure \(\frac{1}{N}\) as follows:

$$\begin{aligned}&{\bar{x}}_0={\bar{x}}_{min}, \end{aligned}$$
(65a)
$$\begin{aligned}&{\bar{x}}_i:=\sup \left\{ x\in {\mathbb {R}}: \int _{{\bar{x}}_{i-1}}^xd\rho _0(y)<\frac{1}{N}\right\} , \qquad i=1,\ldots ,N. \end{aligned}$$
(65b)

Clearly, we have \({\bar{x}}_N={\bar{x}}_{max}\). Repeating the same procedure for the non-negative subgraph of \(\eta _0\) we obtain \({\bar{y}}_0,\ldots ,{\bar{y}}_N\). Then, we solve system (6), for \(i=1,\ldots ,N\), with \(({\bar{x}}_1,\ldots ,{\bar{x}}_N,{\bar{y}}_1,\ldots ,{\bar{y}}_N)\) as initial condition. The choice of discarding the two particles labelled by \(i=0\) is dictated by the need of having exactly N particles each one with mass 1/N.

In view of the results shown in Sect. 3, we know there exists a unique solution \(Z(t)=(x(t),y(t))\in {\mathcal {C}}^{N}\times {\mathcal {C}}^{N}\) in the sense of Definition 4 with support contained in the interval [a(0), b(0)] for all times, with \(a(0)=\min \{{\bar{x}}_0,{\bar{y}}_0\}\) and \(b(0)=\max \{{\bar{x}}_N,{\bar{y}}_N\}\). Now, we consider the piecewise constant densities

$$\begin{aligned}&{\tilde{\rho }}^N(t,x):=\sum _{i=0}^{N-1}d_i^1(t)\chi _{[x_i(t),x_{i+1}(t))}(x), \qquad {\tilde{\eta }}^N(t,x):=\sum _{i=0}^{N-1}d_i^2(t)\chi _{[y_i(t),y_{i+1}(t))}(x), \end{aligned}$$
(66)

where

$$\begin{aligned} d_i^1(t)=\frac{1}{N(x_{i+1}(t)-x_i(t))}, \qquad \text {and}\qquad d_i^2(t)=\frac{1}{N(y_{i+1}(t)-y_i(t))} \end{aligned}$$
(67)

are discrete Lagrangian version of the densities. Note that \(d_i^1\) and \(d_i^2\) are well-defined since particles of the same species cannot collide, as proven in Sect. 4, Theorem 3. Moreover, we also consider the empirical measures

$$\begin{aligned} \rho ^N(t)=\frac{1}{N}\sum _{i=1}^{N}\delta _{x_i(t)}, \quad \text {and} \quad \eta ^N(t)=\frac{1}{N}\sum _{j=1}^{N}\delta _{y_j(t)}. \end{aligned}$$
(68)

We notice that \(\rho ^N, \eta ^N, {\tilde{\rho }}^N, {\tilde{\eta }}^N\) are probability measures with compact support. Moreover, \(({\tilde{\rho }}^N(t),{\tilde{\eta }}^N(t))\) belong to \({{\mathcal {P}}_2^a({\mathbb {R}})}\times {{\mathcal {P}}_2^a({\mathbb {R}})}\).

Both \((\rho ^N,\eta ^N)\) and \(({\tilde{\rho }}^N,{\tilde{\eta }}^N)\) are useful representations of the particle system \(x_1,\ldots ,x_N,y_1,\ldots ,y_N\) for large N. In fact, one can prove that these two sequences in \({{{\mathcal {P}}_2({\mathbb {R}})}}\times {{{\mathcal {P}}_2({\mathbb {R}})}}\) converge, up to a subsequence, to same limit in the p-Wasserstein distance for all \(p\in [1,+\infty )\).

Lemma 7

Let \(p\in [1,+\infty )\). There exists a sequence \((N_k)_k\subset {\mathbb {N}}\) and an absolutely continuous curve

$$\begin{aligned} (\rho (\cdot ),\eta (\cdot ))\in AC([0,T]\,;\,\,{\mathcal {P}}_p({\mathbb {R}})\times {\mathcal {P}}_p({\mathbb {R}})) \end{aligned}$$

such that

$$\begin{aligned}&(\rho ^{N_k},\eta ^{N_k})\rightarrow (\rho ,\eta ) \qquad \hbox {in}\ \ C([0,T]\,;\,\,{\mathcal {P}}_p({\mathbb {R}})\times {\mathcal {P}}_p({\mathbb {R}}))\\&\quad ({\tilde{\rho }}^{N_k},{\tilde{\eta }}^{N_k})\rightarrow (\rho ,\eta ) \qquad \hbox {in}\ \ C([0,T]\,;\,\,{\mathcal {P}}_p({\mathbb {R}})\times {\mathcal {P}}_p({\mathbb {R}})) \end{aligned}$$

as \(k\rightarrow +\infty \).

Proof

Let us fix \(T\ge 0\). The results in Proposition 1 and Proposition 2 imply that

$$\begin{aligned} \Vert Z(\cdot )\Vert _{L^\infty ([0,T]\,;\,{\mathbb {R}}^N\times {\mathbb {R}}^N)}+\Vert {\dot{Z}}(\cdot )\Vert _{L^\infty ([0,T]\,;\,{\mathbb {R}}^N\times {\mathbb {R}}^N)}\le C, \end{aligned}$$

for some \(C>0\) only depending on the initial support. The estimate on \(\Vert Z(t)\Vert _\infty \) implies that all q-moments of \(\rho ^N\) and \(\eta ^N\) are uniformly bounded with respect to N, uniformly on \(t\in [0,T]\), for \(q\in [1, \infty )\).

Therefore we may infer that both \(\rho ^N\) and \(\eta ^N\) are contained in a pre-compact subset of \({\mathcal {P}}_p({\mathbb {R}})\), for all times \(t\in [0,T]\), by Prokhorov’s theorem and the uniform bounds on the q-moments, with \(q>p\). Now, for \(0\le s<t\le T\), we set \(\pi _{s,t}\in {\mathcal {P}}({\mathbb {R}}\times {\mathbb {R}})\) as

$$\begin{aligned} \pi _{s,t}^N(x,y):=\frac{1}{N}\sum _{i=1}^N \delta _{x_i(s)}(x)\otimes \delta _{x_i(t)}(y). \end{aligned}$$

It is easily seen that \(\pi _{s,t}^N\) has marginal measures \(\rho ^N(s)\) in the x-variable and \(\rho ^N(t)\) in the y-variable respectively. Hence,

$$\begin{aligned} {\mathcal {W}}_p(\rho ^N(s),\rho ^N(t))^p&\le \iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|^p d\pi ^N_{s,t}(x,y)\\&=\frac{1}{N}\sum _{i=1}^N\iint _{{\mathbb {R}}\times {\mathbb {R}}}|x-y|^p d\delta _{x_i(s)}(x) d\delta _{x_i(t)}(y)\\&= \frac{1}{N}\sum _{i=1}^N |x_i(s)-x_i(t)|^p = \frac{1}{N}\sum _{i=1}^N\left| \int _s^t {\dot{x}}_i(\tau ) d\tau \right| ^p, \end{aligned}$$

and the above estimate on \(\Vert {\dot{Z}}\Vert \) implies

$$\begin{aligned} {\mathcal {W}}_p(\rho ^N(s),\rho ^N(t))^p\le \frac{C}{N}\sum _{i=1}^N |t-s|^p = C|t-s|^p, \end{aligned}$$

for some constant \(C>0\) that is independent of N. The latter estimate implies equi-continuity of the sequence \(\{\rho ^N\,:\,\,N\in {\mathbb {N}}\}\) in \(C([0,T]\,;\,{\mathcal {P}}_p({\mathbb {R}}))\), and clearly an analogous statement holds for \(\eta ^N\). Hence, the Arzelà-Ascoli’s Theorem implies the existence of a subsequence \((\rho ^{N_k},\eta ^{N_k})\), \(k\in {\mathbb {N}}\), such that

$$\begin{aligned} (\rho ^{N_k},\eta ^{N_k})\rightarrow (\rho ,\eta )\qquad \hbox {in}\, C([0,T]\,;\,{\mathcal {P}}_p({\mathbb {R}})\times {\mathcal {P}}_p({\mathbb {R}})), \end{aligned}$$

as \(k\rightarrow +\infty \), for some \((\rho ,\eta )\in C([0,T]\,;\,{\mathcal {P}}_p({\mathbb {R}})\times {\mathcal {P}}_p({\mathbb {R}}))\), see [2, Proposition 3.3.1].

We now prove that the sequence \(({\tilde{\rho }}^{N_k},{\tilde{\eta }}^{N_k})\) converges to the same limit \((\rho ,\eta )\) in the same topology \(C([0,T]\,;{{\mathcal {P}}}_p({\mathbb {R}})\times {{\mathcal {P}}}_p({\mathbb {R}}))\). For a fixed N, let \(\pi ^N\in C([0,T];{\mathcal {P}}({\mathbb {R}}\times {\mathbb {R}}))\) be defined by

$$\begin{aligned} \pi ^N(t;x,y) = \sum _{i=1}^N\delta _{x_i(t)}(x)\otimes {\tilde{\rho }}^N|_{[x_{i-1}(t),x_i(t))}(y). \end{aligned}$$

A simple computation shows that \(\pi ^N\) has marginal measures \(\rho ^N\) in the x-variable and \({\tilde{\rho }}^N\) in the y-variable, respectively. Hence, for almost every \(t\in [0,T]\),

$$\begin{aligned} {\mathcal {W}}_1(\rho ^N(t),{\tilde{\rho }}^N(t))&\le \iint _{{\mathbb {R}}\times {\mathbb {R}}} |x-y|d\pi ^N(x,y)\\&= \sum _{i=1}^N \frac{1}{N(x_i(t)-x_{i-1}(t))}\int _{x_{i-1}(t)}^{x_i(t)}|x_i(t)-y|dy\\&= \frac{1}{2}\sum _{i=1}^N \frac{1}{N(x_i(t)-x_{i-1}(t))}(x_i(t)-x_{i-1}(t))^2 =(x_N(t)-x_0(t))\frac{1}{2N}\\&\le ({\bar{x}}_N-{\bar{x}}_0)\frac{1}{2N} \end{aligned}$$

and the assertion is proven for \(p=1\) by taking the supremum on \(t\in [0,T]\) and using that \({\bar{x}}_N-{\bar{x}}_0={\bar{x}}_{\max }-{\bar{x}}_{\min }\). Note that \(\mathrm {supp}\,\rho ^N,\mathrm {supp}\,{\tilde{\rho }}^N\subseteq [a(0),b(0)]\), hence

$$\begin{aligned} {\mathcal {W}}_p(\rho ^N(t),{\tilde{\rho }}^N(t))&\le \left( \iint _{[a(0),b(0)]^2} |x-y|^p d\pi ^N(x,y)\right) ^\frac{1}{p}\\&\le \left( b(0)-a(0)\right) ^{\frac{p-1}{p}}\iint _{{\mathbb {R}}\times {\mathbb {R}}} |x-y|d\pi ^N(x,y)\\&\le \left( b(0)-a(0)\right) ^{\frac{p-1}{p}}({\bar{x}}_N-{\bar{x}}_0)\frac{1}{2N}, \end{aligned}$$

which gives the result for \(p\in [1,+\infty )\) by taking again the supremum on \(t\in [0,T]\) and letting \(N\rightarrow \infty \). \(\square \)

We now establish the basic properties satisfied by the N-particle approximation of the initial data \(\rho _0,\eta _0\). In order to simplify the notation, we denote

$$\begin{aligned} ({\tilde{\rho }}^N,{\tilde{\eta }}^N)_{|_{t=0}} =({\tilde{\rho }}_0^N,{\tilde{\eta }}_0^N)\,,\qquad (\rho ^N,\eta ^N)_{|_{t=0}} =(\rho ^N_0,\eta ^N_0). \end{aligned}$$

Proposition 3

The two sequences \(\{({\tilde{\rho }}^N_0,{\tilde{\eta }}^N_0)\}_{n\in {\mathbb {N}}}\) and \(\{(\rho ^N_0,\eta ^N_0)\}_{n\in {\mathbb {N}}}\) converge to the initial datum \((\rho _0,\eta _0)\) with respect to \({\mathcal {W}}_1\). Moreover, assume that there exists a convex, non-decreasing function \(G:[0,+\infty )\rightarrow [0,+\infty )\) with \(G(0)=0\) and \(\lim _{r\rightarrow +\infty }\frac{G(r)}{r}=+\infty \) such that both \(G(\rho _0)\) and \(G(\eta _0)\) belong to \(L^1({\mathbb {R}})\). Then, the quantity

$$\begin{aligned} \int _{\mathbb {R}}G({\tilde{\rho }}_0^N(x)) dx + \int _{\mathbb {R}}G({\tilde{\eta }}_0^N(x)) dx \end{aligned}$$

is uniformly bounded with respect to N.

Proof

The 1-Wasserstein convergence of the initial data relies on the techniques adopted in the proof of Lemma 7 and is therefore left to the reader. In order to prove the last property, we compute

By Jensen’s inequality, we get

$$\begin{aligned} \int _{\mathbb {R}}G({\tilde{\rho }}_0^N(x)) dx \le \sum _{i=0}^{N-1}\int _{{\bar{x}}_i}^{{\bar{x}}_{i+1}}\frac{1}{{\bar{x}}_{i+1}-{\bar{x}}_i}\int _{{\bar{x}}_i}^{{\bar{x}}_{i+1}} G(\rho _0(y)) dy dx \le \int _{{\mathbb {R}}}G(\rho _0(y)) dy, \end{aligned}$$

which proves the assertion for \({\tilde{\rho }}_0^N\). The assertion for \({\tilde{\eta }}_0^N\) is proven in the same way. \(\square \)

The convergence of \(({\tilde{\rho }}^N,{\tilde{\eta }}^N)\) and \((\rho ^N,\eta ^N)\) to \((\rho ,\eta )\) proven in Lemma 7 alone is too weak to prove that \((\rho ,\eta )\) is a gradient flow solution of the continuum model (5) in the sense of [13]. This is due to the discontinuity of the gradient \(\nabla N\), which does not allow for the coupling with a singular measure in the mixed interaction terms of (6). In order to bypass this problem, we argue as follows. Assuming the initial data \(\rho _0,\eta _0\) belong to \(L^m({\mathbb {R}})\) for \(m\ge 1\), we aim at proving that the approximating sequences \({\tilde{\rho }}^N\) and \({{\tilde{\eta }}}^N\) are uniformly bounded in \(L^m({\mathbb {R}})\), which implies weak \(L^m\) compactness (note that the case \(m=1\) is shown separately) and therefore the possibility to pass to the limit under the integral sign using the pairing between an absolutely continuous measure and a discontinuous test function.

Proposition 4

Let us consider \(\rho _0,\eta _0\in {{\mathcal {P}}_2^a({\mathbb {R}})}\cap L^m({\mathbb {R}})\), for some \(m\in (1,+\infty ]\). Then, the piecewise constant densities \({\tilde{\rho }}^N\) and \({\tilde{\eta }}^N\) have a weakly (resp. weakly star) convergent subsequence in \(L_{loc}^m([0,+\infty )\times {\mathbb {R}})\) for finite m (infinite m resp.) to \(\rho \) and \(\eta \) respectively. Moreover, \(\rho \) and \(\eta \) belong to \(C([0,+\infty );\,L^m({\mathbb {R}}))\).

Proof

The proof is based on establishing \(L^m\)-bounds that are uniform in time and an application of Banach-Alaoglu theorem to obtain the weak-star compactness. Let us start by computing the time-derivative of the \(L^m\) norm of the piecewise constant densities. In what follows, we use that particles of the same species do not cross, as proven in Theorem 3. Moreover, the computation below is justified at times t at which no particles of opposite species collide. As proven in Sect. 4.5, this only happens finitely many times on each fixed time interval [0, T]. Hence, for every fixed \(T\ge 0\) and for every but finitely many \(t\in [0,T]\), we have

$$\begin{aligned} \begin{aligned} \frac{d}{dt} \left( \Vert {{\tilde{\rho }}}^N\Vert _m^m + \Vert \tilde{\eta }^N\Vert _m^m\right)&=\frac{d}{dt}\int _{\mathbb {R}}|{\tilde{\rho }}^N(t,x)|^m+|{\tilde{\eta }}^N(t,x)|^m\,dx\\&=-(m-1)\sum _{i=0}^{N-1}[d_i^1(t)]^m({\dot{x}}_{i+1}(t)-\dot{x}_i(t))\\&\quad -(m-1)\sum _{j=0}^{N-1}[d_j^2(t)]^m({\dot{y}}_{j+1}(t)-\dot{y}_j(t))\\&=-\frac{2(m-1)}{N}\sum _{i=0}^{N-1}[d_i^1(t)]^m(1-\alpha (i))\\&\quad -\frac{2(m-1)}{N}\sum _{j=0}^{N-1}[d_j^2(t)]^m(1-\beta (j)) \end{aligned} \end{aligned}$$
(69)

where \(\alpha ,\beta :{\mathbb {N}}\rightarrow {\mathbb {N}}\) are defined by

$$\begin{aligned} \alpha (i)=\#\{k:x_i<y_k<x_{i+1}\},\qquad \beta (j)=\#\{k:y_j<x_k<y_{j+1}\} \end{aligned}$$

for \(i,j=\{0,\ldots ,N-1\}\). Clearly, the two maps \(\alpha \) and \(\beta \) also depend on time, but we will omit such dependency for simplicity and assume we are considering the above computation between two consecutive collisions.

Our goal is to show \(\frac{d}{dt} \left( \Vert {{\tilde{\rho }}}^N\Vert _m^m + \Vert {{\tilde{\eta }}}^N\Vert _m^m\right) \le 0\), which gives the desired \(L^m\)-bound uniform in time. From the last line of (69) this is true if \(\alpha (i),\beta (j)\le 1\) for all \(i,j=\{0,\ldots ,N-1\}\). Now, let us rewrite (69) as follows

$$\begin{aligned} \frac{d}{dt} \left( \Vert {{\tilde{\rho }}}^N\Vert _m^m + \Vert \tilde{\eta }^N\Vert _m^m\right)&=-\frac{2(m-1)}{N}\sum _{i:\alpha (i)=0}[d_i^1(t)]^m\\&\quad +\frac{2(m-1)}{N}\sum _{i:\alpha (i)>1}[d_i^1(t)]^m(\alpha (i)-1)\\&\quad -\frac{2(m-1)}{N}\sum _{j:\beta (j)=0}[d_j^2(t)]^m\\&\quad +\frac{2(m-1)}{N}\sum _{j:\beta (j)>1}[d_j^2(t)]^m(\beta (j)-1)\\&=: A_1 + A_2 + A_3 + A_4. \end{aligned}$$

We notice that, in case \(\alpha (i)>1\), then there exist exactly \(\alpha (i)\) particles of the y-species, say with indices \({\bar{j}},{\bar{j}}+1,\ldots ,{\bar{j}}+\alpha (i)-1\), which are posed strictly between \(x_i\) and \(x_{i+1}\). For each \(j\in \{{\bar{j}},{\bar{j}}+1,\ldots ,{\bar{j}}+\alpha (i)-2\}\) we must have \(\beta (j)=0\) since there are no x-particles between \(y_j\) and \(y_{j+1}\) for all the intermediate particles \(y_j\) except the last one with \(j=\alpha (i)-1\). Hence, the number \(\alpha (i)-1\) equals exactly the number of y-particles between \(x_i\) and \(x_{i+1}\) characterised by \(\beta (j)=0\), and \(A_2\) can be re-written as

$$\begin{aligned} A_2&=\frac{2(m-1)}{N }\sum _{i:\alpha (i)>1} [d_i^1(t)]^m(\alpha (i)-1)\\&= \frac{2(m-1)}{N} \sum _{i:\alpha (i)>1} \sum _{\begin{array}{c} j:\beta (j)=0\\ x_i<y_j<x_{i+1} \end{array}} [d_i^1(t)]^m\\&\le \frac{2(m-1)}{N} \sum _{i:\alpha (i)>1} \sum _{\begin{array}{c} j:\beta (j)=0\\ x_i<y_j<x_{i+1} \end{array}} [d_j^2(t)]^m, \end{aligned}$$

where the last inequality is motivated by the fact that for any index i in the sum we have \(d_i^1(t)\le d_j^2(t)\) because \(y_{{\bar{j}}+k}-y_{{\bar{j}}+k-1} \le x_{i+1}-x_i\) for all \(k\in \{1,\ldots ,\alpha (i)-2\}\). Now, we claim that

$$\begin{aligned} \sum _{i:\alpha (i)>1} \sum _{\begin{array}{c} j:\beta (j)=0\\ x_i<y_j<x_{i+1} \end{array}}[d_j^2(t)]^m = \sum _{j:\beta (j)=0}[d_j^2(t)]^m. \end{aligned}$$
(70)

Indeed, the set of indexes \(\{j\,:\,\beta (j)=0\}\) can be split into a finite number k of sets \(I_1,\ldots , I_k\), with \(I_i\cap I_j=\emptyset \) if \(i\ne j\), with each \(I_\ell \) made up by \(h_\ell \) consecutive elements, say of the form \(I_\ell =\{{\bar{j}},\ldots ,{\bar{j}}+h_\ell -1\}\). Without restriction, we can assume that the sets \(I_\ell \) are maximal with respect to those properties, i. e. no union of any such \(I_i\cup I_j\) with \(i\ne j\) is made up by consecutive indexes. In such configuration, for each \(\ell \in \{1,\ldots ,k\}\) we can detect a unique \(i\in \{1,\ldots ,N\}\) such that \(x_i<y_j<x_{i+1}\) for all \(j\in \{{\bar{j}},\ldots ,{\bar{j}}+h_\ell -1\}\), and this implies \(\ell =\alpha (i)\), which proves our previous claim (70). As a consequence of (70), we immediately get \(A_2+A_3 \le 0\). Arguing in a similar way we also get \(A_1 + A_4 \le 0\), which gives \(\frac{d}{dt}\left( \Vert {{\tilde{\rho }}}^N\Vert _m^m + \Vert \tilde{\eta }^N\Vert _m^m\right) \le 0\) on each time interval between two consecutive collisions, whence

$$\begin{aligned} \Vert {\tilde{\rho }}^N\Vert _{L^\infty ([0,T]; L^m({\mathbb {R}}))} + \Vert \tilde{\eta }^N\Vert _{L^\infty ([0,T]; L^m({\mathbb {R}}))} \le \Vert {\tilde{\rho }}^N(\cdot ,0)\Vert _{L^m({\mathbb {R}})} + \Vert {\tilde{\eta }}^N(\cdot ,0)\Vert _{L^m({\mathbb {R}})}. \end{aligned}$$

The last estimate can be extended to the case \(m=+\infty \) as we observe

$$\begin{aligned}&\Vert {\tilde{\rho }}^N(\cdot ,t)\Vert _{L^\infty ({\mathbb {R}})}+\Vert {\tilde{\eta }}^N(\cdot ,t)\Vert _{L^\infty ({\mathbb {R}})}\le \limsup _{m\rightarrow +\infty }\left[ \Vert {\tilde{\rho }}^N(\cdot ,t)\Vert _{L^m({\mathbb {R}})}+\Vert {\tilde{\eta }}^N(\cdot ,t)\Vert _{L^m({\mathbb {R}})}\right] \\&\ \le \limsup _{m\rightarrow +\infty }\left[ \Vert {\tilde{\rho }}^N(\cdot ,0)\Vert _{L^m({\mathbb {R}})} + \Vert {\tilde{\eta }}^N(\cdot ,0)\Vert _{L^m({\mathbb {R}})}\right] \\&\ \le \limsup _{m\rightarrow +\infty }\left[ \Vert {\tilde{\rho }}^N(\cdot ,0)\Vert _{L^\infty ({\mathbb {R}})}^{\frac{m-1}{m}}\Vert {\tilde{\rho }}^N(\cdot ,0)\Vert _{L^1({\mathbb {R}})}^{\frac{1}{m}}+\Vert {\tilde{\eta }}^N(\cdot ,0)\Vert _{L^\infty ({\mathbb {R}})}^{\frac{m-1}{m}} \Vert {\tilde{\eta }}^N(\cdot ,0)\Vert _{L^1({\mathbb {R}})}^{\frac{1}{m}}\right] \\&\ = \Vert {\tilde{\rho }}^N(\cdot ,0)\Vert _{L^\infty ({\mathbb {R}})} + \Vert {\tilde{\eta }}^N(\cdot ,0)\Vert _{L^\infty ({\mathbb {R}})}. \end{aligned}$$

Therefore, due to Proposition 3 with \(G(r)=r^m\), the sequences \(\{{\tilde{\rho }}^N\}_{N\in {\mathbb {N}}}\) and \(\{{\tilde{\eta }}^N\}_{N\in {\mathbb {N}}}\) are uniformly bounded in \(L_{loc}^\infty ([0,+\infty );L^m({\mathbb {R}}))\). By weak compactness, if \(m<+\infty \) there exists a subsequence for each of them converging weakly in \(L^m_{loc}([0,\infty )\times {\mathbb {R}})\) to some limits \(\rho ',\eta '\in L_{loc}^m([0,+\infty )\times {\mathbb {R}})\), respectively. In the case of \(m=+\infty \) the above subsequence converges in the weak-\(\star \) topology of \(L_{loc}^\infty ([0,+\infty )\times {\mathbb {R}})\). In view of Lemma 7, the limits \(\rho '\) and \(\eta '\) coincide with \(\rho \) and \(\eta \) respectively. The last statement follows by weak lower semi-continuity of the \(L^m\) norm. \(\square \)

The above weak compactness can be stretched to the \(m=1\) case.

Proposition 5

Let us consider \(\rho _0,\eta _0 \in {{{\mathcal {P}}_2({\mathbb {R}})}}\cap L^1({\mathbb {R}})\). Then \({\tilde{\rho }}^N\) and \({\tilde{\eta }}^N\) converge weakly (up to a subsequence) in \(L^1_{loc}([0,+\infty )\times {\mathbb {R}})\) to \(\rho \) and \(\eta \) respectively. Consequently, \(\rho \) and \(\eta \) belong to \(L^\infty ([0,+\infty );\,L^1({\mathbb {R}}))\).

Proof

By de la Vallée-Poussin’s Theorem, there exists a non-decreasing, convex function \(G:[0,+\infty )\rightarrow [0,+\infty )\) with \(G(0)=0\) and \(\lim _{r\rightarrow +\infty }\frac{G(r)}{r}=+\infty \) such that \(G(\rho _0), G(\eta _0) \in L^1({\mathbb {R}})\). Hence, Proposition 3 implies that both \(G({\tilde{\rho }}^N_0)\) and \(G({\tilde{\eta }}^N_0)\) are uniformly bounded in \(L^1({\mathbb {R}})\). By repeating the proof of Proposition 4 with \(G(d_i^j)\) instead of \((d_i^j)^m\) with \(j=1,2\) and \(i=0,\ldots ,N-1\), we easily get a uniform bound for

$$\begin{aligned} \Vert G({\tilde{\rho }}^N)\Vert _{L^\infty (([0,+\infty ); L^1({\mathbb {R}}))} + \Vert G({{\tilde{\eta }}}^N)\Vert _{L^\infty (([0,+\infty ); L^1({\mathbb {R}}))}. \end{aligned}$$

In fact, by using the same notation of Proposition 4 we get

$$\begin{aligned}&\frac{d}{dt}\int _{\mathbb {R}}G({{\tilde{\rho }}}^N(t))+ G({{\tilde{\eta }}}^N(t))\,dx\\&\quad =\frac{d}{dt}\sum _{i=0}^{N-1}G(d_i^1(t))(x_{i+1}(t)-x_i(t))+\frac{d}{dt}\sum _{j=0}^{N-1}G(d_j^2(t))(y_{j+1}(t)-y_j(t))\\&\quad =-\frac{2}{N}\sum _{i=0}^{N-1}G'(d_i^1(t))d_i^1(t)(1-\alpha (i)) + \frac{2}{N}\sum _{i=0}^{N-1}G(d_i^1(t))(1-\alpha (i))\\&\qquad -\frac{2}{N}\sum _{j=0}^{N-1}G'(d_j^2(t))d_j^2(t)(1-\beta (j)) + \frac{2}{N}\sum _{j=0}^{N-1}G(d_j^2(t))(1-\beta (j))\\&\quad =-\frac{2}{N}\sum _{i=0}^{N-1}[G'(d_i^1(t))d_i^1(t)-G(d_i^1(t))](1-\alpha (i))\\&\qquad - \frac{2}{N}\sum _{j=0}^{N-1}[G'(d_j^2(t))d_j^2(t)-G(d_j^2(t))](1-\beta (j)). \end{aligned}$$

As mentioned above we can argue as in the proof of Proposition 4 since G is convex, hence the function \(x\in (0,+\infty )\mapsto G'(x)x-G(x)\) is non-decreasing. Therefore, by the de la Vallée-Poussin’s theorem, we may infer the equi-integrability of the sequences \({\tilde{\rho }}^N\) and \({\tilde{\eta }}^N\), and thus, by an application of the Dunford-Pettis theorem the two sequences are weakly compact in \(L^1_{loc}([0,+\infty )\times {\mathbb {R}})\). Hence, Lemma 7 implies that the limits \(\rho '\) and \(\eta '\) coincide with \(\rho \) and \(\eta \) respectively. The last statement follows by weak lower semi-continuity of the \(L^1\) norm. \(\square \)

The following technical lemma will be useful in the proof of our main result.

Lemma 8

For all \(N\in {\mathbb {N}}\), let

$$\begin{aligned} {\tilde{F}}^N(x,t)=\int _{-\infty }^x {\tilde{\rho }}^N(y,t) dy\,,\qquad {\tilde{H}}^N(x,t)=\int _{-\infty }^x {\tilde{\eta }}^N(y,t) dy. \end{aligned}$$

Then, the two families \(\{{\tilde{F}}^N\}_{N\in {\mathbb {N}}}\) and \(\{{\tilde{H}}^N\}_{N\in {\mathbb {N}}}\) are strongly compact in \(L^1_{loc}({\mathbb {R}}\times [0,+\infty ))\).

Proof

Since both \({\tilde{\rho }}^N(\cdot ,t)\) and \({\tilde{\eta }}^N(\cdot ,t)\) have unit mass for all \(t\ge 0\), we immediately get

$$\begin{aligned} \sup _{t\ge 0}\left[ \Vert {\tilde{F}}^N(\cdot ,t)\Vert _{L^\infty ({\mathbb {R}})} + \Vert {\tilde{H}}^N(\cdot ,t)\Vert _{L^\infty ({\mathbb {R}})}\right] <+\infty . \end{aligned}$$
(71)

Moreover, from the proof of Proposition 5 we easily obtain

$$\begin{aligned} \sup _{t\ge 0}\left[ \Vert G({\tilde{F}}^N_x(t,\cdot ))\Vert _{L^1({\mathbb {R}})} + \Vert G({\tilde{H}}^N_x(t,\cdot ))\Vert _{L^1({\mathbb {R}})}\right] <+\infty , \end{aligned}$$
(72)

where G is a function as in the statement of Proposition 3, the existence of which is guaranteed by de la Vallée-Poussin’s Theorem. Now, in order to estimate the oscillations in time, we aim at proving some uniform equi-continuity in time of the curve \(t\mapsto ({\tilde{\rho }}^N(\cdot ,t),{\tilde{\eta }}^N(\cdot ,t))\) in the 1-Wasserstein distance. To perform this task, for \(0\le s<t\) we recall, from the the content of Sect. 2, that

$$\begin{aligned} {\mathcal {W}}_1({\tilde{\rho }}^N(t),{\tilde{\rho }}^N(s)) = \Vert {\tilde{X}}^N(\cdot ,t)-{\tilde{X}}^N(\cdot ,s)\Vert _{L^1([0,1])}, \end{aligned}$$

where \({\tilde{X}}^N:[0,1]\times [0,+\infty )\rightarrow {\mathbb {R}}\) is the pseudo-inverse with respect to the x-variable of cumulative distribution \({\tilde{F}}^N\) defined above. A simple computation yields

$$\begin{aligned} {\tilde{X}}^N(z,t)=X_{{\tilde{\rho }}^N}(z,t)=&\sum _{i=0}^{N-2}\left[ x_i(t)+\frac{1}{d_i^1(t)}\left( z-\frac{i}{N}\right) \right] \chi _{[\frac{i}{N},\frac{i+1}{N})}(z)\\&+\left[ x_{N-1}(t)+\frac{1}{d_{N-1}^1(t)}\left( z-\frac{N-1}{N}\right) \right] \chi _{[\frac{N-1}{N},1]}(z). \end{aligned}$$

Hence,

$$\begin{aligned}&\Vert {\tilde{X}}^N(\cdot ,t)-{\tilde{X}}^N(\cdot ,s)\Vert _{L^1([0,1])} \\&\ \le \sum _{i=0}^{N-1}\int _{i/N}^{(i+1)/N}\left[ |x_i(t)-x_i(s)|+ N \left( |x_{i+1}(t)-x_{i+1}(s)| + |x_i(t)-x_i(s)|\right) \left( z-\frac{i}{N}\right) \right] dz. \end{aligned}$$

Similarly to the proof of Lemma 7, Proposition 2 implies there exists a constant \(C\ge 0\) independent of N such that

$$\begin{aligned}&\Vert {\tilde{X}}^N(\cdot ,t)-{\tilde{X}}^N(\cdot ,s)\Vert _{L^1([0,1])} \le \frac{C}{N} \sum _{i=0}^{N-1}|t-s| = C|t-s|. \end{aligned}$$

Consequently, we obtain

$$\begin{aligned} \Vert {\tilde{F}}^N(\cdot ,t)-{\tilde{F}}^N(\cdot ,s)\Vert _{L^1({\mathbb {R}})}\le C|t-s|, \end{aligned}$$
(73)

and a similar estimate can be also deduced for \({\tilde{H}}^N(\cdot ,t)\). Combining estimates (71), (72), and (73), for every compact subset \(K\subset {\mathbb {R}}\) we obtain that \({\tilde{F}}^N\) is an equi-continuous family of absolutely continuous curves with values on a bounded and compact subset of \(L^1(K)\), where we are also using Dunford-Pettis Theorem. By Arzelà-Ascoli Theorem, \({\tilde{\rho }}^N\) is strongly compact in \(L^1([0,T]\times K)\) and the same holds for \({\tilde{\eta }}^N\), which proves the assertion. \(\square \)

We are now ready to prove the main result of this section.

Theorem 5

Let \(m\in [1,+\infty ]\) and \((\rho _0,\eta _0)\in ({{\mathcal {P}}_2^a({\mathbb {R}})}\cap L^m({\mathbb {R}}))^2\) with compact support. Then, the piecewise constant particle approximation \(({\tilde{\rho }}^N,{\tilde{\eta }}^N)\) converges, up to a subsequence, weakly in \(L^m_{loc}([0,+\infty )\times {\mathbb {R}})^2\) to the unique weak measure solution \((\rho ,\eta )\) to system (5) according to Definition 1 with initial datum \((\rho _0,\eta _0)\). The empirical measure approximation \((\rho ^N,\eta ^N)\) converges, up to a subsequence, towards the same limit in \(C([0,+\infty )\,;\,{\mathcal {P}}_p({\mathbb {R}})^2)\).

Proof

Our goal is to show that the limit pair \((\rho ,\eta )\) satisfies (16). We shall prove the statement for the first equation in (16), the second one being done in the same vein. We start by proving that the approximating measure \((\rho ^N,\eta ^N)\) almost satisfies the first equation in (16), up to removing the diagonal \(x=y\) to avoid the discontinuity of the \(\mathrm {sign}\)-function. Let \(T\ge 0\) be a fixed time and let \(\varphi \in C_c^1([0,T)\times {\mathbb {R}})\). We have:

$$\begin{aligned} \begin{aligned}&\int _0^T \int _{\mathbb {R}}\varphi _t(x,t) d\rho ^N(t)(x) dt +\int _{\mathbb {R}}\varphi (x,0)d\rho _0^N(x)\\&\qquad + \int _0^T\iint _{{\mathbb {R}}\times {\mathbb {R}}\setminus \{x=y\}} \varphi _x(x,t)\mathrm {sign}(x-y)d\rho ^N(t)(y)d\rho ^N(t)(x)dt\\&\qquad - \int _0^T\iint _{{\mathbb {R}}\times {\mathbb {R}}\setminus \{x=y\}} \varphi _x(x,t)\mathrm {sign}(x-y)d\eta ^N(t)(y)d\rho ^N(t)(x)dt\\&\quad \ = \frac{1}{N}\sum _{i=1}^N\int _0^T\varphi _t(x_i(t),t) dt + \frac{1}{N}\sum _{i=1}^N\varphi ({\bar{x}}_i,0)\\&\qquad \ + \frac{1}{N^2}\int _0^T \sum _{i=1}^N \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne x_j \end{array}}^N \mathrm {sign}(x_i(t)-x_j(t)) \varphi _x(x_i(t),t)dt\\&\qquad \ - \frac{1}{N^2}\int _0^T \sum _{i=1}^N \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne y_j \end{array}}^N \mathrm {sign}(x_i(t)-y_j(t)) \varphi _x(x_i(t),t)dt. \end{aligned} \end{aligned}$$
(74)

Applying the chain rule in the first term and the assumption on the support of \(\varphi \) imply

$$\begin{aligned}&\frac{1}{N}\sum _{i=1}^N\int _0^T\varphi _t(x_i(t),t) dt + \frac{1}{N}\sum _{i=1}^N\varphi ({\bar{x}}_i,0)= -\frac{1}{N}\sum _{i=1}^N\int _0^T {\dot{x}}_i(t)\varphi _x(x_i(t),t)dt. \end{aligned}$$

We remind the reader that particles of the same species never collide and we observe that, as a consequence of Sect. 4.5, only a finite number of collisions between particles of the two species occurs in the time interval [0, T). Hence, since the particles \(x_i\) satisfy (6) away from the collision times, the right hand side of (74) equals zero. In order to conclude the proof, we need to show that the left-hand side of (74) tends to

$$\begin{aligned}&\int _0^T \int _{\mathbb {R}}\varphi _t(x,t) \rho (x,t) dx dt +\int _{\mathbb {R}}\varphi (x,0)\rho _0(x) dx\\&\ + \int _0^T\iint _{{\mathbb {R}}\times {\mathbb {R}}} \varphi _x(x,t)\mathrm {sign}(x-y)\rho (y,t)\rho (x,t)dy dx dt\\&\ - \int _0^T\iint _{{\mathbb {R}}\times {\mathbb {R}}} \varphi _x(x,t)\mathrm {sign}(x-y)\eta (y,t)\rho (x,t)dy dx dt, \end{aligned}$$

as \(N\rightarrow +\infty \). The proof would be completed in this case as \(\rho (\cdot ,t)\) and \(\eta (\cdot ,t)\) being in \(L^1({\mathbb {R}})\) at each time will make sure the diagonal terms in the above integrals do not bring any contribution.

First, the weak measure convergence of \(\rho ^N\) to \(\rho \) and of \(\rho ^N_0\) to \(\rho _0\) easily implies

$$\begin{aligned}&\int _0^T \int _{\mathbb {R}}\varphi _t(x,t) d\rho ^N(t)(x) dt +\int _{\mathbb {R}}\varphi (x,0)d\rho _0^N(x)\\&\ \rightarrow \int _0^T \int _{\mathbb {R}}\varphi _t(x,t) \rho (x,t) dx dt +\int _{\mathbb {R}}\varphi (x,0)\rho _0(x)dx. \end{aligned}$$

Hence, in order to conclude we only need to prove that in the \(N\rightarrow +\infty \) limit we have

$$\begin{aligned}&\frac{1}{N^2}\sum _{i=1}^N \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne x_j \end{array}}^N \int _0^T\mathrm {sign}(x_i(t)-x_j(t)) \varphi _x(x_i(t),t)dt\\&\quad - \frac{1}{N^2}\sum _{i=1}^N \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne y_j \end{array}}^N \int _0^T\mathrm {sign}(x_i(t)-y_j(t)) \varphi _x(x_i(t),t)dt \\&\quad \longrightarrow \int _0^T\iint _{{\mathbb {R}}\times {\mathbb {R}}}\varphi _x(x,t)\rho (x,t)\rho (y,t)\mathrm {sign}(x-y)dx dy dt \\&\quad - \int _0^T\iint _{{\mathbb {R}}\times {\mathbb {R}}} \varphi _x(x,t)\mathrm {sign}(x-y)\eta (y,t)\rho (x,t)dy dx dt. \end{aligned}$$

The following holds:

$$\begin{aligned} \begin{aligned}&\frac{1}{N^2}\sum _{i=1}^N \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne x_j \end{array}}^N \int _0^T\mathrm {sign}(x_i(t)-x_j(t)) \varphi _x(x_i(t),t)dt\\&\qquad - \frac{1}{N^2}\sum _{i=1}^N \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne y_j \end{array}}^N \int _{0}^{T}\mathrm {sign}(x_i(t)-y_j(t)) \varphi _x(x_i(t),t)dt \\&\quad =\frac{1}{N}\sum _{i=1}^N \int _0^T \varphi _x(x_i(t), t) \left\{ \frac{1}{N} \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne x_j \end{array}}^N \mathrm {sign}(x_i(t)-x_j(t)) -\frac{1}{N} \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne y_j \end{array}}^N \mathrm {sign}(x_i(t)-y_j(t)) \right\} dt. \end{aligned} \end{aligned}$$
(75)

Let us focus on the terms in the parentheses. We have

$$\begin{aligned}&\frac{1}{N} \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne x_j \end{array} }^N \mathrm {sign}(x_i(t)-x_j(t)) - \frac{1}{N} \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne y_j \end{array}}^N\mathrm {sign}(x_i(t)-y_j(t))\\&\quad = \rho ^N((-\infty , x_i(t))) - \rho ^N((x_i(t), \infty )) - \eta ^N((-\infty , x_i(t))) + \eta ^N((x_i(t), \infty )). \end{aligned}$$

It is now an easy consequence of the definition of the cumulative distribution functions,

$$\begin{aligned} F^N(x,t)=\rho ^N((-\infty ,x])\quad \text{ and }\quad H^N(x,t)=\eta ^N((-\infty ,x]), \end{aligned}$$

that

$$\begin{aligned} \begin{aligned}&\frac{1}{N} \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne x_j \end{array} }^N \mathrm {sign}(x_i(t)-x_j(t)) -\frac{1}{N} \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne y_j \end{array}}^N \mathrm {sign}(x_i(t)-y_j(t))\\&\quad = \rho ^N((-\infty , x_i(t))) - \rho ^N((x_i(t), \infty )) - \eta ^N((-\infty , x_i(t))) + \eta ^N((x_i(t), \infty ))\\&\quad = 2 F^N(x_i(t)) - 1 - \rho ^N(\{x_i(t)\}) - (2 H^N(x_i(t)) - 1 - \eta ^N(\{x_i(t)\}))\\&\quad = 2(F^N(x_i(t)) - H^N(x_i(t))) - \rho ^N(\{x_i(t)\}) + \eta ^N(\{x_i(t)\})\\&\quad = 2(F^N(x_i(t)) - H^N(x_i(t))) - 1/N + \eta ^N(\{x_i(t)\}). \end{aligned} \end{aligned}$$
(76)

Substituting Eq. (76) into Eq. (75), we obtain

$$\begin{aligned}&\frac{1}{N^2}\sum _{i=1}^N \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne x_j \end{array}}^N \int _0^T\mathrm {sign}(x_i(t)-x_j(t)) \varphi _x(x_i(t),t)dt\\&\qquad - \frac{1}{N^2}\sum _{i=1}^N \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne y_j \end{array}}^N \int _{0}^{T}\mathrm {sign}(x_i(t)-y_j(t)) \varphi _x(x_i(t),t)dt \\&\ = \frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) \left[ F^N(x_i(t),t)- H^N(x_i(t),t)\right] dt\\&\qquad +\frac{1}{N}\sum _{i=1}^N\int _{0}^{T} \varphi _x(x_i(t),t) \left( \eta ^N(\{x_i(t)\})- \frac{1}{N} \right) \,dt\\&\ = \frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) \left[ F^N(x_i(t),t)- H^N(x_i(t),t)\right] dt + {\mathcal {O}}(1/N), \end{aligned}$$

since \(\eta ^N\left( \{x_i(t)\}\right) \) can only be either 0 or 1/N, and in the former case it holds

$$\begin{aligned} \left| -\frac{1}{N^2}\sum _{i=1}^N\int _{0}^{T} \varphi _x(x_i(t),t)\,dt\right| \le \frac{T}{N}\Vert \varphi \Vert _{L^\infty }. \end{aligned}$$

Now, denoting \({\tilde{F}}^N\) and \({\tilde{H}}^N\) as in Lemma 8, we get

$$\begin{aligned}&\frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) F^N(x_i(t),t) dt \\&\quad = \frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) \left( F^N(x_i(t),t)-{\tilde{F}}^N(x_i(t),t)\right) dt\\&\qquad + \frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) {\tilde{F}}^N(x_i(t),t) dt\\&\quad =\frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) {\tilde{F}}^N(x_i(t),t) dt, \end{aligned}$$

and

$$\begin{aligned}&\frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) H^N(x_i(t),t) dt \\&\quad = \frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) \left( H^N(x_i(t),t)-{\tilde{H}}^N(x_i(t),t)\right) dt\\&\qquad + \frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) {\tilde{H}}^N(x_i(t),t) dt. \end{aligned}$$

Since

$$\begin{aligned}&\left| \frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) \left( H^N(x_i(t),t)-{\tilde{H}}^N(x_i(t),t)\right) dt\right| \\&\ \le \Vert \varphi _x\Vert _{L^\infty }\frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \left| (H^N(x_i(t),t)-{\tilde{H}}^N(x_i(t),t)\right| dt\le \frac{C(T)}{N}\Vert \varphi _x\Vert _{L^\infty } , \end{aligned}$$

we easily obtain

$$\begin{aligned}&\frac{1}{N^2}\sum _{i=1}^N \sum _{j=1}^N \int _0^T\mathrm {sign}(x_i(t)-x_j(t)) \varphi _x(x_i(t),t)dt\\&\qquad - \frac{1}{N^2}\sum _{i=1}^N \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne y_j \end{array}}^N \int _0^T\mathrm {sign}(x_i(t)-y_j(t)) \varphi _x(x_i(t),t)dt \\&\quad = \frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) {\tilde{F}}^N(x_i(t),t) dt \\&\qquad -\frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) {\tilde{H}}^N(x_i(t),t) dt + O(1/N), \end{aligned}$$

as \(N\rightarrow +\infty \). We now compute

$$\begin{aligned}&\frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) {\tilde{F}}^N(x_i(t),t) dt-\frac{2}{N}\sum _{i=1}^N \int _{0}^{T} \varphi _x(x_i(t),t) {\tilde{H}}^N(x_i(t),t) dt \\&\quad = 2\sum _{i=1}^N \int _{0}^{T} \int _{x_{i-1}(t)}^{x_{i}(t)}\frac{1}{N(x_i(t)-x_{i-1}(t))}\varphi _x(x_i(t),t) {\tilde{F}}^N(x_i(t),t) \,dx\, dt\\&\qquad -2\sum _{i=1}^N \int _{0}^{T} \int _{x_{i-1}(t)}^{x_{i}(t)}\frac{1}{N(x_i(t)-x_{i-1}(t))}\varphi _x(x_i(t),t) {\tilde{H}}^N(x_i(t),t) \,dx\, dt\\&\quad = 2\int _{0}^{T}\int _{\mathbb {R}}{\tilde{\rho }}^N(x,t)\varphi _x(x,t){\tilde{F}}^N(x,t) \,dx\, dt\\&\qquad -2\int _{0}^{T}\int _{\mathbb {R}}{\tilde{\rho }}^N(x,t)\varphi _x(x,t){\tilde{H}}^N(x,t) \,dx\, dt+ R(N,T), \end{aligned}$$

with

$$\begin{aligned} |R(N,T)|&\le 2\sum _{i=1}^N\int _{0}^{T}\int _{x_{i-1}(t)}^{x_i(t)}\frac{1}{N(x_i(t)-x_{i-1}(t))}\left| \varphi _x(x_i(t),t)-\varphi _x(x,t)\right| {\tilde{F}}^N(x_i(t),t)dx dt\\&\quad + 2\sum _{i=1}^N\int _{0}^{T}\int _{x_{i-1}(t)}^{x_i(t)}\frac{1}{N(x_i(t)-x_{i-1}(t))}|\varphi _x(x,t)|\left| {\tilde{F}}^N(x_i(t),t)-{\tilde{F}}^N(x,t)\right| dx dt\\&\quad + 2\sum _{i=1}^N\int _{0}^{T}\int _{x_{i-1}(t)}^{x_i(t)}\frac{1}{N(x_i(t)-x_{i-1}(t))}\left| \varphi _x(x_i(t),t)-\varphi _x(x,t)\right| {\tilde{H}}^N(x_i(t),t)dx dt\\&\quad +2\sum _{i=1}^N\int _{0}^{T}\int _{x_{i-1}(t)}^{x_i(t)}\frac{1}{N(x_i(t)-x_{i-1}(t))}|\varphi _x(x,t)|\left| {\tilde{H}}^N(x_i(t),t)-{\tilde{H}}^N(x,t)\right| dx dt\\&\le \frac{2}{N}\Vert \varphi _{xx}\Vert _{L^\infty }\sum _{i=1}^N\int _0^T (x_i(t)-x_{i-1}(t)) dt + \frac{4T}{N}\Vert \varphi _x\Vert _{L^\infty }\le \frac{C}{N}, \end{aligned}$$

for some constant \(C\ge 0\) depending on the support of the initial datum \(\rho _0\), on T, and on the test function \(\varphi \). Combining the above estimates we obtain

$$\begin{aligned} \begin{aligned}&\frac{1}{N^2}\sum _{i=1}^N \sum _{j=1}^N \int _0^T\mathrm {sign}(x_i(t)-x_j(t)) \varphi _x(x_i(t),t)dt\\&\qquad - \frac{1}{N^2}\sum _{i=1}^N \sum \limits _{\begin{array}{c} j=1 \\ x_i\ne y_j \end{array}}^N \int _0^T\mathrm {sign}(x_i(t)-y_j(t)) \varphi _x(x_i(t),t)dt \\&\quad = 2\int _{0}^{T}\int _{\mathbb {R}}{\tilde{\rho }}^N(x,t)\varphi _x(x,t){\tilde{F}}^N(x,t) \,dx\, dt\\&\qquad -2\int _{0}^{T}\int _{\mathbb {R}}{\tilde{\rho }}^N(x,t)\varphi _x(x,t){\tilde{H}}^N(x,t) \,dx\, dt + O(1/N)\\&\quad =\int _0^T \iint _{{\mathbb {R}}\times {\mathbb {R}}}\mathrm {sign}(x-y)\varphi _x(x,t){\tilde{\rho }}^N(y,t){\tilde{\rho }}^N(x, t)\, dy\, dx\, dt \\&\qquad -\int _0^T \iint _{{\mathbb {R}}\times {\mathbb {R}}}\mathrm {sign}(x-y)\varphi _x(x,t){\tilde{\eta }}^N(y,t){\tilde{\rho }}^N(x, t)\, dy\, dx\, dt+ O(1/N) \end{aligned} \end{aligned}$$
(77)

as \(N\rightarrow +\infty \). Now, since \({\tilde{\rho }}^N\) and \({\tilde{\eta }}^N\) are weakly compact in \(L^1_{loc}([0,T]\times {\mathbb {R}})\) according to Proposition 5, then so are the product measure \({\tilde{\rho }}^N(\cdot ,t)\otimes {\tilde{\rho }}^N(\cdot ,t)\) and \({\tilde{\rho }}^N(\cdot ,t)\otimes {\tilde{\eta }}^N(\cdot ,t)\) on \([0,T]\times {\mathbb {R}}\times {\mathbb {R}}\). Hence, we can pass to the limit in the last term of (77) and obtain the desired assertion. It is straightforward to extend the result to an \(L_{loc}^m\)-setting since we can readily apply Proposition 4 to infer weak \(L^m\)-compactness. \(\square \)

Remark 10

As pointed out in the introduction, our results allows to establish a rigorous link between a discrete model such as (6) and the continuum system of PDEs (5). A similar result is proven in [21] for a general interaction kernel, possibly with logarithmic repulsive singularity, by regularising the interaction potential V in the discrete setting by \(V_{\delta _n}\) having second derivative bounded in \(L^\infty \) by \(\lambda _n:=\Vert D ^2V_{\delta _n}\Vert _{L^\infty }\). However, the result requires, see Theorem 3.3 and Remark 3.4 of [21], for a general initial condition in \(L^1\log L^1\), that

$$\begin{aligned} e^{3T\lambda _{\delta _N}} N^{-1}\rightarrow 0 \end{aligned}$$

an \(N\rightarrow +\infty \). By smoothing our interaction potential \(V(x)=-|x|\) on an interval \([-\delta _N,\delta _N]\) we obtain the necessary condition that \(\delta _N\) must be tending to zero slower than \(\frac{3T}{\log N}\) as \(N\rightarrow +\infty \). For a simple initial condition \(\rho _0(x)={\mathbf {1}}_{[0,1]}\) this implies that a considerable portion of interactions are artificially “damped” in the discrete model. Indeed, since any two consecutive particles have a distance of order 1/N in the case of the above initial condition, the regularisation by \(V_{\delta _n}\) impacts on the interaction of each particle with a number of particles of order \(\frac{N}{\log N}\). Our approach on the other hand allows, in the one-dimensional case and with \(V(x)=-|x|\), to avoid any regularisation in the discrete setting.