1 Introduction

The paper is concerned with the study of a nonlocal balance equation, where it is assumed that the source/sink term describes the probability rate for a particle to disappear or give a spring. Therefore, the examined equation takes the form:

$$\begin{aligned} \partial _t m(t)+{\text {div}}(f(t,x,m(t))m(t))=g(t,x,m(t))m(t). \end{aligned}$$
(1)

It is endowed with the initial condition

$$\begin{aligned} m(0)=m_0. \end{aligned}$$

Here, m(t) is a measure on the phase space that describes a distribution of particles. Within the framework of particle interpretation of the nonlocal balance equation, for each time t and measure m, \(x\mapsto f(t,x,m)\) is a velocity field, while g(txm) provides a probability rate for a particle to disappear or to give a spring at the point x while the distribution of all particles at time t is m. Such form of the nonlocal balance equation is, in particular, used within the study of model of opinion dynamics with time-varying weights [1, 2].

The nonlocal balance equation is a natural extension of the nonlocal continuity equation. The latter was examined in several works (see [3,4,5] and reference therein) and finds various applications in the analysis of pedestrian flows, opinion dynamics, etc [6,7,8,9]. Notice that balance Eq. (1) becomes a continuity equation if we simply put \(g\equiv 0\). In this case, the total quantity of elements of the system is preserved, i.e., the solution is a flow in the space of probability measures. The general balance equation implies that the right-hand side is an arbitrary signed measure. Its solution is considered on the space of signed measures. This is due to the fact that, if the negative part of this right-hand side is singular w.r.t. the measure variable, the solution has also a negative part [4, Theorem 7.1.1]. The existence and uniqueness results for general balance equation were discussed in [10,11,12]. Additionally, papers [12, 13] provide the sensitivity analysis of solution to the general balance equation. Furthermore, let us mention papers [14, 15] where systems of balance laws were examined.

The main results of the paper are the following. First, we derive the superposition principle that represents a solution of balance Eq. (1) through a distribution on the space of curves with time-varying weights (see Theorem 2). The second result is the conservation form of the balance equation. Here we rewrite the balance equation as an equation on a flow of probability measures with the right-hand side given by a Lévy–Khinchine generator acting on an augmented space (see Theorem 3). The third result is an approximation of the solution of the original nonlocal balance equation by a solution of a system of ODEs. Here, we assume that the velocity filed f and the initial measure \(m_0\) are supported on a compact set \(\mathcal {K}\). The approximating system of ODEs takes the form

$$\begin{aligned} \frac{d}{dt}\beta (t)=\beta (t)Q(t,\beta (t))+\beta (t)G(t,\beta (t)), \end{aligned}$$
(2)

where, given \(t\in [0,T]\), \(\beta (t)\) is a row-vector indexed by elements of some finite set \(\mathcal {S}\), while, for each time t and vector \(\beta \), \(Q(t,\beta )\) and \(G(t,\beta )\) are Kolmogorov and diagonal matrices respectively (we say that a square matrix is Kolmogorov if it has nonnegative entries outside the diagonal, and the sum of entries on each row is zero). The matrices Q and G are also indexed by elements of \(\mathcal {S}\). In the case where the set \(\mathcal {S}\) is close to the compact \(\mathcal {K}\) and the matrix Q approximates the velocity field on the set \(\mathcal {K}\) [see condition (QS1)–(QS3)], we evaluate in Theorem 4 the distance between the measure m(t) and the empirical distribution \(\sum _{{x}\in \mathcal {S}}\beta _{{x}}(t)\delta _{{x}}\) for each time t (here \(\delta _{{x}}\) stands for the Dirac measure concentrated at x). Additionally, we give the way how one can construct a set \({\mathcal {S}}\) and a matrix Q to derive an arbitrary small approximation rate (see Proposition 9). The matrix G is entirely determined by the function g and the set \(\mathcal {S}\). Notice that, applying a numerical scheme to system of ODEs (2), one can obtain a numerical solution of nonlocal balance Eq. (1) with the controlled accuracy.

Let us briefly comment the main results.

The superposition principle obtained in the paper is a counterpart of the famous superposition principle for the continuity equation [16,17,18,19,20]. The latter states the equivalence between the Lagrangian and Eulerian descriptions of systems of identical particles on an Euclidean space. Notice that for the linear balance equation the superposition principle was obtained previously in [21, 22]. The key ingredient in the superposition principle obtained in the paper for nonlocal balance Eq. (1) that is nonlinear is an analog of the evaluation operator that assigns to a measure on the space of trajectories with varying weights a measure on the phase space describing the instantaneous distribution of particles. The latter takes into account not only a position occupied by a particle but also its weight.

Further, recall that the natural phase space for the nonlocal balance equation is the space of nonnegative measures. We endow this space with the distance first proposed by Piccoli and Rossi (see [7, 10, 11, 23]) that also works for signed measures. It generalizes the famous Wasserstein distance. Thus, we call this metric a PRW-distance (PRW-metric). It turns out that the PRW-distance between two nonnegative measures on a Polish space can be computed using the standard Wasserstein distance between probability measures obtained by adding a mass to a remote point and normalization (see Sect. 2 for the details).

The concept of remote point gives an interpretation of the source/sink term. An appearance of a particle is viewed as a jump from this remote point, while a disappearance is a jump to the remote point. Writing down operators those describe these jump processes, we arrive at the conservation form of the nonlocal balance equation.

Due to fact that the Q is a Kolmogorov matrix, system (2) describes an evolution of a distribution in an infinite particle system where each particle can jump on the set \(\mathcal {S}\) or disappear/give a spring. Adding to the set \(\mathcal {S}\) the remote point, we derive the conservation form of (2) that determines a continuous-time Markov chain acting on the set \(\mathcal {S}\) augmented by the remote point (see Proposition 7). Here, we, as above, view a disappearance of a particle as a jump to the remote point, while an appearance of a particle is regarded as a jump from the remote point. Notice that the approximation of the deterministic evolution with a controlled accuracy by a Markov chain was previously developed for mean field type systems (in this case, the evolution is given by a continuity equation) in [24] and for zero-sum differential games in [25]. The main novelty of the case examined in the paper is the presence of jumps from/to the remote point with the intensities depending on current distributions. Thus, to evaluate the approximation rate, we propose a synchronization of these jump processes in the original and approximating systems.

The paper is organized as follows. In Sect. 2, we give general notation and recall the notion of PRW-distance proposed in [10, 11]. Moreover, this Section provides an equivalent form of the PRW-distance as a normalized usual Wasserstein distance between probabilities on a space augmented by a remote point. The equilibrium distribution of the curves with weights is discussed in Sect. 3. Here, we, in particular, prove the existence and uniqueness result for the equilibrium distribution of curves with weights. In the next section, we derive the superposition principle for the examined nonlocal balance equation. Further, we rewrite the balance equation in the conservation form (see Sect. 5). In Sect. 6, we introduce an approximating system of ODEs. Finally, we evaluate the approximation rate in Sect. 7.

2 Preliminaries

2.1 General Notation

Given \(a,b\in \mathbb {R}\), we denote by \(a\wedge b\) the minimum of these numbers. Similarly, \(a\vee b\) states for the maximum of a and b.

If n is a natural number, \(X_1,\ldots ,X_n\) are sets, \(i_1,\ldots ,i_k\) are distinct numbers from \(\{1,\ldots ,n\}\), then we denote by \({\text {p}}^{i_1,\ldots ,i_k}\) the natural projection from \(X_1\times \ldots X_n\) onto \(X_{i_1}\times \ldots X_{i_k}\), i.e.,

$$\begin{aligned} {\text {p}}^{i_1,\ldots ,i_k}(x_1,\ldots ,x_n)\triangleq (x_{i_1},\ldots ,x_{i_k}). \end{aligned}$$

If \((\Omega ,\mathcal {F})\) is a measurable space, m is a measure on \(\Omega \), then we denote by \(\Vert m\Vert \) the total mass of m, i.e., \(\Vert m\Vert \triangleq m(\Omega )\). Furthermore, if \(m_1,m_2\) are measures on \(\mathcal {F}\), then we will write that \(m_1\leqq m_2\) in the case where, for each \(\Upsilon \in \mathcal {F}\),

$$\begin{aligned} m_1(\Upsilon )\le m_2(\Upsilon ). \end{aligned}$$

If \((\Omega ,\mathcal {F})\), \((\Omega ',\mathcal {F}')\) are two measurable spaces, m is a measure on \(\mathcal {F}\), \(h:\Omega \rightarrow \Omega '\) is a \(\mathcal {F}/\mathcal {F}'\)-measurable mapping, then \(h\sharp m\) stands for the push-forward of the measure m through h defined by the rule: for each \(\Upsilon \in \mathcal {F}'\),

$$\begin{aligned} (h\sharp m)(\Upsilon )\triangleq m(h^{-1}(\Upsilon )). \end{aligned}$$

If \((X,\rho _X)\), \((Y,\rho _Y)\) are Polish spaces, then C(XY) denotes the set of all continuous functions from X to Y; \(C_b(X,Y)\) is the subset of bounded continuous functions. Further, \({\text {Lip}}_{c}(X,Y)\) stands for the set of functions from X to Y those are Lipschitz continuous with the constant equal to c. When \(Y=\mathbb {R}\), we will omit the second argument. In particular, \(C_b(X)\) denotes the set of all continuous bounded functions from X to \(\mathbb {R}\). Moreover, \(C_b(X)\) is endowed with the usual \(\sup \)-norm.

If \((X,\rho _X)\) is a Polish space, then we denote by \(\mathcal {M}(X)\) the set of all (nonnegative) finite Borel measures on X, by \(\mathcal {P}(X)\) the set of all probabilities on X, i.e.,

$$\begin{aligned} \mathcal {P}(X)=\{m\in \mathcal {M}(X):m(X)=1\}. \end{aligned}$$

On \(\mathcal {M}(X)\), we consider the narrow convergence that is defined as follows: we say that a sequence of measures \(\{m_n\}_{n=1}^\infty \) narrowly convergence to \(m\in \mathcal {M}(X)\), if, for every \(\phi \in C_b(X)\),

$$\begin{aligned} \int _X \phi (x)m_n(dx)\rightarrow \int _X\phi (x)m(dx)\quad \text {as }n\rightarrow \infty . \end{aligned}$$

Notice that \(\mathcal {P}(X)\) is closed in the topology of narrow convergence.

Further, if \(p\ge 1\), then we consider the set of all probabilities with the finite p-th moment \(\mathcal {P}^p(X)\). This means that a probability m lies in \(\mathcal {P}^p(X)\) iff, for some (equivalently, every) \(x_*\in X\),

$$\begin{aligned} \int _X (\rho _X(x,x_*))^pm(dx)<\infty . \end{aligned}$$

The space \(\mathcal {P}^p(X)\) is endowed with the Wasserstein distance defined by the rule: for \(m_1,m_2\in \mathcal {P}^p(X)\),

$$\begin{aligned} W_p(m_1,m_2)\triangleq \Bigg [\inf _{\pi \in \Pi (m_1,m_2)}\int _{X\times X}(\rho _X(x_1,x_2))^p\pi (d(x_1,x_2))\Bigg ]^{1/p}. \end{aligned}$$

Here \(\Pi (m_1,m_2)\) denotes the set of plans between \(m_1,m_2\) that consists of all probabilities on \(X\times X\) such that \({\text {p}}^1\sharp \pi =m_1\), \({\text {p}}^2\sharp \pi =m_2\).

2.2 PRW-Distance

In this section, we discuss the extension of the Wasserstein distance to the space of nonnegative measure. The definition used in the paper follows the approaches to this problem proposed in [10, 11, 23].

Definition 1

Let \(m_1,m_2\in \mathcal {M}(X)\), and let \(p\ge 1\), \(b>0\). We call the quantity \(\mathcal {W}_{p,b}(m_1,m_2)\) defined by the rule

$$\begin{aligned}{} & {} (\mathcal {W}_{p,b}(m_1,m_2))^p\nonumber \\{} & {} \quad \triangleq \inf \Bigg \{b^p\Vert m_1-{\widehat{m}}{}_1\Vert +b^p\Vert m_2-{\widehat{m}}_2\Vert +\int _{X\times X}(\rho _X(x_1,x_2))^p\varpi (d(x_1,x_2)):\nonumber \\{} & {} \qquad {\widehat{m}}_1\leqq m_1,\, {\widehat{m}}_2\leqq m_2,\, \Vert {\widehat{m}}_1\Vert =\Vert {\widehat{m}}_2\Vert ,\, \varpi \in \Pi ({\widehat{m}}_1,{\widehat{m}}_2)\Bigg \} \end{aligned}$$
(3)

a PRW-distance.

Here, for \(m_1,m_2\in \mathcal {M}(X)\) of the equal masses (\(\Vert m_1\Vert =\Vert m_2\Vert \)), \(\Pi (m_1,m_2)\) is the set of measures on \(X\times X\) such that \({\text {p}}^1\sharp \pi =m_1\), \({\text {p}}^2\sharp \pi =m_2\).

Notice that \(\mathcal {W}_{p,b}\) coincides up to renormalization and renaming of parameters with the generalized Wasserstein distance introduced in [11, Definition 11].

In the paper, we will widely use the representation of the metric \(\mathcal {W}_{p,b}\) as a usual Wassertein metric on an augmented space. To construct this representation, we, first, extend a Polish space \((X,\rho _X)\) by adding to X a remote point \(\star \). The distance on \(X\cup \{\star \}\) is defined by the rule:

$$\begin{aligned} \rho _\star (x,y)\triangleq \left\{ \begin{array}{ll} b, &{} x\in X, y=\star \text { or }x=\star ,y\in X,\\ 0, &{} x=y=\star ,\\ \rho _X(x,y)\wedge 2b, &{} x,y\in X \end{array} \right. \end{aligned}$$
(4)

Here b is a positive constant used in Definition 1. When it does not lead to a confusion, we will omit the subindex \(\star \). In particular, if \(\mathcal {K}\) is a compact subset of the finite-dimensional space, \(b>{\text {diam}}(\mathcal {K})/2\), the restriction of the distance \(\rho _\star \) on \(\mathcal {K}\) coincides with the usual Euclidean distance. Thus, we will denote the distance on \(\mathcal {K}\cup \{\star \}\) by \(\Vert x_1-x_2\Vert \).

If m is a measure on X, \(R> \Vert m\Vert \), then we can extend m to the space \(X\cup \{\star \}\) setting, for a Borel set \(\Upsilon \subset X\cup \{\star \}\),

$$\begin{aligned} (m\triangleright _{R})(\Upsilon )\triangleq m(\Upsilon \cap X)+(R-m(X))\mathbbm {1}_\Upsilon (\star ). \end{aligned}$$
(5)

If \(m_1,m_2\in \mathcal {M}(X)\), when \(R\ge \Vert m_1\Vert \vee \Vert m_2\Vert \), then we put

$$\begin{aligned} \widetilde{\mathcal {W}}_{p,b}(m_1,m_2)\triangleq R^{1/p}W_p(R^{-1}(m_1\triangleright _{R}),R^{-1}(m_2\triangleright _{R})). \end{aligned}$$
(6)

Proposition 1

If \(m_1,m_2\in \mathcal {M}(X)\), \(R> \Vert m_1\Vert \vee \Vert m_2\Vert \), then

$$\begin{aligned}\widetilde{\mathcal {W}}_{p,b}(m_1,m_2)=\mathcal {W}_{p,b}(m_1,m_2).\end{aligned}$$

In particular, the quantity \(\widetilde{\mathcal {W}}_{p,b}(m_1,m_2)\) does not depend on the choice of R greater or equal to \(\Vert m_1\Vert \vee \Vert m_2\Vert \).

Proof

We have that

$$\begin{aligned} \begin{aligned} (\widetilde{\mathcal {W}}_{p,b}(m_1,m_2))^p&=\inf \Bigg \{\int _{(X\cup \{\star \})\times (X\cup \{\star \})}(\rho _\star (x_1,x_2))^p\pi (d(x_1,x_2)):\\&\qquad \pi \in \Pi ((m_1\triangleright _{R}),(m_2\triangleright _{R}))\Bigg \}.\end{aligned} \end{aligned}$$

Given, \(\pi \in \Pi ((m_1\triangleright _{R}),(m_2\triangleright _{R}))\) that is an optimal plan between \((m_1\triangleright _{R})\) and \((m_2\triangleright _{R})\), we represent \(\pi \) as the sum of four measures:

$$\begin{aligned}\pi =\pi _{X,X}+\pi _{X,\star }+\pi _{\star ,X}+\pi _{\star ,\star },\end{aligned}$$

where

  • \({\text {supp}}(\pi _{X,X})\subset X\times X\);

  • \({\text {supp}}(\pi _{X,\star })\subset X\times \{\star \}\);

  • \({\text {supp}}(\pi _{\star ,X})\subset \{\star \}\times X\);

  • \({\text {supp}}(\pi _{\star ,\star })\subset \{(\star ,\star )\}\).

Notice that

$$\begin{aligned} m_1={\text {p}}^1\sharp (\pi _{X,X}+\pi _{X,\star }),\quad m_2={\text {p}}^2\sharp (\pi _{X,X}+\pi _{\star ,X}). \end{aligned}$$

Set \({\widehat{m}}_1\triangleq {\text {p}}^1\sharp \pi _{X,X}\), \({\widehat{m}}_2\triangleq {\text {p}}^2\sharp \pi _{X,X}\). By construction, we have that

$$\begin{aligned}{\widehat{m}}_1\leqq m_1,\quad {\widehat{m}}_2\leqq m_2. \end{aligned}$$

Due to the definition of the measures \((m_1\triangleright _{R})\), \((m_2\triangleright _{R})\), we have that

$$\begin{aligned}{} & {} \int _{(X\cup \{\star \})\times (X\cup \{\star \})}(\rho _\star (x_1,x_2))^p\pi (d(x_1,x_2))\nonumber \\{} & {} \quad = b^p\Vert m_1-{\widehat{m}}{}_2\Vert +b^p\Vert m_2-{\widehat{m}}_2\Vert +\int _{X\times X}(\rho _\star (x_1,x_2))^p\pi _{X\times X}(d(x_1,x_2)). \end{aligned}$$
(7)

Further, the assumption that \(\pi \) is an optimal plan between \((m_1\triangleright _{R})\), \((m_2\triangleright _{R})\) and the definition of the distance \(\rho _\star \) [see (4)], give that \(\rho _\star =\rho _X\) on \({\text {supp}}(\pi _{X,X})\). This and (7) imply that

$$\begin{aligned} (\widetilde{\mathcal {W}}_{p,b}(m_1,m_2))^p\ge (\mathcal {W}_{p,b}(m_1,m_2))^p. \end{aligned}$$
(8)

To prove the opposite inequality, we choose three measures \({\widehat{m}}_1\in \mathcal {M}(X)\), \({\widehat{m}}_2\in \mathcal {M}(X)\), \(\varpi \in \mathcal {M}(X\times X)\) such that

$$\begin{aligned} \Vert {\widehat{m}}_1\Vert =\Vert {\widehat{m}}_2\Vert ,\quad {\widehat{m}}_1\leqq m_1,\quad {\widehat{m}}_2\leqq m_2,\quad \varpi \in \Pi ({\widehat{m}}_1,{\widehat{m}}_2). \end{aligned}$$

Further, let \({\mathcalligra{r}}\) be a mapping that assigns to each element \(x\in X\) the remote point \(\star \). Since \(R\ge \Vert m_1\Vert \vee \Vert m_2\Vert \), the probability \(\pi \in \Pi (R^{-1}(m_1\triangleright _{R}),R^{-1}(m_2\triangleright _{R}))\) defined by the rule:

$$\begin{aligned} \pi \triangleq R^{-1}\Bigg [\varpi +({\text {Id}},{\mathcalligra{r}})\sharp (m_1-{\widehat{m}}_1)+({\mathcalligra{r}},{\text {Id}})\sharp (m_2-{\widehat{m}}_2)+(R-(\Vert m_1\Vert \vee \Vert m_2\Vert ))\delta _{(\star ,\star )}\Bigg ] \end{aligned}$$

lies in \(\Pi (R^{-1}(m_1\triangleright _{R}),R^{-1}(m_2\triangleright _{R}))\). We have that

$$\begin{aligned}\begin{aligned}&R \int _{(X\cup \{\star \})\times (X\cup \{\star \})}(\rho _\star (x_1,x_2))^p\pi (d(x_1,x_2))\\&\quad = \int _{X\times X}(\rho _X(x_1,x_2))^p\varpi (x_1,x_2)+b^p\Vert m_1 -{\widehat{m}}_1\Vert +b^p\Vert m_2-{\widehat{m}}_2\Vert . \end{aligned} \end{aligned}$$

Hence,

$$\begin{aligned} (\widetilde{\mathcal {W}}_{p,b}(m_1,m_2))^p\le (\mathcal {W}_{1,p}(m_1,m_2))^p. \end{aligned}$$

This together with (8) gives the conclusion of the proposition. \(\square \)

Now let us estimate the total mass of the measure through its distance to the given measure.

Proposition 2

Let \(m_1,m_2\in \mathcal {M}(X)\), \(\mathcal {W}_{p,b}(m_1,m_2)\le c\). Then \(\Vert m_1\Vert \le b^{-p}c^p+\Vert m_2\Vert \).

Proof

From Definition 1 and the assumption of the proposition, we have that

$$\begin{aligned} b^p(\Vert m_1\Vert -\Vert {\widehat{m}}_1\Vert )\le b^p\Vert m_1-{\widehat{m}}_1\Vert \le c^p, \end{aligned}$$

where \({\hat{m}}_1\), \({\hat{m}}_2\), \(\varpi \) is a triple providing the minimum in the right-hand side of (3). Furthermore,

$$\begin{aligned} \Vert {\widehat{m}}_1\Vert =\Vert {\widehat{m}}_2\Vert \le \Vert m_2\Vert . \end{aligned}$$

This gives the conclusion of the proposition. \(\square \)

Proposition 3

The space of finite measure on X, \(\mathcal {M}(X)\), endowed with the distance \(\mathcal {W}_{p,b}\) is complete. If, additionally, X is compact, \(R>0\), then \(\{m\in \mathcal {M}(X):\Vert m\Vert \le R\}\) is also compact.

Proof

To prove the first statement of the proposition, consider a Cauchy sequence of measures \(\{m_i\}_{i=1}^\infty \). From Proposition 2, it follows that \(\Vert m_i\Vert \le R\) for some positive constant R. Thus, we have that the sequence of probabilities on \(X\cup \{\star \}\) \(\{R^{-1}(m_i\triangleright _{R})\}_{i=1}^\infty \) is Cauchy in \(W_p\). Therefore, \(\{R^{-1}(m_i\triangleright _{R})\}_{i=1}^\infty \) converges to some probability \({\widetilde{m}}\in \mathcal {P}^p(X\cup \{\star \})\). Letting m be a restriction of \(R{\widetilde{m}}\) on X, we conclude that \(\{m_i\}_{i=1}^\infty \) converges to m in \(\mathcal {W}_{p,b}\).

To prove the second statement of the proposition, it suffices to notice that the set of probabilities on \(X\cup \{\star \}\) is compact, while the mapping \( m\mapsto R^{-1}(m\triangleright _{R})\) is a isomorphism between the sets \(\{m\in \mathcal {M}(X):\Vert m\Vert \le R\}\) and \(\mathcal {P}^p(X\cup \{\star \})\). \(\square \)

Below, we restrict our attention to the case when \(p=1\). This metric coincides up to multiplicative constant with one used in [10].

2.3 Phase Space and Space of Weighted Curves

Let \(\mathcal {M}^1(\mathbb {R}^d)\) be a set of measures m on \(\mathbb {R}^d\) such that \(\int _{\mathbb {R}^d}\Vert x\Vert m(dx)<\infty \). Additionally, denote by \(\mathcal {D}(\mathbb {R}^d)\) the set of continuous functions \(\phi \) with sublinear growth.

If \(T>0\), we denote the set of continuous functions \(\gamma :[0,T]\rightarrow \mathbb {R}^d\times [0,+\infty )\) by \(\Gamma _T\). An element of \(\Gamma _T\) will be interpret as a curve with time-varying weight. Indeed, if \(\gamma (\cdot )=(x(\cdot ),w(\cdot ))\in \Gamma _T\), \(t\in [0,T]\), we regard w(t) as a relative weight of particles at the point \(x=x(t)\) at time t. If \(C>0\) then, we denote by \(\Gamma ^C_T\) the set curves with weights bounded by the constant C, i.e.,

$$\begin{aligned} \Gamma _T^C\triangleq \big \{\gamma (\cdot )=(x(\cdot ),w(\cdot ))\in \Gamma _T:\, w(t)\in [0,C]\text { for each }t\in [0,T]\big \}. \end{aligned}$$

Below, we will consider the probabilities on \(\Gamma _T^C\) with a finite first moment. If \(\eta \in \mathcal {P}^1(\Gamma _T^C)\), \(t\in [0,T]\), then we denote by \(\lfloor {\eta }\rfloor _{t}\) the measure on \(\mathbb {R}^d\) defined by the rule: for \(\phi \in C_b(\mathbb {R}^d)\),

$$\begin{aligned} \int _{\mathcal {K}}\phi (x)\lfloor {\eta }\rfloor _{t}(dx)\triangleq \int _{\Gamma }\phi (x(t))w(t)\eta (d(x(\cdot ),w(\cdot ))). \end{aligned}$$
(9)

The measure \(\lfloor {\eta }\rfloor _{t}\) gives the distribution of the masses at time t on the phase space in the case when the particles and their weights are distributed according to \(\eta \).

Notice that the measure \(\lfloor {\eta }\rfloor _{t}\in \mathcal {M}^1(\mathbb {R}^d)\). Furthermore, equality (9) holds true for each \(\phi \in \mathcal {D}(\mathbb {R}^d)\).

Now let us show that the mapping \(\mathcal {P}^1(\Gamma _T^C)\ni \eta \mapsto \lfloor {\eta }\rfloor _{t}\) is Lipschitz continuous.

Proposition 4

If \(C>0\), \(\eta _1,\eta _2\in \mathcal {P}^1(\Gamma _T^C)\), \(t\in [0,T]\), then

$$\begin{aligned} \mathcal {W}_{1,b}(\lfloor {\eta _1}\rfloor _{t},\lfloor {\eta _2}\rfloor _{t})\le C_0W_1(\eta _1,\eta _2), \end{aligned}$$

where \(C_0=C\vee b^{-1}\).

Proof

Let \(m_1\triangleq \lfloor {\eta _1}\rfloor _{t}\), \(m_2\triangleq \lfloor {\eta _2}\rfloor _{t}\), and let \(\pi \in \Pi (\eta _1,\eta _2)\) be an optimal plan between \(\eta _1\) and \(\eta _2\). We define the measure \(\varpi \) by the following rule: for \(\phi \in C_b(\mathbb {R}^d\times \mathbb {R}^d)\),

$$\begin{aligned}{} & {} \int _{\mathbb {R}^d\times \mathbb {R}^d}\phi (x_1,x_2)\varpi (d(x_1,x_2))\nonumber \\{} & {} \quad \triangleq \int _{\Gamma _T^C\times \Gamma _T^C}\phi (x_1(t),x_2(t))(w_1(t)\wedge w_2(t))\pi (d((x_1(\cdot ),w_1(\cdot )),(x_2(\cdot ),w_2(\cdot )))). \end{aligned}$$
(10)

Further, set \({\widehat{m}}_1\triangleq {\text {p}}^1\sharp \varpi \), \({\widehat{m}}_2\triangleq {\text {p}}^2\sharp \varpi \). Obviously, \(\varpi \in \Pi ({\widehat{m}}_1,{\widehat{m}}_2)\). Further, for each \(\phi \in C_b(\mathbb {R}^d)\),

$$\begin{aligned} \begin{aligned}&\int _{\mathbb {R}^d}\phi (x)(m_1-{\widehat{m}}_1)(dx)\\&\quad =\int _{\Gamma _T^C\times \Gamma _T^C}\phi (x_1(t))(w_1(t)-(w_1(t)\wedge w_2(t)))\pi (d((x_1(\cdot ),w_1(\cdot )),(x_2(\cdot ),w_2(\cdot )))),\end{aligned} \end{aligned}$$

while

$$\begin{aligned}{} & {} \int _{\mathbb {R}^d}\phi (x)(m_2-{\widehat{m}}_2)(dx)\nonumber \\{} & {} \quad =\int _{\Gamma _T^C\times \Gamma _T^C}\phi (x_2(t)) (w_2(t)-(w_1(t)\wedge w_2(t)))\nonumber \\{} & {} \qquad \pi (d((x_1(\cdot ),w_1(\cdot )),(x_2(\cdot ),w_2(\cdot )))). \end{aligned}$$
(11)

Due to the fact that \(\eta _1,\eta _2\in \mathcal {P}^1(\Gamma _T^C)\), one can extend equalities (10)–(11) to the set of functions \(\phi \) with sublinear growth. Therefore,

$$\begin{aligned}{} & {} \int _{\mathbb {R}^d\times \mathbb {R}^d}\Vert x_1-x_2\Vert \varpi (d(x_1,x_2))\nonumber \\{} & {} \quad =\int _{\Gamma _T^C\times \Gamma _T^C}\Vert x_1(t)-x_2(t)\Vert (w_1(t)\wedge w_2(t))\pi (d((x_1(\cdot ),w_1(\cdot )),(x_2(\cdot ),w_2(\cdot )))) \nonumber \\{} & {} \quad \le C \int _{\Gamma _T^C\times \Gamma _T^C}\Vert x_1(t)-x_2(t)\Vert \pi (d((x_1(\cdot ),w_1(\cdot )),(x_2(\cdot ),w_2(\cdot )))). \end{aligned}$$
(12)

Furthermore, using the Jensen’s inequality, we deduce that

$$\begin{aligned}{} & {} \Vert m_1-{\widehat{m}}_1\Vert +\Vert m_2-{\widehat{m}}_2\Vert \nonumber \\{} & {} \quad =\int _{\Gamma _T^C\times \Gamma _T^C} ((w_1(t)\vee w_2(t))-(w_1(t)\wedge w_2(t)))\pi (d((x_1(\cdot ),w_1(\cdot )),(x_2(\cdot ),w_2(\cdot ))))\nonumber \\{} & {} \quad =\int _{\Gamma _T^C\times \Gamma _T^C} |w_1(t)-w_2(t)|\pi (d((x_1(\cdot ),w_1(\cdot )),(x_2(\cdot ),w_2(\cdot )))). \end{aligned}$$
(13)

Combining this with (12), we obtain the statement of the proposition. \(\square \)

Proposition 5

Let c be a constant such that, for \(\eta \)-a.e. curves \((x(\cdot ),w(\cdot ))\in \Gamma _T^C\), one has that

$$\begin{aligned} \Vert x(s)-x(r)\Vert \le {c}|r-s|,\, |w(s)-w(r)|\le {c}|r-s|. \end{aligned}$$
(14)

Then,

$$\begin{aligned} \mathcal {W}_{1,b}(\lfloor {\eta }\rfloor _{s},\lfloor {\eta }\rfloor _{r})\le {c}(C+b)|r-s|. \end{aligned}$$

Proof

Without loss of generality, we assume that \(s<r\). For each curve \((x(\cdot ),w(\cdot ))\in \Gamma _T\) satisfying (14), we have that

$$\begin{aligned} \Vert x(s)(w(s)\wedge w(r))-x(r)(w(s)\wedge w(r))\Vert \le C{c}(r-s). \end{aligned}$$

Furthermore,

$$\begin{aligned} |w(s)-w(r)|\le {c}(r-s). \end{aligned}$$

Let \(\varpi \) be defined by the rule: for each \(\phi \in C_b(\mathbb {R}^d\times \mathbb {R}^d)\),

$$\begin{aligned} \int _{\mathbb {R}^d\times \mathbb {R}^d}\phi (x_1,x_2)\varpi (d(x_1,x_2))= & {} \int _{\Gamma _T^C}\phi (x(s),x(r))(w(s)\wedge w(r))\eta (d(x(\cdot ),w(\cdot ))), \\{} & {} \quad {\widehat{m}}_s\triangleq {\text {p}}^1\sharp \varpi ,\quad {\widehat{m}}_r\triangleq {\text {p}}^2\sharp \varpi . \end{aligned}$$

Therefore, we have that

$$\begin{aligned} \begin{aligned}&\mathcal {W}_{1,b}(\lfloor {\eta }\rfloor _{s},\lfloor {\eta }\rfloor _{r})\\&\quad \le \int _{\mathbb {R}^d\times \mathbb {R}^d}\Vert x_1-x_2\Vert \varpi (d(x_1,x_2))+b\big \Vert \lfloor {\eta }\rfloor _{s}-{\widehat{m}}_s\big \Vert +b\big \Vert \lfloor {\eta }\rfloor _{r}-{\widehat{m}}_r\big \Vert \\&\quad = \int _{\Gamma _T^C}\big (\Vert x(s)-x(r)\Vert (w(s)\wedge w(r))\\&\qquad +b((w(s)\vee w(r))-(w(s)\wedge w(r)))\big )\eta (d(x(\cdot ),w(\cdot ))\\&\quad \le {c}(C+b)(r-s). \end{aligned} \end{aligned}$$

\(\square \)

3 Distribution on the Space of Varying Weighted Curves

In this section, we examine the equilibrium distribution of curves with time-varying weights satisfying the dynamics

$$\begin{aligned}{} & {} \frac{d}{dt}x(t)=f(t,x(t),\lfloor {\eta }\rfloor _{t}),\nonumber \\{} & {} \frac{d}{dt}w(t)=g(t,x(t),\lfloor {\eta }\rfloor _{t})w(t). \end{aligned}$$
(15)

The initial distribution is assumed to be the same as for balance Eq. (1) and equal to \(m_0\).

Definition 2

Let \(C>0\). We say that \(\eta \in \mathcal {P}^1(\Gamma _T^C)\) is an equilibrium distribution of the weighted trajectories provided that

  • \(\lfloor {\eta }\rfloor _{0}=m_0\);

  • \(\eta \)-a.e. curves \((x(\cdot ),w(\cdot ))\) satisfy (15) and the initial condition \(w(0)=\Vert m_0\Vert \).

In the following, we impose the following conditions on f and g:

  1. (A1)

    \(m_0\in \mathcal {M}^1(\mathbb {R}^d)\);

  2. (A2)

    f and g are continuous;

  3. (A3)

    f and g are Lipschitz continuous w.r.t. x and m with the Lipschitz constants \(C_{Lf}\) and \(C_{Lg}\) respectively;

  4. (A4)

    there exists a constants \(C_f\) and \(C_g\) such that, for every \(t\in [0,T]\), \(x\in \mathbb {R}^d\), \(m\in \mathcal {M}(\mathbb {R}^d)\),

    $$\begin{aligned} \Vert f(t,x,m)\Vert \le C_f,\quad |g(t,x,m)|\le C_g. \end{aligned}$$

Theorem 1

For each \(T>0\), \(C>C_1\triangleq \Vert m_0\Vert e^{TC_g}\), there exists a unique equilibrium distribution of weighted trajectory lying in \(\mathcal {P}^1(\Gamma _T^C)\).

Proof

First, let us define

$$\begin{aligned} {\widetilde{\Gamma }}_T\triangleq \Big \{(x(\cdot ),w(\cdot ))\in \Gamma _T^{C_1}:x(\cdot )\in {\text {Lip}}_{C_f}([0,T];\mathbb {R}^d),\, w(\cdot )\in {\text {Lip}}_{C_gC_1}([0,T];\mathbb {R})\Big \}. \end{aligned}$$

Furthermore, we denote

$$\begin{aligned} {\mathscr {P}}\triangleq \Big \{\eta \in \mathcal {P}^1({\widetilde{\Gamma }}_T):\ \ e_0\sharp \eta = ({\text {Id}},\Vert m_0\Vert )(\Vert m_0\Vert ^{-1}m)\Big \}. \end{aligned}$$

Hereinafter, \(e_t\) stands for the evaluation operator form \(\Gamma _T\) to \(\mathbb {R}^d\times [0,+\infty )\) that assigns to each \(\gamma \in \Gamma _T\) its value at time t, i.e., \(e_t(\gamma )=\gamma (t)\).

The definition of the set \({\mathscr {P}}\) gives, in particular, that, for \(\eta \in {\mathscr {P}}\) and \(\eta \)-a.e. \(\gamma \in {\widetilde{\Gamma }}_T\), one has

$$\begin{aligned} \Vert \gamma \Vert \le TC_f+C_1+\Vert \gamma (0)\Vert . \end{aligned}$$

Therefore, probabilities from \({\mathscr {P}}\) are uniformly integrable. The assumption \({\mathscr {P}}\subset \mathcal {P}^1({\widetilde{\Gamma }}_T)\), definition of \({\widetilde{\Gamma }}_T\) and the Arzelà–Ascoli theorem give that \({\mathscr {P}}\) is tight. Therefore, we have that \({\mathscr {P}}\) is compact in the topology produced by the metric \(W_1\) [16, Proposition 7.1.5].

If \(\eta \in {\mathscr {P}}\), then denote by \({\text {traj}}_{\eta }(x_0)\) the operator that assigns to \(x_0\in \mathbb {R}^d\), the pair of functions \((x(\cdot ),w(\cdot ))\) that satisfies

$$\begin{aligned} \begin{aligned}&\frac{d}{dt}x(t)=f(t,x(t),\lfloor {\eta }\rfloor _{t}),\quad x(0)=x_0;\\&\frac{d}{dt}w(t)=g(t,x(t),\lfloor {\eta }\rfloor _{t})w(t),\quad w(0)=\Vert m_0\Vert . \end{aligned} \end{aligned}$$

Further, set

$$\begin{aligned} \Phi (\eta )\triangleq {\text {traj}}_{\eta }\sharp (\Vert m_0\Vert ^{-1}m_0). \end{aligned}$$
(16)

By construction, \(\Phi (\eta )\in \mathcal {P}(\Gamma _T)\). Further,

$$\begin{aligned} \lfloor {\Phi (\eta )}\rfloor _{0}=m_0. \end{aligned}$$

Notice that \(\eta \) is a equilibrium distribution of curves with time-varying weights if and only if

$$\begin{aligned}\Phi (\eta )=\eta .\end{aligned}$$

Let us show that the operator \(\Phi \) defined by (16) maps \({\mathscr {P}}\) into itself. To this end, we choose \((x(\cdot ),w(\cdot ))={\text {traj}}_{\eta }(x_0)\). For each \(s,r\in [0,T]\), one has that

$$\begin{aligned} \Vert x(r)-x(s)\Vert \le \int _s^r \Vert f(t,x(t),\lfloor {\eta }\rfloor _{t})\Vert dt. \end{aligned}$$

This and condition (A4) yield that \(x(\cdot )\in {\text {Lip}}_{C_f}([0,T];\mathbb {R}^d)\). Further, since \(|g(t,x(t),\lfloor {\eta }\rfloor _{t})|\le C_g\), we, using the Gronwall’s inequality, conclude that \(|w(t)|\le C_1\). Additionally, for \(s,r\in [0,T]\),

$$\begin{aligned} |w(r)-w(s)|\le \int _s^rw(t)|g(t,x(t),\lfloor {\eta }\rfloor _{t})|dt\le C_gC_1|r-s|. \end{aligned}$$

Thus, \(w(\cdot )\in {\text {Lip}}_{C_gC_1}([0,T],\mathbb {R})\). Therefore, we have that \({\text {traj}}_{\eta }(x_0)\in {\widetilde{\Gamma }}_T\), while by construction

$$\begin{aligned} e_0\sharp (\Phi (\eta ))= ({\text {Id}},\Vert m_0\Vert )(\Vert m_0\Vert ^{-1}m). \end{aligned}$$

This means that \(\Phi (\eta )\in {\mathscr {P}}\).

Now we prove that \(\Phi \) is a continuous mapping on \({\mathscr {P}}\). Let \(\eta _1,\eta _2\in \mathcal {P}({\widetilde{\Gamma }}_T)\) and let \((x_1(\cdot ),w_1(\cdot ))={\text {traj}}_{\eta _1}(x_0)\), \((x_2(\cdot ),w_2(\cdot ))={\text {traj}}_{\eta _2}(x_0)\). We have that

$$\begin{aligned} \Vert x_1(t)-x_2(t)\Vert \le \int _{0}^t\Vert f(t',x_1(t'),\lfloor {\eta _1}\rfloor _{t'}) -f(t',x_2(t'),\lfloor {\eta _2}\rfloor _{t'})\Vert . \end{aligned}$$

Using the Lipschitz continuity and Proposition 4, we obtain the estimate

$$\begin{aligned} \Vert x_1(t)-x_2(t)\Vert \le \int _{0}^t C_{Lf}\Vert x_1(t')-x_2(t')\Vert dt+C_2 W_1(\eta _1,\eta _2). \end{aligned}$$

Here \(C_{Lf}\) is a Lipschitz constant for the function f, while \(C_2=C_{Lf}C_1\). By the Gronwall’s inequality, we conclude that

$$\begin{aligned} \Vert x_1(\cdot )-x_2(\cdot )\Vert \le C_3 W_1(\eta _1,\eta _2). \end{aligned}$$
(17)

Here \(C_3\) is a constant. Furthermore,

$$\begin{aligned} \begin{aligned} |w_1(t)-w_2(t)|&\le \int _0^t |w_1(t')g(t',x_1(t'),\lfloor {\eta _1}\rfloor _{t'})-w_2(t')g(t',x_2(t'),\lfloor {\eta _1}\rfloor _{t'})|\\&\le \int _0^t\big [|w_1(t')-w_2(t')|\cdot |g(t',x_1(t'), \lfloor {\eta _1}\rfloor _{t'})|\\&\quad +|w_2(t)|\cdot |g(t',x_1(t'),\lfloor {\eta _1}\rfloor _{t'})-g(t',x_2(t'), \lfloor {\eta _2}\rfloor _{t'})|\big ]dt'. \end{aligned} \end{aligned}$$

Using the facts that g is bounded by \(C_g\) and Lipschitz continuous, the inequality \(|w_2(t')|\le C_1\) and estimate (17), we have that

$$\begin{aligned} |w_1(t)-w_2(t)|\le C_g\int _0^t |w_1(t')-w_2(t')|dt'+C_4 W_1(\eta _1,\eta _2). \end{aligned}$$

This and the Gronwall’s inequality give the following estimate for some constant \(C_5\):

$$\begin{aligned} |w_1(\cdot )-w_2(\cdot )|\le C_5 W_1(\eta _1,\eta _2). \end{aligned}$$
(18)

Now, to estimate \(W_1(\Phi (\eta _1),\Phi (\eta _2))\), we consider the plan between \(\Phi (\eta _1)\) and \(\Phi (\eta _2)\) defined by the rule:

$$\begin{aligned} {\tilde{\pi }}\triangleq ({\text {traj}}_{\eta _1},{\text {traj}}_{\eta _2})\sharp \pi , \end{aligned}$$

where \(\pi \) is an optimal plan between \(\eta _1\) and \(\eta _2\). From (17) and (18), we conclude that

$$\begin{aligned} W_1(\Phi (\eta _1),\Phi (\eta _2))\le C_6 W_1(\eta _1,\eta _2), \end{aligned}$$

where \(C_6\) is a constant determined only on f, g, \(\Vert m_0\Vert \) and T. This gives the continuity of \(\Phi \). Therefore, by the Schauder fixed point theorem (see [26, 17.56 Corollary]), the mapping \(\Phi :{\mathscr {P}}\rightarrow {\mathscr {P}}\) admits a fixed point \(\eta ^*\). By construction of \(\Phi \), we have that \(\eta ^*\) is an equilibrium distribution of curves with weights. Furthermore, since \(\eta ^*\in {\mathscr {P}}\subset \mathcal {P}^1({\widetilde{\Gamma }}_T)\), the following estimate holds true:

$$\begin{aligned} \Vert \lfloor {\eta ^*}\rfloor _{t}\Vert \le C_1=\Vert m_0\Vert e^{C_gt}. \end{aligned}$$

Now, let us show that the equilibrium distribution of the weighted curves is unique. Let \(\eta _1,\eta _2\) be two such distributions. This means that

$$\begin{aligned} \eta _1=\Phi (\eta _1),\quad \eta _2=\Phi (\eta _2). \end{aligned}$$

For each t, set

$$\begin{aligned} \varrho (t)\triangleq \mathcal {W}_{1,b}(\lfloor {\eta _1}\rfloor _{t},\lfloor {\eta _2}\rfloor _{t}). \end{aligned}$$

Now let \((x_1(\cdot ,x_0),w_1(\cdot ,x_0))={\text {traj}}_{\eta _1}(x_0)\), \((x_2(\cdot ,x_0),w_2(\cdot ,x_0))={\text {traj}}_{\eta _2}(x_0)\). We have that

$$\begin{aligned} \begin{aligned}&\Vert x_1(t,x_0)-x_2(t,x_0)\Vert \\&\quad \le \Vert x_1(0)+x_2(0)\Vert + \int _{0}^t(C_{Lf}\Vert x_1(t',x_0)-x_2(t',x_0)\Vert +C_2\varrho (t'))dt'.\end{aligned} \end{aligned}$$

This and the Gronwall’s inequality imply that

$$\begin{aligned} \Vert x_1(t,x_0)-x_2(t,x_0)\Vert \le C_7\int _0^t \varrho (t')dt'. \end{aligned}$$
(19)

Using the same arguments, one can show that

$$\begin{aligned} |w_1(t,x_0)-w_2(t,x_0)|\le C_8\int _0^t \varrho (t')dt'. \end{aligned}$$
(20)

Now, notice that

$$\begin{aligned} \begin{aligned} \mathcal {W}_{1,b}(\lfloor {\eta _1}\rfloor _{t},\lfloor {\eta _2}\rfloor _{t})&\le \int _{\mathbb {R}^d} (\Vert x_1(t,x_0)-x_2(t,x_0)\Vert \\&\quad +b| w_1(t,x_0)-w_2(t,x_0)|) \Vert m_0\Vert ^{-1}m_0(dx_0). \end{aligned} \end{aligned}$$

From this, (19), (20), we conclude that

$$\begin{aligned} \varrho (t)=\mathcal {W}_{1,b}(\lfloor {\eta _1}\rfloor _{t},\lfloor {\eta _2}\rfloor _{t})\le C_9\int _0^t \varrho (t')dt'. \end{aligned}$$

Using the Gronwall’s inequality once more, we deduce that

$$\begin{aligned} \varrho (t)=\mathcal {W}_{1,b}(\lfloor {\eta _1}\rfloor _{t},\lfloor {\eta _2}\rfloor _{t})\equiv 0. \end{aligned}$$

Plugging this equality back into (19) and (20), we deduce that \({\text {traj}}_{\eta _1}={\text {traj}}_{\eta _2}\). Using the definition of the operator \(\Phi \) [see (16)], we have that \(\Phi (\eta _1)=\Phi (\eta _2)\). Therefore, we have that \(\eta _1=\eta _2\). \(\square \)

4 Superposition Principle for the Nonlocal Balance Equation

First, let us recall the definition of the weak solution to balance Eq. (1).

Definition 3

We say that a continuous flow of measures \(m(\cdot ):[0,T]\rightarrow \mathcal {M}(\mathbb {R}^d)\) is a weak solution of balance Eq. (1) if, for each \(\phi \in C^1_c([0,T]\times \mathbb {R}^d)\) and \(s\in [0,T]\), the following equality holds:

$$\begin{aligned} \begin{aligned}&\int _{\mathbb {R}^d}\phi (s,x)m(s,dx)-\int _{\mathbb {R}^d}\phi (0,x)m_0(dx)\\&\quad =\int _0^s\int _{\mathbb {R}^d}\big [\partial _t\phi (t,x)+\langle \nabla \phi (x),f(t,x,m(t))\rangle + \phi (x)g(t,x,m(t))\big ]m(t,dx)dt.\end{aligned} \end{aligned}$$

Remark 1

Using the same methods as in [16, Lemma 8.1.2], one can prove that \(m(\cdot )\) is a weak solution of (1) if and only, for every \(\phi \in C^1_0((0,T)\times \mathbb {R}^d)\),

$$\begin{aligned} \int _0^T\int _{\mathbb {R}^d}\big [\partial _t\phi (t,x)+\langle \nabla \phi (x),f(t,x,m(t))\rangle + \phi (x)g(t,x,m(t))\big ]m(t,dx)dt=0. \end{aligned}$$

The main result of this section is the following superposition principle. There, \(C>C_1\), where \(C_1\) is introduced in Theorem 1.

Theorem 2

If \(\eta \in \Gamma _T^C\) is an equilibrium distribution of curves with time-varying weights, then \(m(t)\triangleq \lfloor {\eta }\rfloor _{t}\) is a weak solution of nonlocal balance Eq. (1) on [0, T]. Conversely, if \(m(\cdot )\) is a weak solution of (1) satisfying the initial condition \(m(0)=m_0\), then there exists an equilibrium distribution of curves with time-varying weights \(\eta \) such that \(\lfloor {\eta }\rfloor _{t}=m(t)\). In particular, there exists a unique solution of the initial value problem for balance Eq. (1).

Remark 2

As we mentioned in the Introduction, papers [10,11,12] contain the existence and uniqueness results for the balance equation with the right-hand side given by an arbitrary signed measure. However, they, in particular, assume that the source/sink term is uniformly continuous w.r.t. to the measure variable. This assumption is not directly applicable for the examined balance Eq. (1).

The proof of Theorem 2 relies on the following uniqueness result.

Lemma 1

Let \(v:[0,T]\times \mathbb {R}^d\rightarrow \mathbb {R}^d\), \(z:[0,T]\times \mathbb {R}^d\rightarrow \mathbb {R}\) be continuous w.r.t. time variable. Additionally, assume that v is Lipschitz continuous w.r.t. x. Then, the initial value problem for the linear balance equation:

$$\begin{aligned}{} & {} \partial _t m(t)+{\text {div}}(v(t,x)m(t))=z(t,x)m(t),\nonumber \\{} & {} m(0)=m_0 \end{aligned}$$
(21)

allows at most one weak solution.

This lemma was proved in [22, Lemma 3.5]. However, for the sake of completeness, we give its proof in Appendix A.

Proof of Theorem 2

First, let us prove the necessity part.

Let \(\eta \) be an equilibrium distribution of weighted curves. By Proposition 5, the mapping \(t\mapsto \lfloor {\eta }\rfloor _{t}\) is continuous. Now, let \(\phi \in C^1_c([0,T]\times \mathbb {R}^d)\). The definition of the operation \(\lfloor {\eta }\rfloor _{t}\) gives that, for every \(s\in [0,T]\),

$$\begin{aligned} \begin{aligned}&\int _{\mathbb {R}^d}\phi (s,x)\lfloor {\eta }\rfloor _{s}(dx)- \int _{\mathbb {R}^d}\phi (0,x)\lfloor {\eta }\rfloor _{0}(dx)\\&\quad =\int _{\Gamma _T^C}\big [\phi (s,x(s))w(s)-\phi (0,x(0))w(0)\big ] \eta (d(x(\cdot ),w(\cdot )))\\&\quad =\int _{\Gamma _T^C} \int _0^s\big [ (\partial _t\phi (t,x(t))+\langle \phi (t,x(t)),{\dot{x}}(t)\rangle )w(t)\\&\qquad +\phi (t,x(t)){\dot{w}}(t)\big ]dt\eta (d(x(\cdot ),w(\cdot )))\\&\quad =\int _{\Gamma _T^C} \int _0^s\Big [ (\partial _t\phi (t,x(t))+\langle \phi (t,x(t)),f(t,x(t),\lfloor {\eta }\rfloor _{t})\rangle ) w(t)\\&\qquad +\phi (t,x(t))g(t,x(t), \lfloor {\eta }\rfloor _{t})w(t)\Big ]dt\eta (d(x(\cdot ),w(\cdot )))\\&\quad = \int _0^s\int _{\mathbb {R}^d} \Big [ (\partial _t\phi (t,x)+\langle \phi (t,x),f(t,x, \lfloor {\eta }\rfloor _{t})\rangle )+\phi (t,x)g(t,x,\lfloor {\eta }\rfloor _{t})\Big ] \lfloor {\eta }\rfloor _{t}(dx)dt. \end{aligned} \end{aligned}$$

Therefore, the flow of measures \(t\mapsto \lfloor {\eta }\rfloor _{t}\) is a weak solution of nonlocal balance Eq. (1).

To prove the sufficiency part, given \(m(\cdot )\) that is a weak solution of balance Eq. (1), we construct a distribution of weighted curves \(\eta \) corresponding to the linear balance Eq. (21) with

$$\begin{aligned} v(t,x)\triangleq f(t,x,m(t)),\quad z(t,x)\triangleq g(t,x,m(t)). \end{aligned}$$

By the necessity part of the theorem, the mapping \(t\mapsto \lfloor {\eta }\rfloor _{t}\) solves (21). Lemma 1 yields that \(\lfloor {\eta }\rfloor _{t}\equiv m(t)\). This and construction of \(\eta \) give that \(\eta \) is a distribution of weighted curves corresponding to (1) and the initial distribution \(m_0\). \(\square \)

5 Conservation form of the Balance Equation

In this section, we rewrite balance Eq. (1) as a differential equation on a flow of probability measures defined on \(\mathbb {R}^d\cup \{\star \}\). The right-hand side of this equation is given by a Lévy–Khintchine generator that combines a drift part and jumps between points on \(\mathbb {R}^d\) and the remote point \(\star \) introduced in Sect. 2.2. Natural interpretation of this process is a mean field particle system consisting of infinitely many identical particles with evolution of each particle combining drift and jumps.

Notice that, due to Theorem 1 and Eq. (30), we can assume that \(m(\cdot )\) is such that \(\Vert m(t)\Vert \le R\), where \(R> ce^{C_gT}\). Denoting \(\mathcal {M}^1_R(\mathbb {R}^d)\triangleq \{m\in \mathcal {M}^1(\mathbb {R}^d):\Vert m\Vert \le R\}\), we have that the mapping \(m\mapsto \mu \triangleq R^{-1}(m\triangleright _{R})\) provides the isomorphism between \(\mathcal {M}^1_R(\mathbb {R}^d)\) and \(\mathcal {P}^1(\mathbb {R}^d\cup \{\star \})=\{\mu \in \mathcal {P}(\mathbb {R}^d\cup \{\star \}):\int _{\mathbb {R}^d}\Vert x\Vert \mu (dx)<\infty \}\). Notice that the inverse mapping is \(\mu \mapsto R\mu |_{\mathbb {R}^d}\), where \(\mu |_{\mathbb {R}^d}\) denotes the restriction of the probability \(\mu \) on \(\mathbb {R}^d\), i.e., for each \(\Upsilon \in \mathcal {B}(\mathbb {R}^d)\),

$$\begin{aligned} (\mu |_{\mathbb {R}^d})(\Upsilon )\triangleq \mu (\Upsilon ). \end{aligned}$$

Recall that by (5) and Proposition 1, for each two measures \(m_1,m_2\in \mathcal {M}^1_R(\mathbb {R}^d)\) and \(\mu _1\triangleq R^{-1}(m_1\triangleright _{R})\), \(\mu _2\triangleq R^{-1}(m_2\triangleright _{R})\),

$$\begin{aligned} \mathcal {W}_{1,b}(m_1,m_2)=RW_1(\mu _1,\mu _2). \end{aligned}$$

Now, for \(t\in [0,T]\), \(x\in \mathbb {R}^d\cup \{\star \}\), \(\mu \in \mathcal {P}(\mathcal {K}\cup \{\star \})\), put

$$\begin{aligned} f_R(t,x,\mu )\triangleq \left\{ \begin{array}{ll} f(t,x,R\mu |_{\mathbb {R}^d}), &{} x\in \mathbb {R}^d,\\ 0, &{} x=\star , \end{array}\right. \\ g_R(t,x,\mu )\triangleq \left\{ \begin{array}{ll} g(t,x,R\mu |_{\mathbb {R}^d}), &{} x\in \mathbb {R}^d,\\ 0, &{} x=\star . \end{array}\right. \end{aligned}$$

In the Introduction, we mentioned that the function g combines probability rates for the particle to disappear or to give a spring. Thus, it is convenient to distinguish the positive and negative parts of g:

$$\begin{aligned} g^+(t,x,m)\triangleq g(t,x,m)\vee 0,\quad g^-(t,x,m)\triangleq (-g(t,x,m))\vee 0. \end{aligned}$$

Similarly,

$$\begin{aligned} g^+_R(t,x,\mu )\triangleq g_R(t,x,\mu )\vee 0,\quad g^-_R(t,x,\mu )\triangleq (-g_R(t,x,\mu ))\vee 0. \end{aligned}$$

Given \(\mu \in \mathcal {P}(\mathbb {R}^d\cup \{\star \})\), \(m\triangleq R\cdot (\mathbbm {1}_{\mathbb {R}^d}\sharp \mu )\), the quantity

$$\begin{aligned}Rg^+_R(t,y,\mu )(\mathbbm {1}_{\mathbb {R}^d}\sharp \mu )(dy)\Delta t=g^+(t,y,m)m(dy)\Delta t\end{aligned}$$

approximately describes the distribution of newly appeared particles on the time interval \([t,t+\Delta t]\), while \(g^-_R(t,x,\mu )\Delta t+o(\Delta t)\) is the probability of the particle that occupies the state x at time t to disappear on the time interval \([t,t+\Delta t]\).

For each \(t\in [0,T]\), \(x\in \mathbb {R}^d\cup \{\star \}\), \(\mu \in \mathcal {P}(\mathbb {R}^d\cup \{\star \})\), let \(\nu ^-(t,x,\mu ,\cdot )\) be a finite measure on \(\{0,1\}\) such that

$$\begin{aligned} \nu ^-(t,x,\mu ,\{1\})= g^-_R(t,x,\mu ),\quad \nu ^-(t,x,\mu ,\{0\})=0. \end{aligned}$$
(22)

Further, denote by \(\nu ^+(t,x,\mu ,\cdot )\) the measure on \(\mathbb {R}^d\cup \{\star \}\) defined by the rule: for \(\phi \in C_b(\mathbb {R}^d\cup \{\star \})\),

$$\begin{aligned} \int _{\mathbb {R}^d\cup \{\star \}}\phi (y)\nu ^+(t,m,\star ,dy)\triangleq \mu ^{-1}(\{\star \})\int _{\mathbb {R}^d}\phi (y)g^+_R(t,y,\mu )\mu (dy) \end{aligned}$$
(23)

and

$$\begin{aligned} \int _{\mathbb {R}^d\cup \{\star \}}\phi (y)\nu ^+(t,x,\mu ,dy)=0 \end{aligned}$$
(24)

when \(x\in \mathbb {R}^d\).

With some abuse of notation, we denote by \(C^1(\mathbb {R}^d\cup \{\star \})\) the set of functions \(\phi \) from \(\mathbb {R}^d\cup \{\star \}\) to \(\mathbb {R}\) such that the restriction of \(\phi \) on \(\mathbb {R}^d\) lies in \(C^1(\mathbb {R}^d)\).

Let us consider the following generator on \(C^1(\mathbb {R}^d\cup \{\star \})\):

$$\begin{aligned} L_t[\mu ]\phi (x)\triangleq & {} \langle f_R(t,x,\mu ),\nabla \phi (x)\rangle \nonumber \\{} & {} +\,\int _{\{0,1\}}(\phi (x+u(\star -x))-\phi (x)) \nu ^-(t,x,\mu ,du)\nonumber \\{} & {} +\,\int _{\mathbb {R}^d\cup \{\star \}} (\phi (y)-\phi (x))\nu ^+(t,x,\mu ,dy). \end{aligned}$$
(25)

Now let us consider the following equation

$$\begin{aligned} \frac{d}{dt}\mu (t)=L_t^*[\mu (t)]\mu (t), \end{aligned}$$
(26)

where \(L_t^*[\mu ]\) is the operator adjoint to \(L_t[\mu ]\).

The solution of (26) is considered in the space of probability measures. In particular, it preserves the total mass equal to 1.

Definition 4

We say that \(\mu (\cdot ):[0,T]\rightarrow \mathcal {P}^1(\mathbb {R}^d\cup \{\star \})\) solve (26) provided that, for each \(\phi \in C^1_0(\mathbb {R}^d\cup \{\star \})\),

$$\begin{aligned} \int _{\mathbb {R}^d\cup \{\star \}}\phi (x)\mu (t,dx)-\int _{\mathbb {R}^d\cup \{\star \}}\phi (x)\mu (0,dx) =\int _0^t\int _{\mathbb {R}^d\cup \{\star \}}L_\tau [\mu (\tau )]\phi (x)\mu (\tau ,dx)d\tau . \end{aligned}$$

Theorem 3

A flow of probabilities \(\mu (\cdot )\) is a solution of the Eq. (26) if and only if \(m(\cdot )\) defined by the rule \(m(t)\triangleq R\mu (t)|_{\mathbb {R}^d}\) satisfies balance Eq. (1). In particular, for each \(\mu _0\in \mathcal {P}^1(\mathbb {R}^d\cup \{\star \})\), there exists a unique solution to Eq. (26) with the initial condition \(\mu (0)=\mu _0\triangleq R^{-1}(m_0\triangleright _{R})\).

Proof

Choose \(\mu \in \mathcal {P}(\mathbb {R}^d\cup \{\star \})\) and set \(m\triangleq R\mu |_{\mathbb {R}^d}\). Notice that each function \(\phi \in C^1(\mathbb {R}^d\cup \{\star \})\) can be expressed in the form

$$\begin{aligned} \phi (x)={\tilde{\phi }}(x)\mathbbm {1}_{\mathbb {R}^d}(x)+c\mathbbm {1}_{\{\star \}}(x), \end{aligned}$$

where \({\tilde{\phi }}\) is \(C^1\) function defined on \(\mathbb {R}^d\), while c is a constant. Therefore, due to the definitions of functions \(f_R\), \(g_R\) and measures \(\nu ^-\), \(\nu ^+\), we obtain that

$$\begin{aligned} \begin{aligned} L_t[\mu ]\phi (x)&=\langle f(t,x,m),{\tilde{\phi }}(x)\rangle +(c-{\tilde{\phi }}(x)\mathbbm {1}_{\mathbb {R}^d}(x))g^-(t,x,m)\\&\quad +\,R^{-1}\mu ^{-1}(\{\star \})\int _{\mathbb {R}^d}({\tilde{\phi }}(y)-c)g^ +(t,y,m)m(dy)\mathbbm {1}_{\{\star \}}(x).\end{aligned} \end{aligned}$$

Thus, due to the fact that \(g(t,x,m)=g^+(t,x,m)-g^-(t,x,m)\), we have the equality

$$\begin{aligned}{} & {} R\int _{\mathbb {R}^d\cup \{\star \}}L_t[\mu ]\phi (x)\mu (dx)\nonumber \\{} & {} \quad =\int _{\mathbb {R}^d}\langle f(t,x,m),{\tilde{\phi }}(x)\rangle m(dx)\nonumber \\{} & {} \qquad +\,\int _{\mathbb {R}^d}{\tilde{\phi }}(x)g(t,x,m)m(dx) +c\int _{\mathbb {R}^d}g(t,x,m)m(dx). \end{aligned}$$
(27)

Now let \(\mu (\cdot )\) solve Eq. (26). We define \(m(t)\triangleq R(\mu (t)|_{\mathbb {R}^d})\). Using this and (27) with \(c=0\), we conclude that \(m(\cdot )\) satisfies (1).

To prove the converse statement, we let \(\mu _0\triangleq R^{-1}(m(0)\triangleright _{R})\) and find \(\mu (\cdot )\) solving (26) with the initial condition \(\mu (0)=\mu _0\). Therefore, \(t\mapsto R(\mu (t)|_{\mathbb {R}^d})\) is a solution of balance Eq. (1). The uniqueness result for this equation gives that \(m(t)=R(\mu (t)|_{\mathbb {R}^d})\). \(\square \)

We complete this section with the mean field particle system interpretation of the balance equation.

Remark 3

It is natural to say that a 5-tuple \((\Omega ,\mathcal {F},\{\mathcal {F}_t\}_{t\in [0,T]},\mathbb {P},X)\) is a mean field particle system representation of balance Eq. (1) provided that

  • \((\Omega ,\mathcal {F},\{\mathcal {F}_t\}_{t\in [0,T]},\mathbb {P})\) is a probability space with filtration;

  • X is a \(\{\mathcal {F}_t\}_{t\in [0,T]}\)-adapted stochastic process with values in \(\mathbb {R}^d\cup \{\star \}\);

  • for every \(\phi \in C^1(\mathbb {R}^d\cup \{\star \})\), the process

    $$\begin{aligned} \phi (X(t))-\int _0^t L_{\tau }[\mu (\tau )]\phi (X(\tau ))d\tau \end{aligned}$$

    is a \(\{\mathcal {F}_t\}_{t\in [0,T]}\)-martingale such that

    $$\begin{aligned} \mu (t)=X(t)\sharp \mathbb {P}. \end{aligned}$$
    (28)

Indeed, one can regard the stochastic process X as a dynamics of a sampling particle, while the probability \(\mu (t)\) gives a distribution of all particles in time t.

By construction, we have that \(\mu (\cdot )\) defined by (28) satisfies (26), while \(m(\cdot )\) defined by the rule \(m(t)=R\mu (t)|_{\mathbb {R}^d}\) solves (1).

One can show that there exists at least one particle system representation of (1). The accurate proof of this fact goes beyond the scope of the paper. The scheme of the proof follows the way introduced in [27]. We assume that we are given with the flow of probabilities \(\mu (\cdot )\) that solves  (28). First, we discretize the dynamics and approximate the drift part of the dynamics \(f_R(t,x,\mu )\) by a Kolmogorov matrix. Then, we obtain a continuous time Markov chain. The corresponding stochastic process is uniformly stochastically continuous [27, Theorem 5.4.1]. Finally, one can argue as in [27, Theorem 5.4.1] and show that, when the discretization parameter tends to zero, the continuous time Markov chains converge. The limit provides the desired particle system representation.

6 Approximating System of ODEs

In this section, we introduce a system of ODEs that arises when we approximate the drift term f by a Kolmogorov matrix.

Let \(\mathcal {S}\) be a finite subset of \(\mathbb {R}^d\). The fineness of \(\mathcal {S}\) is evaluated by a number

$$\begin{aligned} d(\mathcal {S})\triangleq \min \big \{\Vert {x}-{y}\Vert :\, {x},{y}\in \mathcal {S},\, {x}\ne {y}\big \}. \end{aligned}$$

the diameter of \(\mathcal {S}\) is

$$\begin{aligned} {\text {diam}}(\mathcal {S}) =\max \big \{\Vert {x}-{y}\Vert :\, {x},{y}\in \mathcal {S},\, {x}\ne {y}\big \}. \end{aligned}$$

A signed measure on \(\mathcal {S}\) is always determined by a sequence \(\beta _{\mathcal {S}}=(\beta _{{x}})_{{x}\in \mathcal {S}})\subset \mathbb {R}\). We denote the set of such sequences by \({\mathcalligra{l}}_1(\mathcal {S})\). It is endowed with the norm

$$\begin{aligned} \Vert \beta _{\mathcal {S}}\Vert \triangleq \sum _{{x}\in \mathcal {S}}|\beta _{{x}}|. \end{aligned}$$

We will assume that each such sequence is a row-vector. Further, the set of elements of \({\mathcalligra{l}}_1(\mathcal {S})\) with nonnegative entries is denoted by \({\mathcalligra{l}}_1^+(\mathcal {S})\). This space inherits the metric from \({\mathcalligra{l}}_1(\mathcal {S})\).

The mapping

$$\begin{aligned} \beta _{\mathcal {S}}\mapsto {\mathscr {I}}(\beta _{\mathcal {S}})\triangleq \sum _{{x}\in \mathcal {S}} \beta _{{x}}\delta _{{x}} \end{aligned}$$

provides the isomorphism between \({\mathcalligra{l}}_1^+(\mathcal {S})\) and \(\mathcal {M}(\mathcal {S})\).

Recall that, on \(\mathcal {M}(\mathcal {S})\), we consider the metric \(\mathcal {W}_{1,b}\).

The distance between images w.r.t. the isomorphism \({\mathscr {I}}\) is evaluated in the following statement which is proved in the Appendix.

Proposition 6

Let \(b\ge {\text {diam}}(\mathcal {S})\). Then, the following estimates hold true:

$$\begin{aligned} b^{-1} \mathcal {W}_{1,b}({\mathscr {I}}(\beta ^1),{\mathscr {I}}(\beta ^2))\le \Vert \beta ^1-\beta ^2\Vert \le (d(\mathcal {S}))^{-1} \mathcal {W}_{1,b}({\mathscr {I}}(\beta ^1),{\mathscr {I}}(\beta ^2)). \end{aligned}$$
(29)

Now let us introduce a system of ODEs indexed by elements of \({\mathcalligra{l}}_1^+(\mathcal {S})\). In Sect. 7 we will show that it approximates original nonlocal balance Eq. (1). To this end, for \({x}\in \mathcal {S}\), set

$$\begin{aligned} {\hat{g}}_{{x}}(t,\beta _{\mathcal {S}})\triangleq g(t,{x},{\mathscr {I}}(\beta _{\mathcal {S}})). \end{aligned}$$

Further, let \(Q(t,\beta _{\mathcal {S}})=(Q_{{x},{y}}(t,\beta _{\mathcal {S}}))_{{x},{y}\in \mathcal {S}}\) be a Kolmogorov matrix, i.e., for each \({x}\in \mathcal {S}\), \(Q_{{x},{y}}(t,\beta _{\mathcal {S}})\ge 0\) when \({x}\ne {y}\), whereas

$$\begin{aligned} \sum _{{y}\in \mathcal {S}}Q_{{x},{y}}(t,\beta _{\mathcal {S}})=0. \end{aligned}$$

The approximating system takes the following form:

$$\begin{aligned} \frac{d}{dt}\beta _{{y}}(t)=\sum _{{x}\in \mathcal {S}}\beta _{{x}}(t) Q_{{x},{y}}(t,\beta _{\mathcal {S}}(t)) + \beta _{{y}}(t){\hat{g}}_{{y}}(t,\beta _{\mathcal {S}}(t)). \end{aligned}$$

Since we assume that \(\beta _{\mathcal {S}}=(\beta _{{x}})_{{x}\in \mathcal {S}}\) is a row-vector, one can rewrite this system in the vector form:

$$\begin{aligned} \frac{d}{dt}\beta _{\mathcal {S}}(t)=\beta _{\mathcal {S}}(t)Q (t,\beta _{\mathcal {S}}(t))+\beta _{\mathcal {S}}(t)G(t,\beta _{\mathcal {S}}(t)). \end{aligned}$$
(30)

Here, we denote by \(G(t,\beta _{\mathcal {S}})=(G_{{x},{y}}(t,\beta _{\mathcal {S}})) _{{x},{y}\in \mathcal {S}}\) the diagonal matrix determined by the rule:

$$\begin{aligned} G_{{x},{y}}(t,\beta _{\mathcal {S}})=\left\{ \begin{array}{lc} {\hat{g}}_{{x}}(t,\beta _{\mathcal {S}}), &{} {x}={y},\\ 0,&{} {x}\ne {y}. \end{array} \right. \end{aligned}$$

Due to the classical existence and uniqueness theorem for ODEs, system (30) has a unique solution for every initial condition.

To rewrite Eq. (30) in the conservation form, we first extend the phase space to \(\mathcal {S}\cup \{\star \}\) and, for each \(t\in [0,T]\), \(\beta _{\mathcal {S}}\in {\mathcalligra{l}}_1^+(\mathcal {S})\), define the matrix \(\mathcal {Q}(t,\beta _{\mathcal {S}})=(\mathcal {Q}_{{x},{y}}(t,\beta _{\mathcal {S}})) _{{x},{y}\in \mathcal {S}\cup \{\star \}}\) by the rule:

$$\begin{aligned} \mathcal {Q}_{{x},{y}}(t,\beta _{\mathcal {S}})\triangleq \left\{ \begin{array}{ll} Q_{{x},{y}}(t,\beta _{\mathcal {S}}), &{} {x},{y}\in \mathcal {S},\, {x}\ne {y}\\ {\hat{g}}^-_{{x}}(t,\beta _{\mathcal {S}}), &{} {x}\in \mathcal {S},\, {y}=\star ,\\ (R-\Vert \beta \Vert )^{-1}{\hat{g}}^+_{{y}}(t,\beta _{\mathcal {S}}) \beta _{{y}}, &{} {x}=\star ,\, {y}\in \mathcal {S},\\ Q_{{x},{x}}(t,\beta _{\mathcal {S}})-{\hat{g}}^-_{{x}} (t,\beta _{\mathcal {S}}), &{} {x}={y}\in \mathcal {S},\\ -(R-\Vert \beta \Vert )^{-1}\sum _{{z}\in \mathcal {S}}g^+_{{z}}(t,\beta _{\mathcal {S}})\beta _{{z}}, &{} {x}={y}=\star . \end{array}\right. \end{aligned}$$
(31)

One can directly check that \(\mathcal {Q}(t,\beta _{\mathcal {S}})\) is a Kolmogorov matrix.

Further, we denote the set of sequences of nonnegative numbers indexed by elements of \(\mathcal {S}\cup \{\star \}\) by \({\mathcalligra{l}}_1^+(\mathcal {S}\cup \{\star \})\). As above, \({\mathscr {I}}\) is the isomorphism between \({\mathcalligra{l}}_1^+(\mathcal {S}\cup \{\star \})\) and \(\mathcal {M}(\mathcal {S}\cup \{\star \})\) given by the rule: for \(\beta _{\mathcal {S},\star }=(\beta _{{x}})_{{x}\in \mathcal {S}\cup \{\star \}}\),

$$\begin{aligned} {\mathscr {I}}(\beta _{\mathcal {S},\star })\triangleq \sum _{{x}\in \mathcal {S}\cup \{\star \}}\beta _{{x}}\delta _{{x}}. \end{aligned}$$

If \(\beta _{\mathcal {S},\star }=(\beta _{{x}})_{{x}\in \mathcal {S}\cup \{\star \}}\in {\mathcalligra{l}}_1^+(\mathcal {S}\cup \{\star \})\), \(\beta _{\mathcal {S}}\) is its restriction on \(\mathcal {S}\), then

$$\begin{aligned} \mathcal {Q}(t,\beta _{\mathcal {S},\star })\triangleq \mathcal {Q}(t,\beta _{\mathcal {S}}). \end{aligned}$$

Due to the definition of the matrix \(\mathcal {Q}\) [see (31)], we have that, if \(\beta _{\mathcal {S},\star }(\cdot ) =(\beta _{{x}}(\cdot ))_{{x}\in \mathcal {S}\cup \{\star \}}\) satisfies

$$\begin{aligned} \frac{d}{dt}\beta _{\mathcal {S},\star }(t)=\beta _{\mathcal {S},\star }(t) \mathcal {Q}(t,\beta _{\mathcal {S},\star }(t)), \end{aligned}$$

then \(\beta _{\mathcal {S}}(\cdot )\triangleq (\beta _{{x}}(\cdot ))_{{x}\in \mathcal {S}}\) solves (30).

Now let us introduce the description of system (30) through the generator technique. As above, we will use the probability measures. Notice that each \(\beta _{\mathcal {S},\star }\) corresponds to the probability measure \(\mu \triangleq R^{-1}{\mathscr {I}}(\beta _{\mathcal {S},\star })\).

If \(t\in [0,T]\), \({x}\in \mathcal {S}\cup \{\star \}\), \(\mu \in \mathcal {P}(\mathcal {S}\cup \{\star \})\), then

$$\begin{aligned} L^{Q}_t[\mu ]\phi ({x})\triangleq & {} \sum _{{y}\in \mathcal {S}}[\phi ({y}) -\phi ({x})]Q_{{x},{y}}(t,{\mathscr {I}}^{-1}(R\mu |_{\mathcal {S}})) \mathbbm {1}_{\mathcal {S}}({x})\nonumber \\{} & {} +[\phi (\star )-\phi ({x})]{\hat{g}}_{{x}}^ -(t,{\mathscr {I}}^{-1}(R\mu |_{\mathcal {S}}))\mathbbm {1}_{\mathcal {S}}({x}) \nonumber \\{} & {} +\mu ^{-1}(\{\star \})\sum _{{y}\in \mathcal {S}}[\phi (y)-\phi (\star )] {\hat{g}}_{{y}}^+(t,{\mathscr {I}}^{-1}(R\mu |_{\mathcal {S}}))\mu (\{{y}\}) \mathbbm {1}_{\{\star \}}({x}). \end{aligned}$$
(32)

Using the definitions of the measures of the \(\nu ^-\) and \(\nu ^+\) [see (22)–(24)], one can rewrite this definition as

$$\begin{aligned} L^{Q}_t[\mu ]\phi ({x})\triangleq & {} \sum _{{y}\in \mathcal {S}}[\phi ({y}) -\phi ({x})]Q_{{x},{y}}(t,{\mathscr {I}}^{-1}(R\mu |_{\mathcal {S}}))\mathbbm {1}_{\mathcal {S}}\\{} & {} +\int _{\mathcal {S}\cup \{\star \}}[\phi (x+u(\star -x))-\phi ({x})]\nu ^-(t,{x},\mu ,du)\\{} & {} +\int _{\mathcal {S}\cup \{\star \}}[\phi (y)-\phi (x)] \nu ^+(t,{x},\mu ,dy). \end{aligned}$$

The conservation form of  (30) is the equation

$$\begin{aligned} \frac{d}{dt}\mu (t)=L_t^{Q,*}[\mu (t)]\mu (t). \end{aligned}$$
(33)

Here \(L_t^{Q,*}[\mu ]\) stands for the operator adjoint to \(L_t^Q[\mu ]\). Its solution is considered in the weak sense.

Proposition 7

If \(\mu _0\in \mathcal {P}(\mathcal {S}\cup \{\star \})\), then there exists a unique solution to the initial value problem for Eq. (33) and the initial condition \(\mu (0)=\mu _0\). Furthermore, \(\mu (\cdot )\) solves (33) if and only if the function \(\beta _{\mathcal {S}}(\cdot )\) defined by the rule \(\beta _{\mathcal {S}}(t)\triangleq R{\mathscr {I}}^{-1}(\mu (t)|_{\mathcal {S}})\) satisfies (30).

The proposition directly follows from the definition of the generator \(L^Q\) [see (32)] and the matrix \(\mathcal {Q}\).

As in the case of balance equation, system of ODEs (30) admits a particle representation.

Remark 4

A 5-tuple \((\Omega ,\mathcal {F},\{\mathcal {F}\}_{t\in [0,T]},\mathbb {P},X^Q)\) is said to be a particle representation of (30) if

  • \((\Omega ,\mathcal {F},\{\mathcal {F}_t\}_{t\in [0,T]},\mathbb {P})\) is a probability space with filtration;

  • \(X^Q\) is a stochastic process with values in \(\mathcal {S}\cup \{\star \}\);

  • for each \(\phi \in C(\mathcal {S}\cup \{\star \})\),

    $$\begin{aligned} \phi (X^Q(t))-\int _0^t L^Q_{\tau }[\mu (\tau )]\phi (X^Q(\tau ))d\tau \end{aligned}$$

    is a \(\{\mathcal {F}_t\}_{t\in [0,T]}\)-martingale with \(\mu (\cdot )\) defined by the rule: for each \(t\in [0,T]\), \(A\in \mathcal {B}(\mathcal {K})\),

    $$\begin{aligned} \mu (t)=(X^Q(t))\sharp \mathbb {P}. \end{aligned}$$
    (34)

As above, \(\mu (\cdot )\) defined by (34) satisfies (33). Thus, \(\beta _{\mathcal {S}}(\cdot )\) such that \(\beta _{\mathcal {S}}(t)=R{\mathscr {I}}^{-1}(\mu (t)|_{\mathcal {S}})\) is a solution of (30).

The existence of the particle representation of (30) directly follows from Kolokoltsov [27, Theorem 5.4.2]

7 Rate of Approximation of Balance Equation

7.1 Formulation of the Approximation Theorem

For simplicity, we assume that the balance equation operates on some compact. This means that we impose the following condition.

  1. (A5)

    There exists a compact \(\mathcal {K}\) such that \({\text {supp}}(m_0)\subset \mathcal {K}\) and, for each \(t\in [0,T]\), \(x\in \mathbb {R}^d\setminus \mathcal {K}\), \(m\in \mathcal {M}(\mathcal {K})\),

    $$\begin{aligned} f(t,x,m)=0. \end{aligned}$$

The following property directly follow from the imposed conditions.

Proposition 8

Assume conditions (A1)–(A5). If \(m(\cdot )\) is a solution to balance Eq. (1) with the initial condition \(m(0)=m_0\), then, for every \(t\in [0,T]\),

$$\begin{aligned}{\text {supp}}(m(t))\subset \mathcal {K}.\end{aligned}$$

It suffices to define the functions \(f_R\), \(g^+\), \(g^-\), \(g_R+\), \(g_R^-\) and the measure \(\nu ^-\) only for \(x\in \mathcal {K}\). Furthermore, \(\nu ^+(t,\star ,\mu ,\cdot )\) is now a measure on \(\mathcal {K}\cup \{\star \}\) defined by the rule: for \(\phi \in C_b(\mathcal {K}\cup \{\star \})\),

$$\begin{aligned} \int _{\mathcal {K}\cup \{\star \}}\phi (y)\nu ^+(t,m,\star ,dy)\triangleq \mu ^{-1}(\{\star \})\int _{\mathcal {K}}\phi (y)g^+_R(t,y,\mu )\mu (dy). \end{aligned}$$
(35)

Finally, L takes the form

$$\begin{aligned} \begin{aligned} L_t[\mu ]\phi (x)&\triangleq \langle f_R(t,x,\mu ),\nabla \phi (x)\rangle \\&\quad +\int _{\{0,1\}}(\phi (x+u(\star -x))-\phi (x)) \nu ^-(t,x,\mu ,du)\\&\quad +\int _{\mathcal {K}\cup \{\star \}} (\phi (y)-\phi (x))\nu ^+(t,x,\mu ,dy). \end{aligned} \end{aligned}$$

The approximation result relies the following conditions on \(\mathcal {S}\) and Q: there exists \(\varepsilon >0\) such that

  1. (QS1)

    for each \(x\in \mathcal {K}\),

    $$\begin{aligned}\min _{{y}\in \mathcal {S}}\Vert x-{y}\Vert \le \varepsilon ;\end{aligned}$$
  2. (QS2)

    for every \(t\in [0,T]\), \({x}\in \mathcal {S}\), \(\beta _{\mathcal {S}}\in {\mathcalligra{l}}_1^+(\mathcal {S})\),

    $$\begin{aligned} \Bigg \Vert f(t,{x},{\mathscr {I}}(\beta _{\mathcal {S}}))-\sum _{{y}\in \mathcal {S}}({y} -{x})Q_{{x},{y}}(t,\beta _{\mathcal {S}})\Bigg \Vert \le \varepsilon ; \end{aligned}$$
  3. (QS3)

    for every \(t\in [0,T]\), \({x}\in \mathcal {S}\), \(\beta _{\mathcal {S}}\in {\mathcalligra{l}}_1^+(\mathcal {S})\),

    $$\begin{aligned} \sum _{{y}\in \mathcal {S}}\Vert x-y\Vert ^2 Q_{{x},{y}}(t,\beta _{\mathcal {S}})\le \varepsilon ^2. \end{aligned}$$

It is reasonable to assume that \(\varepsilon \) is sufficiently small. Without loss of generality, we will consider the case where \(\varepsilon \le 1\). Additionally, we assume that \(b>{\text {diam}}(\mathcal {K})+1\ge {\text {diam}}(\mathcal {K}\cup \mathcal {S})\).

Theorem 4

Assume that conditions (QS1)–(QS3) are in force. Given \(c>0\), there exists a constant \({\widehat{C}}>0\) determined by functions f, g and the constant c such that, if

  • \(\beta _0\in {\mathcalligra{l}}_1^+(\mathcal {S})\), \(\Vert \beta _0\Vert _1\le c\);

  • \(m_0\in \mathcal {M}(\mathcal {K})\), \(\Vert m_0\Vert \le c\);

  • \(\beta _{\mathcal {S}}(\cdot ):[0,T]\rightarrow {\mathcalligra{l}}_1^+(\mathcal {S})\) solves (30) with the initial condition \(\beta _{\mathcal {S}}(0)=\beta _0\);

  • \(m(\cdot ):[0,T]\rightarrow \mathcal {M}(\mathcal {K})\) satisfies (1) and the initial condition \(m(0)=m_0\),

then

$$\begin{aligned} \mathcal {W}_{1,b}(m(t),{\mathscr {I}}(\beta _{\mathcal {S}}(t)))\le {\widehat{C}}(\varepsilon +\mathcal {W}_{1,b}(m_0,{\mathscr {I}}(\beta _0)). \end{aligned}$$

This theorem is proved in Sect. 7.3. The proof relies on an auxiliary construction of the generator on \((\mathcal {K}\cup \{\star \})\times \mathcal {S}\cup \{\star \}\) introduced in Sect. 7.2.

Before, we turn to this auxiliary construction, let us present an example of the system that approximates balance Eq. (1) with arbitrary small accuracy.

Let \(h>0\), \(K_h\triangleq K+B_h\),

$$\begin{aligned} \mathcal {S}^h\triangleq \mathcal {K}_h\cap h\mathbb {Z}^d. \end{aligned}$$
(36)

Here, \(B_h\) is a closed ball centered in the origin of the radius h, \(\mathbb {Z}\) states for the set of integers. Further, consider the coordinate-wise representation of the function f

$$\begin{aligned} f(t,x,m)=(f_1(t,x,m),\ldots ,f_d(t,x,m)). \end{aligned}$$

Put

$$\begin{aligned} Q^h_{{x},{y}}(t,\beta _{\mathcal {S}})\triangleq \left\{ \begin{array}{ll} h^{-1}|f_i(t,{x},{\mathscr {I}}(\beta _{\mathcal {S}}m))|, &{} \begin{array}{l} {x}\in \mathcal {K},\\ {y}={x}+ he_i{\text {sgn}}(f_i(t,x,{\mathscr {I}} (\beta _{\mathcal {S}}))),\end{array}\\ -h^{-1}\sum _{i=1}^{d}|f_i(t,{x},{\mathscr {I}}(\beta _{\mathcal {S}}))|, &{} {x}={y}\in \mathcal {K},\\ 0, &{} \text {otherwise}. \end{array} \right. \end{aligned}$$
(37)

Hereinafter, \(e_i\) stands for the i-th coordinate vector, while \({\text {sgn}}\) denotes the sign function defined by the rule:

$$\begin{aligned}{\text {sgn}}(a)\triangleq \left\{ \begin{array}{ll} 1, &{} a>0,\\ -1, &{} a<0,\\ 0, &{} a=0. \end{array}\right. \end{aligned}$$

Proposition 9

The lattice \(\mathcal {S}^h\) and the Kolmogorov matrix defined by (36) and (37) respectively satisfy conditions (QS1)–(QS3) with \(\varepsilon =\max \{h,d^{1/4}C_f^{1/2}\sqrt{h}\}\).

Proof

The proof mimes one given in [25]. Condition (QS1) directly follows from (36).

We have that

$$\begin{aligned} \sum _{{y}\in \mathcal {S}}({y}-{x})Q^h(t,x, {\mathscr {I}}(\beta _{\mathcal {S}}))=\sum _{i=1}^d f_i(t,x,{\mathscr {I}}(\beta _{\mathcal {S}}))e_i =f(t,x,{\mathscr {I}}(\beta _{\mathcal {S}})). \end{aligned}$$

This gives (QS2).

Finally, condition (QS3) follows from the estimate:

$$\begin{aligned} \sum _{{y}\in \mathcal {S}}\Vert {y}-{x}\Vert ^2Q^h(t,x,{\mathscr {I}}(\beta _{\mathcal {S}})) =h^{-1}\sum _{i=1}^dh^2|f_i(t,x,{\mathscr {I}} (\beta _{\mathcal {S}}))|\le C_f\sqrt{d}h. \end{aligned}$$

\(\square \)

7.2 Coupled Dynamics

We choose a constant R such that

$$\begin{aligned} R>ce^{C_gT}. \end{aligned}$$

Notice that, due to Theorem 1 and Eq. (30), we can assume that \(m(\cdot )\) and \(\beta _\mathcal {S}(\cdot )\) defined in Theorem 4 are such that \(\Vert m(t)\Vert < R\), \(\Vert \beta _{\mathcal {S}}(t)\Vert < R\).

The key idea of the proof of Theorem 4 is to construct a generator \(\Lambda ^Q\) such that, if \(\vartheta (\cdot ):[0,T]\rightarrow \mathcal {P}^1((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\) solves

$$\begin{aligned} \frac{d}{dt}\vartheta (t)=\Lambda ^{Q,*}_t[\vartheta (t)]\vartheta (t), \end{aligned}$$
(38)

then the flows of probabilities \(\mu _1(\cdot )\) and \(\mu _2(\cdot )\) defined by the rule:

$$\begin{aligned} \mu _i(t)\triangleq {\text {p}}^i\sharp \vartheta (t), \ \ i=1,2 \end{aligned}$$

are solutions of (26) and (33) respectively. Hereinafter, \(\Lambda ^{Q,*}_t[\vartheta ]\) is an operator adjoint to \(\Lambda ^Q_t[\vartheta ]\). Notice that \(\vartheta (t)\) is a plan between \(\mu _1(t)\) and \(\mu _2(t)\). We will evaluate \(\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\Vert x_1-x_2\Vert \vartheta (t,d(x_1,x_2))\). This together with the link between distance on \(\mathcal {P}^1(\mathcal {K}\cup \{\star \})\) and metric \(\mathcal {W}_{1,b}\) will give the proof of Theorem 4.

The generator \(\Lambda ^Q\) will describe the behavior of the couple of particles where the first particle can move on \(\mathcal {K}\), while the second one walks randomly on \(\mathcal {S}\). Additionally, both particles can jump to and from the remote point \(\star \). The probability rates of these synchronized jumps will be given by Lévy measures \(\chi ^-\) and \(\chi ^+\). They are defined as follows.

Let \(t\in [0,T]\), \(x_1\in \mathcal {K}\cup \{\star \}\), \(x_2\in \mathcal {S}\cup \{\star \}\) and let \(\vartheta \in \mathcal {P}((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\). Denote \(\mu _1\triangleq {\text {p}}^1\sharp \vartheta \), \(\mu _2\triangleq {\text {p}}^2\sharp \vartheta \).

First, we choose the measure \(\chi ^-(t,x_1,x_2,\vartheta ,\cdot )\) to be a measure on \(\{0,1\}\times \{0,1\}\) satisfying

  • \(\chi ^-(t,x_1,x_2,\vartheta ,\{(1,1)\})\triangleq g^-_R(t,x_1,\mu _1)\wedge g^-_R(t,x_2,\mu _2)\);

  • \(\chi ^-(t,x_1,x_2,\vartheta ,\{(1,0)\})\triangleq g^-_R(t,x_1,\mu _1)-(g^-_R(t,x_1,\mu _1)\wedge g^-_R(t,x_2,\mu _2))\);

  • \(\chi ^-(t,x_1,x_2,\vartheta ,\{(0,1)\})\triangleq g^-_R(t,x_2,\mu _2)-(g^-_R(t,x_1,\mu _1)\wedge g^-_R(t,x_2,\mu _2))\);

  • \(\chi ^-(t,x_1,x_2,\vartheta ,\{(0,0)\})\triangleq 0\).

Recall that by definition \(g^-(t,\star ,\mu )\equiv 0\).

Now, let us introduce the measure \(\chi ^+(t,x_1,x_2,\vartheta )\). For every \(\phi \in C_b((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\), we set

$$\begin{aligned}{} & {} \int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\phi (y_1,y_2)\chi ^ +(t,\star ,\star ,\vartheta ,d(y_1,y_2))\nonumber \\{} & {} \quad \triangleq (\vartheta \{(\star ,\star )\})^{-1}\int _{\mathcal {K}\times \mathcal {S}} \big [\phi (y_1,y_2)(g^+_R(t,y_1,\mu _1)\wedge g^+_R(t,y_2,\mu _2))\nonumber \\{} & {} \qquad +\phi (y_1,\star )(g^+_R(t,y_1,\mu _1)-(g^+_R(t,y_1,\mu _1)\wedge g^+_R(t,y_2,\mu _2)))\nonumber \\{} & {} \qquad +\phi (\star ,y_2)(g^+_R(t,y_2,\mu _2)-(g^+_R(t,y_1,\mu _1)\wedge g^+_R(t,y_2,\mu _2)))\big ]\vartheta (d(y_1,y_2)). \end{aligned}$$
(39)

When \((x_1,x_2)\ne (\star ,\star )\), we put

$$\begin{aligned} \chi ^+(t,x_1,x_2,\vartheta ,\cdot )\equiv 0. \end{aligned}$$
(40)

Notice that the synchronization of two jump-type processes was proposed in [27] (see also [28], where the synchronization of stopping time was considered).

Finally, let us define the generator \(\Lambda ^Q\) by the rule: for \(\phi \in C((\mathcal {K}\cup \{\star \})\times \mathcal {S}\cup \{\star \})\) such that \(\phi \) is continuously differentiable w.r.t. the first variable on \(\mathcal {K}\):

$$\begin{aligned}{} & {} \Lambda ^Q_t[\vartheta ]\phi (x_1,x_2)\nonumber \\{} & {} \quad \triangleq \langle \nabla _{x_1}\phi (x_1,x_2),f_R(t,x_1,{\text {p}}^1\sharp \vartheta )\rangle \mathbbm {1}_{\mathcal {K}}(x_1) \nonumber \\{} & {} \qquad +\sum _{{y}\in \mathcal {S}}\phi (x_1,{y})Q_{x_2,{y}}(t,{\mathscr {I}}^{-1} (R({\text {p}}^1\sharp \vartheta )|_{\mathcal {S}}))\mathbbm {1}_{\mathcal {S}}(x_2)\nonumber \\{} & {} \qquad + \int _{\{0,1\}\times \{0,1\}}[\phi (x_1+u_1(\star -x_1),x_2+u_2(\star -x_2))-\phi (x_1,x_2)]\nonumber \\{} & {} \qquad \chi ^-(t,x_1,x_2,\vartheta ,d(u_1,u_2))\nonumber \\{} & {} \qquad + \int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}[\phi (y_1,y_2)-\phi (x_1,x_2)]\chi ^+(t,x_1,x_2,\vartheta ,d(y_1,y_2)). \end{aligned}$$
(41)

In the following, we, with some abuse of notation, denote by \(C^1((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\) the set of functions \(\phi :(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})\rightarrow \mathbb {R}\) such that the mapping \(x_1\mapsto \phi (x_1,x_2)\) is continuously differentiable on \(\mathcal {K}\).

Definition 5

We say that a flow of probabilities \(\vartheta :[0,T]\rightarrow \mathcal {P}^1((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\) is a solution to (38) if, for each \(\phi \in C^1((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\), one has that

$$\begin{aligned}{} & {} \int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\phi (x_1,x_2)\vartheta (t,d(x_1,x_2))\nonumber \\{} & {} \qquad -\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\phi (x_1,x_2)\vartheta (0,d(x_1,x_2))\nonumber \\{} & {} \quad =\int _0^t \int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}(\Lambda ^Q_\tau [\vartheta (\tau )]\phi (x_1,x_2)) \vartheta (\tau ,d(x_1,x_2))d\tau . \end{aligned}$$
(42)

Remark 5

As above, one can consider a particle representation of the flow \(\vartheta (\cdot )\). It is a 6-tuple \((\Omega ,\mathcal {F},\{\mathcal {F}_t\}_{t\in [0,T]},\mathbb {P},X,X^Q)\) such that the following conditions hold true:

  • \((\Omega ,\mathcal {F},\{\mathcal {F}_t\}_{t\in [0,T]},\mathbb {P})\) is a probability space with filtration;

  • X and \(X^Q\) are \(\{\mathcal {F}_t\}_{t\in [0,T]}\)-adapted processes taking values in \(\mathcal {K}\cup \{\star \}\) and \(\mathcal {S}\cup \{\star \}\) respectively;

  • if \(\vartheta (\cdot )\) is defined by the rule \(\vartheta (t)\triangleq (X(t),X^Q(t))\sharp \mathbb {P}\), then the process

    $$\begin{aligned} \phi (X(t),X^Q(t))-\int _0^t\Lambda ^Q_\tau [\vartheta (\tau )] \phi (X(\tau ),X^Q(\tau ))d\tau \end{aligned}$$

    is a \(\{\mathcal {F}_t\}_{t\in [0,T]}\)-martingale.

Notice that \(X(\cdot )\) describes the continuous motion, while \(X^Q\) is a Markov chain approximation. The first and second terms of the generator \(\Lambda ^Q\) provide the independent evolution on \(\mathcal {K}\) and \(\mathcal {S}\) respectively. Simultaneously, the measure \(\chi ^-\) gives the coordinated jumps from \(\mathcal {K}\) and \(\mathcal {S}\) to the remote point \(\star \). The reverse jumps are described by the measure \(\chi ^+\). It implies the jumps only if both components \(X(t-)\) and \(X^Q(t-)\) are in the remote point \(\star \) and distribute the state \(X(t),X^Q(t)\) according to the current measure \(\vartheta (t)\). The way how one can construct a particle representation of a flow \(\vartheta (\cdot )\) is briefly discussed in Remark 4.

It was mentioned above that the proof of Theorem 4 relies on the properties of the solution to (38). Thus, we need an existence result.

Proposition 10

Given an initial measure \(\vartheta _0\in \mathcal {P}^1((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\), there exists a solution to (38) that starts from \(\vartheta _0\). If \(\vartheta (\cdot )\) is a solution of (38), then, at any time t, \(\vartheta (t)\) is concentrated on \((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})\).

Proof

Notice that such existence results are analogous to one proved in [4, 27]. There, the approximation of the dynamics by Markov chains was used. However, for the sake of completeness, we give the direct proof. The approximation technique follows the way introduced in Sect. 6.

Let \(a>0\), \(K^a\triangleq K+B_a\), \(\mathcal {S}^a\triangleq K_a\cap a\mathbb {Z}^d\). Further, if \(x,y\in \mathcal {S}^a\), \(\vartheta ^a\in \mathcal {P}^1((\mathcal {S}^a\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\), put

$$\begin{aligned} \mathcal {Q}^a_{x,y}(t,\vartheta ^a)\triangleq \left\{ \begin{array}{ll} a^{-1}|f_i(t,x,{\text {p}}^1\sharp \vartheta ^a)|, &{} \begin{array}{l} x\in \mathcal {K},\\ y=x+ ae_i{\text {sgn}}(f_i(t,x,{\text {p}}^1\sharp \vartheta ^a)),\end{array}\\ -a^{-1}\sum _{i=1}^{d}|f_i(t,x,{\text {p}}^1\sharp \vartheta ^a)|, &{} x=y\in \mathcal {K},\\ 0, &{} \text {otherwise}. \end{array} \right. \end{aligned}$$
(43)

For each \(t\in [0,T]\), \(\vartheta ^a\in \mathcal {P}^1((\mathcal {S}^a\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\), let us consider the following generator:

$$\begin{aligned} \begin{aligned}&\Lambda ^a_t[\vartheta ^a]\phi (x_1,x_2)\\&\quad \triangleq \sum _{y\in \mathcal {S}^a}\phi (y,x_2)\mathcal {Q}^a_{x_1,y}(t,\vartheta ^a) +\sum _{{y}\in \mathcal {S}}\phi (x_1,{y})Q_{x_2,{y}}(t,{\mathscr {I}}^{-1} (R({\text {p}}^1\sharp \vartheta ^a)|_{\mathcal {S}}))\\&\qquad +\,\int _{\{0,1\}\times \{0,1\}}[\phi (x_1+u_1(\star -x_1),x_2+u_2(\star -x_2))-\phi (x_1,x_2)]\\&\qquad \chi ^-(t,x_1,x_2,\vartheta ^a,d(u_1,u_2))\\&\qquad +\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}[\phi (y_1,y_2)-\phi (x_1,x_2)]\chi ^+(t,x_1,x_2,\vartheta ^a,d(y_1,y_2)). \end{aligned} \end{aligned}$$

Due to Kolokoltsov [4, Theorem 7.2.1], for each \(\vartheta _0^a\) that is concentrated on \((\mathcal {S}^a\cup \{\star \})\times (\mathcal {S}\cup \{\star \})\), there exists a solution of the equation

$$\begin{aligned} \frac{d}{dt}\vartheta ^a(t)=\Lambda ^{a,*}_t[\vartheta ^a(t)]\vartheta ^a(t). \end{aligned}$$
(44)

As above, \(\Lambda ^{a,*}_t[\theta ]\) stands for the operator adjoint to \(\Lambda ^a_t[\theta ]\). The solution of (44) is considered in the weak sense. Notice that, for each t, \(\vartheta ^a(t)\) is concentrated on \((\mathcal {S}^a\cup \{\star \})\times (\mathcal {S}\cup \{\star \})\). Moreover, letting \(\phi \equiv 1\), we conclude that each measure \(\vartheta ^a(t)\) is a probability.

Further, let \(\phi \in {\text {Lip}}_1((\mathcal {S}^a\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\), we have that

$$\begin{aligned} \Bigg |\int _{(\mathcal {S}^a\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}(\Lambda ^a_t\phi (x_1,x_2)) \vartheta ^a(t)\Bigg | \le C_f+C_{10}\Vert \phi \Vert . \end{aligned}$$

Here \(C_{10}\) is a constant that can depend on \(\mathcal {Q}\) but does not depend on a. Notice that, without loss of generality, one can assume that \(\phi (\star ,\star )=0\) and, thus, for some constant \(C_{11}\), \(\Vert \phi \Vert \le C_{11}\). Therefore, for each \(s,r\in [0,T]\), and \(\phi \in {\text {Lip}}_1((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\) such that \(\phi (\star ,\star )=0\),

$$\begin{aligned} \begin{aligned}&\Bigg |\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\phi (x_1,x_2)\vartheta ^a(r,d(x_1,x_2))\\&\quad -\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\phi (x_1,x_2)\vartheta ^a(s,d(x_1,x_2))\Bigg |\le C_{12}(r-s).\end{aligned} \end{aligned}$$

As above, \(C_{12}\) stands for a constant. Due to this estimate and the Kantorovich–Rubinstein duality [16], we have that the set of curves \(\{\vartheta ^a(\cdot )\}\) are relatively compact in \(C([0,T],\mathcal {P}^1((\mathcal {K}^1\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\). Assuming that \(\{\vartheta ^{a}_0\}\) converges to \(\vartheta _0\), one can construct a sequence \(\{a_n\}_{n=1}^\infty \) and a limiting curve \(\vartheta (\cdot )\) such that \(\vartheta (0)=\vartheta _0\), \(a_n\rightarrow 0\) as \(n\rightarrow \infty \), while \(\{\vartheta ^{a_n}(\cdot )\}\) converges to \(\vartheta (\cdot )\) in \(C([0,T],\mathcal {P}^1((\mathcal {K}^1\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\).

Let us prove that \(\vartheta (\cdot )\) is a solution to (38). To this end, we prove that, if

  • \(\{\vartheta _n\}_{n=1}^\infty \subset \mathcal {P}^1((\mathcal {S}^a\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\),

  • \(\vartheta \in \mathcal {P}^1((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\),

  • \(\{a_n\}_{n=1}^\infty \subset (0,+\infty )\),

  • \(a_n\rightarrow 0\) \(\vartheta _n\rightarrow \vartheta \) as \(n\rightarrow \infty \),

then, for each \(x_1\in \mathcal {K}^1\cup \{\star \}\), \(x_2\in \mathcal {S}\cup \{\star \}\), \(t\in [0,T]\), \(\phi \in C^1((\mathcal {K}^1\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\),

$$\begin{aligned} \Lambda ^{a_n}_t[\vartheta _n]\phi (x_1,x_2)\rightarrow \Lambda ^Q_t[\vartheta ]\phi (x_1,x_2). \end{aligned}$$
(45)

First, notice that the dependence

$$\begin{aligned} \begin{aligned} \vartheta \mapsto&\sum _{{y}\in \mathcal {S}}\phi (x_1,{y})Q_{x_2,{y}}(t,{\mathscr {I}}^{ -1}(R({\text {p}}^1\sharp \vartheta )|_{\mathcal {S}}))\\&\quad + \int _{\{0,1\}\times \{0,1\}}[\phi (x_1+u_1(\star -x_1),x_2 +u_2(\star -x_2))-\phi (x_1,x_2)]\\&\quad \chi ^-(t,x_1,x_2,\vartheta ,d(u_1,u_2))\\&\quad +\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}[\phi (y_1,y_2)-\phi (x_1,x_2)]\chi ^ +(t,x_1,x_2,\vartheta ,d(y_1,y_2)) \end{aligned} \end{aligned}$$

is continuous for each \(t\in [0,T]\), \(x_1\in \mathcal {K}\), \(x_2\in \mathcal {S}\), \(\phi \in C((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\).

Thanks to the choice of the function \(\phi \), the modulus of continuity \(\varsigma (\cdot )\) of the dependence \(\mathcal {K}\ni x_1\mapsto \nabla _{x_1}\phi (x_1,x_2)\) is well-defined. In particular, \(\varsigma (a)\rightarrow 0\) as \(a\rightarrow 0\). Further,

$$\begin{aligned} |a^{-1}(\phi (x_1\pm ae_i,x_2)-\phi (x_1,x_2))-\langle \nabla _{x_1}\phi (x_1,x_2),e_1\rangle |\le \varsigma (a). \end{aligned}$$

Therefore, by (43), we have that, for \(x_1\in \mathcal {K}^1\), \(x_2\in \mathcal {S}\cup \{\star \}\),

$$\begin{aligned} \Bigg |\sum _{y\in \mathcal {S}^a}\phi (y,x_2)\mathcal {Q}^a_{x_1,y}(\vartheta )-\big \langle f(t,x_1,{\text {p}}^1\sharp \vartheta ),\nabla _{x_1}\phi (x_1,x_2)\big \rangle \Bigg |\le C_f\varsigma (a). \end{aligned}$$

Combining this with the continuity of the last three terms in the formulae for the generators \(\Lambda ^Q\) [see (41)] and the continuity of the function f, we obtain the convergence result (45). Thus, we can pass to the limit in (44) and conclude that \(\vartheta (\cdot )\) solves (38).

The fact that \(\vartheta (t)\) is concentrated on \((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})\) directly follows from the fact that \(\Lambda ^Q[\vartheta ]\) is concentrated on \((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})\) while \(\vartheta \) is concentrated on this set. Finally, choosing \(\phi ^a(x_1,x_2)\equiv 1\), we conclude that \(\vartheta (t,(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))=1\). This means that each measure \(\vartheta (t)\) is a probability. \(\square \)

Further, we show that the first and second marginal distributions of the solution of (38) solve (26), (33) respectively.

Proposition 11

Let \(\vartheta (\cdot )\) solve the Eq. (38) and let \(\vartheta (0)\) be concentrated on \((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})\). Then,

  • the flow of probabilities \(\mu _1(\cdot )\) defined by the rule \(\mu _1(t)\triangleq {\text {p}}^1\sharp (\vartheta (t))\) satisfies Eq. (26);

  • the flow of probabilities \(\mu _2(\cdot )\) such that \(\mu _2(t)\triangleq {\text {p}}^2\sharp (\vartheta (t))\) is a solution of (33).

Proof

To prove the first statement, we assume that \(\phi \) depends only on \(x_1\in \mathcal {K}\cup \{\star \}\). For each \(\vartheta \in \mathcal {P}^1((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\), put \(\mu _1\triangleq {\text {p}}^1\sharp \vartheta \). We have that

$$\begin{aligned} \Lambda ^Q_t[\vartheta ]\phi (x_1)\triangleq & {} \langle \nabla \phi (x_1),f_R(t,x_1,\mu _1)\rangle \nonumber \\{} & {} + \int _{\{0,1\}\times \{0,1\}}[\phi (x_1+u_1(\star -x_1))-\phi (x_1)]\chi ^ -(t,x_1,x_2,\vartheta ,d(u_1,u_2))\nonumber \\{} & {} + \int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}[\phi (y_1)-\phi (x_1)]\chi ^+(t,x_1,x_2,\vartheta ,d(y_1,y_2)). \end{aligned}$$
(46)

Further, recall that

$$\begin{aligned}{} & {} \int _{\{0,1\}\times \{0,1\}}[\phi (x_1+u_1(\star -x_1)) -\phi (x_1)]\chi ^-(t,x_1,x_2,\vartheta ,d(u_1,u_2))\nonumber \\{} & {} \quad = \int _{\{0,1\}}[\phi (x_1+u_1(\star -x_1))-\phi (x_1)]\nu ^ -(t,x_1,\mu _1,du_1). \end{aligned}$$
(47)

Additionally, from the definition of the measure \(\chi ^+\) [see (39), (40)], we have that

$$\begin{aligned} \begin{aligned}&\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}[\phi (y_1)-\phi (x_1)]\chi ^+(t,x_1,x_2,\vartheta ,d(y_1,y_2))\\&\quad =(\vartheta \{(\star ,\star )\})^{-1}\int _{\mathcal {K}\times \mathcal {S}} (\phi (y_1)-\phi (\star ))g^+_R(t,y_1,\mu _1)\vartheta (d(y_1,y_2)) \mathbbm {1}_{\{(\star ,\star )\}}(x_1,x_2). \end{aligned} \end{aligned}$$

Integrating the both parts of this equality against the measure \(\vartheta (d(x_1,x_2))\), we have that

$$\begin{aligned} \begin{aligned}&\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\int _{\mathcal {K}\times \mathcal {S}}[\phi (y_1)-\phi (x_1)]\chi ^+(t,x_1,x_2,\vartheta ,d (y_1,y_2))\vartheta (d(x_1,x_2))\\&\quad =\int _{\mathcal {K}\cup \{\star \}} (\phi (y_1)-\phi (\star ))g^+_R(t,y_1,\mu _1)\mu _1(dy_1). \end{aligned} \end{aligned}$$

Using this and the definition of the measure \(\nu ^+\) [see (23), (24),  (35)], we conclude that

$$\begin{aligned} \begin{aligned}&\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\int _{\mathcal {K}\times \mathcal {S}}[\phi (y_1)-\phi (x_1)]\chi ^+(t,x_1,x_2,\vartheta ,d(y_1,y_2)) \vartheta (d(x_1,x_2))\\&\quad =\int _{\mathcal {K}\cup \{\star \}}\int _{\mathcal {K}\cup \{\star \}}[\phi (y_1)-\phi (x_1)]\nu ^+(t,x_1,\mu _1,dy_1)d\mu _1(dx_1). \end{aligned} \end{aligned}$$

This expression and formulae (46), (47) yield the equality

$$\begin{aligned}{} & {} \int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\Lambda ^Q_t[\vartheta ]\phi (x_1)\vartheta (d(x_1,x_2))\nonumber \\{} & {} \quad =\int _{\mathcal {K}}\langle \nabla \phi (x_1),f_R(t,x_1,\mu _1)\rangle \nonumber \\{} & {} \qquad +\,\int _{\mathcal {K}\cup \{\star \}}\int _{\{0,1\}}[\phi (x_1+u_1(\star -x_1))-\phi (x_1)] \nu ^-(t,x_1,\mu _1,du_1)\nonumber \\{} & {} \qquad +\,\int _{\mathcal {K}\cup \{\star \}}\int _{\mathcal {K}\cup \{\star \}}[\phi (y_1)-\phi (x_1)] \nu ^+(t,x_1,\mu _1,dy_1)\mu _1(dx_1)\nonumber \\{} & {} \quad = \int _{\mathcal {K}\cup \{\star \}}L_t[\mu _1]\phi (x)\mu _1(dx). \end{aligned}$$
(48)

Now, recall that \(\vartheta (\cdot )\) satisfies (38), while \(\mu _1(\cdot )\) is defined by the rule: \(\mu _1(t)={\text {p}}^1\sharp \vartheta (t)\). This, (48) and the definition of the generator \(L_t\) [see (25)] yield that, if \(\phi \in C^1(\mathcal {K}\cup \{\star \})\), then

$$\begin{aligned}\frac{d}{dt}\int _{\mathcal {K}\cup \{\star \}}\phi (x)\mu _1(dx)=\int _{\mathcal {K}\cup \{\star \}}L_t[\mu _1]\phi (x)\mu _1(dx).\end{aligned}$$

This implies the first statement of the proposition.

The second statement is proved in the same way. \(\square \)

7.3 Regularized Distance between Marginal Distribution

In this section, we give the proof of Theorem 4. As we mentioned above, the natural idea here is to study the evolution of the integral \(\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\Vert x_1-x_2\Vert \vartheta (t,d(x_1,x_2))\) using Eq. (42). However, the distance function is not smooth. Therefore, we regularize it and evaluate the action of the generator \(\Lambda ^Q_t[\vartheta ]\) on the function

$$\begin{aligned} \varphi _\varepsilon (x_1,x_2)\triangleq \sqrt{\Vert x_1-x_2\Vert ^2+\varepsilon ^2}. \end{aligned}$$

In the following lemmas, we gradually estimate the action of the parts of the generator \(\Lambda ^Q\) on this function. These results will imply Theorem 4. We start with the evaluation of the action of first two terms.

Lemma 2

For each \(t\in [0,T]\), \(x_1\in \mathcal {K}\), \({x}_2\in \mathcal {S}\), \(\vartheta \in \mathcal {P}((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\), and \(\mu _1\triangleq {\text {p}}^1\sharp \vartheta \), \(\mu _2\triangleq {\text {p}}^2\sharp \vartheta \),

$$\begin{aligned} \begin{aligned}&\Bigl |\langle f_R(t,x_1,\mu _1),\nabla _{x_1}\varphi _\varepsilon (x_1,{x}_2)\rangle + \sum _{{y}\in \mathcal {S}}\varphi _\varepsilon (x_1,{y}) Q_{{x}_2,{y}}(t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\Bigr |\\&\quad \le C_{Lf}(\varphi _\varepsilon (x_1,{x}_2)+RW_1(\mu _1,\mu _2))+2\varepsilon .\end{aligned} \end{aligned}$$

Proof

First, we have that

$$\begin{aligned} \nabla _{x_1}\varphi _\varepsilon (x_1,x_2) =\frac{x_1-x_2}{\sqrt{\Vert x_1-x_2\Vert ^2+\varepsilon ^2}}. \end{aligned}$$
(49)

Expanding the function \(y\mapsto \varphi _\varepsilon (x_1,y)\) using the Taylor series about the point \(x_2\), we have that

$$\begin{aligned} \varphi _\varepsilon (x_1,y)= & {} \varphi _\varepsilon (x_1,x_2)- \frac{\langle x_1-x_2,y-x_2\rangle }{\sqrt{\Vert x_1-x_2\Vert ^2+\varepsilon ^2}}\nonumber \\{} & {} \quad +\, \frac{\Vert y-x_2\Vert ^2(\Vert y'-x_1\Vert ^2+\varepsilon ^2)-(\langle y'-x_1,y-x_2\rangle )^2}{2(\Vert y'-x_1\Vert ^2+\varepsilon ^2)^{3/2}}. \end{aligned}$$
(50)

Here \(y'\) lies in the segment \([x_1,y]\). Since \(|\langle x_2-x_1,y-x_2\rangle |\le \Vert y-x_2\Vert \cdot \Vert x_2-x_1\Vert \), we conclude that, for each \(y\ne x_2\),

$$\begin{aligned} \begin{aligned}&\Bigg |\frac{\Vert y-x_2\Vert ^2(\Vert y'-x_1\Vert ^2+\varepsilon ^2)-(\langle y'-x_1,y-x_2\rangle )^2}{2(\Vert y'-x_1\Vert ^2+\varepsilon ^2)^{3/2}}\Bigg |\\&\quad \le \frac{\Vert y-x_2\Vert ^2(\Vert y'-x_1\Vert ^2+\varepsilon ^2)}{2(\Vert y'-x_1\Vert ^2 +\varepsilon ^2)^{3/2}}+\frac{|\langle y'-x_1,y-x_2\rangle |^2}{2(\Vert y'-x_1\Vert ^2+\varepsilon ^2)^{3/2}}\le \Vert y-x_2\Vert ^2\varepsilon ^{-1}. \end{aligned} \end{aligned}$$

Using this and (50), we conclude that

$$\begin{aligned} \Bigg |\varphi _\varepsilon (x_1,y)- \Bigg (\varphi _\varepsilon (x_1,x_2)- \frac{\langle x_1-x_2,y-x_2\rangle }{\sqrt{\Vert x_1-x_2\Vert ^2+\varepsilon ^2}}\Bigg )\Bigg |\le \varepsilon ^{-1}\Vert y-x_2\Vert ^2. \end{aligned}$$

Since \(Q(t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\) is a Kolmogorov matrix, we derive that

$$\begin{aligned} \begin{aligned}&\Bigg |\sum _{{y}\in \mathcal {S}}\varphi _\varepsilon (x_1,{y}) Q_{x_2,{y}}(t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\\&\qquad +\sum _{{y}\in \mathcal {S},{y}\ne x_2} \frac{\langle x_1-x_2,y-x_2\rangle }{\sqrt{\Vert x_1-x_2\Vert ^2+\varepsilon ^2}} Q_{x_2,{y}}(t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\Bigg |\\&\quad \le \varepsilon ^{-1}\sum _{{y}\in \mathcal {S}}\Vert y-x_2\Vert ^2Q_{x_2,{y}} (t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}})). \end{aligned} \end{aligned}$$

Due to (QS3), the right-hand side of this inequality is bounded by \(\varepsilon \). Thus,

$$\begin{aligned}{} & {} \Bigg |\sum _{{y}\in \mathcal {S}} \varphi _\varepsilon (x_1,{y})Q_{x_2,{y}} (t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\nonumber \\{} & {} \quad +\sum _{{y}\in \mathcal {S},{y}\ne x_2} \frac{\langle x-x_2,y-x_2\rangle }{\sqrt{\Vert x-x_2\Vert ^2+ \varepsilon ^2}}Q_{x_2,{y}}(t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\Bigg | \le \varepsilon . \end{aligned}$$
(51)

The Lipschitz continuity of the function f gives that

$$\begin{aligned}{} & {} \Bigl |\langle f_R(t,x_1,\mu _2),\nabla _{x_1}\varphi _\varepsilon (x_1,x_2)\rangle + \sum _{{y}\in \mathcal {S}}\varphi _\varepsilon (x_1,{y}) Q_{x_2,{y}}(t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\Bigr |\nonumber \\{} & {} \quad \le \Bigl |\langle f_R(t,x_2,\mu _2),\nabla _{x_1}\varphi _\varepsilon (x_1,x_2)\rangle + \sum _{{y}\in \mathcal {S}}\varphi _\varepsilon (x_1,{y}) Q_{x_2,{y}}(t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\Bigr |\nonumber \\{} & {} \qquad +C_{Lf}(\Vert x_1-x_2\Vert +RW_{1}(\mu _1,\mu _2)). \end{aligned}$$
(52)

Taking into account (49) and (51), we conclude that

$$\begin{aligned} \begin{aligned}&\Bigl |\langle f_R(t,x_1,\mu _1),\nabla _{x_1}\varphi _\varepsilon (x_1,x_2)\rangle + \sum _{{y}\in \mathcal {S}}\varphi _\varepsilon (x_1,{y}) Q_{x_2,{y}}(t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\Bigr |\nonumber \\&\quad \le \frac{1}{\sqrt{\Vert x_1-x_2\Vert ^2+\varepsilon ^2}} \Bigg |\Bigl \langle f_R(t,x_2,\mu _2)\\&\qquad -\,\sum _{{y}\in \mathcal {S}}(y-x_2)Q_{x_2,{y}}(t,{\mathscr {I}}^{ -1}(R\mu _2|_{\mathcal {S}})),x_1-x_2\Bigr \rangle \Bigg |\\&\qquad + C_{Lf}(\Vert x_1-x_2\Vert +RW_{1}(\mu _1,\mu _2))+\varepsilon . \end{aligned} \end{aligned}$$

Condition (QS2) and the definition of the function \(f_R\) yield that

$$\begin{aligned} \Bigg |f_R(t,x_2,\mu _2)-\sum _{{y}\in \mathcal {S}}(y-x_2)Q_{x_2,{y}} (t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\Bigg |\le \varepsilon . \end{aligned}$$

Therefore,

$$\begin{aligned} \begin{aligned}&\Bigg |\langle f_R(t,x_1,\mu _1),\nabla _{x_1}\varphi _\varepsilon (x_1,x_2)\rangle + \sum _{{y}\in \mathcal {S}}\varphi _\varepsilon (x_1,{y}) Q_{x_2,{y}}(t,{\mathscr {I}}^{-1}(R\mu _2|_{\mathcal {S}}))\Bigg |\\&\quad \le \varepsilon \frac{\Vert x_1-x_2\Vert }{\sqrt{\Vert x_1-x_2\Vert ^2+\varepsilon ^2}}+ C_{Lf}(\Vert x_1-x_2\Vert +RW_{1}(\mu _1,\mu _2))+\varepsilon . \end{aligned} \end{aligned}$$

This, together with the inequality \(\Vert x_1-x_2\Vert \le \sqrt{\Vert x_1-x_2\Vert ^2+\varepsilon ^2}=\varphi _\varepsilon (x_1,x_2)\) imply the statement of the lemma. \(\square \)

The following lemma gives an evaluation of the third term of the generator in the case when the test function is equal to \(\varphi _\varepsilon \).

Lemma 3

There exists a constant \(C_{13}\) depending only on f and g such that, for each \(t\in [0,T]\), \(x_1,x_2\in \mathcal {K}\cup \{\star \}\), \(\vartheta \in \mathcal {P}^1((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\), \(\mu _1\triangleq {\text {p}}^1\sharp \vartheta \), \(\mu _2\triangleq {\text {p}}^2\sharp \vartheta \), one has

$$\begin{aligned} \begin{aligned}&\Bigg |\int _{\{0,1\}\times \{0,1\}} [\varphi _\varepsilon (x_1+u_1(\star -x_1),x_2 +u_2(\star -x_2))\\&\qquad -\varphi _\varepsilon (x_1,x_2)]\chi ^ -(t,x_1,x_2,\vartheta ,d(u_1,u_2))\Bigg |\\&\quad \le \varepsilon C_g+C_{13}\varphi _\varepsilon (x_1,x_2)+C_{13} RW_1(\mu _1,\mu _2). \end{aligned} \end{aligned}$$

Proof

Direct computations yield that

$$\begin{aligned} \begin{aligned}&\int _{\{0,1\}\times \{0,1\}} [\varphi _\varepsilon (x_1+u_1(\star -x_1),x_2 +u_2(\star -x_2)) -\varphi _\varepsilon (x_1,x_2)]\\&\qquad \chi ^-(t,x_1,x_2,\vartheta ,d(u_1,u_2))\\&\quad =(\varphi _\varepsilon (\star ,\star ) -\varphi _\varepsilon (x_1,x_2))(g_R^-(t,x_1,\mu _1)\wedge g_R^ -(t,x_2,\mu _2))\\&\qquad +(\varphi _\varepsilon (\star ,x_2) -\varphi _\varepsilon (x_1,x_2))(g_R^-(t,x_1,\mu _1) -(g_R^-(t,x_1,\mu _1)\wedge g_R^-(t,x_2,\mu _2)))\\&\qquad +(\varphi _\varepsilon (x_1,\star ) -\varphi _\varepsilon (x_1,x_2))(g_R^-(t,x_2,\mu _2) -(g_R^-(t,x_1,\mu _1)\wedge g_R^-(t,x_2,\mu _2))). \end{aligned} \end{aligned}$$

Further, taking into account the facts that \(\varphi _\varepsilon (\star ,\star )=\varepsilon \), while \(\varphi _\varepsilon (x_1,\star )=\varphi _ \varepsilon (\star ,x_2)=\sqrt{b^2+\varepsilon ^2}\), we have that

$$\begin{aligned}\begin{aligned}&\int _{\{0,1\}\times \{0,1\}} [\varphi _\varepsilon (x_1+u_1(\star -x_1),x_2 +u_2(\star -x_2)) -\varphi _\varepsilon (x_1,x_2)]\\&\qquad \chi ^-(t,x_1,x_2,\vartheta ,d(u_1,u_2))\\&\quad =\varepsilon (g_R^-(t,x_1,\mu _1)\wedge g_R^-(t,x_2,\mu _2))\\&\qquad +\sqrt{b^2+\varepsilon ^2}((g_R^-(t,x_1,\mu _1)\vee g_R^-(t,x_2,\mu _2))-(g_R^-(t,x_1,\mu _1)\wedge g_R^-(t,x_2,\mu _2)))\\&\qquad -\varphi _\varepsilon (x_1,x_2)(g_R^-(t,x_1,\mu _1)\vee g_R^-(t,x_2,\mu _2)). \end{aligned} \end{aligned}$$

The deduced equality, the boundedness of the function \(g^-_R\) as its Lipschitz continuity w.r.t. the phase and measure variables give that

$$\begin{aligned} \begin{aligned}&\Bigg |\int _{\{0,1\}\times \{0,1\}} [\varphi _\varepsilon (x_1+u_1(\star -x_1),x_2 +u_2(\star -x_2)) -\varphi _\varepsilon (x_1,x_2)]\\&\qquad \chi ^-(t,x_1,x_2,\vartheta ,d(u_1,u_2))\Bigg |\\&\quad \le \varepsilon C_g+\sqrt{b^2+\varepsilon ^2}C_{Lg}(\Vert x_1-x_2\Vert +RW_1(\mu _1,\mu _2)) +\varphi _\varepsilon (x_1,x_2)C_g. \end{aligned} \end{aligned}$$

This implies the conclusion of the lemma. \(\square \)

Now let us evaluate the action of the measure \(\chi ^+\) on the regularized distance function.

Lemma 4

There exists a constant \(C_{14}\) such that, for each \(t\in [0,T]\), \(\vartheta \in \mathcal {P}((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\), \(\mu _1\triangleq {\text {p}}^1\sharp \vartheta \), \(\mu _2\triangleq {\text {p}}^2\sharp \vartheta \), the following estimate holds true:

$$\begin{aligned} \begin{aligned}&\Bigg |\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}[\varphi _\varepsilon (y_1,y_2) -\varphi _\varepsilon (\star ,\star )]\chi ^ +(t,\star ,\star ,\vartheta ,d(y_1,y_2))\Bigg |\\&\quad \le \vartheta ^{-1}(\{\star ,\star \})\Bigg [C_g\varepsilon +C_{14}\int _{\mathcal {K}\times \mathcal {S}}\varphi _\varepsilon (y_1,y_2) \vartheta (d(y_1,y_2))+C_{14}RW_1(\mu _1,\mu _2)\Bigg ]. \end{aligned} \end{aligned}$$

Proof

By the definition of the measure \(\chi ^+(t,\star ,\star ,\vartheta ,\cdot )\) [see (39)], we have that

$$\begin{aligned} \begin{aligned}&\vartheta (\{\star ,\star \})\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\varphi _\varepsilon (y_1,y_2)\chi ^+(t,\star ,\star ,\vartheta ,d(y_1,y_2))\\&\quad = \int _{\mathcal {K}\times \mathcal {S}}\big [\varphi _\varepsilon (y_1,y_2)(g_R^+(t,y_1,\mu _1)\wedge g_R^+(t,y_2,\mu _2))\\&\qquad +\sqrt{b^2+\varepsilon ^2}((g_R^+(t,y_1,\mu _1)\vee g_R^+(t,y_2,\mu _2))-(g_R^+(t,y_1,\mu _1)\wedge g_R^+(t,y_2,\mu _2)))\big ]\\&\qquad \vartheta (d(y_1,y_2)). \end{aligned} \end{aligned}$$

The boundedness and the Lipschitz continuity of the function g (and, thus, the function \(g^+_R\)) yield that

$$\begin{aligned}&\vartheta (\{\star ,\star \})\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\varphi _\varepsilon (y_1,y_2)\chi ^+(t,\star ,\star ,\vartheta ,d(y_1,y_2))\nonumber \\&\quad \le \int _{\mathcal {K}\times \mathcal {S}}[C_g\varphi _\varepsilon (y_1,y_2)+C_{Lg}\sqrt{b^2+\varepsilon ^2} (\Vert y_1-y_2\Vert +RW_1(\mu _1,\mu _2))]\vartheta (d(y_1,y_2)). \end{aligned}$$
(53)

Further, we have that

$$\begin{aligned}\begin{aligned}&\vartheta (\{\star ,\star \})\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\varphi _\varepsilon (\star ,\star )\chi ^+(t,\star ,\star ,\vartheta ,d(y_1,y_2))\\&\quad =\varepsilon \int _{\mathcal {K}\times \mathcal {S}}((g_R^+(t,y_1,\mu _1)\vee g_R^+(t,y_2,\mu _2))-(g_R^+(t,y_1,\mu _1)\wedge g_R^+(t,y_2,\mu _2)))\\&\qquad \vartheta (d(y_1,y_2))\\&\quad \le C_g\varepsilon . \end{aligned} \end{aligned}$$

This and (53) imply the statement of the lemma. \(\square \)

Proof of Theorem 4

Let \(\vartheta (\cdot ):[0,T]\rightarrow \mathcal {P}^1((\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \}))\) satisfy the following conditions:

  • \(\vartheta (0)\) is an optimal plan between \(\mu _{0,1}\triangleq R^{-1}(m_0\triangleright _{R})\) and \(\mu _{0,2}\triangleq R^{-1}(({\mathscr {I}}(\mu _0))\triangleright _{R})\);

  • \(\vartheta (\cdot )\) is a solution to (38).

Notice that

$$\begin{aligned} W_{1}({\text {p}}^1\sharp \vartheta (t),{\text {p}}^2\sharp \vartheta (t))\le \int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\varphi _\varepsilon (x_1,x_2)\vartheta (t,d(x_1,x_2)). \end{aligned}$$
(54)

Therefore, integrating the estimates proved in Lemmas 24, we conclude that

$$\begin{aligned}\begin{aligned}&\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\varphi _\varepsilon (x_1,x_2)\vartheta (t,d(x_1,x_2))\\&\quad \le \int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\varphi _\varepsilon (x_1,x_2)\vartheta (0,d(x_1,x_2)) +C_{15}\varepsilon \\&\qquad +C_{16}(1+R)\int _0^t\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\varphi _\varepsilon (x_1,x_2)\vartheta (s,d(x_1,x_2))ds. \end{aligned} \end{aligned}$$

Here \(C_{15}\) and \(C_{16}\) are constants determined only by functions f and g. Using the Gronwall’s inequality, we arrive at the estimate

$$\begin{aligned} \begin{aligned}&\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\varphi _\varepsilon (x_1,x_2)\vartheta (t,d(x_1,x_2))\\&\quad \le \Bigg [\int _{(\mathcal {K}\cup \{\star \})\times (\mathcal {S}\cup \{\star \})}\varphi _\varepsilon (x_1,x_2)\vartheta (0,d(x_1,x_2)) +C_{15}\varepsilon \Bigg ]e^{C_{16}(1+R)T}. \end{aligned} \end{aligned}$$

Further, recall that \(\varphi _\varepsilon (x_1,x_2)=\sqrt{\Vert x_1-x_2\Vert ^2+\varepsilon ^2}\le \Vert x_1-x_2\Vert +\varepsilon \). Using this and (54), we conclude that

$$\begin{aligned} W_1({\text {p}}^1\sharp \vartheta (t),{\text {p}}^2\sharp \vartheta (t))\le \big [W_1({\text {p}}^1\sharp \vartheta (t),{\text {p}}^2\sharp \vartheta (t)) +(C_{15}+1)\varepsilon \big ]e^{C_{16}(1+R)T}. \end{aligned}$$

The conclusion of the theorem follows from the definition of the flow \(\vartheta (\cdot )\), equality (6) and Propositions 3711. \(\square \)