Mathematics Subject Classification (2000)

1 Introduction

The recent book by T. Piketty (2013) shifted the general attention as well as the attention of economists towards the important issue of wealth inequality. The question “Why is there wealth inequality?” has attracted the attention of a diverse set of researchers, including economists, physicists and mathematicians. Particularly, during the last 20 years, physicists and mathematicians developed models to theoretically derive the wealth distribution using tools of statistical physics and probability theory: discrete and continuous stochastic processes (random exchange models) as well as related Boltzmann-type kinetic equations. In this framework, the usual concept of equilibrium in economics is complemented or replaced by statistical equilibrium (Garibaldi and Scalas 2010).

The original work of Pareto concerned the distribution of income (Pareto 1897). Pareto observed a skewed distribution with power-law tail. However, he also dealt with the distribution of wealth, for which he wrote:

La répartition de la richesse peut dépendre de la nature des hommes dont se compose la societé, de l’organisation de celle-ci, et aussi, en partie, du hasard (les conjonctures de Lassalle), […]

The distribution of wealth can depend on the nature of those who make up society, on the social organization and, also, in part, on chance, (the conjunctures of Lassalle), […]

More recently, Champernowne (1952), Simon (1955), Wold and Whittle (1957) as well as Mandelbrot (1961) used random processes to derive distributions for income and wealth. Starting from the late 1980s and publishing in the sociological literature, Angle introduced the so-called inequality process, a continuous-space discrete-time Markov chain for the distribution of wealth based on the surplus theory of social stratification (Angle 1986). However, the interest of physicists and mathematicians was triggered by a paper written by Drǎgulescu and Yakovenko (2000) and explicitly relating random exchange models with statistical physics. Among other things, they discussed a simple random exchange model already published in Italian by Bennati (1988). An exact solution of that model was published in Scalas et al. (2006). Lux wrote an early review of the statistical physics literature up to 2005 (Lux 2005). An extensive review was written by Chakrabarti and Chakrabarti in 2010. Boltzmann-like kinetic equations for the marginal distribution of wealth were studied in Cordier et al. (2005) and several other works; we refer to the review article by Düring et al. (2009) and the book by Pareschi and Toscani (2014) and the references therein.

We will focus on the essentials of random modelling for wealth distributions, and we will explicitly show how continuous-space Markov chains can be derived from discrete-space (actually finite-space) chains. We will then focus on the stability properties of these chains, and, finally, we will review the mathematical literature on kinetic equations while studying the kinetic equation related to the Markov chains. In doing so, we will deal with a stylised model for the time evolution of wealth (a stock) and not of income (a flow).

Distributional problems in economics can be presented in a rather general form. Assume one has N economic agents, each one endowed with his/her stock (for instance, wealth) w i ≥ 0. Let W = i = 1 N w i be the total wealth of the set of agents. Consider the random variable W i , i.e. the stock of agent i. One is interested in the distribution of the vector (W 1, , W N ) as well as in the marginal distribution W 1 if all agents are on a par (exchangeable). The transformation

$$\displaystyle{ X_{i} = \frac{W_{i}} {W}, }$$
(7.1)

normalises the total wealth of the system to be equal to one since

$$\displaystyle{ \sum _{i=1}^{N}X_{ i} = 1 }$$
(7.2)

and the vector (X 1, , X N ) is a finite random partition of the interval (0, 1). The X i s are called spacings of the partition.

The following remarks are useful and justify this simplified modelling of wealth distribution.

  1. 1.

    If the stock w i represents wealth, it can be negative due to indebtedness. In this case, one can always shift the wealth to non-negative values by subtracting the negative wealth with largest absolute value.

  2. 2.

    A mass partition is an infinite sequence s = (s 1, s 2, ) such that s 1s 2 ≥ 0 and i = 1 s i ≤ 1.

  3. 3.

    Finite random interval partitions can be mapped into mass partitions, just by ranking the spacings and adding an infinite sequence of 0s.

The vector X = (X 1, , X N ) lives on the N − 1 dimensional simplex \(\Delta _{N-1}\), defined by

Definition 7.1 (The simplex \(\Delta _{N-1}\))

$$\displaystyle{ \Delta _{N-1} = \left \{\mathbf{x} = (x_{1},\ldots,x_{N}):\, x_{i} \geq 0\,\mbox{ for all }i = 1,\ldots,N\mbox{ and }\sum _{i=1}^{N}x_{ i} = 1\right \}\!\!.\ }$$
(7.3)

There are two natural questions that immediately arise from defining such a model.

  1. 1.

    Which is the distribution of the vector (X 1, , X N ) with X i given by (7.1) at a given time?

  2. 2.

    Which is the distribution of the random variable X 1, the proportion of the wealth of a single individual?

One well-studied probabilistic example is to set the vector (W 1, , W N ) of i.i.d. random variables such that W i ∼ gamma(α i , λ). Then W = i = 1 N W i ∼ gamma\(\left (\sum _{i=1}^{N}\alpha _{i},\lambda \right )\). In this case the mass function of (X 1, , X N ) is the Dirichlet distribution, given by

$$\displaystyle{ f_{\mathbf{X}}(\mathbf{x}) = \frac{\Gamma (\alpha _{1} + \cdots +\alpha _{N})} {\Gamma (\alpha _{1})\cdots \Gamma (\alpha _{N})} x_{1}^{\alpha _{1}-1}\cdots x_{ N}^{\alpha _{N}-1},\quad \mathbf{x} = (x_{ 1},\ldots,x_{N}) \in \Delta _{N-1}. }$$
(7.4)

We say X ∼ Dir N−1(α 1, , α N ) and the parameters α 1, , α N are assumed strictly positive as they can be interpreted as the shapes of gamma random variables. A particular case is when α 1 = ⋯ = α n = α. Then the Dirichlet distribution is called symmetric. The symmetric Dirichlet distribution with α = 1 is uniform on the simplex \(\Delta _{N-1}\).

One can now answer the two questions above using the following proposition which we present in its simplest form.

Proposition 7.1

Let (W 1, , W N ) be i.i.d. random variables such that W i ∼ exp(1). Then W = i = 1 N W i ∼ gamma(N, 1). Define X i = W i W; then the vector X = (X 1, , X N ) has the uniform distribution on the simplex \(\Delta _{N-1}\) and one-dimensional marginals X 1 ∼ beta(1, N − 1), namely,

$$\displaystyle{ f_{X_{1}}(x) = \frac{(1 - x)^{N-2}} {B(1,N - 1)}, }$$
(7.5)

where, for a, b > 0,

$$\displaystyle{ B(a,b) = \frac{\Gamma (a)\Gamma (b)} {\Gamma (a + b)}. }$$
(7.6)

The proof of this proposition can be found in several textbooks of probability and statistics including Devroye’s book (2003). Specifically, the part of Proposition 7.1 concerning the uniform distribution is a corollary of Theorem 4.1 in Devroye (2003). Equation (7.5) is a direct consequence of the aggregation property of the Dirichlet distribution.

In this chapter, we define three related models that incorporate a stochastic time evolution for the agent wealth distribution. The models increase in mathematical complexity in the order they are presented.

The first one is a discrete-time discrete-space (DD) Markov chain with a Pólya limiting invariant distribution. We keep the dynamics as simple as possible so in fact the invariant distribution will be uniform (not a generic Pólya distribution), but the ideas and techniques are the same for more complicated versions. The Markov chain of the DD model is then generalised to a discrete-time continuous-space (DC) Markov chain. The extension is natural in the sense that the dynamics, irreducibility and the invariant distribution of the DC model can be viewed as limiting case of the DD model. In the process, we effectively prove that Monte Carlo algorithms will approximate the DC model well. Finally, we present a continuous-continuous-space (CC) model for which the temporal evolution of the (random) wealth of a single individual is governed by a Boltzmann-type equation.

2 Random Dynamics on the Simplex

In order to define our simple models, we first introduce two types of moves on the simplex.

Definition 7.2 (Coagulation)

By coagulation, we denote the aggregation of the stocks of two or more agents into a single stock. This can happen in mergers, acquisitions and so on.

Definition 7.3 (Fragmentation)

By fragmentation, we denote the division of the stock of one agent into two or more stocks. This can happen in inheritance, default and so on.

2.1 Discrete-Time Continuous-Space Model: Coagulation-Fragmentation Dynamics

Before introducing the DD model, let us define the main object of our study: The DC model.

At each event time, the state of the process \(\mathbf{X} \in \Delta _{N-1}\) changes according to a composition of one coagulation and one fragmentation step.

To be precise, let X = x be the current value of the random variable X. For any ordered pair of indices i, j, 1 ≤ i, jN, chosen uniformly at random, define the coagulation application \(\text{coag}_{ij}(\mathbf{x}): \Delta _{N-1} \rightarrow \Delta _{N-2}\) by creating a new agent with stock x = x i + x j while the proportion of wealth for all others remain unchanged. Next enforce a random fragmentation application \(\text{frag}(\mathbf{x}): \Delta _{N-2} \rightarrow \Delta _{N-1}\) that takes x defined above and splits it into two parts as follows. Given u ∈ (0, 1) drawn from the uniform distribution U[0, 1], set x i = ux and x j = (1 − u)x.

The sequence of coagulation and fragmentation operators defines a time-homogeneous Markov chain on the simplex \(\Delta _{N-1}\). Let x(t) = (x 1(t), , x i (t), , x j (t), , x N (t)) be the state of the chain at time t with i and j denoting the selected indices. Then the state at time t + 1 is

$$\displaystyle{ \begin{array}{rl} \mathbf{x}(t + 1)& = (x_{1}(t),\ldots,x_{i}(t + 1) = u(x_{i}(t) + x_{j}(t)),\ldots,x_{j}(t + 1) \\ & = (1 - u)(x_{i}(t) + x_{j}(t)),\ldots,x_{N}(t)).\end{array} }$$

The Markov kernel for this process is however degenerate because each step only affects a Lebesgue measure 0 of the simplex. To avoid this technical complication for the moment, we define the same dynamics on the discrete simplex, and we then analyse the DC model.

2.2 Discrete-Time Discrete-Space Model

Let N denote the number of categories (individuals) into which n objects (coins or tokens) are classified (Garibaldi and Scalas 2010). In the frequency or statistical description of this system, a state is a list n = (n 1, , n N ) with i = 1 N n i = n which gives the number of objects belonging to each category. In this framework, a coagulation move is defined by picking up a pair of ordered integers i, j at random without replacement from 1 ≤ N and creating a new category with n i + n j objects. A fragmentation move takes this category and splits it into two new categories relabeled i and j where n i is a uniform random integer between 0 and (n i + n j − 1) ∨ 0 and n j = n i + n j n i . The state of the process at time \(t \in \mathbb{N}_{0}\) is denoted by X(t), and its state space is the scaled integer simplex

$$\displaystyle{S_{N-1}^{(n)}\! =\! n\Delta _{ N-1}\cap \mathbb{Z}^{N}\! =\! \left \{\!\mathbf{n} = (n_{ 1},n_{2},\ldots,n_{N}): 0 \leq n_{i} \leq n,\ \ \ \sum _{i=1}^{N}n_{ i} = n,\ \ \ n_{i} \in \mathbb{N}_{0}\right \}\!.}$$

Remark 7.1

Note that we have seemingly introduced a slight asymmetry; the agent picked first runs the risk of ending up with 0 fortune. The dynamics are overall not asymmetric, however, since we select i before j with the same probability as selecting j before i. The reason for introducing the model in this way is to simplify the presentation and error estimate in the proof of the weak convergence of the finite dimensional marginals from the DD to the DC model.

Formally, with coagulation, we move from the state space S N−1 (n) to S N−2 (n), and then again with fragmentation, we come back to S N−1 (n). While it is interesting to actually study all stages of the procedure, we are only interested in the aggregated wealth, and therefore we can bypass the intermediate state space by defining the process only on S N−1 (n); it is straightforward to write down the transition probabilities for X(t)

$$\displaystyle\begin{array}{rcl} & & \mathbb{P}\{\mathbf{X}(t + 1) = \mathbf{n}^{{\prime}}\vert \mathbf{X}(t) = \mathbf{n}\} \\ & & \quad \ =\sum _{i,j:i\neq j}\left \{ \frac{1} {N} \frac{1} {N - 1}\Bigg(\frac{1\!\!1\{n_{i} + n_{j} \geq 1,n_{j}^{{\prime}}\geq 1\}} {n_{i} + n_{j}} + 1\!\!1\{n_{i} + n_{j} = 0\}\Bigg)\right. \\ & & \qquad \ \left.\times \ \delta _{n_{i}+n_{j},n_{i}^{{\prime}}+n_{j}^{{\prime}}}\prod _{k\neq i,j}\delta _{n_{k}^{{\prime}},n_{k}}\right \}. {}\end{array}$$
(7.7)

The notation is shorthand and implies that we are adding over all ordered pairs (i, j), ij where the first coordinate indicates the index i that was selected first.

The chain is time-homogeneous as the transition (7.7) is independent of the time parameter t. It is also aperiodic since with positive probability, during each time step, the chain may coagulate and then fragment to the same state. To see this, consider any vector (X 1, , X N ) = (x 1, , x N ) on the simplex. It must have at least one non-zero entry, say x 1 > 0. Select index i = 1 first (with probability N −1), and then select any other index j. After that, fragment at precisely x 1, x j (with probability 1∕(x 1 + x j ) > 0). Finally, the chain is irreducible, since from any point X = (x 1, , x N ), the chain can move with positive probability to any of the neighbouring ((x 1, , x N ) ± (e i e j )) ∩ S N−1 (n), i.e. to any point in the simplex that is 1-distance 2 away from the current state. Therefore, we can conclude that the chain \(\{\mathbf{X}(t)\}_{t\in \mathbb{N}_{0}}\) has a unique equilibrium distribution \(\boldsymbol{\pi }\) which we identify in the next proposition.

Proposition 7.2

The invariant distribution of this Markov chain X(t) is the uniform distribution on \(n\Delta _{N-1} \cap \mathbb{Z}^{N}\) .

Proof

Define

$$\displaystyle{A_{i,j}(\mathbf{n}^{{\prime}}) = \left \{\mathbf{n}: \mathbf{n}\stackrel{\text{coag-frag}_{ i,j}}{\longrightarrow }\mathbf{n}^{{\prime}}\right \}}$$

to be the set of all simplex elements n that map to n via a coagulation-fragmentation procedure in the i, j indices (i selected before j). This set is empty only when n j = 0 while n i ≥ 1, but otherwise it contains at least one vector. Assuming that A i, j (n ) is not empty, we have that its cardinality is

$$\displaystyle{ \text{card}(A_{i,j}(\mathbf{n}^{{\prime}})) = (n_{ i}^{{\prime}} + n_{ j}^{{\prime}}) \vee 1. }$$
(7.8)

Using this notation, we may rewrite the transition probability in (7.7) as

$$\displaystyle\begin{array}{rcl} & & \mathbb{P}\{\mathbf{X}(t + 1) = \mathbf{n}^{{\prime}}\vert \mathbf{X}(t) = \mathbf{n}\} \\ & & \quad =\!\sum _{i,j:i\neq j}\!\left \{\! \frac{1} {N} \frac{1} {N - 1}\Bigg(\!\frac{1\!\!1\{n_{i} + n_{j} \geq 1,n_{j}^{{\prime}}\geq 1\}} {n_{i} + n_{j}} +\! 1\!\!1\{n_{i}\! +\! n_{j} = 0\}\!\Bigg)\delta _{n_{i}+n_{j},n_{i}^{{\prime}}+n_{j}^{{\prime}}}\!\prod _{k\neq i,j}\!\delta _{n_{k}^{{\prime}},n_{k}}\!\right \} \\ & & \quad =\!\sum _{i,j:i\neq j}\!\left \{\! \frac{1} {N} \frac{1} {N - 1}\Bigg(\!\frac{1\!\!1\{n_{i}^{{\prime}} + n_{j}^{{\prime}}\geq 1,n_{j}^{{\prime}}\geq 1\}} {n_{i}^{{\prime}} + n_{j}^{{\prime}}} +\! 1\!\!1\{n_{i}^{{\prime}}\! +\! n_{ j}^{{\prime}} = 0\}\!\Bigg)\delta _{ n_{i}+n_{j},n_{i}^{{\prime}}+n_{j}^{{\prime}}}\!\prod _{k\neq i,j}\!\delta _{n_{k}^{{\prime}},n_{k}}\!\right \} \\ & & \quad \phantom{xxxxxxxxxxxxnxnxnxnxnxnxnxnxnxnxnxnxxxxxxxx}\ \mbox{ from the}\ \delta _{n_{i}+n_{j},n_{i}^{{\prime}}+n_{j}^{{\prime}}}\ \mbox{ factor,} \\ & & \quad =\sum _{i,j:i\neq j}\left \{ \frac{1} {N} \frac{1} {N - 1}\Bigg( \frac{1} {(n_{i}^{{\prime}} + n_{j}^{{\prime}}) \vee 1}\Bigg)\delta _{n_{i}+n_{j},n_{i}^{{\prime}}+n_{j}^{{\prime}}}\prod _{k\neq i,j}\delta _{n_{k}^{{\prime}},n_{k}}1\!\!1\left \{\mathbf{n} \in A_{i,j}(\mathbf{n}^{{\prime}})\right \}\right \} \\ & & \quad =\sum _{i,j:i\neq j}\left \{ \frac{1} {N} \frac{1} {N - 1}\Bigg( \frac{1} {\text{card}(A_{i,j}(\mathbf{n}^{{\prime}}))}\Bigg)\delta _{n_{i}+n_{j},n_{i}^{{\prime}}+n_{j}^{{\prime}}}\prod _{k\neq i,j}\delta _{n_{k}^{{\prime}},n_{k}}1\!\!1\left \{\mathbf{n} \in A_{i,j}(\mathbf{n}^{{\prime}})\right \}\right \} \\ & & \quad =\sum _{i,j:i\neq j}\left \{ \frac{1} {N} \frac{1} {N - 1}\Bigg( \frac{1} {\text{card}(A_{i,j}(\mathbf{n}^{{\prime}}))}\Bigg)1\!\!1\left \{\mathbf{n} \in A_{i,j}(\mathbf{n}^{{\prime}})\right \}\right \} \\ & & \quad \phantom{xxxxxxxxxxxxxxxnxnxnxnxxx}\ \mbox{ since the}\ \delta \ \mbox{ product is equivalent to the last indicator.}{}\end{array}$$
(7.9)

Now fix a n and add up all the transition probabilities in (7.9) over n. We get

$$\displaystyle\begin{array}{rcl} & & \sum _{\mathbf{n}}\sum _{i,j:i\neq j}\left \{ \frac{1} {N} \frac{1} {N - 1}\Bigg( \frac{1} {\text{card}(A_{i,j}(\mathbf{n}^{{\prime}}))}\Bigg)1\!\!1\{\mathbf{n} \in A_{i,j}(\mathbf{n}^{{\prime}})\}\right \} {}\\ & & \qquad \qquad \ \ =\sum _{i,j:i\neq j}\left \{ \frac{1} {N} \frac{1} {N - 1}\Bigg( \frac{1} {\text{card}(A_{i,j}(\mathbf{n}^{{\prime}}))}\Bigg)\sum _{\mathbf{n}}1\!\!1\{\mathbf{n} \in A_{i,j}(\mathbf{n}^{{\prime}})\}\right \} {}\\ & & \qquad \qquad \ \ =\sum _{i,j:i\neq j}\left \{ \frac{1} {N} \frac{1} {N - 1}\Bigg( \frac{1} {\text{card}(A_{i,j}(\mathbf{n}^{{\prime}}))}\Bigg)\text{card}(A_{i,j}(\mathbf{n}^{{\prime}}))\right \} {}\\ & & \qquad \qquad \ \ = 1. {}\\ \end{array}$$

Therefore, the transition matrix is doubly stochastic, and in particular the invariant distribution must be uniform.

2.3 Convergence of the Finite Markov Chain as the Overall Wealth Increases

Reaching a similar conclusion in the case of the DC model is slightly more complicated. The difficulty is related to the fact that time is changing in discrete steps and the chain cannot explore the whole available state space because real numbers cannot be put in 1-to-1 correspondence with integers. How can we be sure that the Markov chain with continuous state space can explore its state space uniformly? We begin our analysis by studying the convergence of the finite state-space Markov chain to the continuous-state-space Markov chain.

Let X (n) be the DD Markov chain for wealth, when the wealth of the system is n, and let X () be the chain for the DC model introduced in Sect. 7.2.1. We scale the state space of each process X (n) so that it is a subset of \(\Delta _{N-1}\) by defining a new, coupled process

$$\displaystyle{\mathbf{Y}^{(n)} = n^{-1}\mathbf{X}^{(n)}.}$$

The state space for the process Y (n) is the simplex

$$\displaystyle{\Delta _{N-1}(n) =\{ (q_{1},\ldots,q_{d}): 0 \leq q_{i} \leq 1,\,q_{1} +\ldots +q_{d} = 1,\,nq_{i} \in \mathbb{N}_{0}\} \subset \Delta _{N-1}.}$$

It can be considered as partition of \(\Delta _{N-1}\) with mesh n −1, i.e. inversely proportional to the total wealth.

In this section we first prove weak convergence of the one-dimensional marginals

$$\displaystyle{\mathrm{Y}_{k}^{(n)}\stackrel{n \rightarrow \infty }{\Longrightarrow}\mathrm{X}_{ k}^{(\infty )},\mbox{ for all }k \in \mathbb{N}}$$

and then prove the existence of a unique invariant distribution for \(\boldsymbol{X}^{(\infty )}\) (the DC model) that we identify as the uniform distribution on \(\Delta _{N-1}\).

Let μ 0 (n) the initial distribution of Y 0 (n) and μ 0 () the initial distribution of X 0 .

Proposition 7.3

Assume the weak convergence μ 0 (n) ⇒μ 0 () as n∞. Then for each \(k \in \mathbb{N}\) , we have weak convergence of the one-dimensional marginals

$$\displaystyle{Y _{k}^{(n)}\Longrightarrow X_{ k}^{(\infty )}\,\mathit{\mbox{ as }}n \rightarrow \infty.}$$

Proof

We first show this for k = 1 and then show it in general with an inductive argument. Let f be a bounded continuous function on \(\Delta _{N-1}(n)\). Let U be a uniform random variable on [0, 1], and define the bounded and continuous \(F_{i,j}: \Delta _{N-1} \rightarrow \mathbb{R}\) by

$$\displaystyle{F_{i,j}(x_{1},\ldots,x_{d}) = \mathbb{E}^{U}(f(x_{ 1},\ldots,U(x_{i} + x_{j}),x_{i+1},\ldots,(1 - U)(x_{i} + x_{j}),x_{j+1},\ldots,x_{d})).}$$

Pick an ɛ > 0. By compactness, we can find a δ = δ(ɛ) > 0 such that whenever ∥xy1 < δ we have that

$$\displaystyle{\sup _{\{i,j\}}\vert F_{ij}(x) - F_{ij}(y)\vert + \vert f(x) - f(y)\vert <\varepsilon.}$$

From this relation, choose n large enough so that the discrete simplex \(\Delta _{N-1}(n)\) is fine enough, namely, two neighbouring points x (n), y (n) satisfy ∥x (n)y (n)1 < δ. In particular, this implies that n > 2δ −1.

The function F i, j evaluated on the partition points is

$$\displaystyle\begin{array}{rcl} F_{i,j}(x^{(n)})& \!=\!& \int _{ 0}^{1}f(x_{ 1}^{(n)},\ldots,u(x_{ i}^{(n)} + x_{ j}^{(n)}),\ldots,(1 - u)(x_{ i}^{(n)} + x_{ j}^{(n)}),\ldots,x_{ d})\,du {}\\ & \!=\!& \left \{\begin{array}{@{}l@{\quad }l@{}} f(x^{(n)}),\quad \quad \quad x_{i}^{(n)} + x_{j}^{(n)} = 0 \quad \\ \! \frac{1} {x_{i}^{(n)}+x_{j}^{(n)}} \!\int _{0}^{x_{i}^{(n)}+x_{ j}^{(n)} }\!f(x_{1}^{(n)},\ldots,s,\ldots,x_{ i}^{(n)}\! +\! x_{ j}^{(n)}\! - s,\ldots,x_{ d})\,ds,\ \ \mbox{ otherwise.}\quad \end{array} \right.{}\\ \end{array}$$

Focus on the integral of the second branch for a moment. We discretise the integral on \(\Delta _{N-1}(n)\) with s-values 0, 1∕n, , x i (n) + x j (n) − 1∕n. Then

$$\displaystyle\begin{array}{rcl} & & \Big\vert \int _{k/n}^{(k+1)/n}f(x_{ 1}^{(n)},\ldots,s,\ldots,x_{ i}^{(n)} + x_{ j}^{(n)} - s,\ldots,x_{ d})\,ds {}\\ & & \qquad \qquad \phantom{xxxxxxxxxxx} - n^{-1}f(x_{ 1}^{(n)},\ldots,k/n,\ldots,x_{ i}^{(n)} + x_{ j}^{(n)} - k/n,\ldots,x_{ d})\Big\vert <\varepsilon /n. {}\\ \end{array}$$

Therefore, the overall error,

$$\displaystyle{ \left \vert F_{i,j}(x^{(n)}) -\sum _{ k=0}^{n(x_{i}^{(n)}+x_{ j}^{(n)})-1 }\frac{f(x_{1}^{(n)},\ldots,k/n,\ldots,x_{i}^{(n)} + x_{j}^{(n)} - k/n,\ldots,x_{d})} {n(x_{i}^{(n)} + x_{j}^{(n)})} \right \vert <\varepsilon. }$$
(7.10)

Now we turn to prove the weak convergence:

$$\displaystyle\begin{array}{rcl} & & \mathbb{E}(f(Y _{1}^{(n)})) =\sum _{ x\in \Delta _{N-1}(n)}f(x)\mathbb{P}\{Y _{1}^{(n)} = x\} {}\\ & & \qquad \qquad \quad =\sum _{x\in \Delta _{N-1}(n)}\sum _{y\in \Delta _{N-1}(n)}f(x)\mathbb{P}\{Y _{1}^{(n)} = x\vert Y _{ 0}^{(n)} = y\}\mu _{ 0}^{(n)}(y) {}\\ & & \qquad \qquad \quad =\sum _{y\in \Delta _{N-1}(n)}\mu _{0}^{(n)}(y)\sum _{ x\in \Delta _{N-1}(n)}f(x)\mathbb{P}\{X_{1}^{(n)} = x\vert X_{ 0}^{(n)} = y\} {}\\ & & \qquad \qquad \quad =\sum _{y\in \Delta _{N-1}(n)}\mu _{0}^{(n)}(y)\sum _{ i,j:,i\neq j}\quad \sum _{\begin{array}{c}x_{i}+x_{j}=y_{i}+y_{j} \\ x_{k}=y_{k} \end{array}}f(x)\mathbb{P}\{Y _{1}^{(n)} = x\vert Y _{ 0}^{(n)} = y\} {}\\ & & \qquad \qquad \quad = \frac{1} {N(N - 1)}\sum _{y\in \Delta _{N-1}(n)}\mu _{0}^{(n)}(y)\ \sum _{ i,j:i\neq j}\ \sum _{\begin{array}{c}x_{i}+x_{j}=y_{i}+y_{j} \\ x_{k}=y_{k} \end{array}}f(x_{1},\ldots x_{i},\ldots,x_{j},\ldots,x_{d}) {}\\ & & \qquad \qquad \quad \quad \, \times \Big (\frac{1\!\!1\{y_{i} + y_{j} \geq n^{-1},x_{j} \geq n^{-1}\}} {n(y_{i} + y_{j})} + 1\!\!1\{y_{i} + y_{j} = 0\}\Big) {}\\ & & \qquad \qquad \quad = \frac{1} {N(N - 1)}\sum _{y\in \Delta _{N-1}(n)}\mu _{0}^{(n)}(y) {}\\ & & \qquad \qquad \quad \quad \, \times \!\sum _{i,j:i\neq j}\Bigg\{\!\sum _{k=0}^{n(y_{i}\!+\!y_{j})-1}\!f(y_{ 1},\ldots,n^{-1}k,\ldots,y_{ i} + y_{j} - n^{-1}k,\ldots,y_{ d}) \frac{1} {n(y_{i} + y_{j})} {}\\ & & \qquad \quad \phantom{XXXXXXXXXx} + f(y_{1},\ldots,0,\ldots,0,\ldots,y_{d})1\!1\{y_{i} + y_{j} = 0\}\Bigg\} {}\\ & & \qquad \qquad \quad = \frac{1} {N(N - 1)}\sum _{y\in \Delta _{N-1}(n)}\mu _{0}^{(n)}(y) {}\\ & & \qquad \qquad \quad \quad \, \times \sum _{i,j:i\neq j}F_{i,j}(y)1\!\!1\{y_{i} + y_{j}> 0\} + O(\varepsilon ) + F_{i,j}(y)1\!\!1\{y_{i} + y_{j} = 0\} {}\\ & & \qquad \qquad \quad = \frac{1} {N(N - 1)}\sum _{y\in \Delta _{N-1}(n)}\mu _{0}^{(n)}(y)\sum _{ i,j:i\neq j}F_{i,j}(y) + O(\varepsilon ) {}\\ & & \qquad \qquad \quad = \frac{1} {N(N - 1)}\sum _{i,j:i\neq j}\mathbb{E}^{\mu _{0}^{(n)} }(F_{i,j}) + O(\varepsilon ). {}\\ \end{array}$$

Now let n and recall that F i, j is bounded and continuous to conclude that

$$\displaystyle\begin{array}{rcl} & & -C\varepsilon \leq \underline{\lim }_{n\rightarrow \infty }\mathbb{E}(f(Y _{1}^{(n)})) - \frac{1} {N(N - 1)}\sum _{i,j:i\neq j}\mathbb{E}^{\mu _{0}^{(\infty )} }(F_{i,j}) \leq \overline{\lim }_{n\rightarrow \infty }\mathbb{E}(f(Y _{1}^{(n)})) {}\\ & & \quad \ \ - \frac{1} {N(N - 1)}\sum _{i,j:i\neq j}\mathbb{E}^{\mu _{0}^{(\infty )} }(F_{i,j}) \leq C\varepsilon. {}\\ \end{array}$$

where C is a constant independent of n that comes from the error term. Let ɛ → 0 to conclude the limit exists and observe that the definition of F i, j and the disintegration theorem imply that

$$\displaystyle{\lim _{n\rightarrow \infty }\mathbb{E}(f(Y _{1}^{(n)})) = \frac{1} {N(N - 1)}\sum _{i,j:i\neq j}\mathbb{E}^{\mu _{0}^{(\infty )} }(F_{i,j}) = \mathbb{E}(f(X_{1}^{(\infty )})).}$$

Therefore, we have now shown that the μ 1 (n)μ 1 () if μ 0 (n)μ 0 (). An inductive construction and the Markov property are enough to guarantee that all one-dimensional marginals converge.

2.4 Irreducibility, Uniqueness of the Invariant Measure and Stability

We can now proceed to study of irreducibility, of the uniqueness of the invariant measure and of the stability for the continuous-space Markov chain.

We begin with a proposition that will simplify the mathematical technicalities associated with general state-space discrete-time Markov chains.

Proposition 7.4 (Duality of coagulation and fragmentation)

Let X(t) denote the coagulation-fragmentation Markov chain defined in Sect.  7.2.1 . If \(\mathbf{X}(t) \sim U[\Delta _{N-1}]\) then \(\mathbf{X}(t + 1) \sim U[\Delta _{N-1}]\) , as well.

Proof

See Bertoin (2006) chapter 2, corollary 2.1, page 77.

This proposition means that the uniform distribution on the simplex \(\Delta _{N-1}\) is an invariant distribution for the coagulation-fragmentation chain.

What we prove in the sequel is that this is the unique invariant measure and the transition kernels converge to it in the total variation norm. With this goal in mind, we begin with some definitions.

Definition 7.4 (Phi-irreducibility)

Let \((S,\mathcal{B}(S),\phi )\) be a measured Polish space. A discrete-time Markov chain X on S is ϕ-irreducible if and only if for any Borel set A, the following implication holds:

$$\displaystyle{\phi (A)> 0\Longrightarrow L(u,A)> 0,\quad \mbox{ for all }u \in S.}$$

Above, we used notation

$$\displaystyle{L(u,A) = P_{u}\{X_{n} \in A\mbox{ for some }n\} = P\{X_{n} \in A\,\mbox{ for some }n\vert \,X_{0} = u\}.}$$

This replaces the notion of irreducibility for discrete Markov chains and means that the chain is visiting any set of positive measure with positive probability.

The existence of a Foster-Lyapunov function V defined as

Definition 7.5 (Foster-Lyapunov function)

For a petite set C, we can find a function V ≥ 0 and a ρ > 0 so that for all xS

$$\displaystyle{ \int P(x,dy)V (y) \leq V (x) - 1 +\rho 1\!\!1_{C}(x), }$$
(7.11)

implies convergence of the kernel P of ϕ-irreducible, aperiodic chain to a unique equilibrium measure π

$$\displaystyle{ \sup _{A\in \mathcal{B}(S)}\left \vert P^{n}(x,A) -\pi (A)\right \vert \rightarrow 0,\mbox{ as }n \rightarrow \infty. }$$
(7.12)

(see Meyn and Tweedie 1993) for all x for which V (x) < . If we define τ C to be the number of steps it takes the chain to return to the set C, the existence of a Foster-Lyapunov function (and therefore convergence to a unique equilibrium) is equivalent to τ C having finite expectation, i.e.

$$\displaystyle{\sup _{x\in C}\mathbf{E}_{x}(\tau _{C}) <M_{C}}$$

which in turn is implied when τ C has geometric tails. This is in fact what we prove in the following.

In our case, ϕ will be the Lebesgue measure, and the role of the petite set C will be played by any set with positive Lebesgue measure. This useful simplification of the mathematical technicalities is an artefact of the compact state space (\(\Delta _{N-1}\)) and the fact that the uniform distribution on the simplex is invariant for the chain (Proposition 7.4).

Proposition 7.5

Let \(t \in \mathbb{N}\) . The discrete chain \(\mathbf{X} =\{ X_{n}\}_{n\in \mathbb{N}}\) as defined in Sect.  7.2.1 is ϕ-irreducible, where ϕλ N−1 is the Lebesgue measure on the simplex.

At this point it is useful to explain the idea of the proof of Proposition 7.5 when we have deterministic dynamics. We do this in the (easy to visualise) case N = 3, while the proof is done generally, with Markov dynamics. For any pair \(u,v \in \Delta _{2}^{\circ }\), there is a deterministic way to move from u = (x u , y u , z u ) to v = (x v , y v , z v ) in precisely two steps. The same happens in higher dimensions; on \(\Delta _{N-1}^{\circ }\), we can move from any starting point to any target point using deterministic coagulation-fragmentation dynamics in precisely N − 1 steps.

Since the dynamics is symmetric with respect to the coordinates, we may assume without loss of generality that z u ≤ 2∕3 and therefore there exists an entry in v, say x v , such that \(m_{1} = \frac{x_{v}} {1 - z_{u}} \leq 1.\) Furthermore, \(m_{2} = \frac{x_{v}} {1 - y_{v}} \leq 1.\) Then consider the mapping

$$\displaystyle\begin{array}{rcl} & & u = (x_{u},y_{u},z_{u})\mapsto (m_{1}(x_{u} + y_{u}),(1 - m_{1})(x_{u} + y_{u}),z_{u}) \\ & & \qquad \ \mapsto \left (m_{1}(x_{u} + y_{u}),m_{2}[(1 - m_{1})(x_{u} + y_{u})\! +\! z_{u}],(1 - m_{2})[(1 - m_{1})(x_{u} + y_{u})\! +\! z_{u}]\right ) \\ & & \qquad \ = (m_{1}(1 - z_{u}),m_{2}[(1 - m_{1})(1 - z_{u}) + z_{u}],(1 - m_{2})[(1 - m_{1})(1 - z_{u}) + z_{u}]) \\ & & \qquad \ = (x_{v},y_{v},z_{v}) = v. {}\end{array}$$
(7.13)

This idea captures the proof of the Lebesgue irreducibility (see also Fig. 7.1).

Fig. 7.1
figure 1

Schematic of a possible coagulation-fragmentation route from u to v in two steps. Starting from point \(u \in \Delta _{2}\), fix z u . Then on the line x + y = 1 − z u , pick the point (x v , 1 − z u x v , z u ). From there, fix x v and choose (y v , z v ) on the line 1 − x v = y v + z v . The shaded region are all points v that can be reached with this procedure from u, first by fixing z u and then by fixing x v . Points in the white region can be reached from u first by fixing z u and then y v

Proof (Proof of Proposition 7.5)

First observe that excluding one coordinate (say x 1) from the coagulation process in \(\Delta _{N-1}\), we are merely restricting the dynamics to \((1 - x_{1})\Delta _{N-2}\). This observation is what allows us to proceed by way of induction.

Base case N = 3. We choose the base case N = 3 for purposes of clarity, in a way that can be immediately generalised to higher dimensions. We are working on \((\Delta _{2},\mathcal{B}(\Delta _{2}),\lambda _{2} \equiv \lambda \otimes \lambda )\).

Let A be a Borel set and assume λ 2(A) = α > 0. We will show that starting from any x, the probability of hitting A in just two steps with the coagulation-fragmentation dynamics described above is strictly positive.

For any δ > 0 and point u, let \(\delta \Delta _{2}(u)\) denote the scaled simplex with length side \(\delta \sqrt{2}\) and barycentre u.

Since A has positive Lebesgue measure, for any ɛ > 0, we can find an open set G A, ɛ A so that λ 2(G A, ɛ ∖A) < ɛ. Fix an ɛ > 0 and construct G A, ɛ . Enumerate all rationals in G A, ɛ and find δ = δ(A, ɛ) > 0 so that

$$\displaystyle{A \subseteq \bigcup _{q\in G_{A,\varepsilon }}(\delta \Delta _{2}(q)),\quad \mbox{ and}\quad \lambda _{2}\Big(\bigcup _{q\in G_{A,\varepsilon }}(\delta \Delta _{2}(q))\Big) \leq \alpha +2\varepsilon.}$$

Without loss of generality, we may assume that \(\delta \Delta _{2}(q) \cap A\neq \varnothing\) for all q in the union; otherwise we remove the extraneous simplexes from the union. Since the union is countable, there must be a barycentre q 0 such that

$$\displaystyle{\beta:=\lambda _{2}(\delta \Delta _{2}(q_{0}) \cap A)> 0.}$$

Let u be an arbitrary starting point of the process. Without loss of generality, and by possibly decreasing our initial choice of δ, assume

  1. 1.

    z u ≤ 2∕3,

  2. 2.

    u can be deterministically mapped at any point \(v \in \delta \Delta _{2}(q_{0}) \cap A\) by first fixing z u and then x v , as in calculation (7.13).

We denote the three corners \(\delta \Delta _{2}(q_{0})\) by a = (x a , y a , z a ), b = (x a δ, y a + δ, z a ) and c = (x a δ, y a , z a + δ). Then,

$$\displaystyle\begin{array}{rcl} 0 <\beta & =& \int _{A\cap \delta \Delta _{2}(q_{0})}\,d\lambda _{2} =\int \int _{A\cap \delta \Delta _{2}(q_{0})}\,d\lambda _{1}\,d\lambda _{1} {}\\ & =& \int d\lambda _{1}\Big(1\!\!1\{x_{a}-\delta <x <x_{a}\}\int d\lambda _{1}1\!\!1\{A \cap \Delta _{2}(q_{0}) \cap \{ z + y = 1 - x\}\}\Big) {}\\ & =& \int \!d\lambda _{1}\Big(\!1\!\!1\{x_{a}-\delta <x <x_{a}\}\!\!\int \!d\lambda _{1}1\!\!1\{(x,y,z)\! \in \! A \cap \Delta _{2}(q_{0})\!:\! y + z = 1\! -\! x\}\!\Big). {}\\ \end{array}$$

Thus, for a positive λ 1 measure of x ∈ (x a δ, x a ), we can find positive λ 1 measure of the intersection between the set A and the line z + y = 1 − x. Thus, we restrict to the measurable set F = {x ∈ [x a δ, x a ]: γ x > n −1} where

$$\displaystyle{\gamma _{x} =\lambda _{1}\{A \cap \delta \Delta _{2}(q_{0}) \cap \{ y + z = 1 - x\}\}.}$$

Integer n is chosen large enough so that λ 1(F) > 0. We have established the existence of a set C so that

$$\displaystyle{C =\{ (x,y,z): x \in F,(x,y,z) \in A \cap \delta \Delta _{2}(q_{0})\},\quad \lambda _{2}(C)> 0.}$$

This is enough to finally complete the proof of the base case. Recall the starting point u = (x u , y u , z u ) of the Markov chain. Define the projection set

$$\displaystyle{F_{C,u} =\{ (x,y,z): z = z_{u},\exists (x_{0},y_{0},z_{0}) \in C\mbox{ s.t. }y+z_{u} = y_{0}+z_{0} = 1-x_{0},x = x_{0}\},}$$

which has positive measure as λ 1(F) = λ 1(F C, u ).

With strictly positive probability, we select to coagulate the first and second coordinate. Then with strictly positive probability, we fragment into the set F C, u . This is because the coagulation-fragmentation process of two coordinates picks a uniformly distributed point on the line x + y = 1 − z u by virtue of construction. The uniform distribution is a scalar multiple of the Lebesgue measure, thus guaranteeing that the probability of selecting such a point is strictly positive. Then, given the chain’s current position, with strictly positive probability, we coagulate the last two coordinates. For the same reason as before, with strictly positive probability, we terminate in the set CA since for any fixed xF, the fragmentation has probability no less than 1∕n to pick up a point (x, y 0, z 0) ∈ C.

To conclude, in just two steps, we have a positive probability of hitting A from any starting point u.

Induction case: Now consider simplex δ N−1, when N ≥ 4, and assume that the proposition is true for all k < N. Let \(A \subseteq \Delta _{N-1}\) be a Borel set of positive λ N−1 measure. As for the base case, we can find a simplex \(\delta \Delta _{N-1}(q_{0})\) such that \(\lambda _{N-1}(A \cap \delta \Delta _{N-1}(q_{0}))> 0\) and with the same Fubini-Tonelli argument conclude that there exist a positive integer n and a positive λ 1-measure set F of x values so that

$$\displaystyle{\lambda _{N-2}(A \cap \delta \Delta _{N-1}(q) \cap \{ x = x_{0} \in F\})> n^{-1}.}$$

Without loss of generality, assume that from the starting point u, we can coagulate and fragment two coordinates, say u 1 and u 2, so that λ 1{xF, x < u 1 + u 2} > 0. Then, for the Markov chain, this implies

$$\displaystyle{ P_{u}\{X_{1} \cdot e_{1} \in F\}> 0. }$$
(7.14)

Let

$$\displaystyle{B =\{ X_{2},X_{3},\ldots,X_{N-1}\;\mbox{ does not coagulate the first coordinate}\}.}$$

Again,

$$\displaystyle{ P_{u}\{B\vert X_{1} \cdot e_{1} \in F\} = P_{u}\{B\}> 0. }$$
(7.15)

Then it is immediate to compute

$$\displaystyle\begin{array}{rcl} L(u,A)& =& P_{u}\{X_{\ell} \in A\mbox{ for some }\ell\} \geq P_{u}\{X_{N-1} \in A\} {}\\ & \geq & P_{u}\{X_{N-1} \in A,X_{1} \cdot e_{1} \in F,B\} {}\\ & \geq & P_{u}\{X_{N-1} \in A\vert B,X_{1} \cdot e_{1} \in F\}P_{u}\{B\vert X_{1} \cdot e_{1} \in F\}P_{u}\{X_{1} \cdot e_{1} \in F\}> 0. {}\\ \end{array}$$

Strict positivity of the last two factors follows from (7.14) and (7.15), while P u {X N−1A | B, X 1 ⋅ e 1F} equals the probability that the N − 2 dimensional fragmentation-coagulation process starting from a random point u 0 with \(x_{u_{0}} \in F\) hits the set \(A \cap \delta \Delta _{N-1}(q) \cap \{ x = x_{0}\}\) in N − 2 steps. By the induction hypothesis, this probability is strictly positive (given the starting point). By restricting the set F so that its measure remains positive, we may further assume that these probabilities are uniformly bounded away from 0, independently of the starting point. This concludes the proof.

Proposition 7.6 (Existence of a Foster-Lyapunov function)

The return times τ A to any set \(A \in \mathcal{ B}(\Delta _{2})\) of positive measure have at most geometric tails, under \(P_{x_{0}}\) . As a consequence, the Foster-Lyapunov function exists.

Proof (Proof of Proposition 7.6)

Let A be a positive Lebesgue measure set. By repeating the construction in the proof of Proposition 7.5 to all coordinates, we can find α i > 0, n i > 0, 1 ≤ iN, δ > 0 and a rational barycentre q 0 and a sequence of measurable sets

$$\displaystyle{A \supseteq A_{1} \supseteq A_{2} \supseteq \ldots \supseteq A_{N}}$$

of positive measure, λ N−1(A N ) = η > 0 and a collection of one-dimensional measurable sets F 1, , F N with the following properties:

  1. 1.

    \(\lambda _{N-2}\{A \cap \delta \Delta _{2}(q_{0}) \cap \{ x_{1} = x_{1}^{{\ast}}\in F_{1}\}\} =\gamma (x_{1}^{{\ast}})> n_{1}^{-1},\quad \lambda _{1}(F_{1}) \geq \alpha _{1}\), \(A_{1} =\{ A \cap \delta \Delta _{2}(q_{0}),x_{1} \in F_{1}\}\)

  2. 2.

    \(\lambda _{N-2}\{A_{k-1} \cap \delta \Delta _{2}(q_{0}) \cap \{ x_{k} = x_{k}^{{\ast}}\in F_{k}\}\} =\gamma (x_{k}^{{\ast}})> n_{k}^{-1},\quad \lambda _{1}(F_{k}) \geq \alpha _{k}\), \(A_{k} =\{ A_{k-1} \cap \delta \Delta _{2}(q_{0}),x_{k} \in F_{k}\},\quad k \geq 2.\)

The basic property of A N is that it is accessible with positive probability (that depends on A), uniformly bounded from below from any point \(u_{0} \in \Delta _{N-1}\). Let a = min{a 1, , a N , 1} and n 0 = max{n 1, , n N }. We bound above the probability that we do not hit A N in the first N − 1 steps, i.e. \(P_{u_{0}}\{\tau _{A_{N}}> N - 1\}\). Suppose we hit in N − 1 steps or less. Then there is at least one sequence of N − 1 coagulation-fragmentation steps for which, if we follow it, we land in A N . We select the appropriate pair of indices at each step with probability 1∕N(N − 1), and, given this, we fragment at an appropriate point with probability at least a. Therefore

$$\displaystyle{ \inf _{x\in \Delta _{N-1}}P_{x}\{\tau _{A_{N}} \leq N - 1\} \geq \left ( \frac{2a} {N(N - 1)}\right )^{N-1} =\rho _{ A}> 0. }$$

Pick a starting point u 0A. Then \(P_{u_{0}}(\tau _{A}> M) \leq P_{u_{0}}(\tau _{A_{N}}> M)\). We will show that the larger tail is bounded above geometrically by an expression independent of u 0. We compute

$$\displaystyle\begin{array}{rcl} P_{x}\{\tau _{A_{N}}> (N - 1)M\}& =& P_{x}\{X_{1}\notin A_{N},\ldots,X_{(N-1)M}\notin A_{N}\} {}\\ & \leq & \left (\sup _{u\in \Delta _{N-1}\setminus A_{N}}P_{u}\{X_{1}\notin A_{N},\ldots,X_{N-1}\notin A_{N}\}\right )^{M} {}\\ & \leq & \left (\sup _{u\in \Delta _{N-1}\setminus A_{N}}P_{u}\{\tau _{A_{N}}> N - 1\}\right )^{M} {}\\ & \leq & (1 -\rho _{A})^{M}. {}\\ \end{array}$$

Finally, since for any 1 ≤ kN − 1 we have \(\{\tau _{A_{N}}> (N - 1)(M + 1)\} \subseteq \{\tau _{A_{N}}> (N - 1)M + k\} \subseteq \{\tau _{A_{N}}> (N - 1)M\}\), we conclude that \(\tau _{A_{N}}\) has geometric tails.

We assemble these propositions in the following theorem.

Theorem 7.1

Let X(t) denote the coagulation-fragmentation Markov chain defined in Sect.  7.2.1 and initial distribution μ 0 on \(\Delta _{N-1}\) . Let μ t denote the distribution of X(t) at time \(t \in \mathbb{N}_{0}\) . Then the uniform distribution on \(\Delta _{N-1}\) is the unique invariant distribution that can be found as the weak limit of the sequence μ t .

Proof

From Proposition 7.4 we have that \(U[\Delta _{N-1}]\) is an invariant distribution for the process. Since the chain is ϕ-irreducible as shown in Proposition 7.5, uniqueness of the equilibrium follows from the existence of a Foster-Lyapunov function, proven in Proposition 7.6.

2.5 Kinetic Equation as Limit of the Agent System

Kinetic equations for the one-agent distribution function with a bilinear interaction term can be derived using mathematical techniques from the kinetic theory of rarefied gases (Cercignani 1988; Cercignani et al. 1994). In this section we discuss how in a time-continuous setting, where the stock (or wealth) of each agent is a continuous variable \(w \in \mathcal{ I} = [0,\infty )\), the exchange mechanism discussed above constitutes a special case of a kinetic model for wealth distribution, proposed by Cordier, Pareschi and Toscani in 2005. In this setting the microscopic dynamics leads to a homogeneous Boltzmann-type equation for the distribution function of wealth f = f(w, t). One can study the moment evolution of the Boltzmann equation to obtain insight into the tail behaviour of the cumulative wealth distribution. We also discuss the grazing collision limit which yields a macroscopic Fokker-Planck-type equation.

Cordier, Pareschi and Toscani (2005) propose a kinetic model for wealth distribution where wealth is exchanged between individuals through pairwise (binary) interactions: when two individuals with pre-interaction wealth v and w meet, then their post-trade wealths v and w are given by

$$\displaystyle{ v^{{\ast}} = (1-\lambda )v +\lambda w +\tilde{\eta } v,\quad w^{{\ast}} = (1-\lambda )w +\lambda v +\eta w. }$$
(7.16)

Herein, λ ∈ (0, 1) is a constant, the so-called propensity to invest. The quantities \(\tilde{\eta }\) and η are independent random variables with the same distribution (usually with mean zero and finite variance σ 2). They model randomness in the outcome of the interaction in a diffusive fashion. Note that to ensure that post-interaction wealths remain in the interval \(\mathcal{I} = [0,\infty )\), additional assumptions need to be made. The discrete exchange dynamics considered in the previous sections find their continuous kinetic analogue when setting \(\eta =\tilde{\eta }\equiv 0\) in (7.16).

With a fixed number N of agents, the interaction (7.16) induces a discrete-time Markov process on \(\mathbb{R}_{+}^{N}\) with N-particle joint probability distribution P N (w 1, w 2, , w N , τ). One can write a kinetic equation for the one-marginal distribution function

$$\displaystyle{P_{1}(w,\tau ) =\int P_{N}(w,w_{2},\ldots,w_{N},\tau )\,dw_{2}\cdots dw_{N},}$$

using only one- and two-particle distribution functions (Cercignani 1988; Cercignani et al. 1994),

$$\displaystyle\begin{array}{rcl} & & P_{1}(w,\tau +1) - P_{1}(w,\tau ) = {}\\ & & \qquad \Bigg\langle \frac{1} {N}\Bigg[\!\int P_{2}(w_{i},w_{j},\tau )\big(\delta _{0}(w - ((1-\lambda )w_{i} +\lambda w_{j} +\tilde{\eta } w_{i})) {}\\ & & \qquad \qquad \ \ +\delta _{0}(w - ((1-\lambda )w_{j} +\lambda w_{i} +\eta w_{j}))\big)dw_{i}\,dw_{j} - 2P_{1}(w,\tau )\Bigg]\Bigg\rangle. {}\\ \end{array}$$

Here, 〈 ⋅ 〉 denotes the mean operation with respect to the random variables \(\eta,\tilde{\eta }\). This process can be continued to give a hierarchy of equations of so-called BBGKY type (Cercignani 1988; Cercignani et al. 1994), describing the dynamics of the system of a large number of interacting agents. A standard approximation is to neglect correlations between the wealth of agents and assume the factorisation

$$\displaystyle{P_{2}(w_{i},w_{j},\tau ) = P_{1}(w_{i},\tau )P_{1}(w_{j}.\tau ).}$$

Standard methods of kinetic theory (Cercignani 1988; Cercignani et al. 1994) can be used to show that, scaling time as t = 2τN and taking the thermodynamical limit N, one obtains that the time evolution of the one-agent distribution function is governed by a homogeneous Boltzmann-type equation of the form

$$\displaystyle\begin{array}{rcl} & & \frac{\partial } {\partial t}f(w,t) = \\ & & \qquad \ \frac{1} {2}\Bigg\langle \int f(w_{i},t)f(w_{j},t)\big(\delta _{0}(w - ((1-\lambda )w_{i} +\lambda w_{j} +\tilde{\eta } w_{i})) \\ & & \qquad \qquad \ +\delta _{0}(w - ((1-\lambda )w_{j} +\lambda w_{i} +\eta w_{j}))\big)\,dw_{i}\,dw_{j}\Bigg\rangle - f(w,t).{}\end{array}$$
(7.17)

Recalling the results from Düring et al. (2008) and Matthes and Toscani (2008), we have the following proposition.

Proposition 7.7

The distribution f(w, t) tends to a steady-state distribution f (w) with an exponential tail.

Proof

The results in Düring et al. (2008) and Matthes and Toscani (2008) imply that f(w, t) tends to a steady-state distribution f (w) which depends on the initial distribution only through the conserved mean wealth M = 0 wf(w, t) dw > 0. As detailed in Düring et al. (2008) and Matthes and Toscani (2008), the long-time behaviour of the s-th moment 0 w sf(w, t) dw is characterised by the function \(\mathcal{S}(s) = (1-\lambda )^{s} +\lambda ^{s} - 1\) which is negative for all s > 1; hence all s-th moments for s > 1 are bounded, and the tail of the steady-state distribution is exponential. □

In general, such equations like (7.17) are rather difficult to treat, and it is usual in kinetic theory to study certain asymptotic limits. In a suitable scaling limit, a partial differential equation of Fokker-Planck type can be derived for the distribution of wealth. Similar diffusion equations are also obtained in Slanina and Lavička (2003) as a mean-field limit of the Sznajd model (Sznajd-Weron and Sznajd 2000). Mathematically, the model is related to works in the kinetic theory of granular gases (Cercignani et al. 1994).

To this end, we study by formal asymptotics the so-called continuous trading limit (λ → 0 while keeping σ η 2λ = γ fixed).

Let us introduce some notation. First, consider test functions \(\phi \in \mathcal{ C}^{2,\delta }([0,\infty ))\) for some δ > 0. We use the usual Hölder norms

$$\displaystyle{ \|\phi \|_{\delta } =\sum _{\vert \alpha \vert \leq 2}\|D^{\alpha }\phi \|_{\mathcal{C}} +\sum _{\alpha =2}[D^{\alpha }\phi ]_{\mathcal{C}^{0,\delta }}, }$$

where \([h]_{\mathcal{C}^{0,\delta }} =\sup _{v\neq w}\vert h(v) - h(w)\vert /\vert v - w\vert ^{\delta }.\) Denoting by \(\mathcal{M}_{0}(A)\), \(A \subset \mathbb{R},\) the space of probability measures on A, we define by

$$\displaystyle{\mathcal{M}_{p}(A) =\bigg\{ \Theta \in \mathcal{ M}_{0}\;\bigg\vert \;\int _{A}\vert \eta \vert ^{p}d\Theta (\eta ) <\infty,\,p \geq 0\bigg.\bigg\}}$$

the space of measures with finite pth moment. In the following all our probability densities belong to \(\mathcal{M}_{2+\delta }\), and we assume that the density \(\Theta\) is obtained from a random variable Y with zero mean and unit variance. We then obtain

$$\displaystyle{ \int _{\mathcal{I}}\vert \eta \vert ^{p}\Theta (\eta )\;d\eta = \mathrm{E}[\vert \sigma _{\eta }Y \vert ^{p}] =\sigma _{ \eta }^{p}\mathrm{E}[\vert Y \vert ^{p}], }$$
(7.18)

where E[ | Y |p] is finite. The weak form of (7.17) is given by

$$\displaystyle{ \frac{d} {dt}\int _{\mathcal{I}}f(w,t)\phi (w)\,dw =\int _{\mathcal{I}}\mathcal{Q}(f,f)(w)\phi (w)\,dw }$$
(7.19)

where

$$\displaystyle{ \begin{array}{rl} \int _{\mathcal{I}}&\mathcal{Q}(f,f)(w)\phi (w)\,dw \\ & = \frac{1} {2}\Big\langle \int _{\mathcal{I}^{2}}\big(\phi (w^{{\ast}}) +\phi (v^{{\ast}}) -\phi (w) -\phi (v)\big)f(v)f(w)\,dv\,dw\Big\rangle. \end{array} }$$

Here, 〈 ⋅ 〉 denotes the mean operation with respect to the random variables \(\eta,\tilde{\eta }\). To study the situation for large times, i.e. close to the steady state, we introduce for λ ≪ 1 the transformation \(\tilde{t} =\lambda t,\;g(w,\tilde{t}) = f(w,t).\) This implies f(w, 0) = g(w, 0), and the evolution of the scaled density \(g(w,\tilde{t})\) follows (we immediately drop the tilde in the following)

$$\displaystyle{ \frac{d} {dt}\int _{\mathcal{I}}g(w,t)\phi (w)\,dw = \frac{1} {\lambda } \int _{\mathcal{I}}\mathcal{Q}(g,g)(w)\phi (w)\,dw. }$$
(7.20)

Due to the interaction rule (7.16), it holds

$$\displaystyle{ w^{{\ast}}- w =\lambda (v - w) +\eta w. }$$

A Taylor expansion of ϕ up to second order around w of the right hand side of (7.20) leads to

$$\displaystyle\begin{array}{rcl} & & \Big\langle \frac{1} {\lambda } \int _{\mathcal{I}^{2}}\phi ^{{\prime}}(w)\left [\lambda (v - w) +\eta w\right ]g(w)g(v)\,dv\,dw\Big\rangle {}\\ & & \phantom{xxxxxxx} +\Big\langle \frac{1} {2\lambda }\int _{\mathcal{I}^{2}}\phi ^{{\prime\prime}}(\tilde{w})\left [\lambda (v - w) +\eta w\right ]^{2}g(w)g(v)\,dv\,dw\Big\rangle {}\\ & & =\Big\langle \frac{1} {\lambda } \int _{\mathcal{I}^{2}}\phi ^{{\prime}}(w)\left [\lambda (v - w) +\eta w\right ]g(w)g(v)\,dv\,dw\Big\rangle {}\\ & & \phantom{xxxxxxx} +\Big\langle \frac{1} {2\lambda }\int _{\mathcal{I}^{2}}\phi ^{{\prime\prime}}(w)\big[\lambda (v - w) +\eta w\big]^{2}g(w)g(v)\,dv\,dw\Big\rangle + R(\lambda,\sigma _{\eta }) {}\\ & & = -\int _{\mathcal{I}^{2}}\phi ^{{\prime}}(w)(w - v)g(w)g(v)\,dv\,dw {}\\ & & \phantom{xxxxxxx} + \frac{1} {2\lambda }\int _{\mathcal{I}^{2}}\phi ^{{\prime\prime}}(w)\big[\lambda ^{2}(v - w)^{2} +\lambda \gamma w^{2}\big]g(w)g(v)\,dv\,dw + R(\lambda,\sigma _{\eta }), {}\\ \end{array}$$

with \(\tilde{w} =\kappa w^{{\ast}} + (1-\kappa )w\) for some κ ∈ [0, 1] and

$$\displaystyle{ R(\lambda,\sigma _{\eta }) =\Big\langle \frac{1} {2\lambda }\int _{\mathcal{I}^{2}}(\phi ^{{\prime\prime}}(\tilde{w}) -\phi ^{{\prime\prime}}(w))\left [\lambda (v - w) +\eta w\right ]^{2}g(w)g(v)\,dv\,dw\Big\rangle. }$$

Now we consider the limit λ, σ η → 0 while keeping γ = σ η 2λ fixed. It can be seen that the remainder term R(γ, σ η ) vanishes in this limit (see Cordier et al. (2005) for details). In the same limit, the term on the right-hand side of (7.20) converges to

$$\displaystyle\begin{array}{rcl} & & -\int _{\mathcal{I}^{2}}\phi ^{{\prime}}(w)(w - v)g(w)g(v)\,dv\,dw + \frac{1} {2}\int _{\mathcal{I}^{2}}\phi ^{{\prime\prime}}(w)\gamma w^{2}g(w)g(v)\,dv\,dw {}\\ & & \qquad \qquad \qquad \qquad = -\int _{\mathcal{I}}\phi ^{{\prime}}(w)(w - m)g(w)\,dw + \frac{\gamma } {2}\int _{\mathcal{I}}\phi ^{{\prime\prime}}(w)w^{2}g(w)\,dw, {}\\ \end{array}$$

with \(m =\int _{\mathcal{I}}vg(v)\,dv\) being the mean wealth (the mass is set to one for simplicity; otherwise it would appear as well here). After integration by parts, we obtain the right-hand side of (the weak form of) the Fokker-Planck equation

$$\displaystyle{ \frac{\partial } {\partial t}g(w,t) = \frac{\partial } {\partial w}\Big((w - m)g(w,t)\Big) + \frac{\gamma } {2} \frac{\partial ^{2}} {\partial w^{2}}\Big(w^{2}g(w,t)\Big), }$$
(7.21)

subject to no-flux boundary conditions (which result from the integration by parts). The same equation has also been obtained by considering the mean-field limit in a trading model described by stochastic differential equations (Bouchaud and Mezard 2000).

3 Remembering Jun-ichi Inoue

One of us (Enrico Scalas) was expecting to meet Jun-ichi Inoue at the 2015 AMMCS-CAIMS Congress in Waterloo, Ontario, Canada. Together with Bertram Düring (also co-author of this paper), we organised a special session entitled Wealth distribution and statistical equilibrium in economics (see: http://www.ammcs-caims2015.wlu.-ca/special-sessions/wdsee/). Even if Enrico never collaborated with Jun-ichi on the specific problem discussed in this paper, they co-authored two research papers, one on the nonstationary behaviour of financial markets (Livan et al. 2012) and another one on durations and the distribution of first passage times in the FOREX market (Sazuka et al. 2009). The former was the outcome of a visit of Jun-ichi to the Basque Center for Applied Mathematics in Bilbao from 3 October 2011 to 7 October 2011. Enrico, Jun-ichi and Giacomo Livan met several times in front of blackboards and computers, and the main idea of the paper (nonstationarity of financial data) was suggested by Jun-ichi. The latter is the result of a collaboration with Naoya Sazuka who, among other things, provided the data from Sony Bank FOREX transactions. This paper is connected to Enrico’s activity on modelling high-frequency financial data with continuous-time random walks. A third review paper was published on the role of the inspection paradox in finance (Inoue et al. 2010). Before leaving for Canada, Enrico received the sad news of Jun-ichi’s death. He had the time to change his presentation in Waterloo to include a short commemoration of Jun-ichi. With Jun-ichi, Enrico lost not only a collaborator, but a friend with an inquisitive mind.