1 Introduction

In recent years, mean-field game theory has emerged as a powerful framework for modeling the behavior of large populations of interacting players in a stochastic environment. This interdisciplinary field lies at the intersection of mathematics, economics, and engineering, offering deep insights into complex systems characterized by strategic interactions. Mean-field game models have found applications in various domains, including for instance, finance, energy systems (Carmona 2021), or traffic management and social dynamics (Festa and Göttlich 2018). The two seminal papers in the field can be considered the contributions by Huang et al. (2006) and Lasry and Lions (2007). The key issue in their proposals, under the assumption of a large number of identically interacting players, is that individual actions do not affect a mean state of the system. This means that an individual player faces an optimization problem against a synthetic player, resulting from the aggregation of a large number of players, which is referred to in this paper as the market. The success of the proposal made it possible to solve various problems, many of which can be found in the two-volume monograph by Carmona and Delarue (2018), which has become a central reference in the field.

The first results of the present paper are in the framework of singular control of diffusions. Our departure point are the results by Alvarez (2018), where the existence and uniqueness of optimal reflecting controls for a diffusion are established. Our contribution is to extend these results to show that the solution found by Alvarez (2018) is in fact the optimal control within the larger class of finite variation controls. To do this, we use the solution of the two-sided ergodic singular control, in the framework proposed by Alvarez (2018), thus extending the class of controls. To achieve our goals, we postulate a verification result in the form of a Hamilton–Jacobi–Bellman equation and use the ergodic properties of the controlled processes to obtain an analytic problem. The control problem has been studied extensively in the literature, see for example Alvarez and Shepp (1998), Hening et al. (2019), and Lande et al. (1994). With respect to applications of singular control results, we mention studies focusing on cash flow management that investigate optimal dividend distribution, recapitalization, or a combination of both, while considering risk neutrality. See, for example, Asmussen and Taksar (1998), Højgaard et al. (2001), Jeanblanc-Picqué and Shiryaev (1995), Paulsen (2008), Peura and Keppo (2006), Shreve et al. (1984).

Our second aim in the present paper is to incorporate a mean-field game dependence into the two-sided ergodic singular control problem for Itô diffusions just described. As a consequence, we obtain necessary and sufficient conditions for the existence of mean-field game equilibrium points, and, for more restricted families of cost functions, uniqueness within the class of reflecting strategies. Finally, we define an N-player problem and prove that a mean-field equilibrium is an approximate Nash equilibrium for the N-player game.

The mean-field game framework is less discussed in the literature. However, there has been increased activity in this area in the recent past. Here we would like to mention the current papers by Aïd et al. (2023), Cao et al. (2023), Dianetti et al. (2023), Kunwai et al. (2022), Christensen et al. (2021), Cao and Guo (2022), on the explicit solution of stationary Mean Field Games with singular and impulsive controls.

The rest of the paper is organized as follows. In Sect. 2 we study the control problem. After introducing the necessary tools, we state and prove the main result of the section, i.e. the optimality of reflecting controls obtained within the class of càdlàg controls. In Sect. 3 we consider the mean-field game problem. It adds the complexity of a two-variable cost function where the second variable represents the market. The main result consists of a set of conditions for the existence and uniqueness of equilibrium strategies, containing also a particular analysis when the cost function is multiplicative. Section 4 presents three examples that illustrate these results. Section 5 contains approximation results. The equilibrium found for mean field games, becomes the limit of Nash equilibrium strategies when considering an individual player in the framework of a symmetric N-player game. A final appendix includes some auxiliary computations corresponding to the examples of Sect. 4.

2 Control problem

In this section we consider the one-player control problem. We first recall results obtained by Alvarez (2018) that play a fundamental role along the paper. These results consist in the determination of optimal control levels in an ergodic framework for a diffusion within the class of reflecting controls. We then prove that the optimal levels found in Alvarez (2018) in fact give the optimal controls within the broader class of of finite variation càdlàg controls.

2.1 Diffusion

Let us consider a filtered probability space \((\Omega ,{\mathcal {F}},\lbrace {\mathcal {F}}_t:t\ge 0^-\rbrace ,{{\textbf{P}}})\) that satisfy the usual assumptions. In order to define the underlying diffusion consider the functions \(\mu :\mathbb {R}\rightarrow \mathbb {R}\) and \(\sigma :\mathbb {R}\rightarrow \mathbb {R}\) assumed to be locally Lipschitz. Under these conditions the stochastic differential equation

$$\begin{aligned} dX_t=\mu (X_t)dt+ \sigma (X_t)dW_t, \ X_{0}=x_0 \end{aligned}$$
(1)

has a unique strong solution up to an explosion time, that we denote by \(X=\{X_t:t\ge 0^-\}\) (see (Protter 2005, Theorem V.38)). Observe that our framework includes quadratic coefficients.

As usual, we define the infinitesimal generator of the process X as

$$\begin{aligned} {\mathcal {L}}_X = \frac{1}{2} \sigma ^2 (x) \frac{\displaystyle d^2}{\displaystyle d^2 x} + \mu (x)\frac{\displaystyle d}{\displaystyle dx}. \end{aligned}$$

We denote the density of the scale function S(x) w.r.t the Lebesgue measure as

$$\begin{aligned} S'(x)=\exp \left( -\int ^x \frac{2\mu (u)}{\sigma ^2(u)} du \right) , \end{aligned}$$

and the density of the speed measure m(x) w.r.t the Lebesgue measure as

$$\begin{aligned} m'(x)=\frac{2}{\sigma ^2(x)S'(x)}. \end{aligned}$$

As mentioned above, the underlying process is controlled by a pair of processes, the admissible controls, that drive it to a convenient region, defined below.

Definition 2.1

An admissible control is a pair of non-negative \(\lbrace {\mathcal {F}}_t \rbrace \)-adapted processes \(\eta =(U=\{U_t\}_{t\ge 0^-},D=\{D_t\}_{t\ge 0^-})\) such that:

  1. (i)

    Each process UD is right continuous and non decreasing almost surely.

  2. (ii)

    For each \(t\ge 0^-\) the random variables \(U_t\) and \(D_t\) have finite expectation.

  3. (iii)

    For every \(x \in \mathbb {R}\) the stochastic differential equation

    $$\begin{aligned} dX^\eta _t:= \mu (X^\eta _t)dt + \sigma (X^\eta _t)dW_t +dU_t-dD_t, \quad X_{0^-}=x \end{aligned}$$
    (2)

    has a unique strong solution with no explosion in finite time.

We denote by \({\mathcal {A}}\) the set of admissible controls.

Note that condition (iii) is satisfied, for instance, when the coefficients are globally Lipschitz (See the remark after Theorem V.38 in Protter (2005).) Observe also that condition (ii) is not a real restriction, as, for instance, the integral in the cost function G(x) in (5) that we aim to minimize, in case of having infinite expectations, is infinite. A relevant sub-class of admissible controls is the set of reflecting controls.

Definition 2.2

For \(a<b\) denote by \(X^{a,b}=\lbrace X^{a,b}_t:t \ge 0 \rbrace \) the strong solution of the stochastic differential equation with reflecting boundaries at a and b:

$$\begin{aligned} dX^{a,b}_t= \mu (X^{a,b}_t)dt + \sigma (X^{a,b}_t)dW_t+dU^a_t-dD^b_t, \qquad X_{0^-}=x. \end{aligned}$$

Here \(U^a=\{U^a_t\}, D^b=\{D^b_t\}\), are the local times of the reflected diffusion in the interval [ab]. They are continuous non-decreasing processes that increase, respectively, only when the solution visits a or b, and make the controlled diffusion satisfy the condition \(a\le X^{a,b}_t\le b\), a.s. for all \(t\ge 0\). As the above equation has a strong solution (see Saisho (1987), Theorem 5.1)), the pair \((U^a, D^b)\) belongs to \({\mathcal {A}}\), we call them reflecting controls. If \(x \notin (a,b)\), we begin the policy by sending the process to the closest point of the interval [ab] at time \(t=0\). This is why we need to begin our evolution at \(t=0^-\), in order to have càdlàg controls.

We introduce below the cost function c(xy) to be considered in the mean-field game formulation, satisfying some natural conditions.

Assumption 2.3

Assume that \(c :\mathbb {R}^2 \rightarrow \mathbb {R}_+\) is a continuous function, and the positive constants \(q_u, q_d\) are the unit cost of using the associated controls. Assume that, for each fixed \(y \in \mathbb {R}\) there exist a value \(x_y\) such that

$$\begin{aligned} c(x,y) \ge c(x_y,y)\ge 0, \qquad \hbox { for all}\ x \in \mathbb {R}, \end{aligned}$$

and positive constants \(K_y\) and \(\alpha _y\) such that

$$\begin{aligned} c(x,y)+K_y \ge \alpha _y \vert x \vert , \qquad \hbox { for all}\ x \in \mathbb {R}. \end{aligned}$$
(3)

Consider the maps

$$\begin{aligned} \pi _1(x,y)=c(x,y)+q_d\mu (x), \hspace{10mm} \pi _2(x,y)=c(x,y)-q_u \mu (x), \end{aligned}$$

and assume that for each fixed \(y \in \mathbb {R}\):

  1. (i)

    There exists a unique real number \(x^y_i = {{\,\mathrm{arg\,min}\,}}\lbrace \pi _i(x,y):x\in \mathbb {R}\rbrace \) so that \(\pi _i( \cdot ,y)\) is decreasing on \((-\infty ,x^y_i)\) and increasing on \((x^y_i, \infty )\), where \(i=1,2\).

  2. (ii)

    The following limits hold:

    $$\begin{aligned} \lim _{x\rightarrow \infty } \pi _1(x,y) = \lim _{x \rightarrow -\infty } \pi _2(x,y) = \infty . \end{aligned}$$
    (4)

Remark 2.4

In the control problem case, when there is no aggregate of players, the cost function depends only on the first variable. We then set the second variable above to \(y=0\), and denote \(c(x)=c(x,0)\). This function satisfies \(c(x)\ge c(x_0)\ge 0\) for some \(x_0\), \(c(x)+K\ge \alpha |x|\), for some positive constants K and \(\alpha \), and the functions are \(\pi _1(x)=c(x)+q_d\mu (x)\) and \(\pi _2(x)=c(x)-q_u\mu (x)\) have their respective minima at \(x^0_1,x^0_2\), and satisfy (4).

Definition 2.5

We define the ergodic cost function as

$$\begin{aligned} G(x)= \inf _{\eta \in {\mathcal {A}}} \limsup _{T\rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T c(X^\eta _s)ds +q_u U_T +q_d D_T \right) , \end{aligned}$$
(5)

where \(\eta =(U,D)\) is an admissible control in \({\mathcal {A}}\).

The existence of a unique pair of optimal controls within the class of reflecting controls was obtained by Alvarez (2018), from where we borrow the notation and assumptions. In the following result, we summarize (in a convenient way for our purposes) results of Lemma 2.1 and Theorem 2.3 from Alvarez (2018). Let us mention that condition (3) is not necessary for Alvarez (2018) results, we will use it in the sequel to prove optimality within the class of feasible controls.

Theorem 2.6

(Alvarez (2018)) Under Assumption 2.3:

(a) If \(a<b\) then

$$\begin{aligned}{} & {} \lim _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T c(X_s^{a,b}) ds +q_u U^a_T + q_d D^b_T \right) \nonumber \\{} & {} \qquad =\frac{1}{m(a,b)} \left[ \int _a^b c(u)m(du) +\frac{q_u}{S'(a)}+\frac{q_d}{S'(b)} \right] =:C(a,b). \end{aligned}$$
(6)

(b) There is an unique pair of points \(a^{*}<b^{*}\) that satisfy the equations:

  1. (i)

    \(\pi _1(b^{*}) = \pi _2(a^{*})\),

  2. (ii)

    \( \int _{a^{*}}^{b^{*}} \left( \pi _1(t)-\pi _1(b^{*}) \right) m(dt) + \frac{\displaystyle q_u+q_d}{\displaystyle S'(a^{*})} =0\).

Furthermore, the pair \((a^{*}, b^{*})\in (-\infty ,x_2^0) \times (x_1^0, \infty )\) minimizes the expected long-run average cost within the class of reflecting controls.

Remark 2.7

Condition (i) is obtained from the fact that \(X^{a,b}\) is stationary. Regarding equation (ii), it arises after differentiation in order to determine the minimum. The uniqueness of the solution is proved based on the properties of the cost function. Conditions (i) and (ii) here are equivalent to conditions (2.5) and (2.6) in Alvarez (2018) as they reduce to solving \(C(a,b)-\pi _2(a)=\pi _1(b)-C(a,b)=0\), as seen in the proof of Lemma 2.1 in Alvarez (2018). See more details in Alvarez (2018).

2.2 Optimality within \({\mathcal {A}}\)

Optimality within the class \({\mathcal {A}}\) of càdlàg controls requires further analysis. As expected, and mentioned in Alvarez (2018), the optimal controls within class \({\mathcal {A}}\) are the same controls found in the class of reflecting controls. An analogous result to the one presented below was obtained for non-negative diffusions when considering a maximization problem in Cao et al. (2023) (see also Kunwai et al. 2022). More precisely, it is clear that

$$\begin{aligned} \inf _{a<b} \lim _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T c(X_s^{a,b}) ds +q_u U^b_T+q_d D^a_T \right) \ge G(x). \end{aligned}$$

Then, to establish the optimality within \({\mathcal {A}}\) it is necessary to obtain the other inequality. This task is carried out with the help of the solution of the free boundary problem (13) below, similarly to Cao et al. (2023). The mentioned differences with this situation require different hypotheses and slightly different arguments.

Theorem 2.8

(Verification) Consider a diffusion defined by (1) and a cost function c(x) satisfying Assumption 2.3. Suppose that there exist a constant \(\lambda \ge 0\) and a function \(u \in C^2(\mathbb {R})\) such that

$$\begin{aligned} ({\mathcal {L}}_X u) (x)+c(x) \ge \lambda , \qquad -q_u \le u'(x) \le q_d,\quad \hbox { for all}\ x\in \mathbb {R}. \end{aligned}$$
(7)

Define the subset of admissible controls

$$\begin{aligned} {\mathcal {B}}= \left\{ \eta \in {\mathcal {A}}:\liminf _{T \rightarrow \infty } \frac{1}{T}\left| {{\textbf{E}}}_x(u(X^\eta _T))\right| =0 \right\} . \end{aligned}$$
(8)

Then,

$$\begin{aligned} \inf _{\eta \in {\mathcal {B}}} \limsup _{T\rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T c(X^\eta _s)ds +q_u U_T +q_d D_T \right) \ge \lambda . \end{aligned}$$
(9)

Remark 2.9

The consideration of the subclass \({\mathcal {B}}\) is not a restriction, as will be seen below. More precisely, it will be proved (using condition (3)), that controls in \({\mathcal {A}}\setminus {\mathcal {B}}\) give infinite values of the long run costs, being then not relevant in the computation of G(x) in (5).

Proof

Fix \(T>0\). For each \(n\ge 1\) define the stopping times

$$\begin{aligned} T_n= \inf \lbrace t \ge 0 :\vert X^\eta _t \vert \ge n \rbrace \wedge T\nearrow T\quad a.s. \end{aligned}$$

Using Itô formula for processes with jumps (observe that the diffusion X is continuous but the controls can have jumps, and in consequence the controlled processes \(X^\eta \) can have jumps),

$$\begin{aligned} u(X^\eta (T_n))&= u(x)+ \int _0^{T_n }u'(X^\eta _{s-} ) dX^\eta _s +\frac{1}{2} \int _0^{T_n}u''(X^\eta _{s-})d \langle (X^\eta )^c,(X^\eta )^c \rangle _s \nonumber \\&\quad +\sum _{s \le T_n} \left( u(X^\eta _s)-u(X^\eta _{s-}) -u'(X^\eta _{s-} ) \bigtriangleup X^\eta _s \right) . \end{aligned}$$
(10)

The r.h.s in (10) can be rewritten as

$$\begin{aligned} u(x)&+ \int _0^{T_n} ({\mathcal {L}}_X u)(X^\eta _{s-})ds -\int _0^{T_n}\mu (X^\eta _{s-})u'(X^\eta _{s-}) ds \nonumber \\&+ \int _0^{T_n}u'(X^\eta _{s-}) d X^\eta _s +\sum _{s \le T_n} \left( u(X^\eta _s)-u(X^\eta _{s-}) -u'(X^\eta _{s-} ) \bigtriangleup X^\eta _s \right) . \end{aligned}$$
(11)

Using the fact that \(u'(X^\eta _{s-})=u'(X^\eta _s)\) in a set of total Lebesgue measure in [0, T] almost surely, and that \(\bigtriangleup X^\eta _s= \bigtriangleup U_s-\bigtriangleup D_s\), we rewrite (11) as

$$\begin{aligned} u(x)&+ \int _0^{T_n} ({\mathcal {L}}_X u)(X^\eta _{s-})ds + \int _0^{T_n}u'(X^\eta _{s-})\sigma (X^\eta _{s-}) d W_s \nonumber \\&+ \int _0^{T_n} u'(X^\eta _{s-}) d (U_s-D_s)\nonumber \\&+\sum _{s \le T_n}\left( u(X^\eta _s)-u(X^\eta _{s-}) - u'(X^\eta _{s-})(\bigtriangleup U_s -\bigtriangleup D_s) \right) . \end{aligned}$$
(12)

Therefore, denoting by \(U_s^c\) and \(D_s^c\) the continuous parts of the processes \(U_s\) and \(D_s\) respectively, and using the inequalities (7) in the hypothesis, we obtain

$$\begin{aligned} u(X^\eta (T_n))&\ge u(x)+ \lambda T_n - \int _0^{T_n}c(X^\eta _{s-}) ds +\int _0^{T_n} u'(X^\eta _{s-}) \sigma (X^\eta _{s-}) dW_s\\ {}&\quad -\int _0^{T_n}q_u dU^c_s -\int _0^{T_n} q_d dD^c_s - \sum _{0 \le s \le T_n} ( \bigtriangleup U_s q_u+\bigtriangleup D_s q_d) \\&=u(x)+ \lambda T_n - \int _0^{T_n}c(X^\eta _{s-}) ds +\int _0^{T_n} u'(X^\eta _{s-}) \sigma (X^\eta _{s-}) dW_s \\&\quad -q_u U_{T_n}-q_d D_{T_n}. \end{aligned}$$

Rearranging the terms above and taking the expectation we obtain

$$\begin{aligned} {{\textbf{E}}}_x (u(X^\eta (T_n))) -u(x)+ {{\textbf{E}}}_x \left( \int _0^{T_n} c(X^\eta _{s-})ds + q_u U_{T_n}+ q_d D_{T_n} \right) \ge \lambda {{\textbf{E}}}_x (T_n). \end{aligned}$$

Taking first limit as n tends to infinity, dividing then by T, and finally taking \(\liminf \) as T goes to infinity we obtain (9) concluding the proof of the verification theorem. \(\square \)

Consideration of free boundary problems such as (7) in the framework of singular control problems can be found for example in Alvarez (2018), Cao et al. (2023), and Kunwai et al. (2022). In Alvarez (2018), the author studied the same problem of this section and used a free boundary problem to find some useful properties of optimal controls. More precisely, under the same assumptions as above, to study the ergodic optimal control problem in the class of reflecting controls, the author considered the free boundary problem consisting of finding \(a<b, \lambda \) and a function u in \(C^2(\mathbb {R})\) such that

(13)

For this problem, it is proved (see Remark 2.4 in Alvarez (2018)) that there exists a unique solution that satisfies (7). Furthermore,

$$\begin{aligned} \lambda =C(a,b), \end{aligned}$$
(14)

the ergodic cost defined by (6), as states equation (2.15) in Alvarez (2018). Here, similar to Kunwai et al. (2022), we use the results obtained in Alvarez (2018) to get a suitable candidate to apply Theorem 2.8.

Theorem 2.10

Consider a diffusion defined by (1) and a cost function c(x) satisfying Assumption 2.3. Then, the reflecting controls with levels given in (b) in Theorem 2.6 minimize the ergodic cost G(x) in (5) within the set \({\mathcal {A}}\) of admissible controls.

Proof

Take u as the solution of the free boundary problem (13) defined above. In view of Theorem 2.8, we need to prove that the infimum of the ergodic cost defining G(x) is realized in the set \({\mathcal {B}}\) defined in (8). Take then \(\eta \in {\mathcal {A}}\setminus {\mathcal {B}}\). By definition of \({\mathcal {B}}\), there exist constants \(\epsilon >0\) and \(S>0\) such that

$$\begin{aligned} {{\textbf{E}}}_x u(X^\eta _s)>\epsilon s,\quad \hbox { for all}\ s \ge S. \end{aligned}$$
(15)

The second statement in (7) implies that \(|u(x)-u(0)|\le (q_u+q_d)|x|\). From this, it follows

$$\begin{aligned} c(x)\ge A u(x)-B, \end{aligned}$$

for \(A=\alpha /(q_u+q_d)\) and \(B=\alpha u(0)/(q_u+q_d)+K\), see (3). In view of (15), this implies

$$\begin{aligned} \limsup _{T\rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T c(X^\eta _s)ds \right) \ge \limsup _{T\rightarrow \infty } \frac{1}{T} \int _S^T (A\epsilon s-B)ds =\infty . \end{aligned}$$

As a consequence, for any \(\eta \in {\mathcal {A}}\setminus {\mathcal {B}}\), we have

$$\begin{aligned} \limsup _{T \rightarrow \infty } \frac{1}{T}{{\textbf{E}}}_x \left( \int _0^T c(X^\eta _s)ds+q_uU_T+q_d D_T\right) =\infty . \end{aligned}$$

Finally, as the class of reflecting controls gives finite ergodic limits by Theorem 2.6, the infimum can be taken in the subclass \({\mathcal {B}}\). So Theorem 2.8 gives the equality \(G(x)=\lambda =C(a,b)\) (see (14)), concluding the proof. \(\square \)

3 Mean-field game problem

As mentioned above, in the mean-field game formulation, the cost function depends on two variables, respectively the state of the player and the state of an aggregate of players referred to as the market. The state of the market is the expectation of a continuous function of the diffusion process under some given controls.

The study of the existence and uniqueness of equilibrium points begins with the application of Theorem 2.6 when the state of the market is asymptotically constant. The cost function becomes one-dimensional and the results in Alvarez (2018) can be applied.

More precisely, assuming f(x) continuous, the expectation of the market diffusion \({{\textbf{E}}}_x(f(X^{c,d}_t))\) has an ergodic limit, denoted R(cd), and applying the previous results, we can prove that the optimal controls for the player should be found in the class of reflecting controls, considering a one variable cost function of the form \(c(\cdot ,R(c,d))\). This is why we assume that the market is also controlled by reflections at some levels \(c<d\), and expect to obtain an equilibrium point when the optimal levels \(a<b\) that control the player’s diffusion coincide with \(c<d\) (see Definition 3.1). Note that the question of the existence of equilibrium strategies beyond the class of reflecting controls is not addressed here. The requirements to apply these results in the mean-field game formulation follow.

3.1 Conditions for optimality and equilibrium

In this setting, we can generalize the results of the section before using some simple ergodic results for diffusions. Recall that the function f(x) is assumed to be continuous.

Definition 3.1

We say that a control \(\eta ^*\) is an equilibrium of the mean-field game if it belongs to the set

$$\begin{aligned} {{\,\mathrm{arg\,min}\,}}_{\eta =(U,D) \in {\mathcal {A}}} \left\{ \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T c\big (X^\eta _s,{{\textbf{E}}}_x(f(X_s^{\eta ^*}))\big ) ds+q_u U_T+ q_d D_T \right) \right\} . \end{aligned}$$

In case the control is reflecting, i.e. \(\eta ^*=(U^{a^*},D^{b^*})\) we say that \((a^*,b^*)\) is an equilibrium point.

The idea of the above definition is to consider situations in which the individual player has no incentive to act differently to the market. Regarding the three-step proposal of (Carmona and Delarue 2013, Section 2.2), we would (i) choose a control \(\mu \in {\mathcal {A}}\) for the market, (ii) solve the standard stochastic problem

$$\begin{aligned} \inf _{\eta =(U,D) \in {\mathcal {A}}} \left\{ \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T c\big (X^\eta _s,{{\textbf{E}}}_x(f(X_s^{\mu }))\big ) ds+q_u U_T+q_d D_T \right) \right\} . \end{aligned}$$

to obtain a control \(\eta \) (depending on \(\mu \)), and (iii) find a fixed point in \({\mathcal {A}}\) of the map \(\mu \mapsto \eta \). Compared to Definition 3.2 in Cao et al. (2023), closer to our formulation, Definition 3.1 admits a time dependent value representing the market state. More precisely, in Cao et al. (2023), the authors consider situations in which the controlled market process has a stationary distribution, whose mean has to coincide with the equilibrium value. If this is the case, as seen in Sect. 2, the control to be an equilibrium, in general terms, should be a reflecting one. Nevertheless, as the following results shows, when considering reflecting controls, we can substitute the time dependent value by its limit in Definition 3.1.

Theorem 3.2

Consider the points \( a<b\), \(c<d\), and \(x \in \mathbb {R}\). Then

$$\begin{aligned} \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T c(X_s^{a,b},{{\textbf{E}}}_x\big (f(X_s^{c,d}))\big ) ds+q_d dD^b_s+q_u dU_s^a \right) \nonumber \\ =\frac{1}{m(a,b)} \left[ \int _a^b c(u,R(c,d))m(du) + \frac{q_u}{S'(a)}+\frac{q_d}{S'(b)} \right] , \end{aligned}$$
(16)

where

$$\begin{aligned} R(c,d)= \int _c^d\frac{f(u)}{m(c,d)}m(du). \end{aligned}$$

Proof

Applying Theorem 2.6 with the cost function \(c(\cdot ,R(c,d))\) we obtain that

$$\begin{aligned} \lim _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T c(X_s^{a,b},R(c,d))ds+q_u U_T^a+q_d D^b_T\right) \\ =\frac{1}{m(a,b)} \left[ \int _a^b c(u,R(c,d))m(du) + \frac{q_u}{S'(a)}+\frac{q_d}{S'(b)}\right] , \end{aligned}$$

i.e. the r.h.s. in (16). It remains then to verify that

$$\begin{aligned} \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T \vert c(X_s^{a,b},{{\textbf{E}}}_x(f(X_s^{c,d})))-c(X_s^{a,b},R(c,d))\vert ds\right) =0. \end{aligned}$$
(17)

In order to do this, define the continuous function \(H:f([c,d]) \rightarrow \mathbb {R}^{+}\) by

$$\begin{aligned} H(y)= \max _{u \in [a,b]} \vert c(u,y)-c(u,R(c,d)) \vert , \end{aligned}$$

and observe that the limit in (17) can be bounded by

$$\begin{aligned}{} & {} \limsup _{T \rightarrow \infty } \frac{1}{T}\int _0^T H({{\textbf{E}}}_x\big (f(X_s^{{c},{d}})))ds \\{} & {} \qquad =\limsup _{T \rightarrow \infty } \frac{1}{T} \int _0^T H \left( \int _c^d f(y) {{\textbf{P}}}_s (x,dy)\right) ds, \end{aligned}$$

with \({{\textbf{P}}}_s(x, dy)= {{\textbf{P}}}_x(Y^{{c},{d}}_s \in dy)\). This limit is zero because

$$\begin{aligned} H \Bigg ( \int _c^d f(y) {{\textbf{P}}}_s (x,dy)\Bigg )\rightarrow H(R({c},{d}))=0, \end{aligned}$$

as H is uniformly continuous, bounded and

$$\begin{aligned} \left\| {{\textbf{P}}}_s(x, \cdot )- \frac{1}{m({c},{d})}m(\cdot ) \right\| \rightarrow 0,\qquad \hbox { as}\ s \rightarrow \infty , \end{aligned}$$

with the norm of total variation (see Theorem 54.5 in Rogers and Williams (2000)). It follows that (17) holds, concluding the proof. \(\square \)

The existence and uniqueness of minimizers given in (b) in Theorem 2.6 can also be generalized, by noticing that in Theorem 3.2 the second variable in the cost function is fixed. The optimality of reflecting controls within the class of càdlàg controls corresponding to Definition 3.1 follows from Theorem 2.10.

Theorem 3.3

For a fixed (ab), the infimum of the ergodic problem is reached only at a pair \((a^{*},b^{*})\) such that

  1. (i)

    \(\pi _1(b^{*},R({a},{b}))= \pi _2(a^{*},R({a},{b})),\)

  2. (ii)

    \( \displaystyle \int _{a^{*}}^{b^{*}} \left( \pi _1(t,R({a},{b}))-\pi _1(b^{*},R({a},{b})) \right) m(dt) + \frac{\displaystyle q_u+q_d}{\displaystyle S'(a^{*})}=0. \)

Moreover \((a^{*},b^{*}) \in (-\infty ,x_2^{R(a,b)}) \times (x_1^{R(a,b)},\infty )\)

Based on this result we obtain a condition for equilibrium of the mean-field game (see Definition 3.1).

Theorem 3.4

A pair \(a<b\) is an equilibrium point if and only if

  1. (i)

    \(\pi _1(b,R(a,b))= \pi _2(a,R(a,b)),\)

  2. (ii)

    \( \displaystyle \int _{a}^b \left[ \pi _1(t,R(a,b))-\pi _1(b,R(a,b)) \right] m(dt) + \frac{\displaystyle q_u+q_d}{\displaystyle S'(a)}=0.\)

Moreover \((a,b) \in (-\infty ,x_2^{R(a,b)}) \times (x_1^{R(a,b)},\infty )\)

3.2 The multiplicative case

In this subsection, we assume that the cost function has a multiplicative form.

Assumption 3.5

The cost function satisfying Assumption 2.3, is factorized as

$$\begin{aligned} c(x,y)= g(x)h(y), \end{aligned}$$

where the factors satisfy

  1. (i)

    \(g:\mathbb {R}\rightarrow [0,\infty )\) is a convex function, with \(g(x)\ge g(0)\),

  2. (ii)

    \(h:\mathbb {R}\rightarrow (0,\infty )\) is continuous, with \(h(x)\ge h(0)\).

Note that such a multiplicative decomposition is particularly natural when g(x) is interpreted as a standardized representation of the units of a good corresponding to a state x and h(y) as the factor modeling the unit cost based in the market.

We give a first result that follows from Theorem 3.4 if the cost function is multiplicative. In this situation, using condition (i), one of the variables can be obtained as a function of the other. For this purpose, consider the set

$$\begin{aligned} C_a = \lbrace b \in \mathbb {R} :b> x_1^{R(a,b)} \vee a, \ x_2^{R(a,b)} > a, \ \pi _1 (b,R(a,b))=\pi _2(a,R(a,b)) \rbrace . \end{aligned}$$

Observe that if \(C_a=\emptyset \), there are no equilibrium points. We then assume condition \(C_a \ne \emptyset \) if and only if \(a \le 0\). This means that we search for the equilibrium points in a connected set. Furthermore, for a fixed \(a\le 0\) we denote

$$\begin{aligned} \rho (a)=\inf C_a, \end{aligned}$$
(18)

and

$$\begin{aligned} L(a)=R(a,\rho (a)). \end{aligned}$$

Proposition 3.6

Suppose that the cost function factorizes as in Assumption 3.5, and there exists a point \(a_0 \le 0\) such that the function \(\rho \) defined via (18) is continuous in \((-\infty , a_0]\). Then,

\(\mathrm (C_1)\):

if

$$\begin{aligned} \int _{a_0}^{\rho (a_0)} (\pi _1 (t,L(a_0))-\pi _1(\rho (a_0),L(a_0)))m(dt) +\frac{q_u+q_d}{S'(a_0)} \ge 0, \end{aligned}$$

then there is at least one equilibrium point.

\(\mathrm (C_2)\):

Furthermore, if in \((-\infty , a_0]\),

$$\begin{aligned} \pi _2 (t,L(a_2))- \pi _2(a_2,L(a_2))< \pi _2 (t,L(a_1))- \pi _2(a_1,L(a_1)) \nonumber \\ \forall (a_2,a_1,t) \ \text { s.t, } a_2<a_1<t \le a_0, \end{aligned}$$
$$\begin{aligned} \pi _1(t,L(a_2) )- \pi _1(\rho (a_2),L(a_2)) < \pi _1(t,L(a_1) )- \pi _1(\rho (a_1),L(a_1)) \nonumber \\ \forall (a_2,a_1,t) \ \text { s.t. } \rho (a_2)>\rho (a_1) > t \ge a_0, \end{aligned}$$

and

$$\begin{aligned} \int _r^{l} (\pi _1 (t,R(r,l))-\pi _1(l,R(r,l)))m(dt)+ \frac{q_u+q_d}{S'(r)}>0, \nonumber \hspace{15mm} \\ \forall r \in (a_0,\rho (a_0)), l >r , \ \pi _1 (l,R(r,l))=\pi _2(r,R(r,l)), \end{aligned}$$
(19)

then the equilibrium is unique.

Proof

For the existence of equilibrium points, we need to prove

$$\begin{aligned} \int _{A}^{\rho (A)} (\pi _1 (t,L(A))-\pi _1(\rho (A),L(A)))m(dt) +\frac{q_u+q_d}{S'(A)} < 0, \end{aligned}$$

for some \(A<a_0\). First, observe that the inequality can be rewritten as

$$\begin{aligned}{} & {} \int _{A}^{0} (\pi _2 (t,L(A))-\pi _2(A,L(A)))m(dt) \nonumber \\{} & {} +\int _{0}^{\rho (A)} (\pi _1 (t,L(A))-\pi _1(\rho (A),L(A)))m(dt) +\frac{q_u+q_d}{S'(0)}<0. \end{aligned}$$
(20)

Furthermore, due to the nature of the multiplicative cost, the points \(x_i^y,i=1,2\) defined in (2.3) can be taken all equal to \(x_i^0\) for each i respectively. Thus, for A negative enough, both integrands are always negative and tend to \(-\infty \) when \(A \rightarrow -\infty \).

Finally, for the uniqueness, condition \((C_2)\) implies that the map defined in \((-\infty ,a_0]\):

$$\begin{aligned} a \rightarrow \int _{a}^{\rho (a)} (\pi _1 (t,L(a))-\pi _1(\rho (a),L(a)))m(dt) +\frac{q_u+q_d}{S'(a)}, \end{aligned}$$

is monotone, thus concluding that the root of this map is unique. \(\square \)

Remark 3.7

Condition \(\mathrm (C_2)\) is a condition on differences of value functions. In particular, if we assume \(\pi _2 \in C^2((-\infty ,a_0) \times \mathbb {R}) \), f defined in the introduction of the section is increasing and L(a) is increasing, then the first inequality in condition \(\mathrm (C_2)\) holds if \(\pi _2\) has negative cross second derivative in \((-\infty , a_0) \times \mathbb {R}\) which is equivalent to the function

$$\begin{aligned} (a, \mu ) \rightarrow \pi _2(a, \langle f,\mu \rangle ), \quad a \in (-\infty ,a_0), \ \mu \text { a probability measure,} \end{aligned}$$

being submodular (see Example 2 of (Dianetti et al. 2021, Assumption 2.9)). A similar analysis can be made with the second inequality (the function in this case is supermodular).

In the particular case of a diffusion without drift, the conditions of the previous proposition are satisfied under the following simple conditions.

Corollary 3.8

Suppose that the cost function factorizes as in Assumption 3.5. Assume furthermore that g is unbounded, convex and with minimum at zero, and the diffusion process (1) has no drift. Then,

(a) the function \(\rho (a)\) is defined as the unique solution of the equation \(h(a)=h(b)\), with \(a \le 0 \le b\), and there exists an equilibrium point,

(b) if the function \(h(R(a,\rho (a)))\) is strictly decreasing for \(a\le 0\), the equilibrium is unique.

Proof

Take \(a_0= 0 \). We have that \(\pi _1(b,R(a,b))=\pi _2 (a,R(a,b))\) is equivalent to the equality \(g(b)=g(a)\), thus from the fact that g is convex with a minimum at zero, the restriction of g to \(x<0\) is an invertible function, denote it by \(g_{\vert _{(-\infty ,0)}}\), and we can define

$$\begin{aligned} \rho (a)=\left( g_{\vert _{(-\infty ,0)}}\right) ^{-1}(a). \end{aligned}$$

We conclude part (a) from the fact \(\rho (0)=0\) and condition \((C_1)\) and is fulfilled. Condition \((C_2)\) is verified, the first two statements follow from the monotonicity of h and \(a \rightarrow g(a,R(a))\) because the inequalites can be rewritten as:

$$\begin{aligned} (g(t)-g(a_2))h(R(a_2,\rho (a_2)))< (g(t)-g(a_1))h(R(a_1,\rho (a_1))) \nonumber \\ \forall (a_2,a_1,t) \ \text { s.t, } a_2<a_1<t \le 0, \end{aligned}$$
$$\begin{aligned} (g(t) - g(\rho (a_2)) )h(R((a_2),\rho (a_2))) < (g(t)-g(a_1))h(R(a_1,\rho (a_1))) \nonumber \\ \forall (a_2,a_1,t) \ \text { s.t. } \rho (a_2)>\rho (a_1) > t \ge 0. \end{aligned}$$

The third integral (19) condition in \((C_2)\) is automatic, as \((a_0,\rho (a_0))=(0,0)\). \(\square \)

4 Examples

We present below several examples where the equations of Theorem 3.4 can be expressed more explicitly and solved numerically. To help the presentation, for each example, we plot in an (ab) plane the implicit curves defined by these equations. To this end, we write equation (i) in Theorem 3.4 as

$$\begin{aligned} F(a,b)=\pi _1(a,R(a,b))-\pi _2(b,R(a,b))=0, \end{aligned}$$

and draw first the set of its solutions. We then draw the set determined by condition (ii). Note that there are cases where there is an intersection of both curves outside the set \(\lbrace a < b \rbrace \), these points are of no interest for our problem. In all examples the function affecting the market expectation is \(f(x)=x\). Furthermore, to ease of exposition, we present the conclusions and the plots and defer the computations to the Appendix (see Sect. A.1).

4.1 Examples with multiplicative cost

The cost function now has the form

$$\begin{aligned} c(x,y)= \max (-\lambda x,x) (1+ \vert y \vert ^{\beta }),\quad \lambda >0,\quad \beta \ge 1, \end{aligned}$$
(21)

and \(q_d \lambda = q_u\).

Remark 4.1

In this scenario the value \(\max (-\lambda x, x)\) could represent the maintenance cost of certain property done by a third party. This third party will change the price of its services depending on the demand of the market.

We consider a mean reverting process \(X=\{X_t\}\) that follows the stochastic differential equation

$$\begin{aligned} dX_t= -\theta X_t dt+ \sigma (X_t) dW_t, \end{aligned}$$
(22)

such that \(\sigma \) is a function that satisfies the conditions of Sect. 2 and \(q_d \theta <1\). Under these conditions the function c(xy) is under Assumptions 2.3. First observe that if we take \(x^y =0\) for all \(y \in \mathbb {R}\), then \(c(x,y ) \ge c(x^y,y)=0\), Second, by taking \(K_y=0, \ \alpha _y= \lambda \wedge 1\) for all \(y \in \mathbb {R}\), condition (3) is satisfied. Finally observe that for every \(y \in \mathbb {R}\) the maps \( \pi _1(x,y), \ \pi _2(x,y) \) are decreasing on x in \((-\infty ,0)\), increasing on x in \((0,\infty )\) and both conditions \(\mathrm {(i)}\) and \(\mathrm {(ii)}\) in Assumptions 2.3 are satisfied.

In the particular case when \(\sigma \) is constant, we can compute

$$\begin{aligned} R(a,b)= \sqrt{\frac{\sigma ^2 }{ \theta \pi }} \left( \frac{\displaystyle e^{-a^2 \frac{\theta }{\sigma ^2}} -e^{-b^2 \frac{\theta }{\sigma ^2}} }{\textrm{erf}\left( \sqrt{\frac{\theta }{\sigma ^2}} b\right) - \textrm{ erf} \left( \sqrt{\frac{\theta }{\sigma ^2}} a \right) } \right) , \end{aligned}$$

where \(\textrm{erf}(x)=\frac{1}{\sqrt{2\pi }}\int _{-\infty }^xe^{-y^2/2}\,dy\). Using Proposition 3.6, existence of equilibrium points holds. Furthermore, if \(\sigma \) is even then uniqueness also holds. Again, the calculations are in Appendix 1. In the graphical examples below \(\sigma \) is constant (Figs. 1, 2).

Fig. 1
figure 1

Mean reverting process (22) with multiplicative cost and parameters \(\theta =0.4,q_d=0.1, \lambda =1, \sigma =2, \beta =1\). The equilibrium point (EP) is \((-0.646,0.646)\) with value 0.617

4.2 “Follow the market" examples

The idea is to introduce a cost function in such a way that the player has incentives to follow the market evolution. The cost function is then

$$\begin{aligned} c(x,y)= \vert x - y \vert . \end{aligned}$$

4.2.1 Brownian motion with negative drift

In this case, the driving process \(X=\{X_t\}\) is

$$\begin{aligned} X_t=\mu t+W_t, \end{aligned}$$

where \(\mu <0\). We proceed to prove that Assumption 2.3 is satisfied. By taking \(x^y =y\) for all \(y \in \mathbb {R}\), then \(c(x,y ) \ge c(x^y,y)=0\), Second, by taking \(K_y=\vert y \vert , \ \alpha _y= 1\) for all \(y \in \mathbb {R}\) then (3) is satisfied. Finally observe that for every \(y \in \mathbb {R}\) the maps \( \pi _1(x,y), \ \pi _2(x,y) \) are decreasing on x in \((-\infty ,y)\), increasing on x in \((y,\infty )\) and both conditions \(\mathrm {(i)}\) and \(\mathrm {(ii)}\) in Assumptions 2.3 are satisfied.

The problem can be reduced to a one variable problem. The conclusions are:

  • If there is a positive constant C such that

    $$\begin{aligned} C(1+e^{2 \mu C})(1-e^{2 \mu C})^{-1}+(q_u+q_d)\mu + \mu ^{-1}&=0,\\ \Big (\frac{\displaystyle C}{\displaystyle e^{2 \mu C}-1 } \Big ) \frac{\displaystyle 2 e^{2 \mu C}}{\displaystyle \mu } + \frac{\displaystyle -2e^{2 \mu C}+2C \mu +1}{\displaystyle 2 \mu ^2} +q_d +q_u&=0, \end{aligned}$$

    then every point of the set \(\lbrace (a,a+C), a \in \mathbb {R} \rbrace \) is an equilibrium point.

  • Otherwise there are no equilibrium points.

The details can be found in the Appendix A.1.1

Fig. 2
figure 2

Brownian motion with drift and cost function \(c(x,y)=|x-y|\). On the left (\(q_u+q_d=0.1, \mu =-0.89 \)) the value at equilibrium points is constant 0.848. On the right (\(q_u+q_d=2, \mu =-1\) ) there are no equilibrium points

4.2.2 Ornstein Uhlenbeck process

In this case, the process \(X=\{X_t\}\) follows the stochastic differential equation

$$\begin{aligned} dX_t= -\theta X_t dt+ \sigma dW_t, \end{aligned}$$

We analyze the symmetric case when \(q:=q_d=q_u\) and \(q \theta <1\). In this situation, taking the same parameters as in the previous example, c(xy) is under Assumption 2.3. The existence of equilibrium points will hold, but uniqueness not necessarily. Essentially, the equation \(\pi _1(a,R(a,b))=\pi _2(a,R(a,b))\) is satisfied when \(a=-b\) by symmetry, so similar arguments as the ones in the multiplicative case hold. However the line \(a+b=0\) is not the only set where \(\pi _1(a,R(a,b))=\pi _2(a,R(a,b))\). We show that uniqueness does not always hold, see Fig. 3.

Fig. 3
figure 3

Mean reverting process with \(q=0.1\), \(\theta =3\), \(s=2\), \(EP1\sim (-4.26,-1.86)\), \(EP2\sim (-0.78,0.78)\), \(EP3 \sim (1.87,4.27)\) with the values 0.839, 0.55 and 0.84 at each equilibrium point respectively

5 Approximation of nash equilibria in symmetric N-player games with mean-field interaction

In this section, we present an approximation result for Nash equilibria in the N-player game corresponding to the ergodic mean-field game considered above, when the number of players N tends to infinity. More precisely, we establish that an equilibrium point of the mean-field game of Definition 3.1 is an \(\epsilon \)-Nash equilibrium of the corresponding N-player game of Definition 5.1, for N large enough. These approximation results have been studied for instance in Cao and Guo (2022) and Cao et al. (2023) and the references therein. In order to formulate the approximation result, consider:

  1. (i)

    A filtered probability space \((\Omega ,{\mathcal {F}},\lbrace {\mathcal {F}}_t:t\ge 0^-\rbrace ,{{\textbf{P}}})\) that satisfies the usual conditions, where all the processes are defined.

  2. (ii)

    Diffusion processes \(X,\{X^i\}_{i=1,2,\dots }\), each of one satisfies Eq. (1) driven by respective adapted independent Brownian motions \(W,\{W^i\}_{i=1,2,\dots }\).

  3. (iii)

    The set of admissible controls \({\mathcal {A}}\) of Definition 2.1, that in particular assumes, given an admissible control \(\eta ^i=(U^i,D^i)\), the existence of the controlled process as a solution of

    $$\begin{aligned} dX^{i,\eta ^i}_t= \mu (X^{i,\eta ^i}_t)dt + \sigma (X^{i,\eta ^i}_t)dW^i_t+dU^i_t-dD^i_t, \quad X^i_{0^-}=x^i, \end{aligned}$$
    (23)

    for each \(i=1,2,\dots \)

For simplicity and coherence we denote by \(X^{i,a,b}\) the solution to (23) when the i-th player chooses reflecting strategies within \(a<b\), denoted respectively by \(U^{i,a}\) and \(D^{i,b}\). As usual, we define a vector of admissible controls by

$$\begin{aligned} \Lambda =(\eta ^1, \dots ,\eta ^N) \end{aligned}$$

such that \(\eta ^i=(U^i,D^i)\) is an admissible control selected by the player i in the N-player game. Furthermore, we define

and, given a real continuous function f(x), denote

$$\begin{aligned} {\bar{f}}^{-i}_s=\frac{1}{N-1} \sum _{j \ne i}^N f( X_s^{j,\eta ^j}), \quad {\bar{f}}^{a,b,-i}_s=\frac{1}{N-1} \sum _{j \ne i}^N f( X_s^{j,a,b}), \end{aligned}$$
(24)

and, given \(\mu =(U,D)\in {\mathcal {A}}\), for \((\mu , \Lambda ^{-i})\), consider

$$\begin{aligned} V_N^{i}(\mu , \Lambda ^{-i})(x)= \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \Bigg ( \int _0^T c \left( X_s^{i,\mu },{\bar{f}}^{-i}_s\right) ds +q_u U^i_T+q_d D^i_T \Bigg ), \end{aligned}$$
(25)

for a cost function c(xy) satisfying Assumption 2.3.

Definition 5.1

For fixed \(\epsilon >0\) and \(N\in \mathbb {N}\), a vector of admissible controls \(\Lambda =(\eta ^1,\dots ,\eta ^N)\) is called an \(\epsilon \)-Nash equilibrium if for all i and all \(x\in \mathbb {R}\),

$$\begin{aligned} V_N^{i}(\eta ^i, \Lambda ^{-i})(x) \le V^i_N(\mu ,\Lambda ^{-i} )(x)+\epsilon ,\quad \hbox { for all}\ \mu \in {\mathcal {A}}. \end{aligned}$$

We are ready to prove that the equilibrium points of the mean-field game are \(\epsilon \)-Nash equilibriums for the N-player game in two different situations: (i) with reflecting controls for the players and a cost function that is convex in the second variable, (ii) with general controls in \({\mathcal {A}}\), and the cost function \(c(x,y)=|x-y|\).

Theorem 5.2

Consider a cost function c(xy) that satisfies Assumption 2.3, and suppose that the function f(x) in Definition 3.1 is continuous. Assume also that one of the following conditions holds:

  1. (i)

    For every fixed x the function \(y\mapsto c(x,y)\) is convex, and the set of admissible controls for each process \(X^i, \ i=1, \dots ,N\), is the set of reflecting controls instead of \( {\mathcal {A}}\).

  2. (ii)

    We have \(f(x)=x\) and the cost function is \(c(x,y)=\vert x-y \vert \).

Then, if (ab) is an equilibrium point for the mean field game driven by X, given \(\epsilon >0\), the vector of controls

$$\begin{aligned} \Lambda ^{a,b}=((U^{1,a},D^{1,b}),\dots ,(U^{N,a},U^{N,b})), \end{aligned}$$
(26)

is an \(\epsilon \)-Nash equilibrium for the N-player game, for N large enough.

In the proof of (i) we will use the following result.

Lemma 5.3

Let c(xy) be a positive measurable function such that \(y\mapsto c(x,y)\) is convex for each fixed x, and (XY) a random vector. Then

(a) If X and Y are independent,

$$\begin{aligned} {{\textbf{E}}}c(X,{{\textbf{E}}}Y)\le {{\textbf{E}}}c(X,Y). \end{aligned}$$
(27)

(b) In the general case, statement (27) is not true.

Proof of Lemma 5.3

(a) With \(F_X\) and \(F_Y\) the respective distributions of X and Y, we have

$$\begin{aligned} {{\textbf{E}}}c(X,Y)&=\int \left[ \int c(x,y)F_Y(dy)\right] F_X(dx)\\&\ge \int c\left( x,\int yF_Y(dy)\right) F_X(dx)={{\textbf{E}}}c(X,{{\textbf{E}}}Y). \end{aligned}$$

To see (b), consider \(c(x,y)=|x-y|\), a standard normal random variable \(X\sim {\mathcal {N}}(0,1)\), and the random vector \((X,Y)=(X,X)\). We have

$$\begin{aligned} {{\textbf{E}}}c(X,Y)={{\textbf{E}}}|X-X|=0<\sqrt{2\over \pi }={{\textbf{E}}}c(X,{{\textbf{E}}}Y)={{\textbf{E}}}|X|, \end{aligned}$$

giving the counter-example that concludes the proof. \(\square \)

Proof of (i) in Theorem 5.2

Define the function

$$\begin{aligned} V:{\mathcal {A}} \times \lbrace (a,b):a<b \rbrace \rightarrow \mathbb {R} \end{aligned}$$
(28)

by the formula

$$\begin{aligned} V(\mu ,(a,b))= \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \left( \int _0^T c\big (X_s^\mu ,{{\textbf{E}}}_x(f(X^{a,b}_s))\big ) ds +q_u U_T+q_d D_T \right) , \end{aligned}$$

where \(\mu =(U,D)\). Take \(\Lambda ^{a,b}\) as in (26). The departing point is the inequality provided by the equilibrium definition:

$$\begin{aligned} V((U^{a},D^{b}),(a,b))\le V(\mu ,(a,b)),\quad \hbox { for any}\ \mu \in {\mathcal {A}}. \end{aligned}$$
(29)

Second, by equidistribution of the player’s driving processes,

$$\begin{aligned} {{\textbf{E}}}_xc(X^\mu _s,{{\textbf{E}}}_x(f(X^{a,b}_s)))={{\textbf{E}}}_xc(X^\mu _s,{{\textbf{E}}}_x({{\bar{f}}}^{a,b,-i}_s)). \end{aligned}$$

Now, taking \(c<d\) and \(\mu =(U^c,D^d)\), by convexity and independence between the coordinates, we apply (i) in Lemma 5.3:

$$\begin{aligned} {{\textbf{E}}}_xc(X^{c,d}_s,{{\textbf{E}}}_x({{\bar{f}}}^{a,b,-i}_s))\le {{\textbf{E}}}_xc(X^{c,d}_s,{{\bar{f}}}^{a,b,-i}_s), \end{aligned}$$

Integrating in time, taking expectation and ergodic limits, combined with (29), it follows

$$\begin{aligned} V((U^{a},D^{b}),(a,b))\le V((U^{c},D^{d}),(a,b))\le V^i_N((U^c,D^d),\Lambda ^{a,b,-i}_N). \end{aligned}$$
(30)

Now, as f(x) is continuous, the set f([ab]) is a closed interval, denote it by [mM], and observe that

$$\begin{aligned} (X^{i,a,b}_s,{{\bar{f}}}^{a,b,-i}_s)\in [a,b]\times [m,M], \end{aligned}$$

that is a product of closed intervals. Then, as c(xy) is uniformly continuous in this compact domain, given \(\epsilon \) there exist \(\delta \) s.t.

$$\begin{aligned} |c(X^\mu _s,{{\bar{f}}}^{a,b,-i}_s)-c(X^\mu _s,{{\textbf{E}}}_x(f(X^{a,b}_s))|\le \frac{\epsilon }{2}, \end{aligned}$$

whenever \(|{{\bar{f}}}^{a,b,-i}_s-{{\textbf{E}}}_x(f(X^{a,b}_s))|\le \delta \). Now we apply Hoeffding’s inequality for bounded random variables \(m\le f(X^{j,a,b})\le M\), obtaining,

$$\begin{aligned} {{\textbf{P}}}\left( |f^{a,b,-i}-{{\textbf{E}}}_x(f(X^{a,b}_s))|\ge \delta \right) \le 2e^{-{2\delta ^2(N-1)\over (M-m)^2}}. \end{aligned}$$

Finally, denoting \(\Vert c\Vert _\infty =\max \{|c(x,y)|:a\le x\le b, m\le y\le M\}\), we have

$$\begin{aligned} \left| \frac{1}{T}{{\textbf{E}}}_x\int _0^T\left( c(X^{i,a,b}_s,{{\bar{f}}}^{a,b,-i}_s)-c(X^{i,a,b}_s,{{\textbf{E}}}_x(f(X^{a,b}_s))\right) \,ds\right| \\ \le \frac{\epsilon }{2}+\frac{2\Vert c\Vert _\infty }{T}\int _0^T{{\textbf{P}}}_x\left( |{{\bar{f}}}^{a,b,-i}_s-{{\textbf{E}}}_x(f(X^{a,b}_s))|\ge \delta \right) \,ds\\ \le \frac{\epsilon }{2}+4\Vert c\Vert _\infty e^{-{2\delta ^2(N-1)\over (M-m)^2}}\le \epsilon , \end{aligned}$$

for N large enough. From this follows that, for these values of N,

$$\begin{aligned} \left| V((U^{a},D^{b}),(a,b))-V^i_N((U^{a},D^{b}),\Lambda ^{a,b,-i}_N)\right| \le \epsilon , \end{aligned}$$

concluding, in view of (30), the proof of (i). \(\square \)

Proof of (ii) in Theorem 5.2

As \(f(x)=x\), we denote

$$\begin{aligned} {\bar{X}}^{a,b,-i}_{s,N}=\frac{1}{N-1} \sum _{j \ne i}^N X_s^{j,a,b}. \end{aligned}$$

As (ab) is an equilibrium point of the mean field game, given \(\epsilon >0\), we have to prove that

$$\begin{aligned} V_N^{i}((U^{i,a},D^{i,b}), \Lambda ^{a,b,-i})\le V_N^{i}(\mu ,\Lambda ^{a,b,-i})+\epsilon , \end{aligned}$$
(31)

for any strategy \(\mu \in {\mathcal {A}}\), for N large enough. Observe now that, given a strategy \(\eta \), if for some \(N_0\) and some \(i_0\), we have \(V_{N_0}^{i_0}(\eta ,\Lambda ^{a,b,-i_0})<\infty \), then

$$\begin{aligned} \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \int _0^T\left| X_s^\eta -{\bar{X}}^{a,b,-i_0}_{s,N_0} \right| ds=:I_0<\infty ,\\ \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x (q_u U_T)=:J_0<\infty ,\\ \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x (q_d D_T)=:K_0<\infty . \end{aligned}$$

By adding and substracting \({\bar{X}}^{a,b,-i_0}_{s,N_0}\) and the triangular inequality, it follows

$$\begin{aligned} \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \int _0^T\left| X_s^\eta \right| ds\le I_0+\max (|a|,|b|), \end{aligned}$$

and in consequence

$$\begin{aligned} \max (V(\eta ,(a,b)),V_N^{i}(\eta ,\Lambda ^{a,b,-i}))\le I_0+J_0+K_0+2\max (|a|,|b|), \end{aligned}$$

for all N and i. Then, in order to prove (31), it is enough to consider these strategies \(\eta \). Now, as (ab) is an equilibrium point, we have

$$\begin{aligned}{} & {} V_N^{i}((U^{i,a},D^{i,b}),\Lambda ^{a,b,-i})- V_N^{i}(\eta ,\Lambda ^{a,b,-i})\\{} & {} =V_N^{i}((U^{i,a},D^{i,b}),\Lambda ^{a,b,-i})- V(\eta ,(a,b)) +V(\eta ,(a,b))- V_N^{i}(\eta ,\Lambda ^{a,b,-i})\\{} & {} \le V_N^{i}((U^{i,a},D^{i,b}),\Lambda ^{a,b,-i})- V((U^a,D^b),(a,b))\\{} & {} +V(\eta ,(a,b))- V_N^{i}(\eta ,\Lambda ^{a,b,-i})\\{} & {} \le 2\sup _{\eta }\left| V(\eta ,(a,b))- V_N^{i}(\eta ,\Lambda ^{a,b,-i}) \right| . \end{aligned}$$

By the triangular inequality, given \(\eta \), we have

$$\begin{aligned}{} & {} \left| V(\eta ,(a,b))- V_N^{i}(\eta ,\Lambda ^{a,b,-i})\right| \\{} & {} \le \limsup _{T \rightarrow \infty } \frac{1}{T} {{\textbf{E}}}_x \int _0^T\left| |X_s^\eta -{\bar{X}}^{a,b,-i}_s|-|X_s^\eta -{{\textbf{E}}}_x(X_s^{a,b})| \right| ds\\{} & {} \le \limsup _{T\rightarrow \infty } \frac{1}{T} \int _0^T{{\textbf{E}}}_x|{\bar{X}}^{a,b,-i}_s-{{\textbf{E}}}_x(X_s^{a,b})| ds\le {b-a\over \sqrt{N-1}}, \end{aligned}$$

because

$$\begin{aligned} {{\textbf{E}}}_x|{\bar{X}}^{a,b,-i}_s-{{\textbf{E}}}_x(X_s^{a,b})|\le \sqrt{{1\over N-1}{\textbf{var}}_x(X_s^{a,b})}\le {b-a\over \sqrt{N-1}}, \end{aligned}$$

concluding the proof. \(\square \)

5.1 Statements and Declarations

None of the authors have any conflicts of interest. The second and third authors are supported by CSIC - Proyecto Grupos nr. 22620220100043UD, Universidad de la República, Uruguay.

The authors are indebted to the two reviewers of the paper, for a careful reading of the original manuscript and many useful suggestions, and, in particular for for detecting a flaw that led to the presentation of Lemma (5.3), and subsequent reformulation of Theorem 5.2.