In this chapter we use the abstract sufficient conditions from Chap. 9 to prove large and moderate deviation principles for small noise finite dimensional jump-diffusions. We will consider only Laplace principles rather than uniform Laplace principles, since, as was noted in Chap. 9, the extension from the nonuniform to the uniform case is straightforward. The first general results on large deviation principles for jump-diffusions of the form considered in this chapter are due to Wentzell [245–248] and Freidlin and Wentzell [140]. The conditions for an LDP identified in the current chapter relax some of the assumptions made in these works. Results on moderate deviation principles in this chapter are based on the recent work [41]. We do not aim for maximal generality, and from the proofs it is clear that many other models (e.g., time inhomogeneous jump diffusions, SDEs with delay) can be treated in an analogous fashion.

The perspective here is different from that of the discrete time model of Chap. 4. In particular, the emphasis is on viewing the stochastic model as a relatively well behaved mapping on a fixed noise space consisting of a Brownian motion and one or more Poisson random measures. This has important consequences for models with degeneracy, i.e., systems for which the noise does not push the state in all directions. Processes of this sort in the general discrete time setting required complicated assumptions such as Condition 4.8 and a delicate mollification argument as in Sect. 4.8. In contrast, the degeneracy is essentially irrelevant when the process of interest can be viewed as a nice mapping (at least asymptotically) on a fixed noise space. This distinction becomes even more significant for the infinite dimensional models of Chap. 11, where the analogous degeneracy is ubiquitous. In this chapter we use Lipschitz continuity assumptions on the coefficients to guarantee that the mapping is well behaved. However, this is not necessary, especially with regard to Poisson noise, and for one such weakening we refer to Sect. 13.3.

The chapter is organized as follows. Section 10.1 introduces the basic stochastic process model that will be considered. Conditions under which the stochastic equation and its deterministic analogue have unique solutions are given. In Sect. 10.2 we use Theorem 9.2 to establish an LDP for the solution under additional integrability conditions. Finally, in Sect. 10.3 we apply Theorem 9.9 to prove a moderate deviations result. For this result the integrability conditions we require are somewhat weaker, though additional smoothness conditions on the coefficients are assumed so that one can easily expand around the LLN limit, and the proof of tightness, as with the discrete time model of Chap. 5, is more involved than for the large deviation counterpart. We use the notation from Chap. 9, except that in this chapter, W is a finite dimensional standard Brownian motion, i.e., \(\mathscr {H}_{0}=\mathscr {H}=\mathbb {R}^{d}\), \(\mathbb {W}=\mathscr {C} ([0,T]:\mathbb {R}^{d})\), and \(\varLambda \) is the identity operator.

1 Small Noise Jump-Diffusion

We consider small noise stochastic differential equations (SDEs) of the form

$$\begin{aligned} X^{\varepsilon }(t)&=x_{0}+\int _{0}^{t}b(X^{\varepsilon }(s))ds+\sqrt{\varepsilon }\int _{0}^{t}\sigma (X^{\varepsilon }(s))dW(s)\nonumber \\&\quad \qquad +\varepsilon \int _{\mathscr {X}_{t}}G(X^{\varepsilon } (s-), y)N^{1/\varepsilon }(ds\times dy), \end{aligned}$$
(10.1)

where W is a standard d-dimensional Wiener process, \(N^{1/\varepsilon }\) is a PRM with intensity measure \(\lambda _{T}\times \nu \) (see Definition 8.11) constructed from \(\bar{N}\) as in (8.16) with \(\varphi =1/\varepsilon \), and \(W, \bar{N}\) satisfy (a)–(c) in Sect. 9.1. The coefficients are assumed to satisfy the following condition.

Condition 10.1

The functions \(b:\mathbb {R}^{d}\rightarrow \mathbb {R}^{d}\), \(\sigma :\mathbb {R}^{d}\rightarrow \mathbb {R}^{d\times d}\), and \(G:\mathbb {R} ^{d}\times \mathscr {X}\rightarrow \mathbb {R}^{d}\) satisfy

(a) for some \(L_{b}\in (0,\infty )\),

$$ \Vert b(x)-b(\bar{x})\Vert \le L_{b}\Vert x-\bar{x}\Vert ,\ x,\bar{x} \in \mathbb {R}^{d}; $$

(b) for some \(L_{\sigma }\in (0,\infty )\),

$$ \Vert \sigma (x)-\sigma (\bar{x})\Vert \le L_{\sigma }\Vert x-\bar{x} \Vert ,\ x,\bar{x}\in \mathbb {R}^{d}; $$

(c) for some \(L_{G}\in \mathscr {L}^{1}(\nu )\),

$$ \Vert G(x,y)-G(\bar{x}, y)\Vert \le L_{G}(y)\Vert x-\bar{x}\Vert ,\ x,\bar{x} \in \mathbb {R}^{d},\ y\in \mathscr {X}; $$

(d) for some \(M_{G}\in \mathscr {L}^{1}(\nu )\),

$$ \Vert G(x, y)\Vert \le M_{G}(y)(1+\Vert x\Vert ),\ x\in \mathbb {R}^{d} ,\ y\in \mathscr {X}. $$

The following result follows by standard arguments (see Theorem IV.9.1 of [159]). It says that under Condition 10.1, Eq. (10.1) has a unique pathwise solution. In applying results from Chap. 9, we take \(\mathbb {U}=\mathscr {D}([0,T]:\mathbb {R}^{d})\), i.e., the space of \(\mathbb {R}^{d}\)-valued right-continuous functions with left limits and the usual Skorokhod topology [24, Chap. 3, Sect. 12].

Theorem 10.2

Fix \(x_{0}\in \mathbb {R}^{d}\), and assume Condition 10.1. Then for each \(\varepsilon >0\), there is a measurable map \({\mathscr {G}}^{\varepsilon }:\mathbb {V}\rightarrow \mathscr {D}([0,T]:\mathbb {R} ^{d})\) such that for every probability space \((\tilde{\varOmega },\tilde{\mathscr {F}},\tilde{P})\) on which are given a d-dimensional Brownian motion \(\tilde{W}\) and an independent Poisson random measure \(\tilde{N}_{\varepsilon }\) on \(\mathscr {X}_{T}\) with intensity measure \(\varepsilon ^{-1}\nu _{T}\), \(\tilde{X}^{\varepsilon }\doteq {\mathscr {G}}^{\varepsilon }(\sqrt{\varepsilon }\tilde{W},\varepsilon \tilde{N}_{\varepsilon })\) is an \(\tilde{\mathscr {F}} _{t}\doteq \sigma \{\tilde{W}(s),\tilde{N}_{\varepsilon }(B\times [0,s]),s\le t, B\in \mathscr {B}(\mathscr {X}),\nu (B)<\infty \}\) adapted process that is the unique solution of the stochastic integral equation

$$\begin{aligned} \tilde{X}^{\varepsilon }(t)&=x_{0}+\int _{0}^{t}b(\tilde{X}^{\varepsilon }(s))ds+\sqrt{\varepsilon }\int _{0}^{t}\sigma (\tilde{X}^{\varepsilon }(s))d\tilde{W}(s)\nonumber \\&\quad \qquad +\varepsilon \int _{\mathscr {X}_{t}}G(\tilde{X}^{\varepsilon }(s-), y)\tilde{N}_{\varepsilon }(ds\times dy), \end{aligned}$$
(10.2)

for \(t\in [0,T]\). In particular, \(X^{\varepsilon }={\mathscr {G} }^{\varepsilon }(\sqrt{\varepsilon }W,\varepsilon N^{1/\varepsilon })\) is the unique solution of (10.1).

2 An LDP for Small Noise Jump-Diffusions

The solution \(X^{\varepsilon }\) of (10.1) is a \(\mathscr {D}([0,T]:\mathbb {R}^{d})\)-valued random variable. To prove a large deviation principle for \(\left\{ X^{\varepsilon }\right\} _{\varepsilon >0}\) as \(\varepsilon \rightarrow 0\), we will assume the following additional condition on the coefficient function G. For \(\rho \in (0,\infty )\), let \(\mathscr {L} _{\text{ exp }}^{\rho }\) be the collection of all measurable \(\theta :\mathscr {X}\rightarrow \mathbb {R}_{+}\) such that whenever \(A\in \mathscr {B}(\mathscr {X})\) satisfies \(\nu (A)<\infty \),

$$\begin{aligned} \int _{A}e^{\rho \theta (y)}\nu (dy)<\infty . \end{aligned}$$
(10.3)

Let \(\mathscr {L}_{\text{ exp }}\doteq \cap _{\rho \in (0,\infty )} \mathscr {L}_{\text{ exp }}^{\rho }\).

Condition 10.3

\(M_{G}\in \mathscr {L}_{\text{ exp }}\) and \(L_{G} \in \mathscr {L}_{\text{ exp }}^{\rho }\) for some \(\rho >0\).

Remark 10.4

This exponential integrability condition on jump distributions is a natural requirement for the model; it should be compared with Condition 4.3, assumed in the study of small noise discrete time Markov recursive systems. Consider, for example, the case in which \(\mathscr {X}=\mathbb {R}^{d}\), \(\nu \in \mathscr {P}(\mathbb {R}^{d})\) satisfies \(\int _{\mathbb {R}^{d}} e^{\left\langle \alpha , y\right\rangle }\nu (dy)<\infty \) for all \(\alpha \in \mathbb {R}^{d}\), and for some \(d\times d\) matrix A and a vector \(w\in \mathbb {R}^{d}\),

$$ G(x,y)=Ay+y\left\langle x, w\right\rangle . $$

Then G satisfies parts (c) and (d) of Condition 10.1, as well as Condition 10.3. In the general case (in which \(\nu \) need not be a probability measure), note that the local rate function corresponding to just the jump part of the SDE (10.2) would be

$$\begin{aligned} L(x,\beta )=\inf \left[ \int _{\mathscr {X}}\ell (g(y))\nu (dy):\int _{\mathscr {X} }G(x, y)g(y)\nu (dy)=\beta \right] . \end{aligned}$$
(10.4)

Using convex duality and that \(\ell (b)\) is dual to \((e^{a}-1)\), for this to be superlinear in \(\beta \) [a condition needed for the rate function on path space to have compact level sets in the usual topology of \(\mathscr {D} ([0,T]:\mathbb {R}^{d})\)], one needs

$$\begin{aligned} \int _{\mathscr {X}}\left[ e^{\left\langle \alpha ,G(x, y)\right\rangle }-1\right] \nu (dy)<\infty \text { for all }\alpha \in \mathbb {R}^{d}. \end{aligned}$$
(10.5)

However, this follows under Conditions 10.1 and 10.3. One can break the integral in (10.5) according to \(\{y:M_{G} (y)>1\}\) and \(\{y:M_{G}(y)\le 1\}\). Then \(M_{G}\in \mathscr {L}^{1}(\nu )\) implies \(\nu (\{y:M_{G}(y)>1\})<\infty \), and the integral over \(\{y:M_{G} (y)>1\}\) is finite due to Condition 10.3. The mean value theorem gives the bound \(\left\| \alpha \right\| e^{\left\| \alpha \right\| }M_{G}(y)\) for the integrand on \(\{y:M_{G}(y)\le 1\}\), and finiteness for the corresponding integral follows from \(M_{G}\in \mathscr {L}^{1}(\nu )\).

We recall that \(S\doteq \cup _{n\in \mathbb {N}}S_{n}\), with each \(S_{n}\doteq S_{n}^{W}\times S_{n}^{N}\) compact in the appropriate topology. The proof of the following theorem proceeds using a standard argument and is given after Lemma 10.8.

Theorem 10.5

Fix \(x_{0}\in \mathbb {R}^{d}\), and assume Conditions 10.1 and 10.3. Then for each \(q=(f, g)\in S\), there is a unique \(\xi =\xi _{q}\in \mathscr {C}([0,T]:\mathbb {R}^{d})\) such that for all \(t\in [0,T]\),

$$\begin{aligned} \xi (t)&=x_{0}+\int _{0}^{t}b(\xi (s))ds+\int _{0}^{t}\sigma (\xi (s))f(s)ds\nonumber \\&\quad +\int _{\mathscr {X}_{t}}G(\xi (s),y)g(s, y)\nu (dy)ds. \end{aligned}$$
(10.6)

For \(q=(f, g)\in S\), let \(\xi =\xi _{q}\) denote the solution of (10.6). Let \({I}:\mathscr {D}([0,T]:\mathbb {R}^{d})\rightarrow [0,\infty ]\) be defined by

$$\begin{aligned} I(\phi )\doteq \inf _{q\in S:\phi =\xi _{q}}\bar{L}_{T}(q)\text {,} \end{aligned}$$
(10.7)

where \(\bar{L}_{T}(q)\doteq L_{T}^{W}(f)+L_{T}^{N}(g)\), with the individual costs defined as in (9.1).

Theorem 10.6

Assume Conditions 10.1 and 10.3. Then I is a rate function on \(\mathscr {D}([0,T]:\mathbb {R}^{d})\) and \(\left\{ X^{\varepsilon }\right\} _{\varepsilon >0}\) satisfies a large deviation principle on \(\mathscr {D}([0,T]:\mathbb {R}^{d})\) with rate function I.

Following our standard convention, the proof is given for \(T=1\). Before proceeding with the proof, we present two lemmas. The first will be used to prove tightness. Recall that \(g\in S_{n}^{N}\) means that \(\int _{[0,T]\times \mathscr {X}}\ell (g(u, y))\nu (dy)du\le n\). For a function \(f:[0,1]\rightarrow \mathbb {R}^{k}\), define \(\Vert f\Vert _{\infty , t}\doteq \sup _{0\le s\le t}\Vert f(s)\Vert \) for \(t\in [0,1]\). Note that the constant \(c(\delta , n)\) appearing in the lemma may also depend on the function \(\theta \). However, when the lemma is used, this will be a fixed quantity, such as \(M_G(y)\), that is associated with a particular process model under consideration.

Lemma 10.7

Let \(\theta \in \mathscr {L}_{\text{ exp }}\) and suppose that \(\nu (\{\theta >1\})<\infty \). Then for every \(\delta >0\) and \(n\in \mathbb {N}\), there exists \(c(\delta , n)\in (1,\infty )\) such that for all \(\tilde{\theta }:\mathscr {X}\rightarrow \mathbb {R}_{+}\) satisfying \(\tilde{\theta }\le \theta \), every measurable map \(f:[0,1]\rightarrow \mathbb {R}_{+}\), and all \(0\le s\le t\le 1\),

$$\begin{aligned}&\sup _{g\in S_{n}^{N}}\int _{(s,t]\times \mathscr {X}}f(u)\tilde{\theta }(y)g(u,y)\nu (dy)du\nonumber \\&\quad \le c(\delta ,n)\left( \int _{\mathscr {X}}\tilde{\theta } (y)\nu (dy)\right) \left( \int _{s}^{t}f(u)du\right) +\delta \left\| f\right\| _{\infty , 1}. \end{aligned}$$
(10.8)

Proof

Let \(f:[0,1]\rightarrow \mathbb {R}_{+}\), \(g\in S_{n}^{N}\), and \(\delta >0\) be given. Then for each \(m\in (0,\infty )\),

$$\begin{aligned} \int _{(s,t]\times \mathscr {X}}f(u)\tilde{\theta }(y)g(u,y)\nu (dy)\, du=T_{1} (m)+T_{2}(m), \end{aligned}$$
(10.9)

where

$$ T_{1}(m)\doteq \int _{(s,t]\times \{\theta \le m\}}f(u)\tilde{\theta } (y)g(u, y)\nu (dy)du, $$

and

$$ T_{2}(m)\doteq \int _{(s,t]\times \{\theta >m\}}f(u)\tilde{\theta }(y)g(u, y)\nu (dy)du. $$

Using part (a) of Lemma 9.6 with \(\sigma =k, a=\theta (y)\) and \(b=g(u, y)\), for each \(k\in \mathbb {N}\) we have the bound

$$ T_{2}(m)\le \left\| f\right\| _{\infty , 1}\left( \int _{\{\theta >m\}}e^{k\theta (y)}\nu (dy)+\frac{n}{k}\right) . $$

Also, for each \(\beta \in (1,\infty )\), \(T_{1}(m)\) can be bounded by

$$ T_{1}(m)\le T_{3}(m,\beta )+T_{4}(m,\beta ), $$

where

$$\begin{aligned} T_{3}(m,\beta )&\doteq \int _{E_{1}(m,\beta )}f(u)\tilde{\theta } (y)g(u, y)\nu (dy)du,\\ T_{4}(m,\beta )&\doteq \int _{E_{2}(m,\beta )}f(u)\tilde{\theta } (y)g(u, y)\nu (dy)du, \end{aligned}$$

and

$$\begin{aligned} E_{1}(m,\beta )&\doteq \{(u,y)\in (s,t]\times \mathscr {X}:\theta (y)\le m \text{ and } g(u, y)\le \beta \},\\ E_{2}(m,\beta )&\doteq \{(u,y)\in (s,t]\times \mathscr {X}:\theta (y)\le m \text{ and } g(u, y)>\beta \}. \end{aligned}$$

Using part (b) of Lemma 9.6 and that \(g \in S^N_n\), we obtain

$$\begin{aligned} T_{3}(m,\beta )+T_{4}(m,\beta )\le \beta \left( \int _{\mathscr {X}}\tilde{\theta }(y)\nu (dy)\right) \left( \int _{s}^{t}f(u)du\right) +\bar{\kappa }_{1} (\beta )mn\left\| f\right\| _{\infty , 1}. \end{aligned}$$
(10.10)

Thus the left side of (10.8) can be bounded by

$$\begin{aligned}&\beta \left( \int _{\mathscr {X}}\tilde{\theta }(y)\nu (dy)\right) \left( \int _{s}^{t}f(u)du\right) \\&\quad +\left\| f\right\| _{\infty , 1}\left( \bar{\kappa }_{1} (\beta )mn+\int _{\{\theta >m\}}e^{k\theta (y)}\nu (dy)+\frac{n}{k}\right) . \end{aligned}$$

Given \(\delta >0\), choose \(k\in \mathbb {N}\) such that \(n/k<\delta /3\). Then use that \(\theta \in \mathscr {L}_{\text{ exp }}\) to choose \(m\in (0,\infty )\) such that \(\int _{\{\theta >m\}}e^{k\theta (y)}\nu (dy)<\delta /3\). This is possible, since \(\nu (\{\theta >1\})<\infty \). Finally, choose \(\beta \in (1,\infty )\) such that \(\bar{\kappa }_{1}(\beta )mn<\delta /3\). The result now follows by taking \(c(\delta , n)=\beta \).    \(\square \)

The following lemma is proved similarly to Lemma 10.7 and therefore only a sketch is provided. Recall that \(\mathscr {X}_{1} \doteq [0,1]\times \mathscr {X}\) and \(\nu _{1}\in \mathscr {P}(\mathscr {X} _{1})\) is the product measure \(\nu _{1}(ds\times dy)=\nu (dy)ds\).

Lemma 10.8

Let \(\theta \in \mathscr {L}_{\text{ exp }}^{\rho } \cap \mathscr {L}^{1}(\nu )\) for some \(\rho \in (0,\infty )\). Then for every \(n\in \mathbb {N}\),

$$ \sup _{g\in S_{n}^{N}}\int _{\mathscr {X}_{1}}\theta (y)g(u, y)\nu (dy)du<\infty . $$

Proof

Consider the equality in (10.9) with \(m=1\), \(s=0\), \(t=1\), \(f=1\), and \(\tilde{\theta }=\theta \). Then as in the proof of Lemma 10.7,

$$ T_{2}(1)\le \int _{\{\theta >1\}}e^{\rho \theta (y)}\nu (dy)+\frac{n}{\rho }. $$

Also, as with the proof of (10.10),

$$ T_{1}(1)\le \int _{\mathscr {X}}{\theta }(y)\nu (dy)+\bar{\kappa }_{1}(1)n. $$

The result follows by combining the two estimates.    \(\square \)

Proof of Theorem 10.5. Fix \(q=(f, g)\in S\) and let \(k\in \mathbb {N}\) be such that \(q\in S_{k}\). We first prove the existence of a solution to (10.6). Consider a sequence \(\{\phi _n\}\) in \(\mathscr {C}([0,1]: \mathbb {R}^d)\) constructed recursively as follows. Define \(\phi _1(t) \doteq x_0\) for all \(t \in [0,1]\), and for \(n \in \mathbb {N}\), let

$$\begin{aligned} \phi _{n+1}(t)&\doteq x_0 + \int _0^t b(\phi _n(s))ds+\int _{0}^{t}\sigma (\phi _n(s))f(s)ds\nonumber \\&\quad +\int _{\mathscr {X}_{t}}G(\phi _n(s),y)g(s, y)\nu (dy)ds,\;t\in [0,1]. \end{aligned}$$
(10.11)

Using the growth conditions on b, \(\sigma \), and G, we have that there is a \(c_1 \in (0,\infty )\) such that for all \(n \in \mathbb {N}\) and \(t \in [0,1]\),

$$\begin{aligned} \Vert \phi _{n+1}\Vert _{\infty , t}&\le \Vert x_0\Vert + c_1 \int _0^t (1+ \Vert \phi _{n}\Vert _{\infty , s}) (1+|f(s)|) ds \\&\quad + c_1 \int _{\mathscr {X}_{t}} M_G(y) (1+ \Vert \phi _{n}\Vert _{\infty ,s}) g(s, y) \nu (dy) ds. \end{aligned}$$

Thus, with \(h(s) \doteq (1+ |f(s)| + \int _{\mathscr {X}} M_G(y) g(s, y) \nu (dy))\), for some \(c_2 \in (0,\infty )\), we have

$$\Vert \phi _{n+1}\Vert _{\infty , t} \le c_2 \left( 1+ \int _0^t \Vert \phi _{n}\Vert _{\infty , s} h(s) ds\right) , \; t\in [0,1], n \in \mathbb {N}.$$

From Lemma 10.8, we obtain \(\int _{[0,1]} h(s) ds <\infty \). A standard recursive argument now shows that for all n, \(\Vert \phi _n\Vert _{\infty , 1} \le c_2 \exp \int _0^1 h(s)ds <\infty \).

Using Lemma 10.7, it is easily seen that for each \(n \in \mathbb {N}\), \(\phi _n \in \mathscr {C}([0,1]: \mathbb {R}^d)\). Indeed, the continuity of the last term in (10.11) follows on observing that for every \(\delta >0\) and \(0\le s \le t \le 1\),

$$\begin{aligned}&\int _{(s,t]\times \mathscr {X}}\Vert G(\phi _n(u),y)\Vert g(u, y)\nu (dy)du\\&\quad \le (1+\Vert \phi _n\Vert _{\infty , 1})\left( c(\delta , k)(t-s)\int _{\mathscr {X}}M_{G}(y)\nu (dy)+\delta \right) . \end{aligned}$$

For \(n \in \mathbb {N}\) and \(t \in [0,1]\), let \(a_n(t)\doteq \Vert \phi _{n+1}-\phi _n\Vert _{\infty , t}\). Then there exists \(c_3 \in (0,\infty )\) such that for all \(n\ge 2\) and \(t \in [0,1]\),

$$\begin{aligned} a_n(t)&\le c_3 \int _0^t a_{n-1}(s) ds + c_3 \int _0^t a_{n-1}(s) |f(s)| ds\\&\quad + \int _0^t a_{n-1}(s) \left( \int _{\mathscr {X}} L_G(y) g(s, y) \nu (dy)\right) ds. \end{aligned}$$

Thus, with \(m(s) \doteq c_3(1 +|f(s)|) + \int _{\mathscr {X}} L_G(y) g(s, y) \nu (dy)\), we have for all \(t\in [0,1]\) and \(n\ge 2\) that

$$a_n(t) \le \int _0^t a_{n-1}(s) m(s) ds.$$

This shows that for all \(n\in \mathbb {N}\),

$$a_{n+1}(1) \le a_1(1) \frac{\left( \int _{0}^1 m(s) ds\right) ^n}{n!}.$$

Lemma 10.8 implies \(\int _{\mathscr {X}_{1}}L_{G}(y)g(s, y)\nu (dy)ds<\infty \), and thus \(\int _{0}^1 m(s) ds<\infty \). From this it follows that \(\{\phi _n\}\) is a Cauchy sequence in \(\mathscr {C}([0,1]: \mathbb {R}^d)\) and therefore must converge to some \(\phi \in \mathscr {C}([0,1]: \mathbb {R}^d)\). From the continuity of b and \(\sigma \) it follows that for every \(t\in [0,1]\),

$$\int _0^t b(\phi _n(s))ds+\int _{0}^{t}\sigma (\phi _n(s))f(s)ds \rightarrow \int _0^t b(\phi (s))ds+\int _{0}^{t}\sigma (\phi (s))f(s)ds.$$

Also,

$$\begin{aligned}&\int _{\mathscr {X}_{1}}\Vert G(\phi _n(s),y)-G(\phi (s),y)\Vert g(s,y)\nu (dy)ds\\&\quad \le \Vert \phi _n-\phi \Vert _{\infty , 1} \int _{\mathscr {X}_{1}}L_{G}(y)g(s, y)\nu (dy)ds. \end{aligned}$$

Since \(\int _{\mathscr {X}_{1}}L_{G}(y)g(s, y)\nu (dy)ds<\infty \), the right-hand side in the last display converges to 0 as \(n\rightarrow \infty \). Combining these observations, we have that \(\phi \) solves (10.6), proving the existence of solutions.

We now consider uniqueness. Suppose that \(\phi _1\), \(\phi _2\) are two solutions of (10.6) in \(\mathscr {C}([0,1]: \mathbb {R}^d)\). Then using the Lipschitz property of b, \(\sigma \), and G, we have that for some \(c_4 \in (0,\infty )\) and all \(t \in [0,1]\),

$$\begin{aligned} \Vert \phi _1-\phi _2\Vert _{\infty , t}&\le c_4 \int _0^t \Vert \phi _1-\phi _2\Vert _{\infty , s} ds + c_4 \int _0^t \Vert \phi _1-\phi _2\Vert _{\infty , s} |f(s)| ds\\&\quad + \int _0^t \Vert \phi _1-\phi _2\Vert _{\infty ,s} \left( \int _{\mathscr {X}} L_G(y) g(s, y) \nu (dy)\right) ds. \end{aligned}$$

Thus

$$ \Vert \phi _1-\phi _2\Vert _{\infty , t} \le \int _0^t \Vert \phi _1-\phi _2\Vert _{\infty , s} \left( c_4 + c_4|f(s)|+\int _{\mathscr {X}} L_G(y) g(s, y) \nu (dy)\right) ds. $$

Recalling that \(\int _{\mathscr {X}_1} L_G(y) g(s, y) \nu (dy) <\infty \), an application of Gronwall’s lemma implies \(\phi _1 = \phi _2\).    \(\square \)

The following lemma is useful in characterizing the limit points of weakly converging controlled jump processes. It will be used in the proof of Theorem 10.6. Recall that as discussed in Sect. 9.1, \(g_{k}\rightarrow g\) in the topology of \(S_{n}^{N}\) if \(\int _{\mathscr {X}_{1}}fg_{k}d\nu _{1}\rightarrow \int _{\mathscr {X}_{1}} fgd\nu _{1}\) for bounded continuous f with compact support.

Lemma 10.9

Fix \(n\in \mathbb {N}\) and let K be a compact subset of \(\mathscr {X}_{1}\). Let \(g, g_{k}\in S_{n}^{N}\), \(k\in \mathbb {N}\) be such that \(g_{k}\rightarrow g\). Also, let \(\gamma :\mathscr {X}_{1}\rightarrow \mathbb {R}\) be a bounded measurable function. Then as \(k\rightarrow \infty \),

$$\begin{aligned} \int _{K}\gamma (s,y)g_{k}(s,y)\nu (dy)ds\rightarrow \int _{K}\gamma (s,y)g(s, y)\nu (dy)ds. \end{aligned}$$
(10.12)

Proof

We assume that \(\nu _{1}(K)\ne 0\), since otherwise, the result is trivially true. By replacing, if needed, \(g_{k}\) and g with \(g_{k}+1_{K}\) and \(g+1_{K}\) respectively, we can assume without loss of generality that \(\int _{K} g(s, y)\nu _{1}(ds\times dy)\ne 0\) and \(\int _{K}g_{k}(s, y)\nu _{1}(ds\times dy)\ne 0\) for all \(k\in \mathbb {N}\). If (10.12) holds with K replaced by \(K_{1}\), where \(K\subset K_{1}\) for some compact \(K_{1}\), then by taking \(\tilde{\gamma }=\gamma 1_{K}\), we see that (10.12) also holds with the compact set K. Also, since \(\nu \) is finite on every compact set, we can always find a compact set \(K_{1}\supset K\) such that \(\nu _{1}(\partial K_{1})=0\). Hence in proving the lemma, we can assume without loss of generality that \(\nu _{1}(\partial K)=0\). Recall from Chap. 9 that for \(g\in S_{n}^{N}\), \(\nu _{1}^{g}\) is defined by setting \(\nu _{1}^{g}(A)=\int _{A}g(s, y)\nu _{1}(ds\times dy)\) for \(A\in \mathscr {B} (\mathscr {X}_{1})\). Define probability measures \(\tilde{\nu }^{k}\) and \(\tilde{\nu }\) as follows:

$$ \tilde{\nu }^{k}(\cdot )\doteq \frac{\nu _{1}^{g_{k}}(\cdot \cap K)}{m_{k}} ,\quad \tilde{\nu }(\cdot )\doteq \frac{\nu _{1}^{g}(\cdot \cap K)}{m}, $$

where \(m\doteq \nu _{1}^{g}(K)\) and \(m_{k}\doteq \nu _{1}^{g_{k}}(K)\). Let \(\theta (\cdot )\doteq \nu _{1}(\cdot \cap K)/\nu _{1}(K)\). Then

$$\begin{aligned} R(\tilde{\nu }^{k}\left\| \theta \right. )&=\int _{K}\log \left( \frac{\nu _{1}(K)}{m_{k}}g_{k}(s, y)\right) \frac{1}{m_{k}}g_{k}(s, y)\nu _{1}(ds\times dy)\\&=\frac{1}{m_{k}}\int _{K}\left[ \ell (g_{k}(s,y))+g_{k}(s, y)-1\right] \nu _{1}(ds\times dy)+\log \frac{\nu _{1}(K)}{m_{k}}\\&\le \frac{n}{m_{k}}+1-\frac{\nu _{1}(K)}{m_{k}}+\log \frac{\nu _{1}(K)}{m_{k} }. \end{aligned}$$

Since \(g_{k}\rightarrow g\) and \(\nu _{1}(\partial K)=0\), it follows that \(m_{k}\rightarrow m\), and therefore the last display implies \(\sup _{k\in \mathbb {N}}R(\tilde{\nu }^{k}\left\| \theta \right. )<\infty \). Also note that \(\tilde{\nu }^{k}\) converges weakly to \(\tilde{\nu }\). From Lemma 2.5, it follows that

$$ \frac{1}{m_{k}}\int _{K}\gamma (s,y)g_{k}(s, y)\nu _{1}(ds\times dy)\rightarrow \frac{1}{m}\int _{K}\gamma (s,y)g(s, y)\nu _{1}(ds\times dy), $$

which proves (10.12).    \(\square \)

2.1 Proof of the Large Deviation Principle

In this section we prove Theorem 10.6. Theorem 10.2 shows that the solution to the SDE (10.2) can be expressed as a measurable mapping on the input noises: \(X^{\varepsilon }=\mathscr {G}^{\varepsilon }(\sqrt{\varepsilon }W,\varepsilon N^{1/\varepsilon })\). We now verify that \(\mathscr {G}^{\varepsilon }\) satisfies Condition 9.1, which by Theorem 9.2 will complete the proof of the LDP. The notation used is that of Sect. 9.2. In particular, the spaces \(\mathbb {V}\) and \(\bar{\mathbb {V}}\) are the canonical spaces for a Brownian motion and PRM on \(\mathscr {X}_{1}\) and \(\mathscr {Y} _{1}\), respectively.

Define \(\mathscr {G}^{0}:\mathbb {V}\rightarrow \mathscr {D}([0,T]:\mathbb {R} ^{d})\) as follows. If \((w, m)\in \mathbb {V}\) is of the form \((w, m)=(\int _{0}^{\cdot }f(s)ds,\nu _{1}^{g})\) for some \(q=(f, g)\in S\), set

$$ \mathscr {G}^{0}(w, m)=\mathscr {G}^{0}\left( \int _{0}^{\cdot }f(s)ds,\nu _{1} ^{g}\right) =\xi _{q}, $$

where \(\xi _{q}\) is the unique solution of (10.6). For all other \((w, m)\in \mathbb {V}\) set \(\mathscr {G}^{0}(w, m)=0\). Since \(\bar{L} _{T}(q)=\infty \) for such q, with this definition, I defined in (10.7) is the same as the function I defined in (9.4).

We will show that part (b) of Condition 9.1, which is the weak convergence of the controlled processes as \(\varepsilon \rightarrow 0\), holds with this choice of \(\mathscr {G}^{0}\). Part (a) of the condition follows if we prove continuity of \(q\mapsto \mathscr {G}^{0}(q)\) for q such that \(\bar{L}_{T}(q)\le n\) (recall that the initial condition has been assumed fixed). This is in fact an easier deterministic analogue of the proof of part (b), and hence omitted (see, for example, the proof of Theorem 11.25). Fix \(n\in \mathbb {N}\) and let \(u^{\varepsilon }=(\psi ^{\varepsilon },\varphi ^{\varepsilon })\in \bar{\mathscr {A}}_{b, n}\), \(u=(\psi ,\varphi )\in \bar{\mathscr {A}}_{b, n}\) be such that \(u^{\varepsilon }\) converges in distribution to u as \(\varepsilon \rightarrow 0\). We recall that this implies the a.s. bounds

$$\begin{aligned} \int _{0}^{1}\left\| \psi ^{\varepsilon }(s)\right\| ^{2}ds\le n\text { and }\int _{\mathscr {X}_{1}}\ell (\varphi ^{\varepsilon }(s, x)){\nu }_{1}(ds\times dx)\le n. \end{aligned}$$
(10.13)

Furthermore, almost surely \(\varphi ^{\varepsilon }(s, x)\) has upper and lower bounds of the form \(1/\delta \) and \(\delta \) for all x in some compact set K, and \(\varphi ^{\varepsilon }(s, x)=1\) for \(x\notin K\) (where \(\delta >0\) and K depend on \(\varphi ^{\varepsilon }\)). The analogous statements also hold for \((\psi ,\varphi )\), and as we will see, the ability to restrict to controls with such nice properties greatly simplifies the arguments. Almost all of the difficulties in the proof are due to the jump term [for comparison, one can consider the proof of the analogous diffusion model in Theorem 3.19].

Let \(\tilde{\varphi }^{\varepsilon }=1/\varphi ^{\varepsilon }\) and for \(t\in [0,1]\) define

$$\begin{aligned} \mathscr {E}_{1}^{\varepsilon }(t)&\doteq \exp \left[ \int _{\mathscr {X} _{t}\times [0,\infty )}1_{[0,\varphi ^{\varepsilon }(s,y)/\varepsilon ]}(r)\log (\tilde{\varphi }^{\varepsilon }(s, y))\bar{N}(ds\times dy\times dr)\right. \\&\quad \quad \quad \left. +\int _{\mathscr {X}_{t}\times [0,\infty )}1_{[0,\varphi ^{\varepsilon }(s,y)/\varepsilon ]}(r)(-\tilde{\varphi }^{\varepsilon }(s, y)+1)\bar{\nu }_{1}(ds\times dy\times dr)\right] \end{aligned}$$

and

$$ \mathscr {E}_{2}^{\varepsilon }(t)\doteq \exp \left[ -\frac{1}{\sqrt{\varepsilon }}\int _{0}^{t}\psi ^{\varepsilon }(s)dW(s)-\frac{1}{2\varepsilon }\int _{0} ^{t}\Vert \psi ^{\varepsilon }(s)\Vert ^{2}ds\right] . $$

Let \(\mathscr {E}^{\varepsilon }(t)=\mathscr {E}_{1}^{\varepsilon }(t)\mathscr {E} _{2}^{\varepsilon }(t)\). Then using the independence between the Brownian and Poisson noises, it follows that \(\{\mathscr {E}^{\varepsilon }(t)\}_{0\le t\le 1}\) is a \(\mathscr {F}_{t}\)-martingale, and consequently

$$ \bar{Q}^{\varepsilon }(A)=\int _{A}\mathscr {E}^{\varepsilon }(1)dP,\;A\in \mathscr {B}(\bar{\mathbb {V}}) $$

defines a probability measure on \(\bar{\mathbb {V}}\). The bounds (10.13) on \(\psi ^{\varepsilon }\) and \(\varphi ^{\varepsilon }\) along with the properties of \(\varphi ^{\varepsilon }\) in terms of some compact set K noted below (10.13) imply that P and \(\bar{Q}^{\varepsilon }\) are mutually absolutely continuous, and by Girsanov’s theorem [see Theorem D.3],

$$ \left( \sqrt{\varepsilon }W^{\psi ^{\varepsilon }/\sqrt{\varepsilon }},\varepsilon N^{\varphi ^{\varepsilon }/\varepsilon }\right) $$

under \(\bar{Q}^{\varepsilon }\) has the same probability law as \((\sqrt{\varepsilon }W,\varepsilon N^{1/\varepsilon })\) under P, where we recall \(W^{\psi /\sqrt{\varepsilon }}\doteq \) \(W+\int _{0}^{\cdot }\psi (s)ds/\sqrt{\varepsilon }\). Thus it follows that \(\bar{X}^{\varepsilon }=\mathscr {G}^{\varepsilon }(\sqrt{\varepsilon }W^{\psi ^{\varepsilon }/\sqrt{\varepsilon } },\,\varepsilon N^{\varphi ^{\varepsilon }/\varepsilon })\) is the unique solution, under both \(\bar{Q}^{\varepsilon }\) and P, of the controlled SDE given by \(\bar{X}^{\varepsilon }(0)=x_{0}\) and

$$\begin{aligned} d\bar{X}^{\varepsilon }(t)&=\left[ b(\bar{X}^{\varepsilon }(t))+\sigma (\bar{X}^{\varepsilon }(t))\psi ^{\varepsilon }(t)\right] dt+\sqrt{\varepsilon }\sigma (\bar{X}^{\varepsilon }(t))dW(t)\nonumber \\&\quad +\varepsilon \int _{\mathscr {X}}G(\bar{X}^{\varepsilon }(t-), x)N^{\varphi ^{\varepsilon }/\varepsilon }(dt\times dx). \end{aligned}$$
(10.14)

We end this section by stating martingale bounds that will be useful in the sequel. Recall the definition of \(\bar{\mathscr {A}}^{N}\) from Sect. 8.3. Also recall that for \(\varphi \in \bar{\mathscr {A}}^{N}\), \(N^{\varphi }_{c}\) denotes the compensated form of \(N^{\varphi }\).

Lemma 10.10

Let \(\varphi \in \bar{\mathscr {A}}^{N}\), and assume that \(\psi :[0,1]\times \varOmega \times \mathscr {X}\rightarrow \mathbb {R}\) is \(\mathscr {P}\mathscr {F}\otimes \mathscr {B}(\mathscr {X})/\mathscr {B} (\mathbb {R})\)-measurable and

$$ E\int _{\mathscr {X}_{1}}(|\psi (s,x)|\vee |\psi (s, x)|^{2})\varphi (s, x)\nu _{1}(ds\times dx)<\infty . $$

Then there exists \(C\in (0,\infty )\) such that for all \(t\in [0,T]\)

$$ E\left[ \sup _{0\le s\le t}\left| \int _{\mathscr {X}_{t}}\psi (s,x)N_{c}^{\varphi }(ds\times dx)\right| \right] \le CE\left[ \int _{\mathscr {X}_{t}}\psi (s, x)^{2}\varphi (s, x)\nu _{1}(ds\times dx)\right] ^{\frac{1}{2}}, $$

and there is also the bound

$$ E\left[ \sup _{0\le s\le t}\left| \int _{\mathscr {X}_{t}}\psi (s, x)N_{c}^{\varphi }(ds\times dx)\right| ^{2}\right] \le 4E\left[ \int _{\mathscr {X}_{t}}\psi (s, x)^{2}\varphi (s, x)\nu _{1}(ds\times dx)\right] . $$

Proof

The bounds follow from Doob’s maximal inequality and the Lenglart–Lepingle–Pratelli inequality [(D.2) and (D.4) of Appendix D], and from expressions for the quantities on the right-hand sides that are stated in Sect. D.2.2. In particular, in applying (D.2), we use that the expected quadratic variation of the stochastic integral in the last display is

$$ E\int _{\mathscr {X}_{t}}\psi (s, x)^{2}N^{\varphi }(ds\times dx)=E\int _{\mathscr {X}_{t}}\psi (s, x)^{2}\varphi (s, x)\nu _{1}(ds\times dx). $$

   \(\square \)

2.1.1 Tightness

Lemma 10.11

Assume Conditions 10.1 and 10.3. Given controls \((\psi ^{\varepsilon },\varphi ^{\varepsilon })\in \bar{\mathscr {A}}_{b, n}\), let \(\bar{X}^{\varepsilon }\) be the corresponding unique solution to (10.14). Then \(\{\bar{X} ^{\varepsilon }\}\) is a tight family of \(\mathscr {D}([0,1]:\mathbb {R}^{d} )\)-valued random variables.

Proof

We begin with an estimate on the supremum of \(\bar{X}^{\varepsilon }(t)\). Recalling \(\Vert x\Vert _{\infty , t}\doteq \sup _{0\le s\le t}\Vert x(s)\Vert \), by Condition 10.1 we have that for suitable \(c_{1}<\infty \),

$$\begin{aligned} \Vert \bar{X}^{\varepsilon }\Vert _{\infty , t}&\le \Vert x_{0}\Vert +c_{1} \int _{0}^{t}(1+\Vert \bar{X}^{\varepsilon }\Vert _{\infty , s})(1+\Vert \psi ^{\varepsilon }(s)\Vert )ds\\&\quad +\sqrt{\varepsilon }\left\| \int _{0}^{\cdot }\sigma (\bar{X}^{\varepsilon }(s))dW(s)\right\| _{\infty , t}\\&\quad +\varepsilon \left\| \int _{\mathscr {X}_{\cdot }}M_{G}(y)(1+\Vert \bar{X}^{\varepsilon }(s-)\Vert )N_{c}^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)\right\| _{\infty , t}\\&\quad +\int _{\mathscr {X}_{t}}M_{G}(y)(1+\Vert \bar{X}^{\varepsilon } \Vert _{\infty ,s})\varphi ^{\varepsilon }(s, y)\nu (dy)ds, \end{aligned}$$

where \(N_{c}^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)=N^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)-\varepsilon ^{-1}\varphi ^{\varepsilon }(s, y)\nu _{1}(ds\times dy)\). Let

$$\begin{aligned} \mathscr {R}_{t}^{\varepsilon }\doteq&\sqrt{\varepsilon }\left\| \int _{0}^{\cdot }\sigma (\bar{X}^{\varepsilon }(s))dW(s)\right\| _{\infty , t}\\&+\varepsilon \left\| \int _{\mathscr {X}_{\cdot }}M_{G}(y)(1+\Vert \bar{X}^{\varepsilon }(s-)\Vert )N_{c}^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)\right\| _{\infty , t}.\nonumber \end{aligned}$$
(10.15)

Using the bound (10.13) for \(\psi ^{\varepsilon }\) and Hölder’s inequality, Gronwall’s inequality [Lemma E.2] gives

$$\begin{aligned} 1+\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1}&\le (1+\Vert x_{0} \Vert +\mathscr {R}_{1}^{\varepsilon })\exp \left\{ c_{1}(1+\sqrt{n} )+\int _{\mathscr {X}_{1}}\!\!M_{G}(y)\varphi ^{\varepsilon }(s, y)\nu (dy)ds\right\} \nonumber \\&\le c_{2}(1+\Vert x_{0}\Vert +\mathscr {R}_{1}^{\varepsilon }), \end{aligned}$$
(10.16)

where, using Lemma 10.7 with \(\delta =1,f\equiv 1\) and \(g=\varphi ^{\varepsilon }\) and the fact that \(M_{G}\in \mathscr {L}^{1}(\nu )\cap \mathscr {L}_{\text{ exp }}\), we obtain

$$ c_{2}\doteq \exp \left\{ c_{1}(1+\sqrt{n})+c(1,n)\int _{\mathscr {X}}M_{G} (y)\nu (dy)+1\right\} <\infty . $$

Also, Condition 10.1 implies that for some \(c_{3}<\infty \),

$$ \sqrt{\varepsilon }E\left\| \int _{0}^{\cdot }\sigma (\bar{X}^{\varepsilon }(s))dW(s)\right\| _{\infty , 1}\le c_{3}\sqrt{\varepsilon }\left( E\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1}+1\right) . $$

Let \(m\in (0,\infty )\). Then the expectation of the second term in the definition of \(\mathscr {R}_{t}^{\varepsilon }\) can be bounded by the sum of

$$\begin{aligned} T_{1}^{\varepsilon }\doteq \varepsilon E\left\| \int _{\mathscr {X}_{\cdot } }M_{G}(y)1_{\{M_{G}\le m\}}(1+\Vert \bar{X}^{\varepsilon }(s-)\Vert )N_{c}^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)\right\| _{\infty , 1} \end{aligned}$$
(10.17)

and

$$\begin{aligned} T_{2}^{\varepsilon }\doteq 2E\int _{\mathscr {X}_{t}}M_{G}(y)1_{\{M_{G}\ge m\}}(1+\Vert \bar{X}^{\varepsilon }(s)\Vert )\varphi ^{\varepsilon }(s, y)\nu (dy)ds. \end{aligned}$$
(10.18)

For (10.18), we use the representation \(N_{c}^{\varphi ^{\varepsilon }/\varepsilon }=N^{\varphi ^{\varepsilon }/\varepsilon }-\varepsilon ^{-1}\varphi ^{\varepsilon }\nu _{1}\), that the corresponding integrals are almost surely nondecreasing in t, and the identity

$$\begin{aligned}&\varepsilon E \int _{\mathscr {X}_{t} }M_{G}(y)1_{\{M_{G}\ge m\}}(1+\Vert \bar{X}^{\varepsilon }(s-)\Vert )N^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)\\&\quad = E \int _{\mathscr {X}_{t} }M_{G}(y)1_{\{M_{G}\ge m\}}(1+\Vert \bar{X}^{\varepsilon }(s)\Vert )\varphi ^{\varepsilon }(s, y)\nu (dy)ds. \end{aligned}$$

An application of Lemmas 10.7 and 10.10 as used before yield that for some \(c_{4}<\infty \),

$$\begin{aligned} T_{1}^{\varepsilon }&\le c_{4}\sqrt{\varepsilon m}E\left[ (1+\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1})\int _{\mathscr {X}_{1}}M_{G}(y)\varphi ^{\varepsilon }(s, y)\nu (dy)ds\right] ^{1/2}\\&\le c_{4}\sqrt{\varepsilon m}(1+E\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1})\left( c(1,n)\int _{\mathscr {X}}M_{G}(y)\nu (dy)+1\right) ^{1/2}. \end{aligned}$$

Also, for every \(\delta >0\),

$$ T_{2}^{\varepsilon }\le 2(1+{E}\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1})\left( c(\delta , n)\int _{\mathscr {X}}M_{G}(y)1_{\{M_{G}\ge m\}} \nu (dy)+\delta \right) . $$

Choosing \(\delta >0\) sufficiently small and then \(m<\infty \) sufficiently large, there is \(\varepsilon _{0}>0\) such that for all \(\varepsilon \le \varepsilon _{0}\),

$$ E\mathscr {R}_{1}^{\varepsilon }=c_{3}\sqrt{\varepsilon }\left( E\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1}+1\right) +T_{1}^{\varepsilon } +T_{2}^{\varepsilon }\le \frac{1}{2c_{2}}\left( E\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1}+1\right) . $$

Using this in (10.16) then gives that for \(\varepsilon \le \varepsilon _{0}\),

$$ \bar{E}\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1}\le 2c_{2}(\Vert x_{0} \Vert +1), $$

and therefore

$$\begin{aligned} \sup _{\varepsilon \le \varepsilon _{0}}\bar{E}\Vert \bar{X}^{\varepsilon } \Vert _{\infty , 1}<\infty . \end{aligned}$$
(10.19)

Henceforth we consider only \(\varepsilon <\varepsilon _{0}\). We next argue that \(\mathscr {R}_{1}^{\varepsilon }\) defined in (10.15) converges to 0 in probability. The term with the Brownian motion is easy. Using Condition 10.1, the estimate in (10.19), and the Burkholder–Davis–Gundy inequality, it follows that

$$\begin{aligned} \sqrt{\varepsilon }\left\| \int _{[0,\cdot ]}\sigma (\bar{X}^{\varepsilon }(s))dW(s)\right\| _{\infty , 1}\rightarrow 0\text { in probability as }\varepsilon \rightarrow 0. \end{aligned}$$
(10.20)

Next we consider the Poisson term, and write

$$\begin{aligned}&\varepsilon \int _{\mathscr {X}_{t}}G(\bar{X}^{\varepsilon }(s-),y)N^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)\\&\quad =\varepsilon \int _{\mathscr {X}_{t}}G(\bar{X}^{\varepsilon } (s-),y)N_{c}^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)+\int _{\mathscr {X}_{t}}G(\bar{X}^{\varepsilon }(s),y)\varphi ^{\varepsilon } (s, y)\nu (dy)ds.\nonumber \end{aligned}$$
(10.21)

Consider the first term on the right side of (10.21). For \(\alpha \in (0,\infty )\), define the stopping time \(\tau _{\alpha }^{\varepsilon }\doteq \inf \{s:\Vert \bar{X}^{\varepsilon }(s)\Vert >\alpha \}\). We first show that

$$ T^{\alpha ,\varepsilon }\doteq \varepsilon \left\| \int _{(0,\tau _{\alpha }^{\varepsilon }\wedge \cdot ]\times \mathscr {X}}G(\bar{X}^{\varepsilon }(s-),y)N_{c}^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)\right\| _{\infty , 1} $$

converges to zero in probability as \(\varepsilon \rightarrow 0\). To do this, note that for every \(r\in (0,\infty )\),

$$ T^{\alpha ,\varepsilon }\le T_{\le r}^{\alpha ,\varepsilon } +T_{>r}^{\alpha ,\varepsilon }, $$

where

$$\begin{aligned} T_{\le r}^{\alpha ,\varepsilon }&\doteq \varepsilon \left\| \int _{(0,\tau _{\alpha }^{\varepsilon }\wedge \cdot ]\times \{M_{G}(y)\le r\}}G(\bar{X}^{\varepsilon }(s-),y)N_{c}^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)\right\| _{\infty , 1},\\ T_{>r}^{\alpha ,\varepsilon }&\doteq \varepsilon \left\| \int _{(0,\tau _{\alpha }^{\varepsilon }\wedge \cdot ]\times \{M_{G}(y)>r\}}G(\bar{X}^{\varepsilon }(s-),y)N_{c}^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)\right\| _{\infty , 1}. \end{aligned}$$

By Lemma 10.10, there is \(c_{5}\in (0,\infty )\) such that

$$ E(T_{\le r}^{\alpha ,\varepsilon })^{2}\le \varepsilon c_{5}E\int _{(0,\tau _{\alpha }^{\varepsilon }\wedge 1]\times \{M_{G}(y)\le r\}}M_{G} ^{2}(y)(1+\Vert \bar{X}^{\varepsilon }(s)\Vert ^{2})\varphi ^{\varepsilon } (s, y)\nu (dy)ds. $$

We then use that \(\Vert \bar{X}^{\varepsilon }(s)\Vert \le \alpha \) for \(s\in (0,\tau _{\alpha }^{\varepsilon }\wedge 1)\) and write \((0,\tau _{\alpha }^{\varepsilon }\wedge 1]\times \{M_{G}(y)\le r\}\) as the disjoint union of two sets, the first for which \(\varphi ^{\varepsilon }(s, y)<\beta \) and the second for which \(\varphi ^{\varepsilon }(s, y)\ge \beta \), and finally apply part (b) of Lemma 9.6 and use (10.13) to get

$$ E(T_{\le r}^{\alpha ,\varepsilon })^{2}\le \varepsilon c_{5}(1+\alpha ^{2})(\beta r\Vert M_{G}\Vert _{1}+r^{2}n\bar{\kappa }_{1}(\beta )). $$

To deal with the term \(T_{>r}^{\alpha ,\varepsilon }\), note that for every \(k\in \mathbb {N}\),

$$\begin{aligned} ET_{>r}^{\alpha ,\varepsilon }&\le 2 E \int _{(0,\tau _{\alpha }^{\varepsilon }\wedge \cdot ]\times \{M_{G}(y)>r\}} \Vert G(\bar{X}^{\varepsilon }(s-), y)\Vert \varphi ^{\varepsilon }(s, y) \nu (dy)ds\\&\le 2(1+\alpha )E\int _{[0,1]\times \{M_{G}(y)>r\}}M_{G}(y)\varphi ^{\varepsilon }(s, y)\nu (dy)ds\\&\le 2(1+\alpha )\left( \int _{\{M_{G}(y)>r\}}e^{kM_{G}(y)}\nu (dy)+\frac{n}{k}\right) , \end{aligned}$$

where the second inequality follows on using the growth bound stated in part (d) of Condition 10.1 and recalling the definition of \(\tau _{\alpha }^{\varepsilon }\), and the last inequality is a consequence of part (a) of Lemma 9.6 with \(\sigma =k, a=M_{G}(y)\) and \(b=\varphi ^{\varepsilon }(s, y)\) and again the fact that \(\varphi ^{\varepsilon }\) takes values in \(S_{n}^{N}\). Since \(M_{G} \in \mathscr {L}^{1}(\nu )\cap \mathscr {L}_{\text{ exp }}\), for every \(k\in \mathbb {N}\), \(\int _{\{M_{G}(y)>r\}}e^{kM_{G}(y)}\nu (dy)\rightarrow 0\) as \(r\rightarrow \infty \). Combining these two bounds and sending \(\varepsilon \rightarrow 0\), \(r\rightarrow \infty \), \(k\rightarrow \infty \) in that order shows that for each \(\alpha \in (0,\infty )\),

$$\begin{aligned} T^{\alpha ,\varepsilon }\rightarrow 0 \text{ in } \text{ probability } \text{ as } \varepsilon \rightarrow 0. \end{aligned}$$
(10.22)

Next let

$$ T^{\varepsilon }\doteq \varepsilon \left\| \int _{(0,\cdot ]\times \mathscr {X} }G(\bar{X}^{\varepsilon }(s-),y)N_{c}^{\varphi ^{\varepsilon }/\varepsilon }(ds\times dy)\right\| _{\infty , 1}, $$

where the restriction on the time variable in \(T^{\alpha ,\varepsilon }\) has been dropped. Defining \(A_{\alpha }\doteq \{\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1} <\alpha \}\), for all \(\eta >0\),

$$\begin{aligned} P(T^{\varepsilon }>\eta )&=P(\{T^{\varepsilon }>\eta \}\cap A_{\alpha })+P(\{T^{\varepsilon }>\eta \}\cap A_{\alpha }^{c})\\&\le P(T_{1}^{\alpha ,\varepsilon }>\eta )+P(A_{\alpha }^{c}). \end{aligned}$$

Combining this last bound with (10.22) and (10.19), we see that \(T^{\varepsilon }\rightarrow 0\) in probability as \(\varepsilon \rightarrow 0\). Together with (10.20), this shows that

$$ \mathscr {R}_{1}^{\varepsilon }\rightarrow 0 \text{ in } \text{ probability } \text{ as } \varepsilon \rightarrow 0. $$

Thus

$$\begin{aligned} \bar{X}^{\varepsilon }(t)&=x_{0}+\int _{0}^{t}b(\bar{X}^{\varepsilon }(s))ds+\int _{0}^{t}\sigma (\bar{X}^{\varepsilon }(s))\psi ^{\varepsilon }(s)ds\nonumber \\&\quad +\int _{\mathscr {X}_{t}}G(\bar{X}^{\varepsilon }(s),y)\varphi ^{\varepsilon }(s,y)\nu (dy)\, ds+\mathscr {\bar{R}}^{\varepsilon }(t), \end{aligned}$$
(10.23)

where \(\Vert \mathscr {\bar{R}}^{\varepsilon }\Vert _{\infty , 1}\le \mathscr {R}^{\varepsilon }_{1}\) converges to 0 in probability as \(\varepsilon \rightarrow 0\). Tightness of the terms in (10.23) that involve b or \(\sigma \) follows from standard estimates, and are the same as calculations used for the small noise diffusion model in Chap. 3. Thus in order to prove tightness of \(\{\bar{X}^{\varepsilon }\}\), it suffices to argue that

$$ \xi ^{\varepsilon }(t)=\int _{\mathscr {X}_{t}}G(\bar{X}^{\varepsilon }(s),y)\varphi ^{\varepsilon }(s,y)\nu (dy)ds,\, t\in [0,1] $$

is tight in \(\mathbb {U}=\mathscr {D}([0,1]:\mathbb {R}^{d})\). By Lemma 10.7, for every \(\delta >0\) and \(0\le s\le t\le 1\),

$$\begin{aligned} \Vert \xi ^{\varepsilon }(t)-\xi ^{\varepsilon }(s)\Vert&\le \int _{[s, t]\times \mathscr {X}}(1+\Vert \bar{X}^{\varepsilon }\Vert _{\infty ,u} )M_{G}(y)\varphi ^{\varepsilon }(u, y)\nu (dy)du\\&\le (1+\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1})c(\delta , n)(t-s)\Vert M_{G}\Vert _{1}+\delta (1+\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1}), \end{aligned}$$

where as before, \(\Vert \cdot \Vert _{1}\) is the norm in \(\mathscr {L}^{1}(\nu )\). Tightness of \(\xi ^{\varepsilon }\) is now a consequence of (10.19). Thus we have shown that \(\{\bar{X}^{\varepsilon }\}\) is tight in \(\mathscr {D} ([0,1]:\mathbb {R}^{d})\).    \(\square \)

2.1.2 Identification of Limits

The following lemma completes the verification of part (b) of Condition 9.1, and hence the proof of Theorem 10.6.

Lemma 10.12

Assume Conditions 10.1 and 10.3. Given controls \((\psi ^{\varepsilon },\varphi ^{\varepsilon })\in \bar{\mathscr {A} }_{b, n}\), let \(\bar{X}^{\varepsilon }\) be the corresponding unique solution to (10.14). Assume that \((\psi ^{\varepsilon },\varphi ^{\varepsilon })\) converges in distribution to \((\psi ,\varphi )\). Then \(\bar{X}^{\varepsilon }\) converges in distribution to the unique solution to (10.6) with \((f, g)=(\psi ,\varphi )\).

Proof

From Lemma 10.11 it follows that if for some fixed n, the controls \((\psi ^{\varepsilon },\varphi ^{\varepsilon })\) are in \(\bar{\mathscr {A} }_{b, n}\) for every \(\varepsilon >0\), then \(\{\bar{X}^{\varepsilon }\}_{\varepsilon >0}\) is a tight collection of \(\mathscr {D}([0,1]:\mathbb {R} ^{d})\)-valued random variables. It was also shown in the proof of the lemma that \(\Vert \mathscr {\bar{R}}^{\varepsilon }\Vert _{\infty , 1}\) appearing in (10.23) converges to 0 in probability as \(\varepsilon \rightarrow 0\). It follows from this last property and (10.23) that \(\bar{X}\) has continuous sample paths a.s. By appealing to the Skorohod representation theorem, we assume without loss of generality the almost sure convergence \((\bar{X}^{\varepsilon },\psi ^{\varepsilon },\varphi ^{\varepsilon },\mathscr {\bar{R}}^{\varepsilon })\rightarrow (\bar{X},\psi ,\varphi , 0)\). Using the assumed conditions on b and \(\sigma \), it is straightforward [see, for example, the proof of Lemma 3.21] using Hölder’s inequality and the dominated convergence theorem to show that for every t, the sum of the first three terms on the right side of (10.23) converges a.s. to

$$ x_{0}+\int _{0}^{t}b(\bar{X}(s))ds+\int _{0}^{t}\sigma (\bar{X}(s))\psi (s)ds. $$

In view of the unique solvability of (10.6), to complete the verification that \(\mathscr {G}^{\varepsilon }\) satisfies part (b) of Condition 9.1, it then suffices to show that for all \(t\in [0,1]\),

$$\begin{aligned} \int _{\mathscr {X}_{t}}G(\bar{X}^{\varepsilon }(s),y)\varphi ^{\varepsilon }(s,y)\nu (dy)ds-\int _{\mathscr {X}_{t}}G(\bar{X}(s),y)\varphi (s, y)\nu (dy)ds\rightarrow 0 \end{aligned}$$
(10.24)

as \(\varepsilon \rightarrow 0\).

We write the expression in (10.24) as \(T_{3}^{\varepsilon }(t)+T_{4}^{\varepsilon }(t)\), where

$$\begin{aligned} T_{3}^{\varepsilon }(t)&\doteq \int _{\mathscr {X}_{t}}(G(\bar{X} ^{\varepsilon }(s),y)-G(\bar{X}(s),y))\varphi ^{\varepsilon }(s, y)\nu (dy)ds,\\ T_{4}^{\varepsilon }(t)&\doteq \int _{\mathscr {X}_{t}}G(\bar{X} (s),y)(\varphi ^{\varepsilon }(s,y)-\varphi (s, y))\nu (dy)ds. \end{aligned}$$

Using Condition 10.1, we obtain

$$ T_{3}^{\varepsilon }(t)\le \Vert \bar{X}^{\varepsilon }-\bar{X}\Vert _{\infty , 1}\int _{\mathscr {X}_{1}}L_{G}(y)\varphi ^{\varepsilon }(s, y)\nu (dy)ds. $$

Since \(L_{G}\in \mathscr {L}^{1}(\nu )\cap \mathscr {L}_{\text{ exp }}^{\rho } \), we see from Lemma 10.8 that \(T_{3}^{\varepsilon }(t)\rightarrow 0\) a.s. as \(\varepsilon \rightarrow 0\). Let \(\{K_{r} \}_{r\in \mathbb {N}}\) be a sequence of compact subsets of \(\mathscr {X}\) such that \(K_{r}\uparrow \mathscr {X}\) as \(r\rightarrow \infty \), and let \(E_{r}\doteq K_{r}\cap \{M_{G}\le r\}\). Write \(T_{4}^{\varepsilon }(t)=T_{4,r\le }^{\varepsilon }(t)+T_{4,r>}^{\varepsilon }(t)\), where

$$\begin{aligned} T_{4,r\le }^{\varepsilon }(t)&\doteq \int _{\mathscr {X}_{t}}G(\bar{X}(s), y)1_{E_{r}}(y)(\varphi ^{\varepsilon }(s,y)-\varphi (s, y))\nu (dy)ds\\ T_{4,r>}^{\varepsilon }(t)&\doteq \int _{\mathscr {X}_{t}}G(\bar{X}(s), y)1_{E_{r}^{c}}(y)(\varphi ^{\varepsilon }(s,y)-\varphi (s, y))\nu (dy)ds. \end{aligned}$$

Using Lemma 10.9, for every \(r\in (0,\infty )\), \(T_{4,r\le }^{\varepsilon }(t)\rightarrow 0\) as \(\varepsilon \rightarrow 0\). Also, using Lemma 10.7 again, for every \(\delta >0\),

$$\begin{aligned} T_{4,r>}^{\varepsilon }(t)&\le (1+\Vert \bar{X}\Vert _{\infty , 1} )\int _{\mathscr {X}_{t}}M_{G}(y)1_{E_{r}^{c}}(y)(\varphi ^{\varepsilon }(s,y)+\varphi (s, y))\nu (dy)ds\\&\le 2(1+\Vert \bar{X}\Vert _{\infty , 1})\left( c(\delta , n)t\int _{\mathscr {X} }M_{G}(y)1_{E_{r}^{c}}(y)\nu (dy)+\delta \right) . \end{aligned}$$

Since \(M_{G}\in L^{1}(\nu )\), it follows that \(\sup _{\varepsilon \in (0,\varepsilon _{0})} T_{4,r>}^{\varepsilon }(t)\rightarrow 0\) if we first send \(r\rightarrow \infty \) and then \(\delta \rightarrow 0\). Combining these two estimates, we have that \(T_{4}^{\varepsilon }(t)\) converges to 0 as \(\varepsilon \rightarrow 0\). Thus we have proved (10.24), which completes the proof of the lemma and therefore, as noted previously, also the proof of Theorem 10.6.    \(\square \)

3 An MDP for Small Noise Jump-Diffusions

Throughout this section we assume Condition 10.1, which implies Lipschitz continuity and linear growth conditions on b, \(\sigma \), and G in the x variable. Let \(X^{0}\in \mathscr {C} ([0,T]:\mathbb {R}^{d})\) be the unique solution of the equation

$$ X^{0}(t)=x_{0}+\int _{0}^{t}b(X^{0}(s))ds+\int _{\mathscr {X}_{t}}G(X^{0} (s),y)\nu (dy)\,ds,\, t\in [0,T]. $$

We now establish a Laplace principle for \(\{Y^{\varepsilon }\}\) with scaling function \(\varkappa (\varepsilon )\), where

$$ Y^{\varepsilon }=\frac{1}{a(\varepsilon )}(X^{\varepsilon }-X^{0}), $$

and as in (9.6), \(a(\varepsilon )\) satisfies \(a(\varepsilon )\rightarrow 0\) and \(\varkappa (\varepsilon )\doteq \varepsilon /a^{2}(\varepsilon )\rightarrow 0\). For the MDP we assume some additional smoothness on the coefficients. For a differentiable function \(f:\mathbb {R}^{d}\rightarrow \mathbb {R}^{d}\) let \(Df(x)=\left( \partial f_{i}(x)/\partial x_{j}\right) _{i, j}\). Following our convention, for matrices we use the operator norm, so that \(\left\| Df(x)\right\| \doteq \sup _{w\in \mathbb {R}^{d}:\left\| w\right\| =1}\left\| Df(x)w\right\| \). Similarly, if \(g:\mathbb {R}^{d} \times \mathscr {X}\rightarrow \mathbb {R}^{d}\) is differentiable in x for each fixed \(y\in \mathscr {X}\) and \(D_{x}g(x,y)=\left( \partial g_{i}(x,y)/\partial x_{j}\right) _{i, j}\), then \(\left\| D_{x}g(x, y)\right\| \) denotes the norm of this matrix.

For the MDP, the integrability assumption on \(M_{G}\) in Condition 10.3 can be weakened, analogous to the corresponding weakening in going from the LDP to MDP in the setting of discrete time models (Chaps. 4 and 5). The following is the only assumption besides Condition 10.1 needed for the MDP.

Condition 10.13

(a) The functions \(L_{G}\) and \(M_{G}\) are in \(\mathscr {L}^{1}(\nu )\cap \mathscr {L}_{\text{ exp }}^{\rho }\) for some \(\rho \in (0,\infty )\).

(b) For every \(y\in \mathscr {X}\), the maps \(x\mapsto b(x)\) and \(x\mapsto G(x, y)\) are differentiable. For some \(L_{Db}\in (0,\infty )\),

$$ \left\| Db(x)-Db(\bar{x})\right\| \le L_{Db}\left\| x-\bar{x}\right\| ,\, x,\bar{x}\in \mathbb {R}^{d}; $$

for some \(L_{DG}\in \mathscr {L}^{1}(\nu )\),

$$ \left\| D_{x}G(x,y)-D_{x}G(\bar{x},y)\right\| \le L_{DG}(y)\left\| x-\bar{x}\right\| ,\,x,\bar{x}\in \mathbb {R}^{d},\, y\in \mathscr {X}; $$

and lastly,

$$ \sup _{\{x\in \mathbb {R}^{d}:\Vert x\Vert \le \Vert X^{0} \Vert _{\infty ,T}\}}\int _{\mathscr {X}}\Vert D_{x}G(x, y)\Vert \nu (dy)<\infty . $$

Recall from Chap. 9 that we define \(\mathscr {L}^{2}\doteq \mathscr {L}^{2}([0,T]:\mathscr {H}_{0})\times \mathscr {L}^{2}(\nu _{T})\), and that in this chapter, \(\mathscr {H}_{0}=\mathbb {R}^{d}\). For \(q=(f_{1}, f_{2} )\in \mathscr {L}^{2}\), consider the equation

$$\begin{aligned} \eta (t)&=\int _{0}^{t}[Db(X^{0}(s))]\eta (s)ds+\int _{\mathscr {X}_{t}} [D_{x}G(X^{0}(s), y)]\eta (s)\nu (dy)ds\nonumber \\&+\int _{0}^{t}\sigma (X^{0}(s))f_{1}(s)ds+\int _{\mathscr {X} _{t}}G(X^{0}(s), y)f_{2}(s, y)\nu (dy)ds. \end{aligned}$$
(10.25)

Since \(M_{G}\in \mathscr {L}^{1}(\nu )\cap \mathscr {L}_{\text{ exp }}^{\rho }\subset \mathscr {L}^{2}(\nu )\), the last integral on the right side is finite by Hölder’s inequality, and so under Condition 10.13, (10.25) has a unique solution \(\eta _{q}\in \mathscr {C} ([0,T]:\mathbb {R}^{d})\). For \(\eta \in \mathscr {D}([0,T]:\mathbb {R}^{d})\), let

$$ \bar{I}(\eta )\doteq \inf _{q=(f_{1}, f_{2})\in \mathscr {L}^{2}:\eta =\eta _{q} }\left[ \frac{1}{2}\left( \Vert f_{1}\Vert _{W, 2}^{2}+\Vert f_{2}\Vert _{N, 2}^{2}\right) \right] . $$

In particular, \(\bar{I}(\eta )=\infty \) for all \(\eta \in \mathscr {D} ([0,T]:\mathbb {R}^{d})\setminus \mathscr {C}([0,T]:\mathbb {R}^{d})\).

Theorem 10.14

Assume Conditions 10.1 and 10.13. Then \(\{Y^{\varepsilon }\}_{\varepsilon >0}\) satisfies the Laplace principle in \(\mathscr {D}([0,T]:\mathbb {R}^{d})\) with scaling function \(\varkappa (\varepsilon )\) and rate function \(\bar{I}\).

The following theorem gives an alternative expression for the rate function. From part (d) of Condition 10.1 and part (a) of Condition 10.13, it follows that \(y\mapsto G_{i}(X^{0}(s), y)\) is in \(\mathscr {L}^{2}(\nu )\) for all \(s\in [0,T]\) and \(i=1,\ldots , d\), where \(G=(G_{1},\ldots , G_{d})^{T}\). For \(i=1,\ldots , d\), let \(e_{i} :\mathscr {X}_{T}\rightarrow \mathbb {R}\) be measurable functions such that for each \(s\in [0,T]\), \(\{e_{i}(s,\cdot )\}_{i=1}^{d}\) is an orthonormal collection in \(\mathscr {L}^{2}(\nu )\) and the linear span of the collection contains that of \(\{G_{i}(X^{0}(s),\cdot )\}_{i=1}^{d}\). Define \(\bar{b}(x)\doteq \int _{\mathscr {X}} D_{x}G(x, y)\nu (dy)\), \(x\in \mathbb {R}^{d}\), and define also \(A:[0,T]\rightarrow \mathbb {R}^{d\times d}\) by

$$\begin{aligned} A_{ij}(s)\doteq \langle G_{i}(X^{0}(s),\cdot ), e_{j}(s,\cdot )\rangle _{\mathscr {L}^{2}(\nu )},\;i, j=1,\ldots , d,\;s\in [0,T], \end{aligned}$$
(10.26)

where \(\langle \cdot ,\cdot \rangle _{\mathscr {L}^{2}(\nu )}\) is the inner product in \(\mathscr {L}^{2}(\nu )\).

For \(\eta \in \mathscr {D}([0,T]:\mathbb {R}^{d})\), let

$$ I(\eta )=\inf _{\tilde{q}=(\tilde{f}_{1},\tilde{f}_{2})}\left[ \frac{1}{2}\left( \Vert \tilde{f}_{1}\Vert _{2}^{2}+\Vert \tilde{f}_{2}\Vert _{2} ^{2}\right) \right] , $$

where the infimum is taken over all \(\tilde{q}=(\tilde{f}_{1},\tilde{f}_{2})\), \(\tilde{f}_{1},\tilde{f}_{2}\in \mathscr {L}^{2}([0,T]:\mathbb {R}^{d})\) such that for \(t\in [0,T]\),

$$\begin{aligned} \eta (t)&=\int _{0}^{t}[Db(X^{0}(s))+\bar{b}(X^{0}(s))]\eta (s)ds+\int _{0}^{t}\sigma (X^{0}(s))\tilde{f}_{1}(s)ds\nonumber \\&\quad +\int _{0}^{t}A(s)\tilde{f}_{2}(s)ds. \end{aligned}$$
(10.27)

Here \(\Vert \cdot \Vert _{2}\) is the usual norm on \(\mathscr {L}^{2} ([0,T]:\mathbb {R}^{d})\), and thus the same as \(\Vert \cdot \Vert _{W, 2}\). The proof of the following theorem is given in Sect. 10.3.3.

Theorem 10.15

Under the conditions of Theorem 10.14, \(I = \bar{I}\).

Remark 10.16

Theorem 10.15 in particular says that the rate function for \(\{Y^{\varepsilon }\}\) is the same as that appearing in the large deviation principle with scaling function \(\varepsilon \) for the Gaussian process

$$ dZ^{\varepsilon }(t)=B(t)Z^{\varepsilon }(t)dt+\sqrt{\varepsilon }A(t)dW_{1} (t)+\sqrt{\varepsilon }\sigma (X^{0}(t))dW_{2}(t),\;Z^{\varepsilon }(0)=0, $$

where \(W_{1}\), \(W_{2}\) are independent standard d-dimensional Brownian motions and \(B(t)=Db(X^{0}(t))+\bar{b}(X^{0}(t))\).

3.1 Some Preparatory Results

Following our standard convention, the proof is given for \(T=1\), and thus \(\mathbb {U}=\mathscr {D}([0,1]:\mathbb {R}^{d})\). From Theorem 10.2, it follows that there exists a measurable map \(\mathscr {G}^{\varepsilon }:\mathbb {V}\rightarrow \mathbb {U}\) such that \(X^{\varepsilon }=\mathscr {G}^{\varepsilon }(\sqrt{\varepsilon }W,\varepsilon N^{1/\varepsilon })\). Using \(Y^{\varepsilon }=(X^{\varepsilon }-X^{0} )/a(\varepsilon )\), there is measurable \(\mathscr {K}^{\varepsilon }\) such that \(Y^{\varepsilon }=\mathscr {K}^{\varepsilon }(\sqrt{\varepsilon }W,\varepsilon N^{1/\varepsilon })\). Define \(\mathscr {K}^{0}:\mathscr {L}^{2}\rightarrow \mathbb {U}\) by \(\mathscr {K}^{0}(q)=\eta \) if \(\eta \) solves (10.25) for \(q=(f_{1}, f_{2})\in \mathscr {L}^{2}\). In order to prove Theorem 10.14, we will verify that Condition 9.8 holds with these choices of \(\mathscr {K}^{\varepsilon }\) and \(\mathscr {K}^{0}\).

The following lemma verifies a continuity property of \(\mathscr {K}^{0}\). Recall the space \(\hat{S}_{n}\doteq \{(f_{1}, f_{2})\in \mathscr {L}^{2}:\Vert f_{1}\Vert _{W, 2} ^{2}+\Vert f_{2}\Vert _{N, 2}^{2}\le n\}\) introduced above Condition 9.8. This is viewed as a subset of the Hilbert space \(\mathscr {L} ^{2}\) defined there, and with respect to the topology of weak convergence in \(\mathscr {L}^{2}\) is a compact Polish space. Together with the continuity established in Lemma 10.17, the compactness of \(\hat{S}_{n}\) implies part (a) of Condition 9.8.

Lemma 10.17

Suppose Condition 10.1 holds and \(M_{G} \in \mathscr {L}^{2}(\nu )\). Fix \(n\in (0,\infty )\) and let \(q^{k}, q\in \hat{S}_{n} \), \(k\in \mathbb {N}\) be such that \(q^{k}\rightarrow q\). Let \(\mathscr {K} ^{0}(q)=\eta \), where \(\eta \) solves (10.25). Then \(\mathscr {K}^{0}(q^{k})\rightarrow \mathscr {K}^{0}(q)\).

Proof

Note that from part (d) of Condition 10.1 and since \(M_{G} \in \mathscr {L}^{2}(\nu )\), the map \((s, y)\mapsto G(X^{0}(s), y)1_{[0,t]}(s)\) is in \(\mathscr {L}^{2}(\nu _{1})\). Let \(q^{k}=(f_{1}^{k}, f_{2}^{k})\) and \(q=(f_{1}, f_{2})\). Since \(f_{2}^{k}\rightarrow f_{2}\), we have for every \(t\in [0,1]\) that

$$\begin{aligned} \int _{\mathscr {X}_{t}}f_{2}^{k}(s, y)G(X^{0}(s), y)\nu (dy)ds\rightarrow \int _{\mathscr {X}_{t}}f_{2}(s, y)G(X^{0}(s), y)\nu (dy)ds. \end{aligned}$$
(10.28)

We argue that the convergence is in fact uniform in t. Note that for \(0\le s\le t\le 1\),

$$\begin{aligned}&\left\| \int _{[s, t]\times \mathscr {X}}f_{2}^{k}(u, y)G(X^{0} (u), y)\nu (dy)du\right\| \nonumber \\&\quad \le \left( 1+\Vert X^{0}\Vert _{\infty , 1}\right) \int _{[s, t]\times \mathscr {X}}M_{G}(y)|f_{2}^{k}(u, y)|\nu (dy)du\nonumber \\&\quad \le \left( 1+\Vert X^{0}\Vert _{\infty , 1}\right) |t-s|^{1/2}\sqrt{n}\Vert M_{G}\Vert _{\mathscr {L}^{2}(\nu )}, \end{aligned}$$
(10.29)

where \(\Vert \cdot \Vert _{\mathscr {L}^{2}(\nu )}\) denotes the norm in \(\mathscr {L}^{2}(\nu )\). This implies equicontinuity, and hence the convergence in (10.28) is uniform in \(t\in [0,1]\).

Next, since \(f_{1}^{k}\rightarrow f_{1}\), and since Condition 10.1 implies that \(\sigma (\cdot )\) is continuous, it follows that for every \(t\in [0,1]\),

$$ \int _{0}^{t}\sigma (X^{0}(s))f_{1}^{k}(s)ds\rightarrow \int _{0}^{t}\sigma (X^{0}(s))f_{1}(s)ds. $$

Once again an equicontinuity estimate similar to (10.29) shows that the convergence is uniform. The conclusion of the lemma now follows from Gronwall’s inequality.    \(\square \)

In order to verify part (b) of Condition 9.8, we first prove some a priori estimates. Recall the spaces \(\mathscr {L}_{\text{ exp }}^{\rho }\) introduced in (10.3) and \(S_{n,+}^{N,\varepsilon }\) and \(S_{n}^{N,\varepsilon }\) in (9.7). Here \(S_{n,+}^{N,\varepsilon }\) are controls for the Poisson noise with cost bounded by \(na^{2}(\varepsilon )\), the scaling that is appropriate for an MDP, and \(S_{n}^{N,\varepsilon }\) are the centered and rescaled versions of elements of \(S_{n,+}^{N,\varepsilon }\).

Lemma 10.18

Let \(h\in \mathscr {L}^{1}(\nu )\cap \mathscr {L}_{\text{ exp }} ^{\rho }\) for some \(\rho >0\) and let I be a measurable subset of [0, 1]. Let \(n\in (0,\infty )\). Then there exist maps \(\vartheta ,\xi ,\zeta \) from \((0,\infty )\) to \((0,\infty )\) such that \(\vartheta (u)\rightarrow 0\) as \(u\rightarrow \infty \) and \(\xi (u)\rightarrow 0\) as \(u\rightarrow 0\), and for all \(\varepsilon ,\beta \in (0,\infty )\),

$$ \sup _{f\in S_{n}^{N,\varepsilon }}\int _{I\times \mathscr {X}} h(y)|f(s, y)|1_{\{|f|\ge \beta /a(\varepsilon )\}}\nu (dy)ds\le \sqrt{a(\varepsilon )}\vartheta (\beta )+(1+\lambda _{1}(I))\xi (\varepsilon ) $$

and

$$ \sup _{f{\in } S_{n}^{N,\varepsilon }}\int _{I{\times }\mathscr {X}}h(y)|f(s,y)|\nu (dy)\, ds{\le }\,\zeta (\beta )\lambda _{1}(I)^{1/2}\,{+}\sqrt{a(\varepsilon )} \vartheta (\beta )\,{+}\,(1\,{+}\,\lambda _{1}(I))\xi (\varepsilon ), $$

where \(\lambda _{1}(I)\) denotes the Lebesgue measure of I.

Proof

Let \(f\in S_{n}^{N,\varepsilon }\) and \(\beta \in (0,\infty )\). Then

$$\begin{aligned} \int _{I\times \mathscr {X}}h(y)|f(s,y)|\nu (dy)ds&\le \int _{I\times \mathscr {X}}h(y)|f(s, y)|1_{\{|f|\le \beta /a(\varepsilon )\}}\nu (dy)ds\\&\quad +\int _{I\times \mathscr {X}}h(y)|f(s, y)|1_{\{|f|\ge \beta /a(\varepsilon )\}}\nu (dy)ds. \nonumber \end{aligned}$$
(10.30)

Recall that \(\mathscr {L}^{1}(\nu )\cap \mathscr {L}_{\text{ exp }}^{\rho }\subset \mathscr {L}^{p}(\nu )\) for all \(p\ge 1\). By the Cauchy-Schwarz inequality and part (c) of Lemma 9.7, we have

$$\begin{aligned}&\int _{I\times \mathscr {X}}h(y)|f(s, y)|1_{\{|f|\le \beta /a(\varepsilon )\}} \nu (dy)ds \\&\quad \le \left( \lambda _{1}(I)\Vert h\Vert _{2}^{2}\int _{\mathscr {X}_{1} }f(s, y)^{2}1_{\{|f|\le \beta /a(\varepsilon )\}}\nu (dy)ds\right) ^{1/2} \nonumber \\&\quad \le \Vert h\Vert _{2}(n\kappa _{2}(\beta ))^{1/2}\lambda _{1}(I)^{1/2}.\nonumber \end{aligned}$$
(10.31)

We now consider the second term on the right side of (10.30). We decompose h as \(h1_{\{h\le 1/a(\varepsilon )^{1/2}\}}+h1_{\{h>1/a(\varepsilon )^{1/2}\}}\). Then using part (a) of Lemma 9.7, we obtain

$$\begin{aligned} \int _{I\times \mathscr {X}}h(y)1_{\{h\le 1/a(\varepsilon )^{1/2}\}} |f(s, y)|1_{\{|f|\ge \beta /a(\varepsilon )\}} \nu (dy)\, ds&\le \frac{1}{\sqrt{a(\varepsilon )}}na(\varepsilon )\kappa _{1}(\beta )\\&=n\sqrt{a(\varepsilon )}\kappa _{1}(\beta ). \end{aligned}$$

Also, letting \(g=a(\varepsilon )f+1\) and noting that the definition of \(S_{n}^{N,\varepsilon }\) implies \(g\ge 0\), we have

$$\begin{aligned}&\int _{I\times \mathscr {X}}h(y)1_{\{h>1/a(\varepsilon )^{1/2}\}} |f(s, y)|1_{\{|f|\ge \beta /a(\varepsilon )\}} \nu (dy)\, ds\nonumber \\&\quad \le \frac{\lambda _{1}(I)}{a(\varepsilon )}\int _{\mathscr {X} }h(y)1_{\{h>1/a(\varepsilon )^{1/2}\}}\nu (dy)\nonumber \\&\quad \quad +\frac{1}{a(\varepsilon )}\int _{I\times \mathscr {X}} h(y)1_{\{h>1/a(\varepsilon )^{1/2}\}}g(s,y)\nu (dy)\, ds. \end{aligned}$$
(10.32)

The first term on the right side can be bounded by

$$ \lambda _{1}(I)C_{1}(\varepsilon )\doteq \lambda _{1}(I)\int _{\mathscr {X}}h(y)^{3}1_{\{h>1/a(\varepsilon )^{1/2}\}} \nu (dy), $$

where \(h\in \mathscr {L}_{\text{ exp }}^{\rho }\cap \mathscr {L}^{1}(\nu )\) implies \(C_{1}(\varepsilon )\rightarrow 0\) as \(\varepsilon \rightarrow 0\). The second term on the right side of (10.32) can be bounded, using part (a) of Lemma 9.6 with \(a=\rho h(y)/2\), \(b=g(s, y)\), \(\sigma =1\), and also using that \(g\in S_{n,+}^{N,\varepsilon }\), by

$$\begin{aligned}&\frac{2\lambda _{1}(I)}{\rho a(\varepsilon )}\int _{\mathscr {X}}e^{\rho h(y)/2}1_{\{h>1/a(\varepsilon )^{1/2}\}}\nu (dy)+\frac{2}{\rho a(\varepsilon )}na^{2}(\varepsilon )\\&\quad \le \frac{2\lambda _{1}(I)}{\rho }\int _{\mathscr {X}}h(y)^{2}e^{\rho h(y)/2}1_{\{h>1/a(\varepsilon )^{1/2}\}}\nu (dy)+\frac{2na(\varepsilon )}{\rho }\\&\quad =\lambda _{1}(I)C_{2}(\varepsilon )+\frac{2na(\varepsilon )}{\rho }, \end{aligned}$$

where \(C_{2}(\varepsilon )\) converges to 0 as \(\varepsilon \rightarrow 0\). Thus the second term on the right side of (10.30) can be bounded by

$$ \sqrt{a(\varepsilon )}\vartheta (\beta )+(1+\lambda _{1}(I))\xi (\varepsilon ), $$

where

$$ \vartheta (\beta )=n\kappa _{1}(\beta ),\;\xi (\varepsilon )=C_{1}(\varepsilon )+C_{2}(\varepsilon )+\frac{2na(\varepsilon )}{\rho }. $$

This gives the first bound of the lemma. The second bound also follows with these choices of \(\xi \), \(\vartheta \), and \(\zeta (\beta )=\Vert h\Vert _{2}(n\kappa _{2}(\beta ))^{1/2}\) using (10.31).    \(\square \)

The following lemma is proved in a fashion similar to that of Lemma 10.7, and so only a sketch is given.

Lemma 10.19

Let \(\theta \in \mathscr {L}_{\text{ exp }}^{\rho }\) for some \(\rho >0\) and suppose that \(\nu (\{\theta >1\})<\infty \). Then for every \(\delta >0\) and \(n\in \mathbb {N}\), there exists \(\tilde{c} (\delta , n)\in (0,\infty )\) such that for all measurable maps \(\tilde{\theta }:\mathscr {X}\rightarrow \mathbb {R}_{+}\) satisfying \(\tilde{\theta }\le \theta \), any measurable \(f:[0,1]\rightarrow \mathbb {R}_{+}\), and all \(0\le s\le t\le 1\),

$$\begin{aligned}&\sup _{g\in S_{n,+}^{N,\varepsilon }}\int _{(s,t]\times \mathscr {X}} f(u)\tilde{\theta }(y)g(u,y)\nu (dy)du\\&\quad \le \tilde{c}(\delta , n)\left( \int _{\mathscr {X}}\tilde{\theta } (y)\nu (dy)\right) \left( \int _{s}^{t}f(u)du\right) +(\delta +n\rho ^{-1} a^{2}(\varepsilon ))\Vert f\Vert _{\infty , 1}. \end{aligned}$$

Proof

Let \(f:[0,1]\rightarrow \mathbb {R}_{+}\) and \(g\in S_{n,+}^{N,\varepsilon }\). For \(m\in (0,\infty )\), let \(T_{i}(m)\), \(i=1,2\), be as in the proof of Lemma 10.7. Then using part (a) of Lemma 9.6 with \(a=\rho \theta (y)\), \(b=g(u, y)\), and \(\sigma =1\), we can bound \(T_{2}(m)\) as

$$ T_{2}(m)\le \frac{\left\| f\right\| _{\infty , 1}}{\rho }\left( \int _{\{\theta >m\}}e^{\rho \theta (y)}\nu (dy)+na^{2}(\varepsilon )\right) . $$

Also, as in the proof of Lemma 10.7,

$$ T_{1}(m)\le \beta \left( \int _{\mathscr {X}}\tilde{\theta }(y)\nu (dy)\right) \left( \int _{s}^{t}f(u)du\right) +\bar{\kappa }_{1}(\beta )mna^{2} (\varepsilon )\Vert f\Vert _{\infty , 1}, $$

where \(\bar{\kappa }_{1}(\beta )\rightarrow 0\) as \(\beta \rightarrow \infty \). The result now follows on recalling (10.9) and choosing first m sufficiently large and then \(\beta \) sufficiently large.    \(\square \)

Recall the map \(\mathscr {K}^{\varepsilon }\) introduced at the beginning of Sect. 10.3.1 and the definition of \(\mathscr {U} _{n,+}^{\varepsilon }\) in (9.8), and recall that by (9.7), \({\mathscr {U}}_{n,+}^{\varepsilon }\) is the class of controls for both types of noise for which the cost scales proportionally with \(a^{2}(\varepsilon )\). Since \(\mathscr {U}_{n,+}^{\varepsilon }\) is contained in \(\mathscr {\bar{A}}_{b,\bar{n}}\) for some \(\bar{n}\in \mathbb {N}\), it follows from Sect. 10.2.1 that for \(n\in \mathbb {N}\) and \(u=(\zeta ,\varphi )\in \mathscr {U}_{n,+}^{\varepsilon }\), the equation

$$\begin{aligned} d\bar{X}^{\varepsilon }(t)&=\left[ b(\bar{X}^{\varepsilon }(t))+\sigma (\bar{X}^{\varepsilon }(t))\zeta (t)\right] dt+\sqrt{\varepsilon }\sigma (\bar{X}^{\varepsilon }(t))dW(t)\\&\quad +\varepsilon \int _{\mathscr {X}}G(\bar{X}^{\varepsilon }(t-), x)N^{\varphi /\varepsilon }(dt\times dx), \end{aligned}$$

\(\bar{X}^{\varepsilon }(0)=x_{0}\) has a unique solution.

Define \(\bar{Y}^{\varepsilon }={\mathscr {K}}^{\varepsilon }(\sqrt{\varepsilon }W^{\zeta /\sqrt{\varepsilon }},\varepsilon N^{\varphi /\varepsilon })\), and note that using Girsanov’s theorem [Theorem D.3] as in the proof of Theorem 3.19 yields

$$\begin{aligned} \bar{Y}^{\varepsilon }=\frac{1}{a(\varepsilon )}(\bar{X}^{\varepsilon } -X^{0}). \end{aligned}$$
(10.33)

The following moment bound on \(\bar{X}^{\varepsilon }\) follows along the lines of the proof of (10.19).

Lemma 10.20

Assume Conditions 10.1 and 10.13. For every \(n\in \mathbb {N}\), there exists an \(\varepsilon _{0}\in (0,1)\) such that

$$ \sup _{\varepsilon \in (0,\varepsilon _{0})}\sup _{u=(\zeta ,\varphi )\in \mathscr {U}_{n,+}^{\varepsilon }}{E}\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1}<\infty . $$

Proof

Using the same argument as that used to establish (10.16), for all \(\varepsilon \in (0,1)\), we have

$$\begin{aligned} \Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1}&\le (1+\Vert x_{0}\Vert +\tilde{\mathscr {R}}_{1}^{\varepsilon })\exp \left\{ c_{1}(1+\sqrt{n} a_{1})+\int _{\mathscr {X}_{1}}M_{G}(y)\varphi (s,y)\nu (dy)\, ds\right\} , \end{aligned}$$
(10.34)

where \(a_{1}\doteq \sup _{\varepsilon \in (0,1)}a(\varepsilon )\) and

$$\begin{aligned}&\tilde{\mathscr {R}}_{1}^{\varepsilon }\doteq \sqrt{\varepsilon }\left\| \int _{0}^{\cdot }\sigma (\bar{X}^{\varepsilon }(s))dW(s)\right\| _{\infty , 1}\\&\quad \quad \quad +\varepsilon \left\| \int _{\mathscr {X}_{\cdot }}M_{G}(y)(1+\Vert \bar{X}^{\varepsilon }(s-)\Vert )N_{c}^{\varphi /\varepsilon }(ds\times dy)\right\| _{\infty , 1}. \end{aligned}$$

Using Lemma 10.19 with \(\delta =1,f\equiv 1\) and recalling that \(\varphi \in S_{n,+}^{N,\varepsilon }\) and the fact that \(M_{G}\in \mathscr {L}^{1}(\nu )\cap \mathscr {L}_{\text{ exp }}\), we obtain

$$\begin{aligned} \Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1}&\le c_{2}(1+\Vert x_{0}\Vert +\tilde{\mathscr {R}}_{1}^{\varepsilon }), \end{aligned}$$
(10.35)

where

$$ c_{2}\doteq \exp \left\{ c_{1}(1+\sqrt{n}a_{1})+\tilde{c}(1,n)\int _{\mathscr {X}}M_{G}(y)\nu (dy)+1+\frac{n}{\rho }a_{1}^{2}\right\} <\infty . $$

We split the expected value of the second term in the definition of \(\tilde{\mathscr {R}}_{1}^{\varepsilon }\) as \(T_{1}^{\varepsilon }+T_{2}^{\varepsilon }\), where \(T_{i}^{\varepsilon }\) are just as in (10.17)–(10.18) in the proof of the corresponding LDP, and then follow the same procedure as that below (10.17)–(10.18) to bound the two terms. In this case, however, we use Lemma 10.19 rather than Lemma 10.7, and find that for some \(c_{3}\in (0,\infty )\),

$$ T_{1}^{\varepsilon }\le c_{3}\sqrt{\varepsilon }m(1+{E}\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1})\left( \tilde{c}(1,n)\int _{\mathscr {X} }M_{G}(y)\nu (dy)+1+\frac{n}{\rho }a_{1}^{2}\right) ^{1/2}, $$

and for every \(\delta >0\) and \(\varepsilon \in (0,1)\),

$$ T_{2}^{\varepsilon }\le 2(1+{E}\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1})\left( \tilde{c}(\delta , n)\int _{\mathscr {X}}M_{G}(y)1_{\{M_{G}\ge m\}}\nu (dy)+\delta +\frac{n}{\rho }a^{2}(\varepsilon )\right) . $$

Choosing first \(\delta \) sufficiently small, next m sufficiently large, and finally \(\varepsilon _{0}\) sufficiently small, we have for all \(\varepsilon \le \varepsilon _{0}\) that

$$ {E}\tilde{\mathscr {R}}_{1}^{\varepsilon }\le \frac{1}{2c_{2}}\left( {E}\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1}+1\right) . $$

The result now follows on using this estimate in (10.35).    \(\square \)

The following tightness property plays a key role in the proof of Theorem 10.14.

Lemma 10.21

Suppose Conditions 10.1 and 10.13 hold, and define \(\bar{Y}^{\varepsilon }\) by (10.33). For every \(n\in \mathbb {N}\), there exists an \(\varepsilon _{1}\in (0,1)\) such that

$$ \left\{ \Vert \bar{Y}^{\varepsilon }\Vert _{\infty , 1},\;u\in \mathscr {U} _{n,+}^{\varepsilon },\;\varepsilon \in (0,\varepsilon _{1})\right\} $$

is a tight collection of \(\mathbb {R}_{+}\)-valued random variables.

Proof

Let \(u=(\zeta ,\varphi )\in \mathscr {U}_{n,+}^{\varepsilon }\) and let \(\psi \doteq (\varphi -1)/a(\varepsilon )\). Then

$$\begin{aligned} \bar{X}^{\varepsilon }(t)-X^{0}(t)&=\int _{0}^{t}\left( b(\bar{X}^{\varepsilon }(s))-b(X^{0}(s))\right) ds+\sqrt{\varepsilon }\int _{0} ^{t}\sigma (\bar{X}^{\varepsilon }(s))dW(s)\\&\quad +\int _{\mathscr {X}_{t}}\varepsilon G(\bar{X}^{\varepsilon } (s-),y)N_{c}^{\varphi /\varepsilon }(ds\times dy)\\&\quad +\int _{\mathscr {X}_{t}}\left( G(\bar{X}^{\varepsilon }(s), y)-G(X^{0} (s),y)\right) \varphi (s, y)\nu (dy)ds\\&\quad +\int _{\mathscr {X}_{t}}G(X^{0}(s),y)(\varphi (s, y)-1)\nu (dy)ds\\&\quad +\int _{0}^{t}\sigma (\bar{X}^{\varepsilon }(s))\zeta (s)ds. \end{aligned}$$

Write \(\bar{Y}^{\varepsilon }=(\bar{X}^{\varepsilon }-X^{0})/a(\varepsilon )\) as

$$\begin{aligned} \bar{Y}^{\varepsilon }=M^{\varepsilon }+A^{\varepsilon }+B^{\varepsilon }+\mathscr {E}^{\varepsilon }+C^{\varepsilon }, \end{aligned}$$
(10.36)

where for \(t\in [0,1]\),

$$\begin{aligned} M^{\varepsilon }(t)&\doteq \frac{\varepsilon }{a(\varepsilon )}\int _{\mathscr {X}_{t}}G(\bar{X}^{\varepsilon }(s-), y)N_{c}^{\varphi /\varepsilon }(ds\times dy)+\left( \frac{\varepsilon }{a(\varepsilon )}\right) ^{1/2} \int _{0}^{t}\sigma (\bar{X}^{\varepsilon }(s))dW(s)\\ A^{\varepsilon }(t)&\doteq \frac{1}{a(\varepsilon )}\int _{0}^{t}\left( b(\bar{X}^{\varepsilon }(s))-b(X^{0}(s))\right) ds,\\ B^{\varepsilon }(t)&\doteq \frac{1}{a(\varepsilon )}\int _{\mathscr {X}_{t} }\left( G(\bar{X}^{\varepsilon }(s), y)-G(X^{0}(s),y)\right) \nu (dy)ds,\\ \mathscr {E}^{\varepsilon }(t)&\doteq \int _{\mathscr {X}_{t}}\left( G(\bar{X}^{\varepsilon }(s), y)-G(X^{0}(s),y)\right) \psi (s, y)\nu (dy)ds,\\ C^{\varepsilon }(t)&\doteq \int _{\mathscr {X}_{t}}G(X^{0}(s),y)\psi (s, y)\nu (dy)ds+\frac{1}{a(\varepsilon )}\int _{0}^{t}\sigma (\bar{X} ^{\varepsilon }(s))\zeta (s)ds. \end{aligned}$$

With \(\varepsilon _{0}\) as in Lemma 10.20, we have from the Burkholder–Davis–Gundy inequality (see Sect. D.1) that

$$ \left\{ \left\| \int _{0}^{\cdot }\sigma (\bar{X}^{\varepsilon } (s))dW(s)\right\| _{\infty , 1}\,\right\} _{\varepsilon \le \varepsilon _{0}} $$

is tight. Also, as in the proof of Lemma 10.20, for some \(c_{1} \in (0,\infty )\), we have

$$\begin{aligned}&{E}\left\| \int _{\mathscr {X}_{\cdot }}G(\bar{X}^{\varepsilon }(s-),y)N_{c}^{\varphi /\varepsilon }(ds\times dy)\right\| _{\infty , 1}\le (1+{E}\Vert \bar{X}^{\varepsilon }\Vert _{\infty , 1})\\&\qquad \qquad \quad ~ \times \left( c_{1} \sqrt{\varepsilon }m+2\tilde{c}(\delta , n)\int _{\mathscr {X}}M_{G}(y)1_{\{M_{G} \ge m\}}\nu (dy)+2\delta +\frac{2n}{\rho }a^{2}(\varepsilon )\right) \end{aligned}$$

for every \(\delta >0\), \(\varepsilon \in (0,\varepsilon _{0})\), and \(u\in \mathscr {U}_{n,+}^{\varepsilon }\). Combining these two estimates, we see that \(\left\{ \Vert M^{\varepsilon }\Vert _{\infty , 1}\right\} _{\varepsilon \le \varepsilon _{0}}\) is tight, and since \(\varepsilon /a(\varepsilon )\rightarrow 0\)

$$\begin{aligned} \Vert M^{\varepsilon }\Vert _{\infty , 1}\rightarrow 0 \text{ in } \text{ probability } \end{aligned}$$
(10.37)

as \(\varepsilon \rightarrow 0\).

In the rest of the proof, we will show upper bounds of the form \(c, c\int _{0}^{t}\Vert \bar{Y}^{\varepsilon }\Vert _{\infty , s}ds\) or \(ca(\varepsilon )\Vert \bar{Y}^{\varepsilon }\Vert _{\infty , 1}\) for each of the remaining terms in (10.36). By the Lipschitz condition on G [part (c) of Condition 10.1 and part (a) of Condition 10.13], there is \(c_{2} \in (0,\infty )\) such that for all \(t\in [0,1]\), \(u\in \mathscr {U} _{n,+}^{\varepsilon }\), we have

$$\begin{aligned} \Vert \mathscr {E}^{\varepsilon }\Vert _{\infty ,t}&\le a(\varepsilon )\int _{\mathscr {X}_{t}}L_{G}(y)\Vert \bar{Y}^{\varepsilon }(s)\Vert \,|\psi (s,y)|\nu (dy)ds\nonumber \\&\le a(\varepsilon )\Vert \bar{Y}^{\varepsilon }\Vert _{\infty ,t} \int _{\mathscr {X}_{t}}L_{G}(y)|\psi (s, y)|\nu (dy)ds\nonumber \\&\le c_{2}a(\varepsilon )\Vert \bar{Y}^{\varepsilon }\Vert _{\infty , t}, \end{aligned}$$
(10.38)

where the last inequality follows from the second bound in Lemma 10.18. Again using the Lipschitz condition on G, we have for all \(t\in [0,1]\) that

$$ \Vert B^{\varepsilon }\Vert _{\infty , t}\le \Vert L_{G}\Vert _{1}\int _{0}^{t} \Vert \bar{Y}^{\varepsilon }(s)\Vert ds. $$

Similarly, the Lipschitz condition on b gives

$$ \Vert A^{\varepsilon }\Vert _{\infty , t}\le L_{b}\int _{0}^{t}\Vert \bar{Y}^{\varepsilon }(s)\Vert ds. $$

Finally, we come to the term \(C^{\varepsilon }\). Again using the second bound in Lemma 10.18, we have that for some \(c_{3}\in (0,\infty )\) and all \(u\in \mathscr {U}_{n,+}^{\varepsilon }\), with \(u=(\zeta ,\varphi ),\psi =(\varphi -1)/a(\varepsilon )\),

$$ \left\| \int _{\mathscr {X}_{\cdot }}G(X^{0}(s),y)\psi (s,y)\nu (dy)ds\right\| _{\infty , 1}\le c_{3}. $$

Since \((\zeta ,\varphi )\in \mathscr {U}_{n,+}^{\varepsilon }\), the bound \(\int _{0}^{1}\left\| \zeta (s)\right\| ^{2}ds\le na^{2}(\varepsilon )\) applies. Thus for \(t\in [0,1]\),

$$ \frac{1}{a(\varepsilon )}\int _{0}^{t}\sigma (\bar{X}^{\varepsilon } (s))\zeta (s)ds=\frac{1}{a(\varepsilon )}\int _{0}^{t}\sigma (X^{0}(s))\zeta (s)ds+\frac{1}{a(\varepsilon )}\mathscr {R}_{1}^{\varepsilon }(t), $$

where

$$\begin{aligned} \Vert \mathscr {R}_{1}^{\varepsilon }\Vert _{\infty , 1}\le a(\varepsilon )L_{\sigma }\Vert \bar{Y}^{\varepsilon }\Vert _{\infty , 1}\sqrt{n}, \end{aligned}$$
(10.39)

and

$$ \left\| \frac{1}{a(\varepsilon )}\int _{0}^{t}\sigma (X^{0}(s))\zeta (s)ds\right\| _{\infty , 1}\le (\Vert X^{0}\Vert _{\infty , 1}L_{\sigma }+\Vert \sigma (0)\Vert )\sqrt{n}. $$

Bringing terms in (10.36) of the form \(ca(\varepsilon )\Vert \bar{Y}^{\varepsilon }\Vert _{\infty , 1}\) to the left side and renormalizing for a coefficient of one, we have, for some \(c_{4}\in (0,\infty )\), \(\tilde{\varepsilon }_{0}\in (0,{\varepsilon }_{0})\), and all \(u\in \mathscr {U} _{n,+}^{\varepsilon }\), \(t\in [0,1]\), and \(\varepsilon \le \tilde{\varepsilon }_{0}\), that

$$ \Vert \bar{Y}^{\varepsilon }\Vert _{\infty , t}\le c_{4}\left( 1+\int _{0} ^{t}\Vert \bar{Y}^{\varepsilon }\Vert _{\infty , s}ds\right) +Z^{\varepsilon }, $$

where \(\{Z^{\varepsilon }\}_{\varepsilon \le \tilde{\varepsilon }_{0}}\) is tight. The result now follows by an application of Gronwall’s inequality.    \(\square \)

The next two lemmas will be needed in the weak convergence arguments of Lemma 10.24. The first is an immediate consequence of Lemma 10.18.

Lemma 10.22

Let \(h\in \mathscr {L}^{1}(\nu )\cap \mathscr {L} _{\text{ exp }}^{\rho }\) for some \(\rho >0\). Then for all \(n\in \mathbb {N}\), and \(\beta \in (0,\infty )\),

$$ \sup _{f\in S_{n}^{N,\varepsilon }}\int _{\mathscr {X}_{1}} h(y)\,|f(s, y)|1_{\{|f|\ge \beta /a(\varepsilon )\}}\nu (dy)ds\rightarrow 0\quad \text{ as } \varepsilon \rightarrow 0. $$

For \(n\in (0,\infty )\), let \(\hat{S}_{n}^{N}\doteq \{f\in \mathscr {L}^{2}(\nu _{1}):\Vert f\Vert _{N, 2}^{2}\le n\}\).

Lemma 10.23

Let \(n\in \mathbb {N}\), \(\varepsilon >0\), and \(f^{\varepsilon }\in S_{n}^{N,\varepsilon }\). Let \(\eta :\mathscr {X}_{1}\rightarrow \mathbb {R}^{d}\) be a measurable function such that

$$ |\eta (s,y)|\le h(y)\text { for }y\in \mathscr {X}, s\in [0,1], $$

where \(h\in \mathscr {L}^{1}(\nu )\cap \mathscr {L}_{\text{ exp }}^{\rho }\) for some \(\rho >0\). Suppose there is \(\beta \in (0,1]\) such that \(f^{\varepsilon }1_{\{|f^{\varepsilon }|\le \beta /a(\varepsilon )\}}\) converges in \(\hat{S}_{n\kappa _{2}(1)}^{N}\) to f. Then for all \(t\in [0,1]\),

$$ \int _{\mathscr {X}_{t}}\eta (s,y)f^{\varepsilon }(s, y)\nu _{1}(ds\times dy)\rightarrow \int _{\mathscr {X}_{t}}\eta (s,y)f(s, y)\nu _{1}(ds\times dy). $$

Proof

It follows from Lemma 10.22 that

$$ \int _{\mathscr {X}_{1}}|\eta (s,y)f^{\varepsilon }(s, y)|1_{\{|f^{\varepsilon }|\ge \beta /a(\varepsilon )\}}\nu _{1}(ds\times dy)\rightarrow 0\; \text{ as } \varepsilon \rightarrow 0. $$

Also, since \(\eta 1_{[0,t]}\in \mathscr {L}^{2}(\nu _{1})\) for all \(t\in [0,1]\) and \(f^{\varepsilon }1_{\{|f^{\varepsilon }|\le \beta /a(\varepsilon )\}}\rightarrow f\), we have

$$ \int _{\mathscr {X}_{t}}\eta (s,y)f^{\varepsilon }(s, y)1_{\{|f^{\varepsilon } |\le \beta /a(\varepsilon )\}}\nu _{1}(ds\times dy)\rightarrow \int _{\mathscr {X} _{t}}\eta (s,y)f(s, y)\nu _{1}(ds\times dy). $$

The result follows on combining the last two displays.    \(\square \)

3.2 Proof of the Moderate Deviation Principle

The following is the key result needed in the proof of Theorem 10.14. It gives tightness of the joint distribution of controls and controlled processes, and indicates how limits of these two quantities are related. Recall \(\hat{S}_{n}^{N}\doteq \{f\in \mathscr {L}^{2}(\nu _{1}):\Vert f\Vert _{N, 2}^{2}\le n\}\).

Lemma 10.24

Suppose Conditions 10.1 and 10.13 hold. Let \(n\in \mathbb {N}\), \(\varepsilon >0\), and \(u^{\varepsilon }=(\zeta ^{\varepsilon },\varphi ^{\varepsilon })\in \mathscr {U}_{n,+}^{\varepsilon }\). Let \(\psi ^{\varepsilon }\doteq (\varphi ^{\varepsilon }-1)/a(\varepsilon )\), suppose \(\beta \in (0,1]\), and set \(\bar{Y}^{\varepsilon }\doteq {\mathscr {K} }^{\varepsilon }(\sqrt{\varepsilon }W^{\zeta ^{\varepsilon }/\sqrt{\varepsilon } },\varepsilon N^{\varphi ^{\varepsilon }/\varepsilon })\). Then \(\{(\bar{Y}^{\varepsilon },\psi ^{\varepsilon }1_{\{|\psi ^{\varepsilon }|\le \beta /a(\varepsilon )\}},\zeta ^{\varepsilon }/a(\varepsilon ))\}_{\varepsilon >0}\) is tight in

$$ \mathscr {D}([0,1]:\mathbb {R}^{d})\times \hat{S}_{n(\kappa _{2}(1)+1)}, $$

and any limit point \((\bar{Y},\psi ,\zeta )\) satisfies (10.25) a.s., with \(\eta \) replaced by \(\bar{Y}\) and \((f_{1}, f_{2})\) replaced by \((\psi ,\zeta )\).

Proof

We use the notation from the proof of Lemma 10.21 but replace \((\zeta ,\varphi )\) throughout by \((\zeta ^{\varepsilon },\varphi ^{\varepsilon })\). Assume without loss of generality that \(\varepsilon \le \varepsilon _{0}\). From (10.37) we have that \(\Vert M^{\varepsilon }\Vert _{\infty , 1}\rightarrow 0\) in probability as \(\varepsilon \rightarrow 0\). Also, since from Lemma 10.21\(\{\Vert \bar{Y}^{\varepsilon }\Vert _{\infty , 1}\}_{\varepsilon \le \varepsilon _{0}}\) is tight, (10.38) implies that \(\Vert \mathscr {E}^{\varepsilon }\Vert _{\infty , 1}\rightarrow 0\) in probability.

Noting that \(\bar{X}^{\varepsilon }(t)=X^{0}(t)+a(\varepsilon )\bar{Y}^{\varepsilon }(t)\), we have by Taylor’s formula,

$$ G(\bar{X}^{\varepsilon }(s), y)-G(X^{0}(s), y)=a(\varepsilon )D_{x}G(X^{0} (s),y)\bar{Y}^{\varepsilon }(t)+R^{\varepsilon }(s, y), $$

where

$$ \Vert R^{\varepsilon }(s, y)\Vert \le L_{DG}(y)a^{2}(\varepsilon )\Vert \bar{Y}^{\varepsilon }(s)\Vert ^{2}. $$

Hence

$$ B^{\varepsilon }(t)=\int _{\mathscr {X}_{t}}D_{x}G(X^{0} (s),y)\bar{Y}^{\varepsilon }(s)\nu (dy)\, ds+T_{1}^{\varepsilon }(t), $$

where

$$ \Vert T_{1}^{\varepsilon }\Vert _{\infty , 1}\le \Vert L_{DG}\Vert _{1}\, a(\varepsilon )\int _{0}^{1}\Vert \bar{Y}^{\varepsilon }(s)\Vert ^{2}ds. $$

Thus using Lemma 10.21 again, we have that \(\Vert T_{1}^{\varepsilon } \Vert _{\infty , 1}\rightarrow 0\) in probability. Similarly,

$$ A^{\varepsilon }(t)=\int _{0}^{t}Db(X^{0}(s))\bar{Y}^{\varepsilon } (s)ds+T_{2}^{\varepsilon }(t), $$

where \(\Vert T_{2}^{\varepsilon }\Vert _{\infty , 1}\rightarrow 0\) in probability. Also, from (10.39) \(\Vert \mathscr {R}_{1}^{\varepsilon } \Vert _{\infty , 1}\rightarrow 0\) in probability. Putting these estimates together, we have from (10.36) that

$$\begin{aligned} \bar{Y}^{\varepsilon }(t)&=T_{3}^{\varepsilon }(t)+\int _{0}^{t} Db(X^{0}(s))\bar{Y}^{\varepsilon }(s)ds+\frac{1}{a(\varepsilon )}\int _{0} ^{t}\sigma (X^{0}(s))\zeta ^{\varepsilon }(s)ds\\&\quad +\int _{\mathscr {X}_{t}}D_{x}G(X^{0}(s), y)\bar{Y}^{\varepsilon } (s)\nu (dy)ds+\int _{\mathscr {X}_{t}}G(X^{0}(s),y)\psi ^{\varepsilon } (s, y)\nu (dy)ds,\nonumber \end{aligned}$$
(10.40)

where

$$ T_{3}^{\varepsilon }\doteq M^{\varepsilon }+\mathscr {E}^{\varepsilon } +T_{1}^{\varepsilon }+T_{2}^{\varepsilon }+\mathscr {R}_{1}^{\varepsilon }\Rightarrow 0. $$

We now prove tightness of

$$\begin{aligned} \tilde{A}^{\varepsilon }(\cdot )&\doteq \int _{0}^{\cdot }Db(X^{0}(s))\bar{Y}^{\varepsilon }(s)ds,\;\tilde{B}^{\varepsilon }(\cdot )\doteq \int _{\mathscr {X}_{\cdot }}D_{x}G(X^{0}(s), y) \bar{Y}^{\varepsilon }(s)\nu (dy)ds,\\ \tilde{C}^{\varepsilon }(\cdot )&\doteq \int _{\mathscr {X}_{\cdot }} G(X^{0}(s),y)\psi ^{\varepsilon }(s, y)\nu (dy)ds,\;\tilde{D}^{\varepsilon } (\cdot )\doteq \frac{1}{a(\varepsilon )}\int _{0}^{\cdot }\sigma (X^{0} (s))\zeta ^{\varepsilon }(s)ds. \end{aligned}$$

Applying Lemma 10.18 with \(h=M_{G}\), for every \(\beta \in (0,1]\) and \(\delta \in (0,1)\), we have

$$\begin{aligned} \Vert \tilde{C}^{\varepsilon }(t+\delta )-\tilde{C}^{\varepsilon }(t)\Vert&=\int _{[t, t+\delta ]\times \mathscr {X}}\Vert G(X^{0}(s),y)\Vert \,|\psi ^{\varepsilon }(s, y)|\nu (dy)ds\\&\le \left( 1+\Vert X^{0}\Vert _{\infty , 1}\right) \int _{[t,t+\delta ]\times \mathscr {X}} M_{G}(y)|\psi ^{\varepsilon }(s, y)|\nu (dy)ds\\&\le \left( 1+\Vert X^{0}\Vert _{\infty , 1}\right) (\zeta (\beta )\delta ^{1/2}+\sqrt{a(\varepsilon )}\vartheta (\beta )+2\xi (\varepsilon )). \end{aligned}$$

Tightness of \(\{\tilde{C}^{\varepsilon }\}_{\varepsilon >0}\) in \(\mathscr {C} ([0,1]:\mathbb {R}^{d})\) is now immediate from the properties of \(\vartheta \) and \(\xi \).

Next we argue the tightness of \(\tilde{B}^{\varepsilon }\). For \(0\le t\le t+\delta \le 1\), we have

$$\begin{aligned} \Vert \tilde{B}^{\varepsilon }(t+\delta )-\tilde{B}^{\varepsilon }(t)\Vert&\le \int _{[t, t+\delta ]\times \mathscr {X}}\Vert D_{x}G(X^{0}(s), y)\bar{Y}^{\varepsilon }(s)\Vert \nu (dy)ds\\&\le \left( \sup _{\Vert x\Vert \le \Vert X^{0}\Vert _{\infty , 1}}\int _{\mathscr {X}}\Vert D_{x}G(x,y)\Vert \nu (dy)\right) \int _{[t, t+\delta ]}\Vert \bar{Y}^{\varepsilon }(s)\Vert ds\\&\le c_{1}\Vert \bar{Y}^{\varepsilon }\Vert _{\infty , 1}\delta , \end{aligned}$$

where \(c_{1}\doteq \sup _{\Vert x\Vert \le \Vert X^{0}\Vert _{\infty , 1}}\int _{\mathscr {X}}\Vert D_{x} G(x, y)\Vert \nu (dy)\) is finite by part (b) of Condition 10.13. Tightness of \(\{\tilde{B}^{\varepsilon }\}_{\varepsilon >0}\) in \(\mathscr {C} ([0,1]:\mathbb {R}^{d})\) now follows as a consequence of Lemma 10.21. Similarly, it can be seen that \(\tilde{A}^{\varepsilon }\) is tight in \(\mathscr {C}([0,1]:\mathbb {R}^{d})\). Finally, since \(\zeta ^{\varepsilon }\in S_{na^{2}(\varepsilon )}^{W}\) implies the bound \(na^{2}(\varepsilon )\) for \(\int _{0}^{1}\Vert \zeta ^{\varepsilon }\Vert ^{2}ds\), it follows that for \(0\le t\le t+\delta \le 1\), we have

$$\begin{aligned} \Vert \tilde{D}^{\varepsilon }(t+\delta )-\tilde{D}^{\varepsilon }(t)\Vert&\le \frac{1}{a(\varepsilon )}\int _{t}^{t+\delta }\Vert \sigma (X^{0} (s))\zeta ^{\varepsilon }(s)\Vert ds\\&\le \sqrt{\delta }(\Vert X^{0}\Vert _{\infty , 1}L_{\sigma }+\Vert \sigma (0)\Vert )\sqrt{n}. \end{aligned}$$

Tightness of \(\{\tilde{D}^{\varepsilon }\}_{\varepsilon >0}\) in \(\mathscr {C} ([0,1]:\mathbb {R}^{d})\) is now immediate. Since each of these terms is tight, \(\{\bar{Y}^{\varepsilon }\}_{\varepsilon >0}\) is tight in \(\mathscr {D} ([0,1]:\mathbb {R}^{d})\). Also, from part (c) of Lemma 9.7,

$$ \left( \psi ^{\varepsilon }1_{\{|\psi ^{\varepsilon }|\le \beta /a(\varepsilon )\}},\frac{1}{a(\varepsilon )}\zeta ^{\varepsilon }\right) $$

takes values in the compact space \(\hat{S}_{n(\kappa _{2}(1)+1)}\) for all \(\varepsilon >0\) and is therefore automatically tight. This completes the proof of the first part of the lemma.

Suppose now that

$$ \left( \bar{Y}^{\varepsilon },\psi ^{\varepsilon }1_{\{|\psi ^{\varepsilon } |\le \beta /a(\varepsilon )\}},\frac{1}{a(\varepsilon )}\zeta ^{\varepsilon }\right) $$

converges in distribution along a subsequence to \((\bar{Y},\psi ,\zeta )\). From Lemma 10.23 and the tightness of \(\tilde{C}^{\varepsilon },\tilde{D}^{\varepsilon }\) established above, we have that

$$ \left( \bar{Y}^{\varepsilon },\int _{\mathscr {X}_{\cdot }}G(X^{0}(s),y)\psi ^{\varepsilon }(s, y)\nu (dy)ds,\frac{1}{a(\varepsilon )}\int _{0}^{\cdot } \sigma (X^{0}(s))\zeta ^{\varepsilon }(s)ds\right) $$

converges in distribution, in \(\mathscr {D}([0,1]:\mathbb {R}^{3d})\), to

$$ \left( \bar{Y},\int _{\mathscr {X}_{\cdot }}G(X^{0}(s),y)\psi (s, y)\nu (dy)ds,\int _{0}^{\cdot }\sigma (X^{0}(s))\zeta (s)ds\right) . $$

The result now follows on using this convergence in (10.40) and recalling that \(T_{3}^{\varepsilon }\Rightarrow 0\).    \(\square \)

We now complete the proof of the moderate deviation principle.

Proof

(of Theorem 10.14) It suffices to show that Condition 9.8 holds with \({\mathscr {K}}^{\varepsilon }\) and \({\mathscr {K}}^{0}\) defined as at the beginning of Sect. 10.3.1. Part (a) of the condition was verified in Lemma 10.17. Consider now part (b). Fix \(n\in \mathbb {N}\) and \(\beta \in (0,1]\). Let \((\zeta ^{\varepsilon },\varphi ^{\varepsilon })\in \mathscr {U}_{n,+}^{\varepsilon }\) and \(\psi ^{\varepsilon }=(\varphi ^{\varepsilon }-1)/a(\varepsilon )\). Suppose that for some \(\beta \in (0,1]\), \((\psi ^{\varepsilon }1_{\{|\psi ^{\varepsilon }|\le \beta /a(\varepsilon )\}},\zeta ^{\varepsilon }/a(\varepsilon ))\Rightarrow (\psi ,\zeta )\). To complete the proof, we need to show that

$$\begin{aligned} {\mathscr {K}}^{\varepsilon }\left( \sqrt{\varepsilon }W^{\zeta ^{\varepsilon }/\sqrt{\varepsilon }},\;\varepsilon N^{\varphi ^{\varepsilon }/\varepsilon }\right) \Rightarrow {\mathscr {K}}^{0}(\zeta ,\psi ). \end{aligned}$$
(10.41)

Recall that the left side of (10.41) equals \(\bar{Y}^{\varepsilon }\) defined in (10.33). From Lemma 10.24, \(\{(\bar{Y}^{\varepsilon },\psi ^{\varepsilon }1_{\{|\psi ^{\varepsilon }|\le \beta /a(\varepsilon )\}},\zeta ^{\varepsilon }/a(\varepsilon ))\}\) is tight in \(\mathscr {D}([0,1]:\mathbb {R}^{d})\times \hat{S}_{n(\kappa _{2}(1)+1)}\), and every limit point \((\bar{Y},\bar{\psi },\bar{\zeta })\) satisfies (10.25) a.s., with \(\eta \) replaced by \(\bar{Y}\) and \((f_{1}, f_{2})\) with \((\bar{\psi },\bar{\zeta })\). Since (10.25) has a unique solution, \(\mathscr {K}^{0}(f_{1} , f_{2})\) for every \(f=(f_{1}, f_{2})\in \mathscr {L}^{2}\) [recall \(\mathscr {L} ^{2}\doteq \mathscr {L}^{2}([0,1]:\mathbb {R}^{d})\times \mathscr {L}^{2}(\nu _{T} )\)], \((\zeta ,\psi )\) has the same law as \((\bar{\zeta },\bar{\psi })\), and every limit point of \(\bar{Y}^{\varepsilon }\) must have the same distribution as \({\mathscr {K}}^{0}(\zeta ,\psi )\). The result follows.    \(\square \)

3.3 Equivalence of Two Rate Functions

In this section we present the proof of Theorem 10.15. To simplify notation, suppose without loss that \(T=1\). Fix \(\eta \in \mathscr {C}([0,1]:\mathbb {R}^{d})\) and \(\delta >0\). Let \(\tilde{f}=(\tilde{f}_{1},\tilde{f}_{2})\), where \(\tilde{f}_{i}\in \mathscr {L} ^{2}([0,1]:\mathbb {R}^{d})\), \(i=1,2\), be such that

$$ \frac{1}{2}\int _{0}^{1}\left( \Vert \tilde{f}_{1}(s)\Vert ^{2}+\Vert \tilde{f}_{2}(s)\Vert ^{2}\right) ds\le I(\eta )+\delta $$

and \((\eta ,\tilde{f})\) satisfy (10.27). Define \(f_{2}:\mathscr {X}_{1}\rightarrow \mathbb {R}\) by

$$\begin{aligned} f_{2}(s, y)\doteq \sum _{i=1}^{d}\tilde{f}_{2,i}(s)e_{i}(s,y),\;(s, y)\in \mathscr {X}_{1}, \end{aligned}$$
(10.42)

where the \(e_{i}\) were introduced just before the statement of Theorem 10.15. From the orthonormality of \(e_{i}(s,\cdot )\), it follows that

$$\begin{aligned} \frac{1}{2}\int _{\mathscr {X}_{1}}|f_{2}(s, y)|^{2}\nu _{1}(ds\times dy)=\frac{1}{2}\int _{0}^{1}\Vert \tilde{f}_{2}(s)\Vert ^{2}ds. \end{aligned}$$
(10.43)

Also, with A(s) defined as in (10.26), we have

$$\begin{aligned}{}[A(s)\tilde{f}_{2}(s)]_{i}&=\sum _{j=1}^{d}\langle G_{i} (X^{0}(s),\cdot ), e_{j}(s,\cdot )\rangle _{\mathscr {L}^{2}(\nu )}\tilde{f} _{2,j}(s)\\&=\left\langle G_{i}(X^{0}(s),\cdot ),\sum _{j=1}^{d}e_{j}(s,\cdot )\tilde{f}_{2,j}(s)\right\rangle _{\mathscr {L}^{2}(\nu )}\\&=\langle G_{i}(X^{0}(s),\cdot ), f_{2}(s,\cdot )\rangle _{\mathscr {L}^{2}(\nu )}, \end{aligned}$$

so that \(A(s)\tilde{f}_{2}(s)=\int _{\mathscr {X}}f_{2}(s, y)G(X^{0} (s), y)\nu (dy)\). Consequently, \(\eta \) satisfies (10.25) with \(f_{2}\) as in (10.42) and \(f_{1}=\tilde{f}_{1}\). Combining this with (10.43), we have

$$\begin{aligned} \bar{I}(\eta )&\le \frac{1}{2}\int _{0}^{1}\Vert \tilde{f}_{1}(s)\Vert ^{2}ds+\frac{1}{2}\int _{\mathscr {X}_{1}}|f_{2}(s, y)|^{2}\nu _{2}(ds\times dy)\\&=\frac{1}{2}\int _{0}^{1}\left( \Vert \tilde{f}_{1}(s)\Vert ^{2}+\Vert \tilde{f}_{2}(s)\Vert ^{2}\right) ds\\&\le I(\eta )+\delta . \end{aligned}$$

Since \(\delta >0\) is arbitrary, we have \(\bar{I}(\eta )\le I(\eta )\).

Conversely, suppose \(\delta >0\) and \(q=(f_{1}, f_{2})\in \mathscr {L}^{2}\) is such that

$$ \frac{1}{2}\int _{\mathscr {X}_{1}}|f_{2}(s, y)|^{2}\nu _{1}(ds\times dy)+\frac{1}{2}\int _{0}^{1}\Vert f_{1}(s)\Vert ^{2}ds\le \bar{I}(\eta )+\delta $$

and (10.25) holds. For \(i=1,\ldots , d\), define \(\tilde{f}_{2,i}:[0,1]\rightarrow \mathbb {R}\) by

$$ \tilde{f}_{2,i}(s)=\langle f_{2}(s,\cdot ), e_{i}(s,\cdot )\rangle _{\mathscr {L} ^{2}(\nu )}. $$

For \(s\in [0,1]\), let \(\{e_{j}(s,\cdot )\}_{j=d+1}^{\infty }\) be defined in such a manner that \(\{e_{j}(s,\cdot )\}_{j=1}^{\infty }\) is a complete orthonormal system in \(\mathscr {L}^{2}(\nu )\). Then for every \(s\in [0,1]\), \(i=1,\ldots , d\),

$$\begin{aligned}{}[A(s)\tilde{f}_{2}(s)]_{i}&=\sum _{j=1}^{d}\langle G_{i} (X^{0}(s),\cdot ), e_{j}(s,\cdot )\rangle _{\mathscr {L}^{2}(\nu )}\langle f_{2}(s,\cdot ),e_{j}(\cdot , s)\rangle _{\mathscr {L}^{2}(\nu )}\\&=\sum _{j=1}^{\infty }\langle G_{i}(X^{0}(s),\cdot ),e_{j}(\cdot , s)\rangle _{\mathscr {L}^{2}(\nu )}\langle f_{2}(s,\cdot ), e_{j}(s,\cdot )\rangle _{\mathscr {L}^{2}(\nu )}\\&=\langle G_{i}(X^{0}(s),\cdot ), f_{2}(s,\cdot )\rangle _{\mathscr {L}^{2}(\nu )}, \end{aligned}$$

where the second equality follows on observing that \(G_{i}(X^{0}(s),\cdot )\) is in the linear span of \(\{e_{j}(s,\cdot )\}_{j=1}^{d}\) for \(i=1,\ldots , d\). Thus \(A(s)\tilde{f}_{2}(s)=\int _{\mathscr {X}}f_{2}(s, y)G(X^{0}(s), y)\nu (dy)\), and therefore \((\eta ,\tilde{f})\) satisfy (10.27) with \(\tilde{f}_{2}\) defined as above and \(\tilde{f}_{1}=f_{1}\). Note that with \(\tilde{f}_{2}=(\tilde{f}_{2,1},\ldots ,\tilde{f}_{2,d})\),

$$\begin{aligned} \frac{1}{2}\int _{0}^{1}\Vert \tilde{f}_{2}(s)\Vert ^{2}ds&=\frac{1}{2} \int _{0}^{1}\sum _{j=1}^{d}\langle f_{2}(s,\cdot ), e_{j}(s,\cdot )\rangle _{\mathscr {L}^{2}(\nu )}^{2}ds\\&\le \frac{1}{2}\int _{0}^{1}\int _{\mathscr {X}}f_{2}^{2}(s, y)\nu (dy)ds. \end{aligned}$$

Thus

$$\begin{aligned} {I}(\eta )&\le \frac{1}{2}\int _{0}^{1}\left( \Vert \tilde{f}_{1} (s)\Vert ^{2}+\Vert \tilde{f}_{2}(s)\Vert ^{2}\right) ds\\&\le \frac{1}{2}\int _{0}^{1}\Vert f_{1}(s)\Vert ^{2}ds+\frac{1}{2}\int _{0}^{1}\int _{\mathscr {X}}f_{2}^{2}(s, y)\nu (dy)ds\\&\le \bar{I}(\eta )+\delta . \end{aligned}$$

Since \(\delta >0\) is arbitrary, \(I(\eta )\le \bar{I}(\eta )\), which completes the proof.    \(\square \)

4 Notes

The first results for a general class of continuous time small noise Markov processes appear to be those of Wentzell [245–248]. The class covered includes both Gaussian and Poisson driving noises, and the proof uses approximation by discrete time processes.

Large deviation principles for small noise infinite dimensional stochastic differential equations driven by PRM are considered in [38]. Although this paper considers a considerably more complex setting than the one studied in the current chapter, it assumed a somewhat stronger condition on the coefficient function that plays the role of G in this chapter. Specifically, it required the functions \(M_{G}\) and \(L_{G}\) to satisfy a more stringent integrability condition than the one used here (Condition 10.3). Some of the results used to deal with these weaker integrability conditions are from [47]. Another distinction between this chapter and [38] is that here we consider systems driven by both Gaussian and Poisson noise, and consider both large and moderate deviations. A moderate deviation principle of the form given in Sect. 10.3, applicable to both finite and infinite dimensional SDE, was presented in [41] for the setting in which the driving noise is only Poisson. It is worth noting that most of the technical details in this chapter arise from the treatment of the PRM term.

There is an important distinction between the types of processes one can represent using PRMs and their large deviation analysis. For one class, which includes the models considered in this chapter as well as the example in Sect. 3.3, different points in the point space of the underlying PRM are used to model different types of jumps in the solution to the SDE. The conditions placed on the coefficient that modulates the impact of the noise on the state (G in this chapter) tend to be continuity-type conditions, analogous to those one places on the diffusion coefficient of an SDE driven by Brownian motion, though stated in terms of integration over the space \(\mathscr {X}\). These continuity properties are used to establish uniqueness of the map that takes the controls into the state for the limit dynamics, under the assumption that the cost of the controls is bounded. With the second class, there are only finitely many different types of jumps of the state, and the role of G is simply to “thin” the PRM to produce state-dependent jump rates. Examples of this type include the process models of Chap. 13 as well as those in [23, 42]. Owing to its role in thinning, G is typically not continuous, and one does not expect uniqueness of the limiting deterministic map that takes controls to the state process. However, as we will see, it is in fact sufficient to prove the following restricted uniqueness: given a state trajectory for which there is a corresponding control with finite cost, find a control that produces the same state trajectory with nearly the same cost and for which there is uniqueness. These points are illustrated in the analysis carried out in Chap. 13.