Abstract
In the present paper, we consider a family of continuous time symmetric random walks indexed by \(k\in \mathbb {N}\), \(\{X_k(t),\,t\ge 0\}\). For each \(k\in \mathbb {N}\) the matching random walk take values in the finite set of states \(\Gamma _k=\frac{1}{k}(\mathbb {Z}/k\mathbb {Z})\); notice that \(\Gamma _k\) is a subset of \(\mathbb {S}^1\), where \(\mathbb {S}^1\) is the unitary circle. The infinitesimal generator of such chain is denoted by \(L_k\). The stationary probability for such process converges to the uniform distribution on the circle, when \(k\rightarrow \infty \). Here we want to study other natural measures, obtained via a limit on \(k\rightarrow \infty \), that are concentrated on some points of \(\mathbb {S}^1\). We will disturb this process by a potential and study for each \(k\) the perturbed stationary measures of this new process when \(k\rightarrow \infty \). We disturb the system considering a fixed \(C^2\) potential \(V: \mathbb {S}^1 \rightarrow \mathbb {R}\) and we will denote by \(V_k\) the restriction of \(V\) to \(\Gamma _k\). Then, we define a non-stochastic semigroup generated by the matrix \(k\,\, L_k + k\,\, V_k\), where \(k\,\, L_k \) is the infinifesimal generator of \(\{X_k(t),\,t\ge 0\}\). From the continuous time Perron’s Theorem one can normalized such semigroup, and, then we get another stochastic semigroup which generates a continuous time Markov Chain taking values on \(\Gamma _k\). This new chain is called the continuous time Gibbs state associated to the potential \(k\,V_k\), see (Lopes et al. in J Stat Phys 152:894–933, 2013). The stationary probability vector for such Markov Chain is denoted by \(\pi _{k,V}\). We assume that the maximum of \(V\) is attained in a unique point \(x_0\) of \(\mathbb {S}^1\), and from this will follow that \(\pi _{k,V}\rightarrow \delta _{x_0}\). Thus, here, our main goal is to analyze the large deviation principle for the family \(\pi _{k,V}\), when \(k \rightarrow \infty \). The deviation function \(I^V\), which is defined on \( \mathbb {S}^1\), will be obtained from a procedure based on fixed points of the Lax–Oleinik operator and Aubry–Mather theory. In order to obtain the associated Lax–Oleinik operator we use the Varadhan’s Lemma for the process \(\{X_k(t),\,t\ge 0\}\). For a careful analysis of the problem we present full details of the proof of the Large Deviation Principle, in the Skorohod space, for such family of Markov Chains, when \(k\rightarrow \infty \). Finally, we compute the entropy of the invariant probabilities on the Skorohod space associated to the Markov Chains we analyze.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We will study a family of continuous time Markov Chains indexed by \(k\in \mathbb {N}\), for each \(k\in \mathbb {N}\) the corresponding Markov Chain take values in the finite set of states \(\Gamma _k=\frac{1}{k}(\mathbb {Z}/k\mathbb {Z})\). Let \(\mathbb {S}^1\) be the unitary circle which can be identified with the interval \([0,1)\). In this way we identify \(\Gamma _k\) with \(\{0,\,1/k,\, 2/k,\ldots ,\,(k-1)/k\}\) in order to simplify the notation. We will analyse below a limit procedure on \(k\rightarrow \infty \) and this is the reason why we will consider that the values of the states of the chain are in the unitary circle. The continuous time Markov Chain with index \(k\) has the following behaviour: if the particle is at \(j/k\) it waits an exponential time of parameter \(2\) and then jumps either to \((j-1)/k\) or to \((j+1)/k\) with probability \(1/2\). In order to simplify the notation, we omit the indication that the the sum \(j+1\) is mod \(k\) and the same for the subtraction \(j-1\); we will do this without other comments in the rest of the text. The skeleton of this continuous time Markov Chain has matrix of transitions \(\mathcal {P}_k=(p_{i,j})_{i,j}\) such that the element \(p_{j,j+1}\) describes the probability of transition of \(i/k\) to \(j/k\), which is \(p_{i,i+1}=p_{i,i-1}=1/2\) and \(p_{i,j}=0\), for all \(j\ne i\). The infinitesimal generator is the matrix \(L_k=2(\mathcal {P}_k-I_k)\), where \(I_k\) is the identity matrix, in words \(L_k\) is a matrix that is equal to \(-2\) in the diagonal \(L_{i,j}=1\) above and below the diagonal, and the rest is zero. Notice that \(L_k\) is symmetric matrix. For instance, take \(k=4\),
We can write this infinitesimal generator as an operator acting on functions \(f: \Gamma _k\rightarrow \mathbb {R}\) as
Notice that this expression describes the infinitesimal generator of continuous time random walk. For each \(k\in \mathbb {N}\), we denote \(P_k(t)=e^{t\, L_k}\) the semigroup associated to this infinitesimal generator. We also denote by \(\pi _k\) the uniform probability on \(\Gamma _k\). This is the invariant probability for the above defined continuous Markov Chain. The probability \(\pi _k\) converges to the Lebesgue measure on \( \mathbb {S}^1\), as \(k \rightarrow \infty \).
Fix \(T>0\) and \(x_0\in \mathbb {S}^1\), let \(\mathbb {P}_k\) be probability on the Skorohod space \(D[0,T]\), the space of càdlàg trajectories taking values on \(\mathbb {S}^1\), which are induced by the infinitesimal generator \(k\mathcal L_k\) and the initial probability \(\delta _{x_k(x_0)}\), which is the Delta of Dirac at \(x_k(x_0):=\lfloor k x_0\rfloor /k\in \Gamma _k\), where \(x_k(x_0)\) is the closest point to \(x_0\) on the left of \(x_0\) in the set \(\Gamma _k\). Denote by \(\mathbb {E}_k\) the expectation with respect to \(\mathbb {P}_k\) and by \(\{X_k(t)\}_{t\in [0,T]}\) the continuous time Markov chain with the infinitesimal generator \(k\mathcal L_k\). One of our goals is described in the Sect. 2 which is to establish a Large Deviation Principle for \(\{\mathbb {P}_k\}_k\) in \(D[0,T]\). This will be used later on the Sect. 3.1 to define the Lax–Oleinik semigroup. One can ask: why we use this time scale? Since the continuous time symmetric random walk converges just when the time is re-scaled with speed \(k^2\), then taking speed \(k\) the symmetric random walk converges to a constant trajectory. Here the setting follows similar ideas as the ones in the papers [1, 2], where N. Anantharaman used the Shilder’s Theorem. The Shilder’s Theorem says that for \(\{B_t\}_t\) (the standard Brownian Motion) the sequence \(\{\sqrt{\varepsilon }B_t\}_t\), which converges to a trajectory constant equal to zero, when \(\varepsilon \rightarrow 0\), has rate of convergence equal to \(I(\gamma )=\int _0^T\frac{(\gamma '(s))^2}{2}\,ds\), if \(\gamma :[0,T]\rightarrow \mathbb {R}\) is absolutely continuous, and \(I(\gamma )=\infty \), otherwise.
We proved that the sequence of measures \(\{\mathbb {P}_k\}_k\) satisfy the large deviation principle with rate function \(I_T: D[0,T]\rightarrow \mathbb {R}\) such that
if \(\gamma \in \mathcal {AC}[0,T]\) and \(I_{T}(\gamma )=\infty \), otherwise.
Finally, in Sect. 3, we consider this system disturbed by a \(C^2\) potential \(V:\mathbb S^1 \rightarrow \mathbb {R}.\) The restriction of \(V\) to \( \Gamma _k\) is denoted by \(V_k\). From the continuous time Perron’s Theorem we get an eigenvalue and an eigenfunction for the operator \(k\,L_k + k \, V_k\). Then, normalizing the semigroup associated to \(k\,L_k + k \, V_k\) via the eigenvalue and eigenfunction of this operator, we obtain a new continuous time Markov Chain, which is called the Gibbs Markov Chain associated to \(k\, V_k\) (see [4, 19]). Denote by \(\pi _{k,V}\) the initial stationary vector of this family of continuous time Markov Chains indexed by \(k\) and which takes values on \( \Gamma _k\subset \mathbb S^1\). We investigate the large deviation properties of this family of stationary vectors which are probabilities on \(\mathbb S^1\), when \(k\rightarrow \infty \). More explicitly, roughly speaking, the deviation function \(I^V\) should satisfy the property: given an interval \([a,b]\)
If \(V:\mathbb S^1 \rightarrow \mathbb {R}\) attains the maximal value in just one point \(x_0\), then, \(\pi _{k,V}\) weakly converge, as \(k\rightarrow \infty \), to the delta Dirac in \(x_0.\) We will use results of Aubry–Mather theory (see [6, 8, 10] or [11]) in order to exhibit the deviation function \(I^V\), when \(k \rightarrow \infty \).
It will be natural to consider the Lagrangian defined on \(S^1\) given by
which is convex and superlinear. It is easy to get the explicit expression of the associated Hamiltonian,
As we will see the deviation function is obtained from certain weak KAM solutions of the associated Hamilton-Jacobi equation (see Sect. 4 and 7 in [11]). In the one-dimensional case \(\mathbb S^1\) the weak KAM solution can be in some cases explicitly obtained (for instance when \(V\) as a unique point of maximum). From the conservation of energy (see [7]), in this case, one can get a solution (periodic) with just one point of lack of differentiability.
It follows from the continuous time Perron’s Theorem that the probability vector \(\pi _{k,V}\) depends for each \(k\) on a left eigenvalue and on a right eigenvalue. In this way, in the limit procedure, this will require in our reasoning the use of the positive time and negative time Lax–Oleinik operators (see [11]).
From a theoretical perspective, following our reasoning, one can think that we are looking for the maximum of a function \(V:\mathbb S^1 \rightarrow \mathbb {R}\) via an stochastic procedure based on continuous time Markov Chains taking values on the finite lattice \(\Gamma _k\), \(k \in \mathbb {N}\), which is a discretization of the circle \(\mathbb S^1\). Maybe this can be explored as an alternative approach to Metropolis algorithm, which is base in frozen arguments. In our setting the deviation function \(I^V\) gives bounds for the decay of the probability that the stochastic procedure corresponding to a certain \(k\) does not localize the maximal value.
Moreover, in the Sect. 4 we compute explicitly the entropy of the Gibbs state on the Skorohod space associated to the potential \(k\,V_k\). In this moment we need to generalize a result which was obtained in [19]. After that, we take the limit on \(k\rightarrow \infty \), and we obtain the entropy for the limit process which in this case is shown to be zero.
In [15] it is also consider the use of Aubry–Mather Theory in the study of Large deviation properties.
2 Large Deviations on the Skorohod Space for the Unperturbed System
The goal of this section is to prove the Large Deviation Principle for the sequence of measures \(\{\mathbb {P}_k\}_k\) on \(D[0,T]\), defined in Sect. 1. We recall that \(\mathbb {P}_k\) is induced by the continuous time random walk, which has infinitesimal generator \(k\mathcal L_k\), see (1), and the initial measure \(\delta _{x_k(x_0)}\), which is the Delta of Dirac at \(x_k(x_0)=\lfloor k x_0\rfloor /k\in \Gamma _k\).
Theorem 1
The sequence of probabilities \(\{\mathbb {P}_k\}_k\) satisfies:
-
Upper Bound: For all \(\mathcal {C}\subset D[0,T]\) closet set,
$$\begin{aligned} \varlimsup _{k\rightarrow \infty }\frac{1}{k}\log \mathbb {P}_k\Big [X_k\in \mathcal C\Big ]\le -\inf _{\gamma \in \mathcal C}I_{T}(\gamma ) . \end{aligned}$$ -
Lower Bound: For all \(\mathcal {O}\subset D[0,T]\) open set,
$$\begin{aligned} \varliminf _{k\rightarrow \infty }\frac{1}{k}\log \mathbb {P}_k\Big [X_k\in \mathcal O\Big ]\ge -\inf _{\gamma \in \mathcal O}I_{T}(\gamma ) . \end{aligned}$$
The rate function \(I_{T}: D[0,T]\rightarrow \mathbb {R}\) is
if \(\gamma \in \mathcal {AC}[0,T]\) and \(I_{T}(\gamma )=\infty \), otherwise.
The set \(\mathcal {AC}[0,T]\) is the set of all absolutely continuous functions \(\gamma :[0,T] \rightarrow \mathbb {S}^1\). Saying that a function \(\gamma :[0,T] \rightarrow \mathbb {S}^1\) is absolutely continuous means that for all \(\varepsilon >0\) there is \(\delta >0\), such that, for all family of intervals \(\{(s_i,t_i)\}_{i=1}^{n}\) on \([0,T]\), with \(\mathop \sum \limits _{i=1}^{n} t_i-s_i<\delta \), we have \(\mathop \sum \limits _{i=1}^{n} \gamma (t_i)-\gamma (s_i)<\varepsilon \).
Proof
This proof is divided in two parts: upper bound and lower bound. The proof of the upper bound is on Sects. 2.2 and 2.3. And, the proof of the lower bound is Sect. 2.4. In the Sect. 2.1, we prove some useful tools for this proof, like the one related to the perturbation of the system and also the computation of the Lengendre transform. \(\square \)
2.1 Useful Tools
In this subsection we will prove some important results for the upper bound and for the lower bound. More specifically, we will study a typical perturbation of the original system and also the Radon-Nikodym derivative of this process. Moreover, we will compute the Fenchel-Legendre transform for a function \(H\) that appears in a natural way in the Radon-Nikodym derivative.
For a time partition \(0=t_0<t_1<t_2<\dots <t_n=T\) and for \(\lambda _i:[t_{i-1},t_i]\rightarrow \mathbb {R}\) a linear function with linear coefficient \(\lambda _i\), for \(i\in \{1,\dots ,n\}\), consider a polygonal function \(\lambda :[0,T]\rightarrow \mathbb R\) as \(\lambda (s)=\lambda _i(s)\) in \([t_{i-1},t_i]\), for all \(i\in \{1,\dots ,n\}\).
For each \(k\in \mathbb N\) and for the polygonal function \(\lambda :[0,T]\rightarrow \mathbb R\), defined above, consider the martingale
notice that \(M^k_t\) is positive and \(\mathbb {E}_k[M^k_t]=1\), for all \(t\ge 0\), see Appendix 1.7 of [17]. Making a simple calculation, the part of the expression inside the integral can rewritten as
where \(H(\lambda ):=e^\lambda +e^{-\lambda }-2\). Since \(\lambda \) is a polygonal function, the other part of the expression inside the integral is equal to
Using telescopic sum, we have
The last equality follows from the fact that \(\lambda \) is a polygonal function \((\lambda _{i}(t_{i})=\lambda _{i+1}(t_{i}))\). Thus, the martingale \(M^k_T\) becomes
Remark 2
If \(\lambda :[0,T]\rightarrow \mathbb R\) is an absolutely continuous function, the expression for the martingale \(M^k_T\) can be rewritten as
Define a measure on \(D[0,T]\) as
for all set \(A\) in \(D[0,T]\). For us \(\mathbf {1}_A\) is the indicator function of the set \(A\), it means that \(\mathbf {1}_A(x)=1\) if \(x\in A\) or \(\mathbf {1}_A(x)=0\) if \(x\notin A\).
One can observe that this measure is associated to a non-homogeneous in time process, which have infinitesimal generator acting on functions \(f: \Gamma _k\rightarrow \mathbb {R}\) as
By Proposition 7.3 on Appendix 1.7 of [17], \(M^k_T\) is a Radon-Nikodym derivative \(\frac{d\mathbb {P}_k^\lambda }{d\mathbb {P}_k}\).
To finish this section, we will analyse the properties of the function \(H\), which appeared in the definition of the martingale \(M_T^k\).
Lemma 3
Consider the function
the Fenchel-Legendre transform of \(H\) is
Moreover, the supremum above is attain on \(\lambda _v=\log \Big (\genfrac{}{}{}1{1}{2}\Big (v+\sqrt{(v)^2+4}\Big )\Big )\).
Proof
Maximizing \(\lambda v -(e^\lambda +e^{-\lambda }-2)\) on \(\lambda \), we obtain the expression on (5). \(\square \)
Then, we can rewrite the rate functional \(I_{T}: D[0,T]\rightarrow \mathbb {R}\), defined in (2), as
2.2 Upper Bound for Compact Sets
Let \(\mathcal C\) be an open set of \(D[0,T]\). For all \(\lambda :[0,T]\rightarrow \mathbb R\) polygonal function as in Sect. 2.1, we have
for all \(\lambda _{i+1}:[t_i,t_{i+1}]\rightarrow \mathbb {R}\) linear function, where \(J^{i+1}_{\lambda _{i+1}}(\gamma )\) is equal to
Then, for all \(\mathcal C\) open set on \(D[0,T]\), minimizing over the time-partition and over functions \(\lambda _1,\dots ,\lambda _n\), we have
Since \(J^{i+1}_{\lambda _{i+1}}(\gamma )\) is continuous on \(\gamma \), using Lemma 3.3 (Minimax Lemma) in Appendix 2 of [17], we can interchanged the supremum and infimum above. And, then, we obtain, for all \(\mathcal K\) compact set
where \(I_{\{t_i\}}(\gamma )=\sup _{\lambda _1}\cdots \sup _{\lambda _n}\,\mathop \sum \limits _{i=0}^{n-1}J^{i+1}_{\lambda _{i+1}}(\gamma ).\) Define \(I(\gamma )=\sup _{\{t_i\}_i}\,I_{\{t_i\}}(\gamma )\). Notice that
If \(\gamma \in \mathcal {AC}[0,T]\), then
Thus,
The last equality is true, because \(L(v)=\sup _{\lambda \in \mathbb {R}}\{v\lambda -H(\lambda )\}\), see (6). Putting it on the definition of \(I(\gamma )\), we have
Now, consider the case where \(\gamma \notin \mathcal {AC}[0,T]\), then there is \(\varepsilon >0\) such that for all \(\delta >0\) there is a family of intervals \(\{(s_i,t_i)\}_{i=1}^{n}\) on \([0,T]\), with \(\mathop \sum \limits _{i=1}^{n} t_i-s_i<\delta \), but \(\mathop \sum \limits _{i=1}^{n} \gamma (t_i)-\gamma (s_i)>\varepsilon \). Thus, taking the time-partition of \([0,T]\) as \(t'_0=0<t'_1<\dots <t'_{2n}<t'_{2n+1}=T\), over the points \(s_i, t_i\), we get
Then,
for all \(\delta >0\) and for all \(\lambda \in \mathbb {R}\). Thus, \(I(\gamma )\ge \lambda \varepsilon \), for all \(\lambda \in \mathbb {R}\). Remember that \(\varepsilon \) is fixed and we take \(\lambda \rightarrow \infty \). Therefore, \(I(\gamma )=\infty \), for \(\gamma \notin \mathcal {AC}[0,T]\). Then, \(I(\gamma )=I_{T}(\gamma )\) as on (2) or on (6).
In conclusion, we have obtained, by inequalities (7), (8) and definition of \(I(\gamma )\), that
where \(I_T\) was defined on (2) or on (6).
2.3 Upper Bound for Closed Sets
To extend the upper bound for closed sets we need to use a standard argument, which is to prove that the sequence of measures \(\{\mathbb {P}_k\}_k\) is exponentially tight, see Proposition 4.3.2 on [20] or on Sect. 1.2 of [21]. By exponentially tight we understood that there is a sequence of compact sets \(\{\mathcal {K}_j\}_j\) in \(D[0,T]\) such that
for all \(j\in \mathbb N\).
Then this section is concerned about exponential tightness. First of all, as in Sect. 4.3 on [20] or in Sect. 10.4 on [17], we also claim that the exponential tightness is just a consequence of the lemma below,
Lemma 4
For every \(\varepsilon >0\),
Proof
Firstly, notice that
We have here \(\genfrac{}{}{}1{\varepsilon }{4}\) instead of \(\genfrac{}{}{}1{\varepsilon }{3}\) due to the presence of jumps. Using the useful fact, for any sequence of real numbers \(a_N,b_N\), we have
in order to prove this lemma, it is enough to show that
for every \(\varepsilon >0\) and for all \(t_0\ge 0\). Let be \( M^{k}_t\) the martingale defined in (3) with the function \(\lambda \) constant, using the expression (4) for \( M^{k}_t\) and the fact that \(\lambda \) is constant, we have that
is a positive martingale equal to \(1\) at time \(0\). The constant \(c\) above will be chosen a posteriori as enough large. In order to obtain (10) is sufficient to get the limits
and
The second probability is considered for a deterministic set, and by boundedness, we conclude that for \(\delta \) enough small the probability in (12) vanishes.
On the other hand, to prove (11), we observe that we can neglect the absolute value, since
and using again (9). Because \(\{M^{k}_t/M^{k}_{t_0};\,t\ge t_0\}\) is a mean one positive martingale, we can apply Doob’s Inequality, which yields
Passing the \(\log \) function and dividing by \(k\), we get
for all \(c>0\). To treat of the second term on (13), we just need to observe that \(\{M^{k}_{t_0}/M^{k}_{t};\,t\ge t_0\}\) is also a martingale and rewriting
as
Then, we get the same bound for this probability as in (14), it finishes the proof. \(\square \)
2.4 Lower Bound
Let \(\gamma :[0,T]\rightarrow \mathbb {S}^{1}\) be a function such that \(\gamma (0)=x_0\) and for a \(\delta >0\), in the following
Let \(\mathcal O\) be a open set of \(D[0,T]\). For all \(\gamma \in \mathcal O\), our goal is prove that
For that, we can suppose \(\gamma \in \mathcal {AC}[0,T]\), because if \(\gamma \notin \mathcal {AC}[0,T]\), then \(I_T(\gamma )=infty\) and (15) is trivial. Since \(\gamma \in \mathcal O\), there is a \(\delta >0\) such that
We need consider the measure \(\mathbb {P}_k^\lambda \) with \(\lambda :[0,T]\rightarrow \mathbb R\), the function \(\lambda (s)=\lambda _\gamma (s)=\log \Big (\genfrac{}{}{}1{1}{2}\Big (\gamma '(s)+\sqrt{(\gamma '(s))^2+4}\Big )\Big )\), which we obtain in the Lemma 3, as a function that attains the supremum \(\sup _\lambda [ \lambda \, \gamma '(s)-H(\lambda )] \) for each \(s\). Thus,
The last equality follows from Remark 2. Define the measure \(\mathbb {P}_{k,\delta }^{\lambda ,\gamma }\) as
for all bounded function \(f:D[0,T]\rightarrow \mathbb R\). Then,
Then, using Jensen’s inequality
Since \(\gamma :[0,T]\rightarrow \mathbb {R}\) is an absolutely continuous function, we can write
Since \(\lambda (s)=\lambda _\gamma (s)=\log \Big (\genfrac{}{}{}1{1}{2}\Big (\gamma '(s)+\sqrt{(\gamma '(s))^2+4}\Big )\Big )\), by Lemma 3, we obtain
and, by (6), the last expression is equal to \(I_{T}(\gamma )\). Thus,
The last inequality follows from the above and the Lemmas 5 and 6 below.
Lemma 5
With respect the measure defined on (16), there exists a constant \(C>0\) such that
Lemma 6
There is a \(k_0=k_0(\gamma ,\delta )\) such that \(\mathbb {P}_k^{\lambda }[X_k^{\lambda }\in B_\infty (\gamma ,\delta )]>\frac{3}{4}\), for all \(k\ge k_0\).
The proofs of Lemmas 5 and 6 are in the end of this subsection.
Continuing with the analysis of (17), we mention that, since, for all \(\gamma \in \mathcal O\), there exists \(\delta =\delta (\gamma )\), such that \(B_\infty (\gamma ,\delta )\subset \mathcal O\), then for all \(\varepsilon <\delta \), we have
Thus, for all \(\gamma \in \mathcal O\), we have (15). Therefore,
We present, now, the proofs of the Lemmas 5 and 6.
Proof of Lemma 5
Recalling the definition of the probability measure \(\mathbb {P}_{k,\delta }^{\lambda ,\gamma }\), we can write
\(\square \)
Proof of Lemma 6
Consider the martingale
remember that \(\mathbb {P}_k\) has initial measure \(\delta _{x_k(x_0)}\), where \(x_k(x_0)=\frac{\lfloor kx_0\rfloor }{k}\). Notice that, by the choose of \( \lambda (s)\) as \(\log \Big (\genfrac{}{}{}1{1}{2}\Big (\gamma '(s)+\sqrt{(\gamma '(s))^2+4}\Big )\Big )\) and hypothesis over \(\gamma \), we have that
Then, \(X_k^{\lambda }(t)-\gamma (t)=\mathcal {M}^k_t+r_k\), where \(r_k=\frac{\lfloor kx_0\rfloor }{k}-x_0\). Using the Doob’s martingale inequality,
for \(k\) large enough. Using the fact that
And, making same more calculations, we get that the expectation above is bounded from above by
Then there is \(k_0\), such that, \(\mathbb {P}_k^{\lambda }[\sup _{0\le t\le T}|X_k^{\lambda }(t)-\gamma (t)|>\delta ]<1/4\), for all \(k>k_0\).
\(\square \)
This is the end of the first part of the paper where we investigate the deviation function on the Skorohod space when \(k\rightarrow \infty \) for the trajectories of the unperturbed system.
3 Disturbing the System by a Potential \(V\)
Now, we introduce a fixed differentiable \(C^2\) function \(V: \mathbb {S}^1 \rightarrow \mathbb {R}.\) We want to analyse large deviation properties associated to the disturbed system by the potential \(V\). Several of the properties we consider just assume that \(V\) is Lipschitz, but we need some more regularity for Aubry–Mather theory. Given \(V: \mathbb {S}^1 \rightarrow \mathbb {R}\) we denote by \(V_k\) the restriction of \(V\) to \(\Gamma _k\). It is known that if \(kL_k\) is a \(k\) by \(k\) line sum zero matrix with strictly negative elements in the diagonal and non-negative elements outside the diagonal, then for any \(t>0\), we have that \(e^{t\,kL_k}\) is stochastic. The infinitesimal generator \(kL_k\) generates a continuous time Markov Chain with values on \(\Gamma _k=\{0,1/k, 2/k,\ldots ,\frac{k-1}{k}\}\subset \mathbb S^1\). We are going to disturb this stochastic semigroup by a potential \(k\,V_k:\Gamma _k\rightarrow \mathbb {R}\) and we will derive another continuous Markov Chain (see [4, 19]) with values on \(\Gamma _k\). This will be described below. We will identify the function \(k\,V_k\) with the \(k\) by \(k\) diagonal matrix, also denoted by \(k\,V_k\), with elements \(k\,V_k(j/k)\), \(j=0,1,2\ldots ,k-1\), in the diagonal.
The continuous time Perron’s Theorem (see [23, p. 111]) claims the following: given the matrix \( k\,L_k\) as above and the \(k\,V_k\) diagonal matrix, then there exists
-
(a)
a unique positive function \(u_{V_k}=u_k : \{0,1/k,2/k,\ldots ,(k-1)/k\}\rightarrow \mathbb {R}\),
-
(b)
a unique probability vector \(\mu _{V_k}=\mu _k\) over the set \( \{0,1/k,2/k,\ldots ,(k-1)/k\}\), such that
$$\begin{aligned} \mathop \sum \limits _{j=1}^k u_k^j \,\mu _k^j = 1 ,\end{aligned}$$where \(u_k=(u_k^1,\ldots ,u_k^k)\), \(\mu _k=(\mu _k^1,\ldots ,\mu _k^k)\)
-
(c)
a real value \(\lambda (V_k)=\lambda _k\),
such that
-
(i)
for any \(v \in \mathbb {R}^n\), if we denote \( P^t_{k,V} =e^{t\,(k\,L_k + k\,V_k)}\), then
$$\begin{aligned}\lim _{t\rightarrow \infty } e^{-t \lambda (k)} P^t_{k,V} (v) = \,\mathop \sum \limits _{j=1}^k v_j \,\mu _k^j\, u_k^j\,, \end{aligned}$$ -
(ii)
for any positive \(s\)
$$\begin{aligned} e^{-s \lambda (k)}P^s_{k,V}(u_k)= u_k. \end{aligned}$$
From (ii) follows that
The semigroup \(e^{t\, (k\, L_k + k\,V_k - \lambda (k))}\) defines a continuous time Markov chain with values on \( \Gamma _k\), where the vector \(\pi _{k,V}=(\pi _{k,V}^1,\ldots ,\pi _{k,V}^k)\), such that \(\pi _{k,V}^j=\,u_k^j\, \mu _k^j\, \,\), \(j=1,2,\ldots ,k\), is stationary. Notice that \(\pi _k=\pi _{k,V}\), when \(V=0\). Remember that the \(V_k\) was obtained by discretization of the initial \(V:\mathbb S^1\rightarrow \mathbb {R}.\)
Example 7
When \(k=4\) and \(V_4\) is defined by the values \(V_4^j\), \(j=1,2,3,4\), then, we have first to find the left eigenvector \(u_{V_4}\) for the eigenvalue \(\lambda (V_4)\), that is to solve the equation
Suppose \(\mu _{V_4}\) is the right normalized eigenvector. In this way we can get by the last theorem a stationary vector \(\pi _{4,V}\) for stationary Gibbs probability associated to the potential \(V_4\) We point out that by numeric methods one can get good approximations of the solution of the above problem.
From the end of Sect. 5 in [23], we have that
where \(\psi :\Gamma _k \rightarrow \mathbb {R}\),
and \(\pi _{k}\) is uniform in \(\Gamma _k\). Notice that for any \(\psi \), we have
Moreover,
In this way
Observe that for any \(\psi \in \mathbb { L}^2\), with \(||\psi ||_2=1\), the expression inside the braces is bounded from above by
Notice that for each \(k\) fixed, the vector \(\psi ^k=\psi \) that attains the maximal value \(\lambda _k\) is such that \(\psi _k^i= \sqrt{u_{k,V}^i}\), with \(i\in \{0,\ldots ,(k-1)\}\),
When \(k\) is large the above \(\psi _k\) have the tendency to become more and more sharp close to the maximum of \(V_k\). Then, we have that
converges to
when \(k\) increases to \(\infty \).
Summarizing, we get the proposition below:
Proposition 8
where the last infimum is taken over all measures \(\mu \) such that \(\mu \) is invariant probability for the Euler-Lagrange flow of \( L( x,v)\).
The last equality follows from Aubry–Mather theory (see [8, 10]). Notice that this Lagrangian is convex and superlinear.
3.1 Lax–Oleinik Semigroup
By Feynman–Kac, see [17], we have that the semigroup associated to the infinitesimal generator \(k\,\mathcal L_k+kV_k\) has the following expression
for all bounded mensurable function \(f:\mathbb S^1\rightarrow \mathbb R\) and all \(t\ge 0\).
Now, consider
for a fixed Lipschitz function \(u:\mathbb S^1\rightarrow \mathbb R\). Now, we want to use the results of Sect. 2 together with the Varadhan’s Lemma, which is
Lemma 9
(Varadhan’s Lemma (see [9])) Let \(\mathcal E\) be a regular topological space; let \((Z_t)_{t>0}\) be a family of random variables taking values in \(\mathcal E\); let \(\mu _\varepsilon \) be the law (probability measure) of \(Z_t\). Suppose that \(\{\mu _\varepsilon \}_{\varepsilon >0}\) satisfies the large deviation principle with good rate function \(I : \mathcal E\rightarrow [0, +\infty ]\). Let \(\phi : \mathcal E \rightarrow \mathbb R\) be any continuous function. Suppose that at least one of the following two conditions holds true: either the tail condition
where \(\mathbf 1(A)\) denotes the indicator function of the event \(A\); or, for some \(\gamma > 1\), the moment condition
Then,
We will consider here the above \(\varepsilon \) as \(\frac{1}{k}.\) By Theorem 1 and Varadhan’s Lemma, for each Lipschitz function \(u:\mathbb S^1\rightarrow \mathbb R\), we have
When \(\gamma \notin AC[0,T]\), \(I_T(\gamma )=\infty \) and if \(\gamma \in AC[0,T]\), \(I_T(\gamma )=\int _0^TL(\gamma '(s))\,ds\). Thus,
For a fixed \(T>0\), define the operator \(\mathcal {T}_T \) acting on Lipschitz functions \(u:\mathbb S^1\rightarrow \mathbb R\) by the expression \(\mathcal {T}_T(u)(x)=\lim _{k\rightarrow \infty }\genfrac{}{}{}1{1}{k}\log \,P^{T}_{k,V}(e^{ku})(x)\), then, we just show that
This family of operators parametrized by \(T>0\) and acting on function \(u:\mathbb S^1 \rightarrow \mathbb {R}\) is called the Lax–Oleinik semigroup.
3.2 The Aubry–Mather Theory
We will use now Aubry–Mather theory (see [8, 10]) to obtain a fixed point \(u\) for such operator. This will be necessary later in next section. We will elaborate on that. Consider Mather measures, see [8, 10], on the circle \(\mathbb {S}^1\) for the Lagrangian
\(x\in \mathbb S^1, v \in T_x \mathbb S^1\), when \(V: \mathbb S^1\rightarrow \mathbb {R}\) is a \(C^2\) function. This will be Delta Dirac on any of the points of \(\mathbb S^1\), where \(V\) has maximum (or convex combinations of them). In order to avoid technical problems we will assume that this point \(x_0\) where the maximum is attained is unique. This is generic among \(C^2\) potentials \(V\).
This Lagrangian appeared in a natural way, when we analysed the asymptotic deviation depending on \(k\rightarrow \infty \) for the discrete state space continuous time Markov Chains indexed by \(k\), \(\{X_k(t),t\ge 0\}\), described above in Sect. 2. We denote by \(H(x,p)\) the associated Hamiltonian obtained via Legendre transform.
Suppose \(u_+\) is a fixed point for the positive Lax–Oleinik semigroup and \(u_{-}\) is a fixed point for the negative Lax–Oleinik semigroup (see next section for precise definitions). We will show that function \(I^V= u_+ + u_{-}\) defined on \( \mathbb {S}^1\) is the deviation function for \(\pi _{k,V} \), when \(k\rightarrow \infty .\)
Fixed functions \(u\) for the Lax–Oleinik operator are weak KAM solutions of the Hamilton-Jacobi equation for the corresponding Hamiltonian \(H\) (see [11, Sects. 4, 7 ]).
The so called critical value in Aubry–Mather theory is
where the infimum above is taken over all measures \(\mu \) such that \(\mu \) is invariant probability for the Euler–Lagrange flow \(L^V\). Notice that
This will play an important role in what follows. A Mather measure is any \(\mu \) which attains the above infimum value. This minimizing probability is defined on the tangent bundle of \(\mathbb {S}^1\) but as it is a graph (see [8]) it can be seen as a probability on \(\mathbb {S}^1\). This will be our point of view.
In the case that the potential \(V\) has a unique point \(x_0\) of maximum on \( \mathbb {S}^1\), we have that \(c(L)=V(x_0)\). The Mather measure in this case is a Delta Dirac on the point \(x_0\).
Suppose there exist two points \(x_1\) and \(x_2\) in \( \mathbb {S}^1\), where the supremum of the potential \(V\) is attained. For the above defined lagrangian \(L\) the static points are \((x_1,0)\) and \((x_2,0)\) (see [8, 11] for definitions and general references on Mather Theory). This case requires a more complex analysis, because it requires some hypothesis in order to know which of the points \(x_0\) or \(x_1\) the larger part of the mass of \(\pi _{k,V}\) will select. We will not analyse such problem here. In this case the critical value is \(c(L)=-\, L^V(x_1,0)= V(x_1)= -\, L^V(x_2,0)=V(x_2).\)
In appendix of [1] and also in [2] the N. Anantharaman shows, for \(t\) fixed, an interesting result relating the time re-scaling of the Brownian motion \(B(\varepsilon t)\), \(k\rightarrow \infty ,\) and Large Deviations. The large deviation is obtained via Aubry–Mather theory. The convex part of the Mechanical Lagrangian in this case is \(\frac{1}{2}\, |v|^2\). When there are two points \(x_1\) and \(x_2\) of maximum for \(V\) the same problem as we mention before happens in this other setting: when \(\varepsilon \rightarrow 0\), which is the selected Mather measure? In this setting partial answers to this problem is obtained in [3].
In the present paper we want to obtain similar results for \(t\) fixed, but for the re-scaled semigroup \(P_{k}(ks)=e^{skL_k}\), \(s \ge 0 \), obtained from the speed up by \(k\) the time of the continuous time symmetric random walk (with the compactness assumption) as described above.
In other words we are considering that the unitary circle (the interval \([0,1)\)) is being approximated by a discretization by \(k\) equally spaced points, namely, \(\Gamma _k= \{0,1/k,2/k,\ldots ,(k-1)/k\}\).
Let \( \mathbb { X}_{t,x}\) be the set of absolutely continuous paths \(\gamma :[0,t)\rightarrow [0,1]\), such that \(\gamma (0)=x\).
Consider the positive Lax–Oleinik operator acting on continuous function \(u\) on the circle: for all \(t>0\)
It is well known (see [8, 10]) that there exists a Lipschitz function \(u_+\) and a constant \(c=c(L)\) such that for all \(t>0\)
We say that \(u_+\) is a \((+)\)-solution of the Lax–Oleinik equation. This function \(u_+\) is not always unique. If we add a constant to \(u_+\) get another fixed point. To say that the fixed point \(u_+\) is unique means to say that is unique up to an additive constant. If there exist just one Mather probability then \(u_+\) is unique (in this sense). In the case when there exist two points \(x_1\) and \(x_2\) in \( \mathbb {S}^1\) where the supremum of the potential \(V\) is attained the fixed point \(u_+\) may not be unique.
Now we define, the negative Lax–Oleinik operator: for all \(t>0\) and for all continuous function \(u\) on the circle, we have
Note on this new definition the difference from \(+\) to \(-\). The space of curves we consider now is also different. It is also known that there exists a Lipschitz function \(u_-\) such that for the same constant \(c\) as above, we have for all \(t>0\)
We say that \(u_-\) is a \((-)\)-solution of the Lax–Oleinik equation.
The \(u_{+}\) solution will help to estimate the asymptotic of the left eigenvalue and the \(u_{-}\) solution will help to estimate the asymptotic of the right eigenvalue of \(k\,L_k+ k V_k\).
We point out that for \(t\) fixed the above operator is a weak contraction. Via the discounted method is possible to approximate the scheme used to obtain \(u\) by a procedure which takes advantage of another transformation which is a contraction in a complete metric space (see [12]). This is more practical for numerical applications of the theory. Another approximation scheme is given by the entropy penalized method (see [13, 14]).
For \(k\in \mathbb {N}\) fixed the operator \( k \, L_k\) is symmetric when acting on \(\mathcal { L}^2 \) functions defined on the set \(\Gamma _k\subset \mathbb S^1\). The stationary probability of the associated Markov Chain is the uniform measure \(\pi _k\) (each point has mass \(1/k)\). When \(k\) goes to infinity \(\pi _k\) converges to the Lebesgue measure on \( \mathbb {S}^1\). When the system is disturbed by \(k\, V_k\) we get new stationary probabilities \(\pi _{k,V}\) with support on \(\Gamma _k\) and we want to use results of Aubry–Mather theory to estimate the large deviation properties of this family of probabilities on \(\mathbb S^1\), when \(k\rightarrow \infty .\)
As we saw before, any weak limit of subsequence of probabilities \(\pi _{k,V}\) on \( \mathbb {S}^1=[0,1)\) is supported in the points which attains the maximal value of \(V:[0,1)\rightarrow \mathbb {R}\). Notice that, the supremum of
is not attained on \(\mathbb { L}^2(d\, x)\). Considering a more general problem on the set \( \mathbb { M} ( \mathbb {S}^1)\), the set of probabilities on \( \mathbb {S}^1\), we have
and the supremum is attained, for example, in a delta Dirac on a point \(x_0\), where the supremum of \(V\) is attained. Any measure \(\nu \) which realizes the supremum on \(\mathbb { M} ( \mathbb {S}^1)\) has support in the set of points which attains the maximal value of \(V\). In this way the lagrangian \(L\) described before appears in a natural way.
3.3 Large Deviations for the Stationary Measures \(\pi _{k,V}\).
We start this subsection with same definitions. For each \(k\) and \(x\in \mathbb S^1\) we denote \(x_k(x)\) the closest element to \(x\) on the left of \(x\) in the set \( \Gamma _k\), in fact \(x_k(x)=\frac{\lfloor kx\rfloor }{k}\). Given \(k\) and a function \(\varphi _k\) defined on \( \Gamma _k\), we consider the extension \(g_k\) of \(\varphi _k\) to \(\mathbb S^1\). This is a piecewise constant function such that in the interval \([j/k,(j+1)/k)\) is equal to \(\varphi _k(j/k).\) Finally, we call \(h_k\) the continuous function obtained from \(g_k\) in the following way: \(h_k\) is equal \(g_k\) outside the intervals of the form \([\frac{j}{k} - \frac{1}{k^2} , \frac{j+1}{k} - \frac{1}{k^2}]\), \(j=1,2,\ldots ,k\), and, interpolates linearly \(g_k\) on these small intervals.
When we apply the above to \(\varphi _k=u_k\) the resulting \(h_k\) is denoted by \(z_k=z_k^V\), and when we do the same for \(\varphi _k=\mu _k\), the resulting \(h_k\) is called \(p_{\mu _k}^V\). In order to control the asymptotic with \(k\) of \(\pi _{k,V}= u_k\, \mu _k\) we have to control the asymptotic of \(z_k^V\). We claim that \( (1/k) \, \log z_k \) is an equicontinuous family of transformations, where \(z_k\) is the “extended continuous” to \([0,1]\). And, we consider now limits of a convergent subsequences of \(z_k=z_k^V\).
Lemma 10
Suppose that \(u\) is a limit point of a convergent subsequence \((1/k_j) \, \log z_{k_j} \), \(j \rightarrow \infty \), of \((1/k) \, \log z_{k} \). Then, \(u\) is a \((+)\)-solution of the Lax–Oleinik equation.
Proof
We assume that \(z_{k_j} \sim e^{ u \,k_j}.\) In more precise terms, for any \(x\), we have \(z_{k}(x_k(x)) \sim e^{ u (x)\,k}.\) Therefore, for \(t\) positive and \(x\) fixed, from (21), we have
By definitions in the begin of this subsection, we have that the expression above becomes
Using again that \(z_{k}(x_k(x)) \sim e^{ u (x)\,k}\), we have
Therefore, \(u\) is a \((+)\)-solution of the Lax–Oleinik equation above. \(\square \)
We point out that from the classical Aubry–Mather theory, it follows that the fixed point \(u\) for the Lax–Oleinik Operator is unique up to an additive constant in the case the point of maximum for \(V\) is unique. It follows in this case that any convergent subsequence \((1/k_j) \, \log z_{k_j}^V\,\, \), \(j \rightarrow \infty \), will converge to a unique \(u_{+}\). We point out that the normalization we assume for \(\mu _k\) and \(u_k\) (which determine \(z_k\)) will produce a \(u_{+}\) without the ambiguity of an additive constant.
In the general case (more than one point of maximum for the potential \(V\)) the problem of convergence of \((1/k) \, \log z_{k}^V \), \(k \rightarrow \infty \), is complex and is related to what is called selection of subaction. This kind of problem in other settings is analysed in [3, 5].
One can show in a similar way that:
Lemma 11
Suppose that \(u^*\) is a limit point of a convergent subsequence \((1/k_j) \, \log p_{k_j}^V \), \(j \rightarrow \infty \), of \((1/k) \, \log p_{k}^V \). Then, \(u^*\) is a \((-)\)-solution of the Lax–Oleinik equation.
In the case the point of maximum for \(V\) is unique one can show that any convergent subsequence \((1/k_j) \, \log p_{k_j}^V \), \(j \rightarrow \infty \), will converge to a unique \(u^*\).
Now, we will show that \((1/k) \, \log z_{k}^V\,\, \), \(k \in \mathbb {N}\), is a equicontinuous family.
Consider now any points \(x_0,x_1\in [0,1)\), a fixed positive \(t\in \mathbb {R}\), then define \(\mathbb { X}_{t,x_0,x_1}= \{\gamma (s)\in \mathcal {AC} [0,t]\, | \, \gamma (0) = x_0, \gamma (t)=x_1\}\).
For any \(x_0,x_1\in [0,1)\) and a fixed positive \(t\in \mathbb {R}\) consider the continuous functional \(\phi _{t,x_0,x_1,V} : D[0,t]\rightarrow {\mathbb R}\), given by
if \(\gamma \in \mathbb { X}_{t,x_0,x_1} \) and \(\phi _{t,x_0,x_1,V} (\gamma )=-\infty \), otherwise. Recall that \(x_k(a)=\frac{\lfloor ak\rfloor }{k}\), for \(a\in [0,1]\). For a fixed \(k\), when we write \(\phi _{t,x_k(x_0),x_k(x_1),V} (\gamma )\) we mean
if \(\gamma \in \mathbb { X}_{t,x_0,x_1} \) and \(\phi _{t,x_0,x_1,V} (\gamma )=-\infty \), otherwise. Denote by
From [8, Sects. 3, 4] it is known that \(\Phi _t (x_0,x_1)\) is Lipschitz in \( \mathbb {S}^1\times \mathbb {S}^1\).
Given \(x\) and \(k\), we denote by \(i(x,k)\) the natural number such that \(x_k (x) = \frac{i(x,k)}{k}.\) An important piece of information in our reasoning is
The last equality follows from Varadhan’s Integral Lemma and the definition of functional \(\phi _{t,x_0,x_1, V}\), see (22).
Using the definition of \(\phi _{t,x_0,x_1,V}\) and of \(I_t\), see (2), we get
The convergence is uniform on \(k\), for any \(x_0,x_1\). And, the definition of \(L^V\) is on (20).
Lemma 12
The family \(\genfrac{}{}{}1{1}{k} \log z_k^V\) is equicontinuous in \(k\in \mathbb {N}\). Therefore, there exists a subsequence of \(\genfrac{}{}{}1{1}{k} \log z_k^V\) converging to a certain Lipschitz function \(u\). In the case the maximum of \(V\) is attained in a unique point, then \(u\) is unique up to an additive constant.
Proof
Given \(x\) and \(y\), and a positive fixed \(t\) we have
For each \(k\) the above supremum is attained at a certain \(j_k\). Consider a convergent subsequence \(\frac{j_k}{k}\) to a certain \(z\), where \(k\rightarrow \infty \). That is, there exists \(z\) such that \(i(z,k)=j_k\) for all \(k\).
Therefore, for each \(k\) and \(t\) fixed
Taking \(k\) large, we have, for \(t\) fixed that
The Peierls barrier is defined as
Taking a subsequence \(t_r\rightarrow \infty \) such \(h(y,z)=\varliminf _{r\rightarrow \infty } \Phi _{t_r} (x,z)\), one can easily shows that for large \(k\)
The Peierls barrier satisfies \( h(y,z)-h(x,z)\le \Phi (y,x)\le A \, |x-y|\), where \(A\) is constant and \(\Phi \) is the Mañe potential (see [8, 3–7.1, Item 1]). Therefore, the family is equicontinuous. For each \(k\) fixed there is always a value \(z_k(x)\) above \(1\) and one below \(1\).
The conclusion is that there exists a subsequence of \(\frac{1}{k} \log z_k \) converging to a certain \(u\). The uniqueness of the limit follows from the uniqueness of \(u\)
\(\square \)
A similar result is true for the family \(\frac{1}{k} \, \log p_{\mu _k}^V\), remember that \(p_{\mu _k}^V\) is obtained through of \(\mu _k\). Taking a convergent subsequence, we denote by \(u^*\) the limit. This subsequence can be considered as a subsequence of the one we already got convergence for \(\frac{1}{k}\, \log z_k^V.\) In this case we got an \(u=u: \mathbb {S}^1 \rightarrow \mathbb {R}\) and a \(u^*: \mathbb {S}^1 \rightarrow \mathbb {R}\), which are limits of the corresponding subsequences.
Now we want to analyse large deviations of the measure \(\pi _{k,V}\).
Theorem 13
A large deviation principle for the sequence of measures \(\{\pi _{k,V}\}_k\) is true and the deviation rate function \(I^V\) is \(I^V(x)= u (x) + u^{*} (x)\). In other words, given an interval \(F=[c,d]\),
Proof
Suppose the maximum of \(V\) is unique. Then, we get \(z_{k}(x_k(x)) \sim e^{ u_{+} (x)\,k}\) and \(p_{\mu _k}^V(x_k(x)) \sim e^{ u_{-} (x)\,k}\) What is the explicit expression for \(I^V\)? Remember that \(u^{+}\) satisfies \(\mathcal {T}^+_t (u_+) = u_+ + c \, t \) and \(u^{-}\) satisfies \(\mathcal {T}^+_t (u_{-}) = u_{-} + c \, t \). Here, \(u\) is one of the \(u_{+}\) and \(u^*\) is one of the \(u_{-}\). As we said before they were determined by the normalization. The functions \(u_{+}\) and \(u_{-}\) are weak KAM solutions.
We denote \(I^V(x)= u (x) + u^{*} (x).\) The function \(I^V\) is continuous (not necessarily differentiable in all \(\mathbb {S}^1\)) and well defined. Notice that \(\pi _{k,V}(j/k) = (z_k^V)_j \, (p_{\mu _k}^V)_j .\) We have to estimate
Then, from Laplace method it follows that \(I^V(x)\) is the deviation function. \(\square \)
4 Entropy of \(V\)
4.1 Review of the Basic Properties of the Entropy for Continuous Time Gibbs States
In [19] it is consider the Thermodynamic Formalism for continuous time Markov Chains taking values in the Bernoulli space. The authors consider a certain a priori potential
and an associated discrete Ruelle operator \({\mathcal L}_A\).
Via the infinitesimal generator \(L = {\mathcal L}_A-I\) is defined an a priori probability over the Skorohod space
In [19] it is consider a potential \(V:\{1,2,\ldots ,k\}^\mathbb {N}\rightarrow \mathbb {R}\) and the continuous time Gibbs state associated to \(V\). This generalizes what is know for the discrete time setting of Thermodynamic Formalism (see [22]). In this formalism the properties of the Ruelle operator \({\mathcal L}_A\) are used to assure the existence of eigenfunctions, eigenprobabilities, etc... The eigenfunction is used to normalize the continuous time semigroup operator in order to get an stochastic semigroup (and a new continuous time Markov chain which is called Gibbs state for \(V\)). The main technical difficulties arise from the fact that the state space of this continuous time Markov Chain is not finite (not even countable). [16] is a nice reference for the general setting of Large Deviations in continuous time.
By the other hand, in [4] the authors considered continuous time Gibbs states in a much more simple situation where the state space is finite. They consider an infinitesimal generator which is a \(k\) by \(k\) matrix \(L\) and a potential \(V\) of the form \(V:\{1,2,\ldots ,k\}\rightarrow \mathbb {R}\). This is more close to the setting we consider here with \(k\) fixed.
In the present setting, and according to the notation of last section, the semigroup \(e^{t\, (k\, L_k + k\,V_k - \lambda (k))}, t>0,\) defines what we call the continuous time Markov chain associated to \(k\, V_k\). The vector \(\pi _{k,V}=(\pi _{k,V}^1,\ldots ,\pi _{k,V}^k)\), such that \(\pi _{k,V}^j=\,u_k^j\, \mu _k^j\, \,\), \(j=1,2,\ldots ,k\), is stationary for such Markov Chain.
Notice that the semigroup \(e^{t\, (k\, L_k + k\,V_k}), t>0,\) is not stochastic and the procedure of getting an stochastic semigroup from this requires a normalization via the eigenfunction and eigenvalue.
If one consider a potential \(A:\{1,2,\ldots ,k\}^\mathbb {N}\rightarrow \mathbb {R}\) which depends on the two first coordinates and a potential \(V:\{1,2,\ldots ,k\}^\mathbb {N}\rightarrow \mathbb {R}\) which depends on the first coordinate one can see that ”basically” the results of [19] are an extension of the ones in [4].
In Sect. 4 in [19] it is consider a potential \(V:\{1,2,\ldots ,k\}^\mathbb {N}\rightarrow \mathbb {R}\) and introduced for the associated Gibbs continuous time Markov Chain, for each \(T>0\), the concept of entropy \(H_T\). Finally, one can take the limit on \(T\) in order to obtain an entropy \(H\) for the continuous time Gibbs state associated to such \(V\). We would like here to compute for each \(k\) the expression of the entropy \(H(k)\) of the Gibbs state for \(k V_k\). Later we want to estimate the limit \(H(k)\), when \(k\rightarrow \infty \).
Notice that for fixed \(k\) our setting here is a particular case (much more simpler) that the one where the continuous time Markov Chain has the state space \(\{1,2,\ldots ,k\}^\mathbb {N}\). However, the matrix \(L_k\) we consider here assume some zero values and this was not explicitly considered in [19]. This will be no big problem because the use of the discrete time Ruelle operator in [19] was mainly for showing the existence of eigenfunctions and eigenvalues. Here the existence of eigenfunctions and eigenvalues follows from trivial arguments due to the fact that the operators are defined in finite dimensional vector spaces.
A different approach to entropy on the continuous time Gibbs setting (not using the Ruelle operator) is presented in [18]. We point out that [4] does not consider the concept of entropy. We will show below that for the purpose of computation of the entropy for the present setting the reasoning of [19] can be described in more general terms without mention the Ruelle operator \({\mathcal L}_A\).
No we will briefly describe for the reader the computation of entropy in [19]. Given a certain a priori Lipschitz potential
consider the associated discrete Ruelle operator \({\mathcal L}_{A_k}\).
Via the infinitesimal generator \(\tilde{L}_k = {\mathcal L}_{A_k}-I\), for each \(k\), we define an a priori probability Markov Chain. Consider now a potential \(\tilde{V}_k:\{1,2,\ldots ,k\}^\mathbb {N}\rightarrow \mathbb {R}\) and the associated Gibbs continuous time Markov Chain. We denote by \(\mu ^{k}\) the stationary vector for such chain. We denote by \(P_{\mu ^{k}}\) the probability over the Skorohod space \(D\) obtained from initial probability \(\mu ^{k}\) and the a priori Markov Chain (which will define a Markov Process which is not stationary). We also consider \(\tilde{P}^{\tilde{V}_k}_{\mu ^k}\) the probability on \(D\) induced by the continuous time Gibbs state associated to \(V\) and the initial measure \(\mu ^k\).
According to [19, Sect. 4], for a fixed \(T\ge 0\), the relative entropy is
In the above \(\mu _k\) is a probability fixed on the state space and \(\mathcal {F}_T\) is the usual sigma algebra up to time \(T\). Moreover, \(D\) is the Skorohod space.
The entropy of the stationary Gibbs state \(\tilde{ P}^{\tilde{V}_k}_{\mu ^k}\) is
The main issue here is to apply the above to \(k\, V_k\) and not \(\tilde{V_k}.\) In order to compute the entropy in our setting we have to show that the expression above can be generalized and described not mentioning the a priori potential \(A\). This will be explained in the next section.
4.2 Gibbs State in a General Setting
The goal of this subsection is improve the results of the Sects. 3 and 4 of the paper [19]. In order to do this we will consider a continuous time Markov Chain \(\{X_t, t\ge 0\}\) with state space \(E\) and with infinitesimal generator given by
where \(p(x,y)\) is the rate jump from \(x\) to \(y\). Notice that maybe \(\mathop \sum \limits _{y\in E}p(x,y)\ne 1\). For example, if the state space \(E\) is \(\{1,\ldots ,k\}^{\mathbb {N}}\) and \(L=\mathcal L_A-I\), as in [19], we have that \(p(x,y)=\mathbf {1}_{\sigma (y)=x}e^{A(y)}\), or if \(L=L^V\), also in [19], \(p(x,y)\) is equal to \(\gamma _V(x)\mathbf {1}_{\sigma (y)=x}e^{B_{V}(y)}\).
As we will see by considering this general \(p\) one can get more general results.
Proposition 14
Suppose \(L\) is an infinitesimal generator as above and \(V:E\rightarrow \mathbb {R}\) is a function such that there exists an associated eigenfunction \(F_V:E\rightarrow (0,\infty )\) and eigenvalue \(\lambda _V\) for \(L+V\). That is, we have that \((L+V)F_V=\lambda _V\, F_V\). Then, by a procedure of normalization, we can get a new continuous time Markov Chain, called the continuous time Gibbs state for \(\varvec{V,}\) which is the process \(\{Y^V_T,\,T\ge 0\}\), having the infinitesimal generator acting on bounded mensurable functions \(f:E\rightarrow \mathbb R\) given by
Proof
To obtain this infinitesimal generator we can follow without any change from the beginning of the proof of the Proposition 7 in Sect. 3 of [19] until we get the equality (11). After the equation (11) we use the fact that \(p(x,y)\) is equal to \(\mathbf {1}_{\sigma (y)=x}e^{A(y)}\). Then, in the present setting we just have to start from the equation (11). Notice that the infinitesimal generator \(L^V(f)(x)\) can be written as
Using the fact that \(F_V\) and \(\lambda _V\) are, respectively, the eigenfunction and eigenvalue, we get that the expression (24) defines and infinitesimal generator for a continuous time Markov Chain \(\square \)
Now, rewriting (24) as
we can see that the process \(\{Y_T^V, T\ge 0\}\) is a perturbation of the original process \(\{X_t, t\ge 0\}\). This perturbation is given by the function \(\log F_V\), where \(F_V\) is the eigenfunction of \(L+V\), in the sense of the Appendix 1.7 of [17, p. 337].
Now we will introduce a natural concept of entropy for this more general setting describe by the general function \(p\).
Denote by \(\mathbb P_\mu \) the probability on the Skorohod space \(D:=D([0, T], E)\) induced by \(\{X_t, t\ge 0\}\) and the initial measure \(\mu \). And, denote by \(\mathbb P^V_\mu \) the probability on \(D\) induced by \(\{Y_T^V, T\ge 0\}\) and the initial measure \(\mu \). By [17, p. 336], the Radon–Nikodym derivative \(\frac{d\mathbb P^V_\mu }{d \mathbb P_\mu }\) is
Thus, we obtain the expression:
which is more sharp that the expression (17) on page 13 of [19]. To compare them, we take on (17) \(\tilde{\gamma }=1-V+\lambda _V\), then we obtain the first term. To obtain the second one, we need to observe that the second term in (17), in [19], can be written as a telescopic sum.
Now for a fixed \(k\) we will explain how to get the value of the entropy of the corresponding Gibbs state for \(k\, V_k: \Gamma _k \rightarrow \mathbb {R}\).
In the general setting of last theorem consider \(E = \Gamma _k=\{0,1/k,2/k,\ldots ,(k-1)/k\}\), and, for \(i/k,j/k\in \Gamma _k\), we have
-
(a)
\(p(i/k,j/k)= k\), if \(j=i+1\) or \(j=i-1\),
-
(b)
\(p(i/k,j/k)=0,\) in the other cases.
The existence of eigenfunction \(F_k\) and eigenvalue \(\lambda _k\) for \(k L_k + k V_k\) follows from the continuous time Perron’s Theorem described before. The associated continuous time Gibbs Markov Chain has a initial stationary vector which will be denoted by \(\pi _k\).
Now we have to integrate concerning \(\mathbb P_{\pi _{k,V}}^{kV_k}\) for \(T\) fixed the function
As the probability that we considered on the Skorohod space is stationary and ergodic this integration results in
Thus, the entropy \(H(\mathbb P_{\pi _{k,V} }^{kV_k}\vert \mathbb P_{\pi _{k,V} })=\int k V_k d \pi _{k,V} - \lambda _k\). We point out that for a fixed \(k\) this number is computable from the linear problem associated to the continuous time Perron’s operator. Now in order to find the limit entropy associated to \(V\) we need to take the limit on \(k\) of the above expression.
Here, we assume that the Mather measure is a Dirac Delta probability on \(x_0.\) Remember that \(\lim _{k\rightarrow \infty }\frac{1}{k}\lambda (k) =c(L)= V(x_0).\) Moreover, \(\pi _{k,V} \rightarrow \delta _{x_0}\), when \(k\rightarrow \infty \). Therefore,
The limit entropy in this case is zero.
References
Anantharaman, N.: Counting geodesics which are optimal in homology. Ergod. Theory Dyn. Syst. 23(2), 353–388 (2003)
Anantharaman, N.: On the zero-temperature or vanishing viscosity limit for Markov processes arising from Lagrangian dynamics. J. Eur. Math. Soc. 6(2), 207–276 (2004)
Anantharaman, N., Nalini, R., Iturriaga, P.P., Sanchez-Morgado, H.: Physical solutions of the Hamilton–Jacobi equation. Discret. Contin. Dyn. Syst. Ser. B 5(3), 513–528 (2005)
Baraviera, A., Exel, R., Lopes, A.: A Ruelle Operator for continuous time Markov chains. São Paulo J. Math. Sci. 4(1), 1–16 (2010)
Baraviera, A., Leplaideur, R., Lopes, A.O.: Selection of ground states in the zero temperature limit for a one-parameter family of potentials. SIAM J. Appl. Dyn. Syst. 11(1), 243–260 (2012)
Biryuk, A., Gomes, D.A.: An introduction to Aubry–Mather theory. Sao Paulo J. Math. Sci. 4(1), 17–63 (2010)
Carneiro, M.J.: On minimizing measures of the action of autonomous Lagrangians. Nonlinearity 8(6), 1077–1085 (1995)
Contreras, G., Iturriaga, R.: Global minimizers of autonomous lagrangians, CIMAT (2000) (see homepage of G. Contreras in CIMAT)
Dembo, A., Zeitouni, O.: Large Deviations Techniques. Springer, New York (1998)
Fathi, A.: Théorème KAM faible et théorie de Mather sur les systèmes lagrangiens. C.R. Acad. Sci. Ser. I Math. 324, 1043–1046 (1997)
Fathi, A.: Weak KAM Theorem in Lagrangian Dynamics. Lecture Notes, Pisa (2005)
Gomes, D.A.: Viscosity solution methods and discrete Aubry Mather problem. Discret. Contin. Dyn. Syst. 13(1), 103–116 (2005)
Gomes, D.A., Valdinoci, E.: Entropy penalization methods for Hamilton–Jacobiequations. Adv. Math. 215(1), 94–152 (2007)
Gomes, D., Lopes, A., Mohr, J.: The Mather measure and a large deviation principle for the entropy penalized method. Commun. Contemp. Math. 13(2), 235–268 (2011)
Ioffe, D., Levit, A.: Ground states for mean field models with a transverse component. J. Stat. Phys. 151(6), 1140–1161 (2013)
Kifer, Y.: Large deviations in dynamical systems and stochastic processes. TAMS 321(2), 505–524 (1990)
Landim, C., Kipnis, C.: Scaling Limits of Interacting Particle Systems. Grundlehren der Mathematischen Wissenschaften, vol. 320. Springer, Berlin (1999)
Lecomte, V., Appert-Rolland, C., van Wijland, F.: Thermodynamic formalism for systems with Markov dynamics. J. Stat. Phys. 127(1), 51–106 (2007)
Lopes, A.O., Neumann, A., Thieullen, P.: A thermodynamic formalism for continuous time Markov chains with values on the Bernoulli Space: entropy, pressure and large deviations. J. Stat. Phys. 152(5), 894–933 (2013)
Neumann, A.: Large deviations principle for the exclusion process with slow bonds, Ph.D. thesis, IMPA (2011)
Olivieri, E., Vares, M.E.: Large Deviations and Metastability. Cambridge Universtiy Press, Cambridge (1998)
Parry, W., Pollicott, M.: Zeta functions and the periodic orbit structure of hyperbolic dynamics. Astérisque 187–188, 1–268 (1990)
Strook, D.W.: An Introduction to Large Deviations. Springer, New York (1984)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lopes, A.O., Neumann, A. Large Deviations for Stationary Probabilities of a Family of Continuous Time Markov Chains via Aubry–Mather Theory. J Stat Phys 159, 797–822 (2015). https://doi.org/10.1007/s10955-015-1205-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-015-1205-1