1 Introduction and main results

In this work we study Brownian motion on the unitary group \({\mathbb {U}}(N)\) of dimension N. One can define Brownian motion on \({\mathbb {U}}(N)\) by considering the left-invariant Riemannian metric induced by the inner product \(\langle A, B \rangle = N {\textrm{tr}}(A B^*)\) on the Lie algebra of skew-Hermitian matrices. Unitary Brownian motion is then the Markov diffusion process starting from the identity matrix with generator given by the Laplacian on \({\mathbb {U}}(N)\) associated to this metric.

It will be more convenient for us to consider the following equivalent definition of \(U_t\) as the solution of the Itô stochastic differential equation,

$$\begin{aligned} {\textrm{d}}U_t = {\textrm{i}}U_t {\textrm{d}}W_t - \frac{1}{2} U_t {\textrm{d}}t, \qquad U_0 = {\varvec{1}} \end{aligned}$$
(1.1)

where \(W_t\) is a standard complex Hermitian Brownian motion. That is, if \(X_t\) and \(X'_t\) are \(N \times N\) matrices of independent standard Brownian motions, then,

$$\begin{aligned} W_t = \frac{1}{ \sqrt{4N}} \left( X_t +X_t^T + {\textrm{i}}(X'_t - (X'_t)^T ) \right) . \end{aligned}$$
(1.2)

The process (1.1) admits strong solutions by standard results (see, e.g., Theorem 8.3 of [35]).

Unitary Brownian motion is well-studied in random matrix theory as well as in the context of free probability due to its connection with an object called free unitary Brownian motion. Of particular interest is the empirical spectral measure of \(U_t\), which is a random, time-dependent measure on the unit circle defined by,

$$\begin{aligned} {\textrm{d}}\nu _{N, t} (x):= \frac{1}{N} \sum _{i=1}^N \delta _{\lambda _i (t) } (x) {\textrm{d}}x. \end{aligned}$$
(1.3)

In the work [10], Biane showed that for fixed t, the measure \(\nu _{N, t}\) converges almost surely to a measure on the unit circle which we will denote by \(\nu _t\). Identifying the unit circle with the angular coordinates \(\theta \in (-\pi , \pi ]\), the measure \(\nu _t\) has a density \(\rho _t ( \theta ) \) for any \( t>0\). For \(t <4\) the support of \(\rho _t\) is given by,

$$\begin{aligned} I_t:= [ - \Theta _t, \Theta _t], \qquad \Theta _t:= \frac{1}{2} \sqrt{ (4-t)t} +2 \arcsin \left( \sqrt{\frac{t}{4}}\right) , \end{aligned}$$
(1.4)

whereas for \( t \ge 4\), the support is the entire unit circle. Moreover, for \(t \ge 4\) the density is everywhere non-zero unless \(t=4\) in which case \(\rho _t\) vanishes only at \(\pi \). In fact, \(\rho _t\) can be described as the spectral measure of free unitary Brownian motion, an object appearing in free probability. The limit of the \(\nu _{N, t}\) was also derived independently by Rains in [43].

Since Biane’s paper, there have been many works studying the convergence of \(\nu _{N, t}\) to \(\rho _t\). Concentration estimates and convergence for the empirical averages \(\int f {\textrm{d}}\nu _{N, t}\) were established by Kemp [31] for various classes of f of low regularity. Meckes and Melcher [42] established explicit convergence rates in terms of the \(L^1\)-Wasserstein metric. For \(0< t < 4\), the convergence of the spectral edge of \(U_t\) to \(\pm \Theta _t\) was established by Collins, Dahlqvist and Kemp [17]. This work also established a multi-time, multi-matrix version of this result. The asymptotic Gaussian fluctuations of the empirical averages of \(\int f {\textrm{d}}\nu _{N, t}\) were established by Lévy and Maida [38]. Multivariate fluctuations for trace polynomials of a two parameter family of diffusion processes (including unitary Brownian motion as a special case) were studied by Cébron and Kemp [13].

The main contribution of the present work is to establish almost-optimal rates (i.e., up to polynomial \(N^{\varepsilon }\) factors) of convergence of \(\nu _{N, t}\) to the limiting distribution \(\rho _t\) on the almost-shortest possible scales, as well as almost-optimal estimates on the eigenvalue locations. In the random matrix literature, these estimates are known as local laws and rigidity estimates, respectively. Our local laws are stated in Theorem 1.2 and in Corollary 1.3 below, showing that the number of eigenvalues in any sub-interval I of the unit circle is given by \(N \rho _t (I) + {\mathcal {O}}(N^{\varepsilon } )\) for any \(\varepsilon >0\).

For times \( t< 4\) we establish almost-optimal rates of convergence of the spectral edge of \(U_t\) to \(\pm \Theta _t\). That is, the extremal eigenvalues are within distance \({\mathcal {O}}(N^{-2/3+\varepsilon } )\) of \(\pm \Theta _t\) (what is usually termed edge rigidity in the literature). Given the square-root behavior of the spectral measure at the edges, this is expected to be optimal up to the \(N^{\varepsilon }\) factor. We also derive almost-optimal edge rigidity results up to \(t = 4 - N^{-1/2+\varepsilon }\) at which point the measure \(\rho _t\) forms a cusp (i.e., vanishes like a cube root) near \(\theta = \pm \pi \). This is expected to be optimal as at later times, the natural inter-particle distance at the spectral edges exceeds the distance between \(+ \Theta _t\) and \(-\Theta _t\). Our rigidity estimates are formulated in Corollaries 1.5 and 1.7 below. The scaling behaviors in the various parameter regimes will be given as the results are introduced.

To our knowledge, our results are the strongest available estimates on eigenvalue locations for unitary Brownian motion. Our methods are completely different from prior works on rates of convergence to \(\rho _t\), relying on the method of characteristics from PDEs. Previous works were based on moment calculations and/or concentration estimates for heat kernels on Lie groups.

1.1 Discussion of methodology

There are many approaches in the literature for proving local laws and rigidity in random matrix theory. For Wigner matrices and related mean-field random matrices W, a multi-scale approach to analyzing the resolvent \((W-z)^{-1}\) giving estimates down to the optimal scale was developed by Erdős, Schlein and Yau [21,22,23]. For pedagogical overviews of this strategy see [9, 26]. Earlier local laws for random matrices on short scales appeared in [8, 28].

While powerful, the resolvent method makes heavy use of the matrix structure and independence between different entries; while \(U_t\) is associated with a matrix process, the correlation structure between entries is complicated and renders this approach intractable. Instead, the process \(U_t\) somewhat resembles what is known as (Hermitian) Dyson Brownian motion, whose definition we now recall. In the work [19], Dyson showed that the eigenvalues of \(V+W_t\) for any Hermitian V (and \(W_t\) as in (1.2)) obey the system of SDEs,

$$\begin{aligned} {\textrm{d}}\mu _i = \sqrt{ \frac{2}{N \beta } } {\textrm{d}}B_i + \frac{1}{N} \sum _{j \ne i } \frac{1}{ \mu _i - \mu _j} {\textrm{d}}t - \frac{1}{2} \mu _i {\textrm{d}}t, \end{aligned}$$
(1.5)

where \(\beta =2\). Moreover, at the level of formal calculation, Dyson showed that the eigenvalues \( \lambda _i (t)\) of the matrix \(U_t\) satisfy the closed system of SDEs,

$$\begin{aligned} {\textrm{d}}\lambda _i = \frac{1}{ \sqrt{N}} {\textrm{i}}\lambda _i {\textrm{d}}B_i - \frac{1}{N} \sum _{j \ne i } \frac{ \lambda _j \lambda _i}{ \lambda _i - \lambda _j } {\textrm{d}}t - \frac{1}{2} \lambda _i {\textrm{d}}t, \end{aligned}$$
(1.6)

where the \(\{ B_i (t) \}_{i=1}^N\) are a family of independent standard Brownian motions. We will not make use of this SDE, but mention that well-definedness of this process was studied in [14].

Given the similar form of these two processes, it is therefore useful to recall how rigidity has been established in the study of DBM (1.5) for general \(\beta \). Given Dyson’s original derivation of (1.5) we see that for the special values \(\beta =1, 2, 4\), this eigenvalue process in fact comes from a matrix-valued Brownian motion that can be thought of as an additively deformed Gaussian Orthogonal/Unitary/Symplectic ensemble. In the case that the initial data is the 0 matrix, this is just a scaled Gaussian matrix, and so the local laws and rigidity follow from those for Wigner matrices. For general initial data, local laws and rigidity were developed by Lee and Schnelli [36, 37] as well as the second author with Yau [33, 34] still using resolvent methods, as the additive structure inherent in DBM allows this approach to work. However, for general \(\beta \), the process (1.5) is no longer naturally associated with any matrix process and so no resolvent methods can work.

Nonetheless, the second author with Huang [30] showed that local laws and rigidity in fact hold for DBM with general \(\beta \) as well as a general potential (the equation (1.5) being associated with quadratic \(V' ( \mu ) = \frac{1}{2} \mu \)). This was further developed by the first author with Huang to study the spectral edges [1]. The method in these works is our starting point and so we review it here.

The works [1, 30] were based on a PDE-style approach to studying the Stieltjes transform,

$$\begin{aligned} m (z, t):= \frac{1}{N} \sum _{i=1}^N \frac{1}{ \mu _i (t) -z }, \end{aligned}$$
(1.7)

of the empirical eigenvalue measure associated to the Hermitian DBM (1.5), on short scales \({\textrm{Im}}[z] \sim N^{-1}\). For \(\beta =2\), this quantity satisfies,

$$\begin{aligned} {\textrm{d}}m (z, t) = \left( m (z, t) + \frac{z}{2} \right) \partial _z m(z, t) {\textrm{d}}t+ \frac{1}{2} m (z, t) {\textrm{d}}t + {\textrm{d}}N_t, \end{aligned}$$
(1.8)

for some Martingale \(N_t\) which turns out to be lower-order. This leads to the limiting complex Burger’s equation,

$$\begin{aligned} \partial _t {\tilde{m}}(z, t) = \left( {\tilde{m}}(z, t) + \frac{z}{2} \right) \partial _z {\tilde{m}}(z, t) + \frac{1}{2} {\tilde{m}}(z, t). \end{aligned}$$
(1.9)

These equations were considered by Rogers and Shi [45] and general potential analogues by Li, Li and Xie [39, 40]. The complex Burger’s equation may be solved by the elementary method of characteristics. The works [1, 30] were based on tracking the difference between \({\tilde{m}}(z_t, t)\) and \(m(z_t, t)\) along characteristics \(z_t\). Note that for any fixed z, it is relatively straightforward to see that the limiting m(zt) must satisfy (1.9). The main challenge in [1, 30] is to extend this to short scales and obtain optimal error estimates.

The natural analogue of the Stieltjes transform m(zt) above for measures supported on the unit circle is the following transform,

$$\begin{aligned} f(z, t):= \frac{1}{N} {\textrm{tr}}\left( \frac{ U_t + z}{ U_t-z} \right) = \int \frac{ {\textrm{e}}^{ {\textrm{i}}\theta } + z}{ {\textrm{e}}^{ {\textrm{i}}\theta } - z} {\textrm{d}}\nu _{N, t} ( \theta ). \end{aligned}$$
(1.10)

We will refer to this as the “Cauchy transform” although the Cauchy transform is usually defined slightly differently [15] (the usual definition differs from ours only by an affine transformation). Our choice of f is to match the work of Biane [10, 11] discussed in more detail below.

An application of Itô’s formula starting from (1.1) shows that f(zt) obeys the SDE,

$$\begin{aligned} {\textrm{d}}f(z, t) = - \frac{z f(z, t)}{2} \partial _z f(z, t) {\textrm{d}}t + {\textrm{d}}M_t (z) \end{aligned}$$
(1.11)

where \(M_t(z)\) is a complex-valued Martingale defined via,

$$\begin{aligned} {\textrm{d}}M_t(z) = - \frac{2 {\textrm{i}}z}{N} {\textrm{tr}}\left( \frac{ U_t}{ (U_t-z)^2} {\textrm{d}}W_t \right) = - \frac{2 {\textrm{i}}z}{N} \sum _{i, j=1}^N \left[ \frac{ U_t}{ (U_t-z)^2} \right] _{ij} {\textrm{d}}(W_t)_{ij},\nonumber \\ \end{aligned}$$
(1.12)

where the second equality explicitly writes out the trace in terms of the matrix elements of \(U_t (U_t -z)^{-2}\) and the matrix of stochastic differentials of the Hermitian Brownian motion \(W_t\). The covariation process of \(M_t\) is easily calculated,

$$\begin{aligned} \langle {\textrm{d}}M , {\textrm{d}}M \rangle&= - \frac{4 z^2}{N^3} {\textrm{tr}}\left( \frac{ U_t^2}{ ( U_t -z )^4} \right) {\textrm{d}}t \nonumber \\ \langle {\textrm{d}}M , {\textrm{d}}{\bar{M}} \rangle&= \frac{4 |z|^2}{N^3} {\textrm{tr}}\left( \frac{ |U_t|^2}{ |U_t - z|^4 } \right) {\textrm{d}}t. \end{aligned}$$
(1.13)

Based on this, one sees that the limiting Cauchy transform should be the solution to,

$$\begin{aligned} \partial _t {\tilde{f}}(z, t) = - \frac{z {\tilde{f}}(z, t)}{2} \partial _z {\tilde{f}}(z, t), \qquad {\tilde{f}}(z, 0) = \frac{ 1+z}{1-z}, \end{aligned}$$
(1.14)

the analogue of the complex Burger’s equation (1.9). Indeed, these equations were found by Biane [10, 11], and the Cauchy transform \({\tilde{f}}(z, t)\) characterizes the limiting spectral measure \(\nu _t\). We collect properties of this measure and its density \(\rho _t ( \theta )\) in Appendix A.

In the present work, we will analyze the equation (1.11) through the characteristics associated to (1.14). That is, if z(t) is a time-dependent curve in \({\mathbb {C}}\) satisfying,

$$\begin{aligned} \frac{ {\textrm{d}}}{{\textrm{d}}t} z(t) = \frac{ z(t) {\tilde{f}}( z(t), t) }{2}, \end{aligned}$$
(1.15)

then,

$$\begin{aligned} \frac{ {\textrm{d}}}{ {\textrm{d}}t} {\tilde{f}}( z(t), t) = 0. \end{aligned}$$
(1.16)

In fact, Biane’s work [11] shows that for any t the map \(z \rightarrow z(t)\) is a conformal map of some domain onto the complement of the unit circle in \({\mathbb {C}}\), which allows one to construct \({\tilde{f}}(z, t)\) from the characteristics.

At a superficial level, we are translating the methods of [1, 30] from the real line to the unit circle. However, the translation is not at all straightforward and there are serious obstacles to be overcome once one moves to our new setting. We briefly mention a few now; they will be presented in more detail below as we discuss our results. First, it is not a-priori clear that this method could even work in the first place. In particular, it is crucial that the Martingale term \(M_t (z)\) above can be controlled by the empirical Cauchy transform f(zt) itself. The precise form of the quadratic variation of \(M_t\) is therefore important. Similar considerations hold for controlling the term \(\partial _z f (z, t)\) by f(zt). In fact, getting precise constants here is crucial due to the use of Gronwall’s inequality in our proof.

One of the new novelties of our work is to deal with times close to \(t \approx 4\) where the spectral measure \(\rho _t( \theta )\) forms a cusp singularity near \(\theta = \pm \pi \). Local laws near a cusp are in general delicate (see e.g., [6, 16, 20]) and in the random matrix setting have not been dealt with via the characteristics approach before our work (but see the work [29] which studies non-intersecting random walks via characteristics in a different context). Secondly, the works [1, 30] dealt only with short times \(t = o(1)\) whereas we are interested in times of order 1. Coupled with the curvature of the unit circle, this requires a more detailed understanding of the behavior of the characteristics, especially near the spectral edges and cusps than was required before. Here, we partially rely on the semi-explicit form of the spectral measure and Cauchy transform \({\tilde{f}}(z, t)\). Finally, the results of [1] make somewhat strong assumptions on the initial data. This was primarily due to the treatment of general potentials in that work, which allowed for an analysis of the movement of the spectral edge. Here, our initial data is a delta function, falling outside the assumptions of [1]. In particular, our short-time analysis is more involved.

The characteristic approach has appeared in a few other works on the short-scale behavior of eigenvalues. Bourgade used characteristics to analyze a “stochastic advection equation” derived from a certain coupling between DBMs and obtained fine estimates on the local eigenvalue behavior of general Wigner matrices, including universality of the extreme gaps [12]. Von Soosten and Warzel used random characteristics to prove local laws for Wigner matrices [49] and to study delocalization in the Rosenzweig–Porter model [48].

1.2 Statement of main results

Cauchy transforms of measures \(\mu \) on the unit circle \(f_\mu (z)\) are traditionally studied for z in the open unit disc, \(\{ |z| < 1 \}\). Due to the identity,

$$\begin{aligned} f_\mu (r {\textrm{e}}^{ {\textrm{i}}\theta } ) = - \bar{ f_\mu }( r^{-1} {\textrm{e}}^{ {\textrm{i}}\theta } ), \end{aligned}$$
(1.17)

this is equivalent to studying the behavior for \(|z| >1\), and it is somewhat conceptually simpler for our techniques to treat \(|z| >1\). Based on this, our first main result is the following, identifying the rate of convergence of the empirical spectral measure of unitary Brownian motion down to the optimal scale (up to polynomial factors). In order to state our results we introduce the notion of overwhelming probability.

Definition 1.1

If \({\mathcal {A}}_i\) are events indexed by some set \(i \in {\mathcal {I}}\) (and may depend on N) then we say that the family of events \({\mathcal {A}}_i\) hold with overwhelming probability if for all \(D>0\) there is a \(C>0\) so that,

$$\begin{aligned} \sup _{i \in {\mathcal {I}}} {\mathbb {P}}[ {\mathcal {A}}_i^c] \le N^{-D}, \end{aligned}$$
(1.18)

for all \(N \ge C\).

Theorem 1.2

Let \(T>0\) and \(\varepsilon , \delta , {\mathfrak {c}}>0\). For any \(0< t< T\), we define the domain,

$$\begin{aligned} {\mathcal {B}}_t:= \left\{ z \in {\mathbb {C}}: 5> \log |z| > \frac{ N^{\delta }}{N | {\textrm{Re}}[ {\tilde{f}}(z, t) ] |} \vee N^{-{\mathfrak {c}}} \right\} . \end{aligned}$$
(1.19)

Then, with overwhelming probability we have uniformly for all \(0< t <T\) and \(z \in {\mathcal {B}}_t\) that,

$$\begin{aligned} | f(z, t) - {\tilde{f}}(z, t)| \le \frac{N^{\varepsilon }}{N \log |z|}. \end{aligned}$$
(1.20)

Let us now explain the connection between the Cauchy transform and the spectral measure. For a general probability measure \(\mu \) on the unit circle, one may recover \(\mu \) via the weak limits,

$$\begin{aligned} (2 \pi ){\textrm{d}}\mu (\theta )= & {} \lim _{ r \uparrow 1} \left( {\textrm{Re}}\left[ \int \frac{ {\textrm{e}}^{ {\textrm{i}}\theta '} + r {\textrm{e}}^{ {\textrm{i}}\theta }}{ {\textrm{e}}^{ {\textrm{i}}\theta '} - r {\textrm{e}}^{ {\textrm{i}}\theta }} {\textrm{d}}\theta ' \right] \right) {\textrm{d}}\theta \nonumber \\= & {} -\lim _{ r \downarrow 1} \left( {\textrm{Re}}\left[ \int \frac{ {\textrm{e}}^{ {\textrm{i}}\theta '} + r {\textrm{e}}^{ {\textrm{i}}\theta }}{ {\textrm{e}}^{ {\textrm{i}}\theta '} - r {\textrm{e}}^{ {\textrm{i}}\theta }} {\textrm{d}}\theta ' \right] \right) {\textrm{d}}\theta . \end{aligned}$$
(1.21)

However, this is not useful in order to obtain effective estimates on, e.g., the number of eigenvalues in an interval. The Helffer–Sjöstrand formula [18] for measures on \({\mathbb {R}}\) allows one to relate empirical averages of test functions to integrals of the Stieltjes transform over \({\mathbb {C}}\), turning estimates on the Stieltjes transform into effective estimates on the eigenvalues [9]. In Sect. 6.1 we quickly develop a version of the Helffer–Sjöstrand formula for measures on the unit circle (like the usual HS formula, it is a consequence of Green’s theorem). Similar formulas have appeared before in the literature [41], but the form given here is well-adapted to our purposes. Using this and the above theorem as input, we obtain the following.

Corollary 1.3

Let \(T>0\) and \(\varepsilon >0\). With overwhelming probability, the following holds uniformly over all intervals \(I \subset (-\pi , \pi ]\), and all \(0< t < T\),

$$\begin{aligned} \left| \left| \{ \lambda _i (t) = {\textrm{e}}^{{\textrm{i}}\theta _i (t) }: \theta _i \in I \} \right| - N \int _I \rho _t (\theta ) {\textrm{d}}\theta \right| \le N^{\varepsilon } + t^{-1/2} N^{-1/\varepsilon }, \end{aligned}$$
(1.22)

for N large enough.

The above corollary shows that as long as \(t \ge N^{-C}\) for some \(C>0\) then the number of eigenvalues in any interval is given by the limiting spectral measure up to an arbitrarily small polynomial error, with very high probability. This is the optimal scaling up to perhaps replacing the \(N^{\varepsilon }\) error by some sort of logarithmic error.

Theorem 1.2 is proven in Sect. 2, and Corollary 1.3 is derived in Sect. 6.2. This is the most straightforward of our results as it is the most literal translation of the methods of [1, 30] to the unitary setting. Nonetheless, estimates as sharp as those of Corollary 1.3 were not known before this work.

Our method relies on an application of Gronwall’s inequality to the the function \( t\rightarrow f (z (t), t) - {\tilde{f}}(z (t), t)\) where z(t) is a characteristic as above. The most delicate estimates are associated with estimating the martingale term \(M_t (z(t))\) defined above, as well as the argument around (2.23) and (2.24). The integral form of Gronwall’s inequality we apply involves an exponential term which this latter argument estimates. Here, we cannot lose any constants or else the error term would be far too large to close our argument.

Recall that for \( t<4\) the support of \(\rho _t\) is not yet the entire unit circle and is instead the interval \(I_t:= [ - \Theta _t, \Theta _t]\) where \(\Theta _t\) is as in (1.4). The estimates of Theorem 1.2 are insufficient to address the natural question of whether or not there are eigenvalues outside of \(I_t\), or their typical distance from the edges of \(I_t\).

As mentioned above, Collins, Dahlqvist and Kemp [17] showed that with high probability there are no eigenvalues outside any open set containing \(I_t\). However, such a statement does not yield the correct order of fluctuations of the extremal eigenvalues of \(U_t\).

Before stating our results regarding the extremal eigenvalues, let us first ascertain what we expect for the order of magnitude of the fluctuations. For \(\delta<t < 4 -\delta \), we show in Appendix A that for \(E>0\) sufficiently small,

$$\begin{aligned} \rho _t (\Theta _t - E) = c_t E^{1/2} (1 + {\mathcal {O}}(E) ). \end{aligned}$$
(1.23)

That is, the spectral measure vanishes like a square-root at the edges of its spectrum. The square-root behavior near spectral edges is generic in random matrix theory and is associated with limiting Tracy-Widom fluctuations on the order of \({\mathcal {O}}(N^{-2/3})\). While not explicitly formulated, we expect that the estimates of [17] in fact show that the extremal eigenvalues of \(U_t\) are no more than \({\mathcal {O}}(N^{-c})\) from the edges of \(I_t\) for some small, explicit \(c>0\). However, the interparticle distance associated with the square-root behavior is \({\mathcal {O}}(N^{-2/3} )\) and so such an estimate nonetheless falls short.

Our next result shows that with overwhelming probability, the extremal eigenvalues are in fact within \({\mathcal {O}}(N^{-2/3+\varepsilon } )\) of the edges of the support \(I_t\). These are analogues of the well-known rigidity results in random matrix theory at the edge (see, e.g., [27] for the first such estimates for generalized Wigner matrices). This is also the analogue of the results of [1] on the Hermitian DBM in the unitary setting.

Theorem 1.4

Let \(\delta >0\) and \(\varepsilon >0\) be sufficiently small. With overwhelming probability, the following holds uniformly for all t satisfying \(\delta< t < 4- \delta \),

$$\begin{aligned} \left| \left\{ i: \lambda _i (t) = {\textrm{e}}^{ {\textrm{i}}\theta }, \theta \in [-\pi , \pi ] \backslash [ - \Theta _t - N^{-2/3+\varepsilon }, \Theta _t + N^{-2/3+\varepsilon } ] \right\} \right| = 0,\nonumber \\ \end{aligned}$$
(1.24)

and for all \(0 \le t \le \delta \),

$$\begin{aligned} \left| \left\{ i: \lambda _i (t) = {\textrm{e}}^{ {\textrm{i}}\theta }, \theta \in [-\pi , \pi ] \backslash [ - \Theta _t - N^{-\varepsilon /6}, \Theta _t + N^{-\varepsilon /6} ] \right\} \right| = 0. \end{aligned}$$
(1.25)

For short times \(t \ll 1\), the measure \(\rho _t\) looks like an approximate semicircle centered at \(\theta =0\) of width \(\sqrt{t}\) and \(\rho _t(0) \asymp t^{-1/2}\). One therefore expects a different scaling of the interparticle distance for short times t. The error we obtain is not optimal for short times t, but this regime is not the main focus of our work and so we do not try to optimize our approach here.

Together with Corollary 1.3, we can then deduce the following rigidity estimates. To introduce them, we require some further notation. Note that for all times \(t>0\) the joint law of the eigenvalues of \(U_t\) has a density with respect to Haar measure, and so for each fixed time \(t_0 > 0 \), the eigenvalues are almost surely distinct. On the other hand, it follows from [14] that for distinct initial data, the solution to (1.6) exists as a strong solution for all times \(t> t_0\) and moreover the eigenvalues do not intersect. Taking \(t_0 \rightarrow 0\) it follows that with probability 1, the eigenvalues are distinct for all times \(t>0\). Since the eigenvalue locations are continuous functions of time and all start at the location \(z=1\) at \(t=0\) it follows that there is a labelling \(\{ \lambda _i (t) \}_{i=1}^N\) so that we can write \(\lambda _i = {\textrm{e}}^{ {\textrm{i}}\theta _i (t)}\) for continuous \(\theta _i (t)\) starting at 0 and for all \(t>0\) satisfying,

$$\begin{aligned} \theta _1 (t)< \theta _2 (t)< \dots< \theta _N (t) < \theta _1 (t) + 2 \pi . \end{aligned}$$
(1.26)

Note that it is possible for \(\lambda _i (t) = {\textrm{e}}^{ {\textrm{i}}\theta _i (t)}\) to wrap many times around the unit circle. E.g., each \(\theta _i (t) \) can take any value in \({\mathbb {R}}\) such that the above ordering is respected. However, since the statement of Theorem 1.4 holds on an event of overwhelming probability for all t simultaneously, we can rule out any eigenvalues passing through the point \(\theta = \pi \) on this event (this could also be concluded by passing from estimates holding on a sufficiently finely-spaced grid of times to all t using the Hoffman-Wielandt inequality). This observation allows us to formulate the following rigidity estimates.

Denote by \(\gamma _i(t)\) the quantiles of \(\rho _t\),

$$\begin{aligned} \frac{i}{N} = \int _{-\pi }^{\gamma _i(t)} \rho _t (\theta ) {\textrm{d}}\theta , \end{aligned}$$
(1.27)

with the convention that \(\gamma _N\) is either the right spectral edge or \(\pi \) for \(t \ge 4\). We have the following.

Corollary 1.5

Let \(\delta >0\) and \(\varepsilon >0\). The following holds with overwhelming probability uniformly for all t satisfying \(\delta< t < 4- \delta \) and all \( 1 \le i \le N\). We have,

$$\begin{aligned} | \theta _i (t) - \gamma _i (t) | \le N^{\varepsilon } \frac{1}{N^{2/3} \min \{ i^{1/3}, (N+1 - i )^{1/3} \} }. \end{aligned}$$
(1.28)

Theorem 1.4 is proven in Sect. 3. Corollary 1.5 follows in a straightforward manner from Corollary 1.3 and Theorem 1.4 and so we omit the proof (see, e.g., Section 3.3 of [30]).

Compared to the work [1], we encounter several new difficulties in our unitary setting in establishing these edge rigidity results. These mostly have to do with the fact that establishing the above results heavily depends on detailed analysis of the behavior of characteristics near the spectral edge. This behavior depends especially on the distance along the unit circle of characteristic from the location \(\pm \Theta _t\) (the angular coordinate) as well as the distance from the unit circle. However, these coordinates introduce curvature whereas in the real line setting these are flat Cartesian coordinates of the real and imaginary part of the characteristic. This is further complicated by the fact that for short times t, the spectral measure is very peaked and that for long times t, the characteristics will leave the small neighbourhoods of the spectral edges for which we can develop expansions of \({\tilde{f}}(z, t)\).

We overcome the short-time difficulties mainly by sacrificing obtaining optimal estimates for short times; i.e., for short times we consider only characteristics that are somewhat far from the spectral edge. The second fact we use to overcome the difficulties associated with curvature and long times is monotonicity of the radial coordinate of the characteristics in time. This second fact is lacking for the general potential processes considered in [1, 30], and is one of the sources of the short-time restrictions in those works (“shocks” can develop for these processes). Essentially, monotonicity of the characteristics allows us to split the paths into “short” and “long” time regimes. In the short-time regimes, we can use a combination of analytic approaches to square-root measures and the semi-explicit formulas for \({\tilde{f}}(z, t)\) to control the behavior of characteristics close to spectral edges. In the long time regime, things are far away from the spectrum and so relevant quantities can usually be bounded by constants.

We now turn our attention to times \(t \sim 4\). This is a critical time for unitary Brownian motion, as at \(t=4\), the two spectral edges \(\pm \Theta _t\) merge, and the support of the density of states becomes the entire unit circle for later times t. In fact, as we show in Appendix A, the spectral measure at \(t=4\) has a cusp singularity,

$$\begin{aligned} \rho _4 (\pi + E) = c |E|^{1/3} (1 + {\mathcal {O}}( |E|^{1/3} ) ), \end{aligned}$$
(1.29)

for some constant \(c>0\). Moreover, for times t near 4, the spectral measure undergoes a transition where the two edges gradually form a “near-cusp”, become a cusp, and then form a small local minimum as t ranges from slightly less than 4 to slightly larger than 4.

This behavior is identical to that found in the theory of the so-called quadratic vector equation. The quadratic vector equation is a generic equation characterizing the density of states of certain classes of mean field self-adjoint random matrix models. In a series of works [3,4,5], Ajanki, Erdős and Krüger carried out a systematic study of the solutions to the quadratic vector equation. In particular, they found that the only possible singularities that may occur are cusps, where the density of states of vanishes like a cube root, and square roots, occurring at either external or internal edges. Moreover, they characterized transitional regimes where intervals of the density of states merge or split. In such regimes, they showed that the leading order of the density of states is always given by universal shape functions arising from Cardano’s formula for the roots of third-degree polynomials. There are two such functions; the first, \(\Psi _e\) corresponds to the case when two separate intervals merge and describe a transition from square-root to cubic behavior. The second, \(\Psi _m\) describes what occurs after the cusp forms, when the density of states has a small minimum.

In fact, in Appendix A we show that for times \(t <4\) that the density of states \(\rho _t\) of unitary Brownian motion is described by \(\Psi _e\) and for times \(t > 4\) by \(\Psi _m\), the universal shape functions of [3]. On the one hand, this is remarkable as there is no quadratic equation describing the density of states of unitary Brownian motion and moreover that the eigenvalues are on the unit circle instead of the real line. On the other hand, the universal shape functions arise from expansions of the Cauchy or Stieltjes transform near critical points (i.e., the spectral edges or minima) as soon as one is guaranteed that either the coefficients of the quadratic or cubic terms is non-degenerate and so this behavior is somewhat expected.

Theorems 1.2 and 1.4 above do not capture the behavior of the extremal eigenvalues in the case of the near-cusp, when times t are very close to 4. Using the formula for \(\Theta _t\) above, we have for \(t < 4\) that the gap between the two spectral edges scales like,

$$\begin{aligned} \Delta _t:= 2 ( \pi - \Theta _t ) = \frac{1}{3} (4 -t )^{3/2} (1 + {\mathcal {O}}(4-t) ). \end{aligned}$$
(1.30)

By our calculations of \(\rho _t\) and the asymptotics of the shape function \(\Psi _e\) we have that in a vicinity of the edge, the density of states behaves like a re-scaled square-root,Footnote 1

$$\begin{aligned} \rho _t ( \Theta _t - E) \asymp \frac{ E^{1/2}}{ (4-t)^{1/4} } \asymp \frac{E^{1/2}}{\Delta _t^{1/6}}, \qquad 0 \le E \le \Delta _t. \end{aligned}$$
(1.31)

From the behavior of \(\rho _t\) it follows that the natural fluctuation scale of the extremal eigenvalues is \(\Delta _t^{1/9} N^{-2/3}\). This is of the same order of magnitude as \(\Delta _t\) when \(4-t = N^{-1/2}\). It follows that for \(t \ll 4 - N^{-1/2}\) one expects that the extremal eigenvalues are still located near their respective edges. For larger t, the fluctuations of the extremal eigenvalues is larger than \(\Delta _t\) and so no such rigidity estimate is expected. The first statement is the content of the following theorem.

Theorem 1.6

Let \(\delta >0\) and let \(\varepsilon >0\). With overwhelming probability the following holds. Uniformly for all t satisfying \(2< t < 4 - N^{-1/2+\delta }\) we have that,

$$\begin{aligned} \left| \left\{ i: \lambda _i (t) = {\textrm{e}}^{ {\textrm{i}}\theta }, \theta \in [-\pi , \pi ] \backslash [ - \Theta _t - N^{-2/3+\varepsilon }\Delta _t^{1/9}, \Theta _t + \Delta _t^{1/9} N^{-2/3+\varepsilon } ] \right\} \right| = 0\nonumber \\ \end{aligned}$$
(1.32)

The above theorem is proven in Sect. 4. In principle, its proof could be absorbed into the proof of Theorem 1.4. However, handling the three separate scaling regimes, when t is small, of intermediate size, and close to 4, would require significant additional notation and burden the reader by overly complicating the proofs. We have chosen instead to treat the “short” and “long” time regimes separately; in fact we will use the result of Theorem 1.4 in our proof of Theorem 1.6, initializing the dynamics at an intermediate time \(0 \ll t \ll 4\), conditional on the results of Theorem 1.4 holding.

Local laws and rigidity for random matrices exhibiting cusps were established in [20] using resolvent methods. The work [16] also establishes rigidity results for certain interpolating ensembles using a dynamical, PDE-based approach not related to our approach, although both works study the formation of cusps under eigenvalue dynamics.

The main obstacle in proving the above theorem is to understand how the cusp scaling affects the behavior of the characteristics. In particular, we must understand how the angular and radial coordinates of the characteristics are affected by this new scaling. Luckily, in the regime where we expect to prove edge rigidity, there is still a small interval where the density of states behaves like a square root, albeit rescaled by a factor involving \(\Delta _t\). The characteristics relevant to edge rigidity start close to the spectral edge, and some of our calculations of square-root behavior in the earlier short-time regime of Theorem 1.4 are applicable here, after finding appropriate re-scalings by \(\Delta _t\) of the angular and radial characteristic coordinates. Nonetheless, we still need to handle the behavior of the characteristics for times of order 1, and so more analysis of the characteristics is required. This is further complicated by the curvature of our coordinate system as well as the fact that the scaling factor \(\Delta _t\) is itself time-dependent and will in general differ by several orders of magnitude over the time intervals we consider.

We can use Theorem 1.6 together with Corollary 1.3 to deduce the following rigidity estimates, in a similar manner to Corollary 1.5. We omit the proof.

Corollary 1.7

Let \(\delta >0\) and \(\varepsilon >0\). For \(N^{-1/2+\delta } \le 4-t \le 10^{-1}\) we have that the following estimates hold with overwhelming probability. Uniformly for all i satisfying \(1 \le i \le N (4-t)^2\) we have,

$$\begin{aligned} | \theta _i (t) - \gamma _i (t) | \le N^{\varepsilon } (4-t)^{1/6} \frac{1}{N^{2/3} i^{1/3} } \end{aligned}$$
(1.33)

and for \(N (4-t)^2 \le i \le N/2\) we have,

$$\begin{aligned} | \theta _i (t) - \gamma _i (t) | \le N^{\varepsilon } \frac{1}{N^{3/4} i^{1/4}}. \end{aligned}$$
(1.34)

Analogous estimates hold for indices i near N.

Remark

The reason for the two regimes of indices less than or greater than \(N(4-t)^2\) is due to the behavior of the limiting spectral measure near \(-1 = {\textrm{e}}^{ {\textrm{i}}\pi }\) for times t close to 4. With \(s = 4-t\), we have that \(\rho _t (\Theta _t -E) \asymp E^{1/2} /s^{1/4}\) for \(E \le s^{3/2}\) and \(\rho _t(\Theta _t - E) \asymp E^{1/3}\) for \(0.1 \ge E \ge s^{3/2}\) (see Proposition A.5 and (A.28)). The form of the RHS of the estimates (1.33) and (1.34) reflect the different interparticle distances in each of these regimes. \(\square \)

For larger times \(t \gtrsim 4 - N^{-1/2}\), the optimal estimates for the Cauchy transform are in fact included in Theorem 1.2; compare with, e.g., the local laws of [20]. Note that Theorem 1.2 alone is insufficient to conclude rigidity estimates similar to Corollaries 1.5 or 1.7. In particular, the above results cannot rule out the case that after time \(t \sim 4\), that all of the eigenvalues wrap around the unit circle many times.

However, due to the fact that \(\theta _N (t)\) and \(\theta _1 (t)\) cannot cross, the “winding number” (we use this term loosely) can be determined from the behavior of the center of mass,

$$\begin{aligned} {\bar{\theta }} (t):= \frac{1}{N} \sum _{i=1}^N \theta _i (t). \end{aligned}$$
(1.35)

From either (1.1) or (1.6) one can check that formally \( {\textrm{d}}{\bar{\theta }} = N^{-1} {\textrm{d}}B\) for a Brownian motion B. In Sect. 5 we justify this using a careful application of the analytic functional calculus.

Proposition 1.8

For any \(t >0\) we have almost surely that,

$$\begin{aligned} {\bar{\theta }} (t) = \frac{1}{N} {\textrm{tr}}(W_t) \end{aligned}$$
(1.36)

where \(W_t\) is the standard complex Hermitian Brownian motion in (1.1).

We expect that a version of the above statement could also be deduced from Lemma 5 of [42] but we provide a direct proof, aspects of which may be useful in other settings.

This allows us to deduce the following. Let us denote the extended quantiles \({\tilde{\gamma }}_i (t)\) of \(\rho _i (t)\) as follows. For \(1 \le i \le N\) we let \({\tilde{\gamma }}_i (t) = \gamma _i (t)\). For \(i > N\) we

$$\begin{aligned} {\tilde{\gamma }}_i (t) = 2 \pi + \gamma _{i-N} (t) \end{aligned}$$
(1.37)

and for \(i < 1\) we let

$$\begin{aligned} {\tilde{\gamma }}_i (t) = - 2 \pi + \gamma _{i+N} (t). \end{aligned}$$
(1.38)

Corollary 1.9

Let \(\varepsilon >0\) and \(\delta >0\). With overwhelming probability we have uniformly for all t satisfying \(\delta< t < \delta ^{-1}\) that,

$$\begin{aligned} {\tilde{\gamma }}_{i - N^{\varepsilon } } (t) \le \theta _i (t) \le {\tilde{\gamma }}_{i + N^{\varepsilon }} (t). \end{aligned}$$
(1.39)

Note that this is weaker for the edge eigenvalues for times \(t < 4 - N^{-1/2+\varepsilon }\) than Corollaries 1.5 and 1.7, and is only useful in the regime where we can no longer rule out the existence of eigenvalues in the gap between the spectral edges, or there is no longer a gap. Corollary 1.9 is proven in Sect. 6.3.

1.3 Further discussion and motivation

In addition to being an intrinsic question about the nature of the process \(U_t\), these local law and rigidity estimates have been well studied in the context of Hermitian random matrix theory. There, a primary motivation is the study of the universality of the local eigenvalue statistics: that is, whether or not the limiting local eigenvalue statistics coincide with those of the Gaussian ensembles, which admit exact formulas.

The short scale behaviors of the repulsive interaction terms of the eigenvalue process of the Hermitian and Unitary Brownian motions (1.5) and (1.6) are similar. Based on this and the general belief in the universality of large correlated systems, it is natural to conjecture that the local eigenvalue statistics of (1.6) should be given by the same statistics as the GUE in the limit \(N \rightarrow \infty \).

Indeed, there have been many developments in the universality theory of the local eigenvalue statistics both within the larger context of random matrix theory, as well as that of the Hermitian Dyson Brownian motion started from general initial data. Note that if the initial data of (1.5) is not the 0 matrix, then the joint eigenvalue distribution of \(X_t\) is no longer that of a re-scaled GUE. Nonetheless, local scaling limits of DBM with general initial data has been obtained in [25, 32,33,34].

Universality has been established for wide classes of Hermitian random matrices. We refer the interested reader to, e.g., the book [26] for an overview of these developments as well as to the seminal papers of Tao and Vu [46, 47], and Erdős, Schlein and Yau [24].

Given these advances in the theory of Hermitian random matrices, it is natural to turn to the question of universality of unitary Brownian motion. An important tool in many of the proofs of Hermitian universality are the aforementioned rigidity and local law estimates.

The main contribution of the present work is to establish these results. It is then a subject of current investigation to use these estimates to prove the local universality of unitary Brownian motion: that the local eigenvalues statistics of \(U_t\) are given by the Tracy-Widom, and Pearcey and Sine kernels in the various scaling regimes of interest.

1.4 Notation

The notion of overwhelming probability was defined above in Definition 1.1. We let \(c >0\) and \(C>0\) denote small and large constants respectively. In general, we allow them to increase or decrease from line to line. For two positive N-dependent quantities (or quantities depending on some auxiliary parameters, usually time t) the notation \(a_N \asymp b_N\) means that there is a constant \(C>0\) so that \(C^{-1} a_N \le b_N \le C a_N\). The notation \(a_N \ll b_N\) means \(a_N / b_N \rightarrow 0\) as \(N \rightarrow \infty \). We will use this notation sparingly, but somewhat informally. When used we always have an explicit estimate, e.g., \(a_N \le b_N / \log (N)\). For complex \(c_N\) and \(d_N\), the notation \(c_N = {\mathcal {O}}(d_N)\) means \(|c_N| \le C | d_N|\) for some \(C>0\).

1.5 Organization of paper

Sections 23 and 4 are meant to be read in a relatively linear fashion. These sections prove Theorems 1.21.4 and 1.4, respectively. The analysis in each section directly builds off that of the previous section. Section 2 treats the “bulk” local law (i.e., a local law with an error that is optimal only in the bulk) and the treatment of the characteristics is relatively straightforward. Section 3 treats the cases \(t < 4 - \delta \) where \(\rho _t\) has a regular square-root edge. Here, the treatment of characteristics (and the resulting estimates of the quantities such as the Martingale term in evolution equation of f(zt)) is more complicated. Finally, Sect. 4 deals with the formation of the cusp.

In the short Sect. 5 we prove Proposition 1.8, that the centre of mass, or averaged winding number, is described by a Brownian motion. In Sect. 6 we establish the analog of the Helffer–Sjöstrand formula and use it to deduce Corollary 1.3. The latter is very similar to what has appeared in [30] and so not all details are provided.

In Appendix A we establish various properties of the limiting spectral measure \(\rho _t\). We use as input its characterization in terms of conformal maps of Biane [11], as well as arguments of Ajanki, Erdős and Krüger [3] involving solutions of Cauchy/Stieltjes transforms of approximate cubic equations, Cardano’s formula and the universal shape functions \(\Psi _m\) and \(\Psi _e\). Finally, Appendix B collects various calculus-type inequalities used in the proof.

This is a shortened version prepared for publication in PTRF of the original manuscript. A longer version, containing all of the proofs omitted here, appears on the arXiv as arXiv:2202.06714v3, i.e., as the third version v3, and is referenced in the current work as [2]. Whenever we omit a proof in the present manuscript, we precisely reference its location in [2].

2 Bulk estimates

In this section we prove Theorem 1.2. Fix a final time \(T>0\). This time may depend on N, but stays bounded above. We introduce the characteristic maps via

$$\begin{aligned} {\mathcal {C}}_t (z) = z \exp \left[ - \frac{ (T-t) {\tilde{f}}(z, T) }{2} \right] . \end{aligned}$$
(2.1)

That is, the function \(t \rightarrow {\mathcal {C}}_t (z)\) satisfies the characteristic equation (1.15) and has final condition \({\mathcal {C}}_T (z) = z\).

From the above, it is clear that the real parts of f(zt) and \({\tilde{f}}(z, t)\) will play important roles. We record here the identity,

$$\begin{aligned} {\textrm{Re}}[ f(z, t) ] = \frac{1}{N} \sum _{i=1}^N \frac{1- |z|^2}{ | \lambda _i - z|^2} . \end{aligned}$$
(2.2)

We will also have use for,

$$\begin{aligned} \partial _z f (z, t) = \frac{2}{N} \sum _{i=1}^N \frac{ \lambda _i (t) }{ ( \lambda _i (t) - z)^2}, \end{aligned}$$
(2.3)

and the inequality

$$\begin{aligned} | \partial _z f(z, t) | \le \frac{2}{ 1- |z|^2} {\textrm{Re}}[ f (z, t) ]. \end{aligned}$$
(2.4)

In the remainder of the section we will also make use of the spectral domains \({\mathcal {B}}_t\) that were defined above in (1.19). We collect some elementary properties of the characteristics.

Lemma 1.10

For any \(|z| >1\), the map,

$$\begin{aligned} t \rightarrow \log | {\mathcal {C}}_t (z) | \end{aligned}$$
(2.5)

is decreasing in time. Secondly, there is a constant \(C>0\) so that for any \(z \in {\mathcal {B}}_T\) we have,

$$\begin{aligned} | {\mathcal {C}}_t(z) | \le C{\textrm{e}}^{CT}. \end{aligned}$$
(2.6)

Proof

The first claim follows from the fact that \({\textrm{Re}}[ {\tilde{f}}(z, t) ] <0\) for \(|z| >1\). For the second, let \(z_t = {\mathcal {C}}_t (z)\). Note that \({\tilde{f}}(z_t, t) = {\tilde{f}}(z, T)\) for all t. If at any time t we have \(|z_t| > 2\), then \(| {\tilde{f}}(z_t, t) | \le 4\) and so \(| {\mathcal {C}}_t (z) | \le |z|{\textrm{e}}^{2 T}\). So either \(|z_t| < 2\) for all t or \(|z_t| \le |z|{\textrm{e}}^{2 T}\) for all t. This yields the claim. \(\square \)

We fix a final point \(z \in {\mathcal {B}}_T\). We will first prove that Theorem 1.2 holds at a single z by tracking the evolution of f(zt) along the characteristic,

$$\begin{aligned} z_t:= {\mathcal {C}}_t (z). \end{aligned}$$
(2.7)

The extension to all z and all \(0< t < T\) will be detailed later. We now introduce the stopping time \(\tau \) via,

$$\begin{aligned} \tau := \inf \left\{ s \in [0, T]: | f (z_s, s) - {\tilde{f}}(z_s, s) | > \frac{ N^{\varepsilon }}{N \log |z_s | } \right\} \wedge T \end{aligned}$$
(2.8)

where we choose \(\varepsilon < \delta /10\), with \(\delta \) as in the definition of \({\mathcal {B}}_T\). We first note that,

$$\begin{aligned} {\textrm{d}}\left( f (z_t, t) - {\tilde{f}}(z_t, t) \right)&= - \frac{ z_t \partial _z f(z_t, t) }{2} \left( f (z_t, t) - {\tilde{f}}(z_t, t) \right) + {\textrm{d}}M_t (z_t) , \end{aligned}$$
(2.9)

with the Martingale term defined above in (1.12). Hence,

$$\begin{aligned} f ( z_\tau , \tau ) - {\tilde{f}}(z_\tau , \tau ) = {\mathcal {E}}_1 ( \tau ) + {\mathcal {E}}_2 ( \tau ) \end{aligned}$$
(2.10)

where

$$\begin{aligned} {\mathcal {E}}_1(t):= - \int _0^{t} \frac{ z_s \partial _z f (z_s, s) }{2} ( f (z_s, s) - {\tilde{f}}(z_s, s) ) {\textrm{d}}s \end{aligned}$$
(2.11)

and

$$\begin{aligned} {\mathcal {E}}_2 (t):= \int _0^{t} {\textrm{d}}M_s (z_s ). \end{aligned}$$
(2.12)

We first prove the following estimate on the martingale term.

Proposition 1.11

For all \(\varepsilon _1 >0\) we have,

$$\begin{aligned} {\mathbb {P}}\left[ \exists t \in [0, \tau ]: | {\mathcal {E}}_2 (t) | > \frac{ N^{\varepsilon _1}}{N \log |z_t | } \right] \le C \log (N) {\textrm{e}}^{ - c N^{\varepsilon _1}}. \end{aligned}$$
(2.13)

Proof

We fix a sequence of intermediate times \(t_k\) with \(k=1, \dots , M\) in [0, T] such that \(\log |z_{t_k} | \le 2 \log |z_{t_{k+1} } |\) and \(t_M = T\). Then \(M \le C \log (N)\) for some \(C>0\) since \(\log |z_0|\) is bounded by Lemma 2.1. Let \(\tau _k = \tau \wedge t_k\). The quadratic variation of \( {\mathcal {E}}_2 ( \tau _k )\) satisfies,

$$\begin{aligned}&\langle {\mathcal {E}}_2 ( \tau _k), {\bar{{\mathcal {E}}}}_2 ( \tau _k) \rangle = \frac{4}{N^2} \int _0^{\tau _k} |z_s|^2 \frac{1}{N} \sum _{i=1}^N \frac{1}{ | \lambda _i (s) - z_s |^4} {\textrm{d}}s \nonumber \\&\quad \le \frac{4}{N^2} \int _0^{ \tau _k } \frac{ |z_s|^2}{ (|z_s| -1)^2 } \frac{1}{N} \sum _{i=1}^N \frac{1}{ | \lambda _i (s) - z_s |^2} {\textrm{d}}s\nonumber \\&\quad \le \frac{C}{N^2} \int _0^{ \tau _k } \frac{ | {\textrm{Re}}[ f (z_s, s) ] |}{ ( \log |z_s | )^3} {\textrm{d}}s. \end{aligned}$$
(2.14)

In the first inequality we used the trivial estimates \(| \lambda _i (s) - z_s | \ge |z_s| -1\). In the last inequality we used Lemma 2.1 to bound \(|z_s|\) as well as the representation (2.2). We also used that \(\log (r) \le r-1\) for \(r>1\). For \(s < \tau \) we have

$$\begin{aligned} | {\textrm{Re}}[ f(z_s, s) ] - {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | \le \frac{N^{\varepsilon }}{N \log |z_s| } \le \frac{ N^{\varepsilon }}{N \log |z_T|}. \end{aligned}$$
(2.15)

By definition of \({\mathcal {B}}_T\) we see that

$$\begin{aligned} \frac{ N^{\varepsilon }}{N \log |z_T|} \le N^{\varepsilon -\delta } | {\textrm{Re}}[ {\tilde{f}}(z_T, T) ] | = N^{\varepsilon -\delta } | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | \end{aligned}$$
(2.16)

where we used that \({\tilde{f}}\) is constant along characteristics. Since \(\varepsilon < \delta \) we therefore see that,

$$\begin{aligned} | {\textrm{Re}}[ f (z_s, s) ] | \le (1 + N^{\varepsilon -\delta } ) | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | \le \left( 1 + \frac{1}{ \log (N) } \right) | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] |\nonumber \\ \end{aligned}$$
(2.17)

for \(s < \tau \). Therefore,

$$\begin{aligned} \int _0^{\tau _k} \frac{ | {\textrm{Re}}[ f (z_s, s) ] |}{ ( \log |z_s | )^3} {\textrm{d}}s&\le 2 \int _0^{\tau _k} \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] |}{ ( \log |z_s | )^3} {\textrm{d}}s \le 2 \int _0^{t_k} \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] |}{ ( \log |z_s | )^3} {\textrm{d}}s \nonumber \\&\le \frac{2}{ (\log |z_{t_k} | )^2}. \end{aligned}$$
(2.18)

In the last inequality we used that \(\partial _u \log |z_u | = - \frac{1}{2}| {\textrm{Re}}[{\tilde{f}}(z_u, u) ] |\). By the Burkholder-Davis-Gundy (BDG) inequality (see, e.g., [44]) and a union bound, we conclude that,

$$\begin{aligned} {\mathbb {P}}\left[ \exists k: \sup _{0 \le t \le \tau _k } | {\mathcal {E}}_2 ( t) | > \frac{ N^{\varepsilon _1}}{ 2 N \log |z_{t_k } | } \right] \le C \log (N) {\textrm{e}}^{ - c N^{\varepsilon _1} }. \end{aligned}$$
(2.19)

Let \(0< t < \tau \) and let k be such that \(\tau _k \le t < \tau _{k+1}\). On the complement of the event on the LHS of (2.19) we have,

$$\begin{aligned} | {\mathcal {E}}_2 (t) | \le \frac{N^{\varepsilon _1}}{2 N \log |z_{t_k } | } \le \frac{ N^{\varepsilon _1}}{ N \log |z_t | } \end{aligned}$$
(2.20)

by the choice of the \(t_k\)’s. This completes the proof. \(\square \)

Proof of Theorem 1.2

Let \(z \in {\mathcal {B}}_T\) with characteristic \(z_t\) as above. Let \(\tau \) be the stopping time as defined above. Let \(\Upsilon \) be the event of Proposition 2.2 and define,

$$\begin{aligned} g(s):= \frac{ |z_s \partial _z f (z_s, s) |}{2}. \end{aligned}$$
(2.21)

By Gronwall’s inequality we have for all \(0<t < \tau \) on the event \(\Upsilon \),

$$\begin{aligned} | f (z_t, t) - {\tilde{f}}(z_t, t) | \le \int _0^t g (s) \exp \left[ \int _s^t g(u) {\textrm{d}}u \right] \frac{ N^{\varepsilon _1}}{N \log |z_s| } {\textrm{d}}s + \frac{N^{\varepsilon _1}}{N \log |z_t| }. \end{aligned}$$
(2.22)

We now estimate the integral of the function g that appears above in the argument of the exponential function. Using first (2.4) we have,

$$\begin{aligned} \int _s^t g(u) {\textrm{d}}u&= \int _s^t \frac{ |z_u | | \partial _z f(z_u, u) |}{2} {\textrm{d}}u \le \int _s^t \frac{ |z_u| | {\textrm{Re}}[ f(z_u, u) ] | }{ |z_u|^2 -1 } {\textrm{d}}u \nonumber \\&\le \int _s^t \frac{ | {\textrm{Re}}[ f (z_u, u) ] |}{ 2 \log |z_u | } {\textrm{d}}u. \end{aligned}$$
(2.23)

In the last inequality we used the elementary inequality,

$$\begin{aligned} \frac{x}{x^2-1} \le \frac{1}{ 2 \log (x) } \end{aligned}$$
(2.24)

which holds for \(x >1\) (see Lemma B.1). Using now (2.17) we have,

$$\begin{aligned} \int _s^t \frac{ | {\textrm{Re}}[ f (z_u, u) ] |}{ 2 \log |z_u | } {\textrm{d}}u \le&\left( 1+ \frac{1}{ \log (N) } \right) \int _s^t \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_u, u) ]|}{2 \log |z_u | } {\textrm{d}}u \nonumber \\ =&\left( 1 + \frac{1}{ \log (N) } \right) \log \left( \frac{ \log |z_s|}{ \log |z_t| } \right) \end{aligned}$$
(2.25)

where we used in the last line that \(\partial _u \log |z_u | = - \frac{1}{2}| {\textrm{Re}}[ {\tilde{f}}(z_u, u) ] |\). Therefore,

$$\begin{aligned} \exp \left[ \int _s^t g(u) {\textrm{d}}u \right] \le C \frac{ \log |z_s|}{ \log |z_t| }. \end{aligned}$$
(2.26)

We used the fact that the assumption \( C\ge \log |z_u| \ge N^{-{\mathfrak {c}}}\) from the definition of \({\mathcal {B}}_T\) implies that

$$\begin{aligned} \left( \frac{ \log |z_s|}{ \log |z_t| } \right) ^{1/\log (N) } \le C. \end{aligned}$$
(2.27)

Substituting (2.26) into (2.22) yields,

$$\begin{aligned} |f (z_t, t) - {\tilde{f}}(z_t, t) | \le \frac{ C N^{\varepsilon _1}}{N \log |z_t|} \int _0^t g(s) {\textrm{d}}s + \frac{ N^{\varepsilon _1}}{N \log |z_t| }. \end{aligned}$$
(2.28)

From the definition of g and (2.4) and (2.17) we see that,

$$\begin{aligned} g(s) \le C \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] |}{ \log |z_s |} \end{aligned}$$
(2.29)

for \(s < \tau \). Therefore,

$$\begin{aligned} \int _0^t g(s) {\textrm{d}}s \le C \log (N) \end{aligned}$$
(2.30)

using that \(\log |z_t| \ge N^{-{\mathfrak {c}}}\). We conclude that for all \( 0< t < \tau \) that,

$$\begin{aligned} |f (z_t, t) - {\tilde{f}}(z_t, t) | \le C \frac{ \log (N) N^{\varepsilon _1}}{N \log |z_t|}. \end{aligned}$$
(2.31)

Taking \(\varepsilon _1 < \varepsilon /2\), where \(\varepsilon >0\) is in the definition of the stopping time, we see that on the event \(\Upsilon \), we must have \(\tau = T\) for all N large enough. This proves the estimate of Theorem 1.2 at the final time T and at the point z. The extension to all points \(z \in {\mathcal {B}}_T\) may be done by first taking a union bound over \({\mathcal {O}}(N^C)\) points and then using

$$\begin{aligned} | f_\mu (z) - f_\mu (w) | \le C|z-w| \left( \frac{1}{ \log |z| } + \frac{1}{ \log |w|} \right) ^2 \end{aligned}$$
(2.32)

valid for any Cauchy transform and |z|, |w| in bounded regions of \({\mathbb {C}}\). The above estimate is elementary and can be proven directly from the definition

$$\begin{aligned} f_\mu (z):= \int _0^{2 \pi } \frac{ {\textrm{e}}^{ {\textrm{i}}\theta } + z}{ {\textrm{e}}^{ {\textrm{i}}\theta }- z }{\textrm{d}}\mu ( \theta ) \end{aligned}$$

and the fact that for \(|z| >1\) we have for the quantity in the denominator \(|{\textrm{e}}^{ {\textrm{i}}\theta } -z | \ge |z| -1 \ge c \log |z|\).

The extension to all times \(t < T\) is done in a similar manner; one first proves the estimate for all \(z \in {\mathcal {B}}_t\) for all times t in a well-spaced grid of [0, T] of at most size \({\mathcal {O}}(N^A)\), some \(A>0\) to be determined. Then one notes that the proof also gives that the estimate holds along the entire characteristic, and along each characteristic, we have, e.g., \(|z_t - z_s | \le C N^{C} |t-s|\), for some \(C>0\) depending on \({\mathfrak {c}}>0\). We need only take A large enough depending on \(C >0\). This completes the proof of Theorem 1.2. \(\square \)

3 Edge estimates

Fix \(T < 4\). We will need to consider characteristics ending at many times and so we introduce the characteristic map,

$$\begin{aligned} {\mathcal {C}}_{s, t} (z) = z \exp \left[ - \frac{ (t-s) {\tilde{f}}(z, t) }{2} \right] \end{aligned}$$
(3.1)

for \(0 \le s \le t\). We will need to establish several properties about the behavior of characteristics near the edge \(\Theta _t\). The proof of the following lemma, an exercise in calculus, and can be found in Appendix B of [2].

Lemma 1.12

Let \(\rho \) be a measure on \([-\pi , \pi ]\) such that \(\rho ( \theta ) = \rho (-\theta )\) that is supported in \([-E, E]\) for \(0< E < \pi \). Assume either that \(\rho ( \theta ) \le M\) or \(E < \pi /8\). For any \(\varepsilon >0\) there is a \(c_\varepsilon >0\), depending only on EM and \(\varepsilon >0\) so that for \(0< r-1 < c_\varepsilon \) and \(E< \theta < \pi - \varepsilon \) we have,

$$\begin{aligned} {\textrm{Im}}[f ( r {\textrm{e}}^{ {\textrm{i}}\theta } ) ]< {\textrm{Im}}[ f ( {\textrm{e}}^{ {\textrm{i}}\theta } ) ] < {\textrm{Im}}[ f ( {\textrm{e}}^{ {\textrm{i}}E } ) ] \end{aligned}$$
(3.2)

where f(z) is the Cauchy transform of \(\rho \). We also have for \(E < \theta \le \varphi \le \pi \),

$$\begin{aligned} 0 \le {\textrm{Im}}[ f ( {\textrm{e}}^{ {\textrm{i}}\varphi } ) ] \le {\textrm{Im}}[ f ( {\textrm{e}}^{ {\textrm{i}}\theta } ) ], \end{aligned}$$
(3.3)

with equality occuring only in trivial cases.

We now prove the following.

Lemma 1.13

Let \(T<4\) and \( z= r {\textrm{e}}^{ {\textrm{i}}(\Theta _t + \kappa )}\) for \(\kappa >0\). There is a small \(c>0\) and \(d>0\) so that,

$$\begin{aligned} {\textrm{Im}}[ {\tilde{f}}(z, t) ] - {\textrm{Im}}[ {\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _t }, t ) ] \le - d \sqrt{ \kappa } \end{aligned}$$
(3.4)

for \(0< t < T\) and all \(0< r-1 < c\) and \(\kappa < \pi - \Theta _t\).

Proof

By Lemma B.3, the desired estimate hold for all \(\kappa < \varepsilon \) and \(0< r-1 < \varepsilon \) for some \(\varepsilon >0\). In particular,

$$\begin{aligned} {\textrm{Im}}[ {\tilde{f}}({\textrm{e}}^{ {\textrm{i}}( \Theta _t + \varepsilon /2)}, t) ] - {\textrm{Im}}[ {\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _t }, t ) ] \le - c_1 \end{aligned}$$
(3.5)

for some \(c_1>0\). Since for \(t < T\) the measures are supported away from \(\pi \) we see that there is a \(\delta >0\) so that

$$\begin{aligned} | {\textrm{Im}}[ {\tilde{f}}( z, t) ] - {\textrm{Im}}[ {\tilde{f}}(-1, t ) ] | \le \frac{c_1}{2}, \end{aligned}$$
(3.6)

for \(|z+1| < \delta \). It follows that for all \(|z+1| < \delta \),

$$\begin{aligned} {\textrm{Im}}[ {\tilde{f}}(z, t) ] - {\textrm{Im}}[ {\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _t} ) ] \le \frac{c_1}{2} + {\textrm{Im}}[ {\tilde{f}}({\textrm{e}}^{ {\textrm{i}}( \Theta _t + \varepsilon /2}), t) ] - {\textrm{Im}}[ {\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _t} ) ] \le - \frac{c_1}{2}\nonumber \\ \end{aligned}$$
(3.7)

where we used the second estimate of Lemma 3.1. This proves the desired estimate for \(|z+1| < \delta \) after possibly decreasing the value of \(d>0\).

On the other hand, by Lemma 3.1 we conclude that there is a \(c_2 >0\) so that for \(0< r-1 < c_2\) and \(\varepsilon /2< \kappa < \pi - \Theta _t - \delta /2\) that,

$$\begin{aligned} {\textrm{Im}}[ {\tilde{f}}(r {\textrm{e}}^{ {\textrm{i}}( \Theta _t + \kappa )}, t) ] - {\textrm{Im}}[ {\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _t} ) ] \le {\textrm{Im}}[ {\tilde{f}}( {\textrm{e}}^{ {\textrm{i}}( \Theta _t + \varepsilon /2)}, t) ] - {\textrm{Im}}[ {\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _t} ) ] \le -c_1.\nonumber \\ \end{aligned}$$
(3.8)

This concludes the proof. \(\square \)

The following contains the properties of the characteristics that we will need.

Proposition 1.14

Let \(T <4\). There is a \({\mathfrak {a}}>0\) and \({\mathfrak {b}}>0\) depending on T so that the following holds. Let \(z_s = {\mathcal {C}}_{s, t} ( z)\) for any \(t< T\), where \(z = r {\textrm{e}}^{ {\textrm{i}}\theta }\) satisfies \(0< r-1 < {\mathfrak {a}}\) and \( \Theta _t< \theta < \pi \). Denote \(z_s = r_s {\textrm{e}}^{ {\textrm{i}}(\Theta _s + \kappa _s )}\). Let \(s_*\) be,

$$\begin{aligned} s_* = \inf \{ s< t: (r_s-1) < {\mathfrak {a}}\}. \end{aligned}$$
(3.9)

Then for \(s_*< s < t\) we have,

$$\begin{aligned} \sqrt{ \kappa _s} \ge \sqrt{ \kappa _t} + {\mathfrak {b}}(t-s) \end{aligned}$$
(3.10)

and for \(s \le s_*\) we have \(r_s \ge 1 + {\mathfrak {a}}\). Furthermore, let D(s) be a function that obeys,

$$\begin{aligned} \sqrt{ D(s)} \le \sqrt{D(t)} + \frac{{\mathfrak {b}}}{2} (t-s), \qquad s < t. \end{aligned}$$
(3.11)

If \(\kappa _t \ge D (t)\) then for \(s_*< s < t\) we have,

$$\begin{aligned} \kappa _s \ge D(s) +\frac{{\mathfrak {b}}}{2} \sqrt{ \kappa _t} (t-s). \end{aligned}$$
(3.12)

Finally, characteristics do not cross the real axis in the complex plane.

Proof

We take \({\mathfrak {a}}\) so small so that the conclusion of Lemma 3.2 for \(z = r {\textrm{e}}^{ {\textrm{i}}\theta }\) with \(r-1 < 10 {\mathfrak {a}}\). For \(s < s_*\) we have \(r_s > {\mathfrak {a}}\) because the radial coordinate is decreasing in time. Let \(s_1\) be,

$$\begin{aligned} s_1 = \inf \{ s < t: \kappa _s > 0 \}. \end{aligned}$$
(3.13)

Note that \(s_1 < t\) as we assume \(\kappa _t >0\). We claim that \(s_1 \le s_*\). For \(t> s> s_1 \vee s_*\) we have, the following calculation,

$$\begin{aligned} \partial _s \kappa _s&= \frac{1}{2}{\textrm{Im}}[{\tilde{f}}(z_s,s)] - \frac{1}{2}{\textrm{Im}}[{\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _s},s)] \nonumber \\&= \frac{1}{2}{\textrm{Im}}[{\tilde{f}}(z_t,t)] - \frac{1}{2}{\textrm{Im}}[{\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _t},t)] - \frac{1}{2}(\sqrt{\frac{4 -s}{s}} - \sqrt{\frac{4-t}{t}}) \nonumber \\&\le \frac{1}{2}[{\textrm{Im}}[{\tilde{f}}(z_t,t)] - \frac{1}{2}{\textrm{Im}}[{\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _t},t)]] - c(t-s) \nonumber \\&\le -d\sqrt{\kappa _t} -c(t-s) \end{aligned}$$
(3.14)

The second line is from the fact that \({\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _t}, t) = {\textrm{i}}\sqrt{ 4t^{-1} -1 }\) for all t and that \({\tilde{f}}\) is constant along characteristics. The third line is straightforward. The last line follows from Lemma 3.2. In particular, we see that \(\kappa _s\) is a decreasing function for \(s > s_* \vee s_1\). Since \(\kappa _t >0\) it follows that \(s_1 \le s_*\). Therefore, the final inequality in (3.14) holds for all \(s_*< s < t\). Therefore via integration we obtain,

$$\begin{aligned} \kappa _s \ge \kappa _t + 2c_2 (t-s) \sqrt{ \kappa _t} + c_2 (t-s)^2 \end{aligned}$$
(3.15)

for some small \(c_2 >0\) for all \(s_*< s < t\). This is equivalent to the desired inequality. For the last inequality of the proposition, we have that

$$\begin{aligned} \sqrt{\kappa _s} \ge \sqrt{\kappa _t} + {\mathfrak {b}}(t-s) \ge \sqrt{D(t)} + {\mathfrak {b}}(t-s) \ge \sqrt{D(s)} + \frac{{\mathfrak {b}}}{2}(t-s). \end{aligned}$$
(3.16)

We can square this to get \(\kappa _s \ge D(s) + {\mathfrak {b}}(t-s) \sqrt{D(s)}\) as well as \(\kappa _s \ge \kappa _t + 2 {\mathfrak {b}}(t-s) \sqrt{\kappa _t}\). If \(D(s) \ge \kappa _t\), we can use the first inequality. Otherwise, we can use the second.

The final claim that characteristics not cross the real axis is due to the fact that the imaginary part of \({\tilde{f}}(z, t)\) vanishes for purely real z, due to the symmetry of \(\rho _t\). \(\square \)

The real part of the Cauchy transform f(zt) can be used to detect the presence of outlying eigenvalues. In order to use it, we first need the following estimates on how \({\tilde{f}}(z, t)\) behaves. The proof is for the most part elementary and is deferred to Appendix B of the longer version [2].

Lemma 1.15

Let \(\delta >0\). Let \(z = (1+\eta ) {\textrm{e}}^{ {\textrm{i}}(\Theta _t + \kappa ) }\). Uniformly in the region,

$$\begin{aligned} 0< \eta< 5, \qquad 0< \kappa < \pi - \Theta _t \end{aligned}$$
(3.17)

we have for \(\delta< t < 4 - \delta \) that,

$$\begin{aligned} c \frac{ \eta }{ \sqrt{ \kappa +\eta } } \le | {\textrm{Re}}[ {\tilde{f}}(z, t) ] | \le C \frac{ \eta }{ \sqrt{ \kappa +\eta } } \end{aligned}$$
(3.18)

for some \(c, C>0\). For \( 0< t < \frac{1}{2}\) we also have,Footnote 2

$$\begin{aligned} \eta c \le | {\textrm{Re}}[ {\tilde{f}}(z,t) ] | \le C \frac{ \eta }{ \kappa ^2}. \end{aligned}$$
(3.19)

As advertised, the following lemma allows us to use estimates for the Cauchy transform outside the spectrum to rule out the presence of outlying eigenvalues.

Lemma 1.16

Let \(\delta >0\) and assume \(\delta< t < 4 - \delta \). Let \(\varepsilon >0\) and \(0 \le k \le \frac{2}{3}\). Suppose that the estimate,

$$\begin{aligned} \left| f(z, t) - {\tilde{f}}(z, t) \right| \le \frac{N^{\varepsilon }}{N \sqrt{ \eta + \kappa } \sqrt{ \eta } } \end{aligned}$$
(3.20)

holds for all \( z = r {\textrm{e}}^{ {\textrm{i}}\theta }\) for any r and \(\theta \) satisfying,

$$\begin{aligned} \Theta _t + N^{-2/3+k + 5 \varepsilon } \le | \theta | \le \pi , \qquad N^{-2/3+k/ 4+ \varepsilon } \le r -1 \le c \end{aligned}$$
(3.21)

where \(c >0\) is any positive constant. Then there are no eigenvalues \(\lambda = {\textrm{e}}^{ {\textrm{i}}\theta }\) for \(\Theta _t + N^{-2/3+k + 5 \varepsilon } \le | \theta | \le \pi \). The same conclusion holds for \(t < \delta \) if we take \(k = 2/3 - 5 \varepsilon - \varepsilon /6\).

Proof

Let \(\lambda \) be an eigenvalue in f(zt). We will show that if z is of the form \(r {\textrm{e}}^{{\textrm{i}}\theta }\) with \(r>1\) and \(|\theta - \arg \lambda | < r-1\), then \({\textrm{Re}}[f(z,t)] \ge \frac{1}{N (r-1)}\). By rotational invariance along the unit circle, it suffices to consider \(\lambda = 1\).

We see by direct calculation that

$$\begin{aligned} {\textrm{Re}}\left[ \frac{z-1}{z+1} \right]= & {} \frac{r^2-1}{(r-1)^2 + 2 r(1- \cos \theta )} \ge \frac{r^2-1}{(r-1)^2 + r \theta ^2} \ge \frac{r^2-1}{(r-1)^2(1+r)} \nonumber \\\ge & {} \frac{1}{r-1}. \end{aligned}$$
(3.22)

To get the second inequality, we used \(1- \cos \theta \le \frac{\theta ^2}{2}\). The third inequality comes from \(\theta < r-1\).

Thus, we see that if z of the form \(r {\textrm{e}}^{{\textrm{i}}\theta }\) with \(r>1\) and \(|\theta - \arg \lambda | < r-1\), we know that \({\textrm{Re}}[f(z,t)] \ge \frac{1}{N(r-1)}\) provided that \(\lambda \) is an eigenvalue appearing in the empirical measure for f(zt).

Now, consider the point \(z=(1+ N^{-2/3 + k/4+ \varepsilon } ) {\textrm{e}}^{{\textrm{i}}\theta }\), where \(\theta \) is an angle in the region \( |\theta | \in [\Theta _t + N^{-2/3 + k+ 5 \varepsilon }, \pi ]\).

Applying the triangle inequality and the estimates (3.20) and (3.18), we see that for large enough N,

$$\begin{aligned} |{\textrm{Re}}[f(z,t)] |\le & {} | f(z,t) - {\tilde{f}}(z,t)| + |{\textrm{Re}}[{\tilde{f}}(z,t)] |< C \frac{\eta }{\sqrt{\kappa + \eta }} + \frac{N^\varepsilon }{N \sqrt{\eta } \sqrt{\kappa + \eta }} \nonumber \\< & {} \frac{1}{N \eta }. \end{aligned}$$
(3.23)

Indeed, for the final inequality, we use the hypotheses on \(\eta \) and \(\theta \) to estimate,

$$\begin{aligned}{} & {} \frac{\eta }{\sqrt{\kappa + \eta }} \le N^{-1/3- k/4 -3/2 \varepsilon }, \quad \frac{N^{\varepsilon }}{N \sqrt{\eta } \sqrt{\kappa + \eta }} \le N^{-1/3 -5k/8- 2 \varepsilon }, \quad \frac{1}{N \eta } \nonumber \\{} & {} \quad = N^{-1/3 -k/4 - \varepsilon } \end{aligned}$$
(3.24)

This shows \(|{\textrm{Re}}[f(z,t)] | < \frac{1}{N \eta }\) and therefore \({\textrm{e}}^{{\textrm{i}}\theta }\) cannot be an eigenvalue of the empirical measure associated to f(zt).

We now consider the case \(0< t < \delta \) and \(k = 2/3 - 5\varepsilon - \varepsilon /6\). We can instead apply the trivial bound \(| {\textrm{Re}}[{\tilde{f}}(z,t)] | \le C \frac{\eta }{\kappa ^2}\) (3.19). We see that at the same choice of \(\eta \) and \(\kappa \), we have

$$\begin{aligned} \frac{\eta }{\kappa ^2} \le N^{-1/2 + \varepsilon /24 }, \quad \frac{N^{\varepsilon }}{N \sqrt{\eta } \sqrt{\kappa + \eta }} \le N^{-3/4 +59/48 \varepsilon }, \quad \frac{1}{N \eta } = N^{-1/2 +7/24 \varepsilon }.\nonumber \\ \end{aligned}$$
(3.25)

Therefore, for \(z = r {\textrm{e}}^{ {\textrm{i}}\theta }\) as in the statement of the lemma we see that \(| {\textrm{Re}}[f (z, t) ] | < \frac{1}{N \eta }\). This completes the proof. \(\square \)

With the above results in hand, we can begin our proof of Theorem 1.4. Before doing so, we need to introduce further notation. Let \(T<4\). We consider times \(t < T\). Let \({\mathfrak {a}},{\mathfrak {b}}>0\) be the constants of Proposition 3.3. Let \(\delta >0\) and \(\varepsilon >0\). Assume \(\varepsilon < 10^{-6}\). These parameters will be fixed until the end of the proof of Theorem 1.4.

Introduce D(t) to be the function

$$\begin{aligned} D(t) = N^{-\varepsilon /6}, 0 \le t \le \delta , \qquad D(t) = \max \left\{ (N^{-\varepsilon /12} - \frac{{\mathfrak {b}}}{10}(t-\delta ))^2, N^{-2/3 + 5 \varepsilon } \right\} .\nonumber \\ \end{aligned}$$
(3.26)

The function D(t) satisfies the hypotheses of Proposition 3.3. In fact, with this choice of D(t) we have for any choices of \( s< t\) that

$$\begin{aligned} \sqrt{ D(s) } \le \sqrt{ D(t) } + \frac{{\mathfrak {b}}}{10} (t-s). \end{aligned}$$
(3.27)

Additionally, define k(t) to be the solution of \(N^{-2/3 + k(t) + 5 \varepsilon }= D(t)\). Define the spectral domains \({\mathcal {G}}_t\) by,

$$\begin{aligned} {\mathcal {G}}_t:= \{ z = r {\textrm{e}}^{ {\textrm{i}}\theta }: \Theta _t + N^{-2/3+k(t)+5 \varepsilon } \le | \theta | \le \pi , N^{-2/3+k(t)/4+ \varepsilon } \le r -1 \le {\mathfrak {a}}/2 \}.\nonumber \\ \end{aligned}$$
(3.28)

For any characteristic \(z_t = r {\textrm{e}}^{ {\textrm{i}}\theta }\) with \( r>1\) define \(\kappa (z_t) = |\theta | - \Theta _t\) and \(\eta (z_t) = r-1\). Consider the control parameter,

$$\begin{aligned} B(z_t):= \frac{1}{N \sqrt{ \kappa (z_t) +\eta (z_t) }\sqrt{\eta (z_t)} } {\varvec{1}}_{ \kappa (z_t) > 0 } + \frac{1}{N \eta (z_t) } {\varvec{1}}_{\kappa (z_t) < 0}. \end{aligned}$$
(3.29)

Lemma 1.17

Let \(0< t < T\) and let \(z_s = C_{s, t} (z)\) where \(z \in {\mathcal {G}}_t\). Then, for all \(0< s < t\) we have,

$$\begin{aligned} N^{\varepsilon } B(z_s) \le \frac{1}{ \log (N) } | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] |. \end{aligned}$$
(3.30)

Proof

Let \(s_*\) be as in Proposition 3.3. For \(s \le s_*\) we have that

$$\begin{aligned} N^{\varepsilon } B (z_s) \le C N^{\varepsilon -1}. \end{aligned}$$
(3.31)

On the other hand, by Lemma 3.4, \(| {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | \ge c\) for some \(c>0\) for such \(z_s\). We now consider \(s_*< s < t\). By Proposition 3.3 it follows that,

$$\begin{aligned} \kappa (z_s ) \ge \kappa (z_t) \ge N^{-2/3+k(t)+5 \varepsilon } \end{aligned}$$
(3.32)

and \(\eta (z_s) \ge N^{-2/3+k(t)/4 + \varepsilon }\). First consider \(t\le \delta \). Then, \(k(t) = 2/3-5\varepsilon - \varepsilon /6\) and so

$$\begin{aligned} N^{\varepsilon } B( z_s) \le N^{-3/4+10 \varepsilon }. \end{aligned}$$
(3.33)

Moreover, \({\textrm{Re}}[ {\tilde{f}}(z_s, s) ] \ge c \eta (z_s) \ge N^{-1/2-5 \varepsilon }\). This proves the estimate for such t. Consider now \(t > \delta \). In this case, the desired inequality can be rewritten as

$$\begin{aligned} \log (N) N^{\varepsilon } B(z_s) \le |{\textrm{Re}}[{\tilde{f}}(z_s, s) ] | = | {\textrm{Re}}[ {\tilde{f}}(z_t, t) ] |, \end{aligned}$$
(3.34)

using the fact that \({\tilde{f}}\) is constant along characteristics. Since \(B (z_s) \le B (z_t)\) for \(s > s_*\), this reduces to whether,

$$\begin{aligned} \eta (z_t)^{3/2} \ge \log (N) N^{\varepsilon -1}. \end{aligned}$$
(3.35)

But this holds since \(\eta (z_s) \ge N^{-2/3+ \varepsilon }\). \(\square \)

We introduce a grid of times \(t_i = i TN^{-10}\) for \(i = 1, \dots , N^{10}\). For each such \(t_i\) we let \(\{ u^i_j \}_{j=1}^{N^{10}}\) be a well-spaced grid of \({\mathcal {G}}_{t_i}\). We then define the characteristics,

$$\begin{aligned} z^i_j (s):= {\mathcal {C}}_{s, t_i} (u^i_j ). \end{aligned}$$
(3.36)

We define each characteristic \(z^i_j(s)\) only for \(0< s < t_i\). We introduce the stopping times,

$$\begin{aligned} \tau _{ij}:= \inf \{ s \in (0, t_i ]: | f( z^i_j (s),s) - {\tilde{f}}(z^i_j (s), s) | > N^{\varepsilon /2} B(z^i_j (s) ) \} \end{aligned}$$
(3.37)

where the infimum of the empty set is \(+ \infty \). Then, we introduce

$$\begin{aligned} \tau := \min _{i,j} \tau _{ij} \wedge T. \end{aligned}$$
(3.38)

We want to prove that \(\tau =T\). First observe that since \(|{\tilde{f}}(z, t)| \le N\) for any \(z \in {\mathcal {G}}_t\), we have \(|z_{s_1} - z_{s_2} | \le C N |s_1-s_2|\) for any characteristic \(z_s\) ending in some \({\mathcal {G}}_t\). It follows that for each i and s satisfying \(t_i - TN^{-10} < s \le t_i\) that the points \(\{ z_j^i (s) \}_{j=1}^{N^{10}}\) are a well-spaced grid of \({\mathcal {G}}_s\) in the sense that every \(z \in {\mathcal {G}}_s\) is no further than \(N^{-8}\) away from the closest of these points. Since f(zt) and \({\tilde{f}}(z, t)\) are Lipschitz functions with Lipschitz constant less than CN for \(|z| > 1 + N^{-1}\) we see that for every \( s< \tau \) that,

$$\begin{aligned} | f (z, s) - {\tilde{f}}(z, s) | \le N^{\varepsilon } B(z) \end{aligned}$$
(3.39)

for any \(z \in {\mathcal {G}}_s\). From Lemma 3.5 we conclude that at any time \(s < \tau \) there are no eigenvalues of the form \(\lambda = {\textrm{e}}^{ {\textrm{i}}\theta }\) for \(\Theta _s + N^{-2/3+k(s)+5 \varepsilon } < | \theta | \le \pi \).

As in the proof of Theorem 1.2 we write, for any \( t < t_i \wedge \tau \),

$$\begin{aligned} f ( z^i_j (t) ) - {\tilde{f}}( z^i_j (t) ) = {\mathcal {E}}_1 ( t )^i_j + {\mathcal {E}}_2 (t)^i_j \end{aligned}$$
(3.40)

where

$$\begin{aligned} {\mathcal {E}}_1 (t)^i_j = - \frac{1}{2} \int _0^s z^i_j (s) (\partial _z f) ( z^i_j (s), s) ( f ( z^i_j (s), s) - {\tilde{f}}( z^i_j (s), s) ) {\textrm{d}}s \end{aligned}$$
(3.41)

and

$$\begin{aligned} {\mathcal {E}}_2(t)^i_j = \int _0^t {\textrm{d}}M_s ( z^i_j (s) ). \end{aligned}$$
(3.42)

We now prove the following estimate on the stochastic term.

Proposition 1.18

Let \(\varepsilon _1 >0\). Then for all ij as above,

$$\begin{aligned} {\mathbb {P}}\left[ \exists t \in (0, t_i \wedge \tau ): \left| {\mathcal {E}}_2 (t)^i_j \right| > N^{\varepsilon _1+\varepsilon /8} B ( z^i_j (t) ) \right] \le C \log (N) {\textrm{e}}^{ - c N^{\varepsilon _1} }. \end{aligned}$$
(3.43)

Proof

For simplicity of notation let us denote \(z_t:= z^i_j (t)\). We fix a sequence of intermediate times \(0 <s_1, s_2, \dots , s_M = t_i\) such that,

$$\begin{aligned} \frac{1}{2} B ( z_{s_k} ) \le B ( z_{s_{k+1} } ) \le 2 B ( z_{s_{k} } ), \end{aligned}$$
(3.44)

for every i. We can take \(M \le C \log (N)\). Recall also the time \(s_* < t_i\) defined in Proposition 3.3. Define also \(\kappa _t = \kappa ( z_t)\) and \(\eta _t = \eta (z_t)\). We calculate the quadratic variation,

$$\begin{aligned} \langle {\bar{{\mathcal {E}}}}_2 (s_k\wedge \tau ) {\mathcal {E}}_2 ( s_k \wedge \tau ) \rangle&\le \frac{C}{N^2} \int _0^{s_k \wedge \tau } \frac{1}{N} \sum _{n=1}^N \frac{1}{ | \lambda _i (t) - z_t |^4} {\textrm{d}}t \nonumber \\&\le \frac{C}{N^2} \int _{s_* \wedge \tau }^{s_k \wedge \tau } \frac{1}{N} \sum _{n=1}^N \frac{1}{ | \lambda _n - z_t|^4} {\textrm{d}}t + \frac{C}{N^2} \end{aligned}$$
(3.45)

where the integral in the last line is interpreted as 0 if \(s_k \le s_*\). When \(s_k> s_*\) we continue to estimate the integral. For \(s_*< t < \tau \) we have,

$$\begin{aligned} | \lambda _n (t) - z_t |^2 \ge c \left( ( \kappa _t -N^{-2/3+k(t) + 5\varepsilon } )^2 + \eta _t^2 \right) \end{aligned}$$
(3.46)

Note that in particular, we also used that \(\kappa _t \ge N^{-2/3+k(t) + 5\varepsilon } = D(t)\) for \(s_*< t < s_k\) due to (3.12) and the fact that \(\kappa _{s_k} \ge D(s_k)\). Therefore,

$$\begin{aligned} \frac{1}{N^2} \int _{s_* \wedge \tau }^{s_k \wedge \tau } \frac{1}{N} \sum _{n=1}^N \frac{1}{ | \lambda _n - z_t|^4} {\textrm{d}}t&\le \frac{C}{N^2} \int _{s_* \wedge \tau }^{s_k \wedge \tau } \frac{ | {\textrm{Re}}[ f (z_t, t) ] |}{ \eta _t \left( ( \kappa _t -N^{-2/3+k(t) + 5\varepsilon } )^2 + \eta _t^2 \right) } {\textrm{d}}t \nonumber \\&\le \frac{C}{N^2} \int _{s_* \wedge \tau }^{s_k \wedge \tau } \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_t, t) ] |}{ \eta _t \left( ( \kappa _t -N^{-2/3+k(t) + 5\varepsilon } )^2 + \eta _t^2 \right) } {\textrm{d}}t , \end{aligned}$$
(3.47)

where in the second line we used that Lemma 3.6 together with the definition of the stopping time \(\tau \) implies that for \(t < \tau \) we have,

$$\begin{aligned} | {\textrm{Re}}[ f(z_t, t) ] | \le | {\textrm{Re}}[ {\tilde{f}}(z_t, t) ] | + N^{\varepsilon /2} B (z_t) \le 2 | {\textrm{Re}}[ {\tilde{f}}(z_t, t) ] |. \end{aligned}$$

We use now,

$$\begin{aligned} \kappa _t - N^{-2/3+k(t) + 5\varepsilon }&= \kappa _t - D(t) \nonumber \\&\ge \kappa _{s_k} + 2{\mathfrak {b}}\sqrt{ \kappa _{s_k}} (s_k - t) + {\mathfrak {b}}^2 (s_k - t)^2 - D(t) \nonumber \\&\ge \kappa _{s_k} + 2 {\mathfrak {b}}\sqrt{ \kappa _{s_k} } (s_k-t)+{\mathfrak {b}}^2 (s_k - t)^2 \nonumber \\&\quad - D(s_k) - {\mathfrak {b}}\sqrt{ D(s_k ) } (s_k - t) - \frac{{\mathfrak {b}}^2}{4} (s_k - t)^2 \nonumber \\&\ge {\mathfrak {b}}\sqrt{ \kappa _{s_k} } (s_k - t). \end{aligned}$$
(3.48)

In the first inequality we used (3.10). In the second inequality we used the square of the inequality \(\sqrt{D (t) } \le \sqrt{ D(s_k ) } + \frac{{\mathfrak {b}}}{2} (s_k - t)\). In the last inequality we used \(\kappa _{s_k} \ge D(s_k)\).

Therefore,

$$\begin{aligned} \frac{1}{N^2} \int _{s_* \wedge \tau }^{s_k \wedge \tau } \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_t, t) ] |}{ \eta _t \left( ( \kappa _t -N^{-2/3+k(t) + 5\varepsilon } )^2 + \eta _t^2 \right) } {\textrm{d}}t \le \frac{C}{N^2} \int _{s_*}^{s_k } \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_t, t) ] |}{ \eta _t \left( ( \kappa _{s_k} (t- s_k)^2 + \eta _t^2 \right) } {\textrm{d}}t. \end{aligned}$$
(3.49)

We need to consider two cases. First, consider \(\eta _{s_k} \le \kappa _{s_k}\). From the fact that \(\kappa (s_k) \ge D (s_k)\) and that \(D(s_k) = N^{-\varepsilon /6}\) if \(s_k < \delta \) we see from (3.18) and (3.19) that,

$$\begin{aligned} | {\textrm{Re}}[ {\tilde{f}}(z_{s_k}, s_k ) ] | \le C \frac{ \eta _{s_k}}{ \sqrt{ \kappa _{s_k} } } N^{\varepsilon /4}. \end{aligned}$$
(3.50)

Then using this, as well as that \(\eta _t\) is decreasing and \({\tilde{f}}(z_t, t)\) is constant along characteristics we have,

$$\begin{aligned} \frac{1}{N^2} \int _{s_*}^{s_k } \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_t, t) ] |}{ \eta _t \left( ( \kappa _{s_k} (t- s_k) + \eta _t^2 \right) } {\textrm{d}}t&\le \frac{C N^{\varepsilon /4}}{N^2} \frac{1}{ \sqrt{ \kappa _{s_k} + \eta _{s_k}}} \int _{s*}^{s_k} \frac{1}{ \kappa _{s_k} (t-s_k)^2 + \eta _{s_k }^2 } {\textrm{d}}t \nonumber \\&\le \frac{C N^{\varepsilon /4}}{N^2} \frac{1}{ \sqrt{ \kappa _{s_k} + \eta _{s_k} } } \frac{1}{ \sqrt{ \kappa _{s_k}} \eta _{s_k} } \nonumber \\&\le \frac{C N^{\varepsilon /4}}{N^2} \frac{1}{ ( \kappa _{s_k} + \eta _{s_k} )(\eta _{s_k} )}. \end{aligned}$$
(3.51)

The second estimate follows via direct integration and in the last inequality we used the assumption \(\kappa _{s_k} \ge \eta _{s_k}\). We now consider the case \(\eta _{s_k} \ge \kappa _{s_k}\). In this case we proceed similarly to Proposition 2.2,

$$\begin{aligned} \frac{1}{N^2} \int _{s_*}^{s_k } \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_t, t) ] |}{ \eta _t \left( ( \kappa _{s_k} (t- s_k) + \eta _t^2 \right) } {\textrm{d}}t&\le \frac{1}{N^2} \int _{s_*}^{s_k} \frac{| {\textrm{Re}}[{\tilde{f}}(z_t, t) ] |}{ \eta _t^3} \le \frac{C}{N^2 \eta _{s_k}^2}\nonumber \\&\le \frac{C N^{\varepsilon /4}}{N^2 \eta _{s_k} ( \kappa _{s_k} + \eta _{s_k} )}. \end{aligned}$$
(3.52)

By the BDG inequality,

$$\begin{aligned} {\mathbb {P}}\left[ \sup _{ s \in (0, s_k )} \left| {\mathcal {E}}_2 ( s \wedge \tau ) \right| > N^{\varepsilon _1+\varepsilon /8} B (z_{s_k} ) \right] \le C {\textrm{e}}^{ - c N^{\varepsilon _1} }. \end{aligned}$$
(3.53)

Taking a union bound over the \(O ( \log (N) )\) choices of \(s_k\) and using (3.44) we conclude the proof, similarly as in the proof of Proposition 2.2. \(\square \)

Proof of Theorem 1.4

With Proposition 3.7 in hand, the proof of Theorem 1.4 is similar to the proof of Theorem 1.2. Let \(\Upsilon \) be the event of Proposition 3.7, after taking an intersection over all choices of \(i, j \le N^{10}\). Let \(z_t:= z(t)^i_j\) for notational simplicity. On the event of \(\Upsilon \) we have the inequality for \(0< t < \tau \),

$$\begin{aligned} | f (z_t, t) - {\tilde{f}}(z_t, t) | \le \int _0^t g(s) | f(z_s, s) - {\tilde{f}}(z_s, s) | {\textrm{d}}s + N^{\varepsilon _1+\varepsilon /8} B ( z_t ), \end{aligned}$$
(3.54)

where we denoted,

$$\begin{aligned} g(s) = \frac{ | z_s (\partial _z f ) (z_s, s) |}{2}.\nonumber \\ \end{aligned}$$
(3.55)

Hence, via Gronwall’s inequality we obtain for \(0< t < \tau \),

$$\begin{aligned} | f (z_t, t) - {\tilde{f}}(z_t, t) | \le \int _0^t g(s) \exp \left[ \int _s^t g(u) {\textrm{d}}u \right] N^{\varepsilon _1+\varepsilon /8} B ( z_s ) {\textrm{d}}s + N^{\varepsilon _1+\varepsilon /8} B (z_t).\nonumber \\ \end{aligned}$$
(3.56)

Similar to the proof of Theorem 1.2, now using Lemma 3.6, we have

$$\begin{aligned} \exp \left[ \int _s^t g(u) {\textrm{d}}u \right] \le C \frac{ \log |z_s |}{ \log |z_t|} \end{aligned}$$
(3.57)

as well as

$$\begin{aligned} g(s) \le C \frac{ {\textrm{Re}}[ {\tilde{f}}(z_s, s) ]}{ \eta _s}. \end{aligned}$$
(3.58)

Therefore,

$$\begin{aligned} | f (z_t, t) - {\tilde{f}}(z_t, t) | \le C \frac{ N^{\varepsilon _1+\varepsilon /8}}{\eta _t} \int _0^t | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | B ( z_s ) {\textrm{d}}s + N^{\varepsilon _1+\varepsilon /8} B(z_t).\nonumber \\ \end{aligned}$$
(3.59)

We must consider a few different cases. First, let us consider the case that \(t < s_*\). Then, we see that

$$\begin{aligned} \frac{ N^{\varepsilon _1+\varepsilon /8}}{\eta _t} \int _0^t | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | B ( z_s ) {\textrm{d}}s + N^{\varepsilon _1+\varepsilon /8} B(z_t) \le C \frac{N^{\varepsilon _1+\varepsilon /8}}{N}. \end{aligned}$$
(3.60)

So in the remainder of the proof we consider the case \(t \ge s_*\). We now consider a few different cases depending on whether or not \(t_i\) (the end-time of the characteristic \(z_t\)) is small or large. First, assume that \(t_i < \delta \). Then, using (3.19),

$$\begin{aligned} \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | }{ \eta _t } = \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_t, t) ] | }{ \eta _t } \le C\left( |\kappa _t|^{-2} \right) \le C N^{\varepsilon /3} \end{aligned}$$
(3.61)

we have,

$$\begin{aligned} \frac{ N^{\varepsilon _1+\varepsilon /8}}{\eta _t} \int _0^t | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | B ( z_s ) {\textrm{d}}s \le N^{11 \varepsilon /24+\varepsilon _1} B(z_t) \end{aligned}$$
(3.62)

because \(B(z_t)\) is effectively decreasing along characteristics (it may not be decreasing for \(t < s_*\) but for such t it is \({\mathcal {O}}(N^{-1})\)). We take \(\varepsilon _1 < \varepsilon /100\).

Now, we assume that \(t_i > \delta \). Then for \( s < \delta /2\) we have that either \(\eta _s \ge c\) or \(\kappa _s \ge c\) by Proposition 3.3 depending on whether s is smaller or larger than \(s_*\). Then for such s we have \(B (z_s) \le CN^{-1}\eta _t^{-1/2}\) and so,

$$\begin{aligned} \frac{N^{\varepsilon _1+\varepsilon /8}}{\eta _t} \int _0^{t\wedge \delta /2} | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | B(z_s) {\textrm{d}}s\le & {} C N^{\varepsilon _1+\varepsilon /8} \frac{1}{N \eta _t^{1/2}} \frac{ | {\textrm{Re}}[{\tilde{f}}(z_t, t) ] |}{\eta _t } \nonumber \\\le & {} C N^{\varepsilon _1+\varepsilon /8} B (z_t). \end{aligned}$$
(3.63)

In the final inequality we used \({\textrm{Re}}[{\tilde{f}}(z_t, t) ] \eta _t^{-1} \le C ( \kappa _t + \eta _t)^{-1/2}\) (recall that \(t \ge s_*\)) if \( t> \delta /2\) and (3.18). If \(t < \delta /2\) then since \(t \ge s_*\) we have \(\kappa _t \ge c\) since \(t < t_i - \delta /2\) and so \(| {\textrm{Re}}[ {\tilde{f}}(z_t, t) ]| \eta _t^{-1}\) is bounded.

Now we have still to estimate the integral over \([t\wedge \delta /2, t]\) in the case that \(t_i > \delta \). Since this integral is 0 if \(t < \delta /2\) we may assume that \(t > \delta /2\). Then, using freely (3.18), we have,

$$\begin{aligned} \frac{N^{\varepsilon _1+\varepsilon /8}}{\eta _t}&\int _{\delta /2 \vee s_*}^t | {\textrm{Re}}[{\tilde{f}}(z_s, s) ] | B(z_s) {\textrm{d}}s \le \frac{C}{N} \frac{ N^{\varepsilon _1+\varepsilon /8} }{ \sqrt{ \kappa _t + \eta _t } } \int _{\delta /2 \vee s_*}^t \frac{ \eta _s}{ \sqrt{ \kappa _s + \eta _s } \eta _s^{3/2}}{\textrm{d}}s \nonumber \\&\le \frac{C}{N} \frac{N^{\varepsilon _1+\varepsilon /8}}{\sqrt{ \kappa _t + \eta _t } } \int _{\delta /2 \vee s_*}^t \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ]| }{\eta _s^{3/2}}{\textrm{d}}s \le \frac{C N^{\varepsilon _1+\varepsilon /8}}{N \sqrt{ \kappa _t + \eta _t} \sqrt{\eta _t}} \end{aligned}$$
(3.64)

The integral over \([\delta /2, \delta /2 \vee s_*]\) contributes \(N^{\varepsilon _1+\varepsilon /8-1} ( \kappa _t + \eta _t)^{-1/2}\) because \(\eta _s \ge c\) here. Therefore, taking \(\varepsilon _1 < \varepsilon /100\) we see that we have proven that,

$$\begin{aligned} | f (z_t, t) - {\tilde{f}}(z_t, t) | \le C N^{23 \varepsilon /48} B(z_t) \ll N^{\varepsilon /2} B(z_t). \end{aligned}$$
(3.65)

on the event \(\Upsilon \). It follows that \(\tau = T\). \(\square \)

4 Cusp estimates

In this section we prove Theorem 1.6. We will use the following reverse time parameterization. We will denote by uppercase T, and S the usual forward time parameterization, so that we will consider S and T close to 4. We will the introduce,

$$\begin{aligned} T = 4 - t, \qquad S = 4 - s \end{aligned}$$
(4.1)

so that t and s will usually obey,

$$\begin{aligned} N^{-1/2+\delta } \le s, t \le \frac{1}{10}. \end{aligned}$$
(4.2)

We will often substitute in T or t into functions of time. When we write t it is understood that we are evaluating the function at \(4-t\). For example, the gap between edges is,

$$\begin{aligned} \Delta _t = \Delta _T = 2( \pi - \Theta _T) = \frac{1}{3} t^{3/2} ( 1 + {\mathcal {O}}(t) ). \end{aligned}$$
(4.3)

The proof of Theorem 1.6 is similar in structure to the proof of Theorem 1.4. We first establish analogues of the estimates proven there before proceeding to the main body of the proof.

Lemma 1.19

There is a constant \(C>0\) so that the following holds. For \(0< t < \frac{1}{10}\) and \(z = (1+ \eta ) {\textrm{e}}^{ {\textrm{i}}(\Theta _t + \kappa ) }\) with \(\eta \) and \(\kappa \) satisfying,

$$\begin{aligned} 0< \eta< \Delta _t, \qquad 0< \kappa < \pi - \Theta _t \end{aligned}$$
(4.4)

we have that

$$\begin{aligned} \frac{1}{C} \frac{ \eta }{ \sqrt{ \eta + \kappa } } \le \Delta _t^{1/6} | {\textrm{Re}}[ {\tilde{f}}(z, t) ] | \le C \frac{ \eta }{ \sqrt{ \eta + \kappa } } \end{aligned}$$
(4.5)

Proof

Denoting \(\theta = \kappa + \Theta _t\) we have,

$$\begin{aligned} - {\textrm{Re}}[ {\tilde{f}}(z,t) ] \asymp \eta \int \frac{ \rho _t (x) }{ \eta ^2 + \sin ^2 (\frac{ \theta - x}{2}) } {\textrm{d}}x. \end{aligned}$$
(4.6)

Due to symmetry of the density,

$$\begin{aligned} \int \frac{ \rho _t (x) }{ \eta ^2 + \sin ^2 (\frac{ \theta - x}{2}) } {\textrm{d}}x \asymp \int _0^\pi \frac{ \rho _t (x) }{ \eta ^2 + \sin ^2 (\frac{ \theta - x}{2}) } {\textrm{d}}x. \end{aligned}$$
(4.7)

Changing coordinates and using the fact that \(\sin ^2(x) \asymp x^2\) for \(|x| \le \pi /2\) we have,

$$\begin{aligned} \int _0^\pi \frac{ \rho _t (x) }{ \eta ^2 + \sin ^2 (\frac{ \theta - x}{2}) } {\textrm{d}}x \asymp \int _{0}^{\Theta _t} \frac{ \rho _t (\Theta _t - x ) }{ \eta ^2 + ( \kappa + x)^2} {\textrm{d}}x. \end{aligned}$$
(4.8)

For an upper bound we split the integral into the regions \([0, \Delta _t]\) and \([\Delta _t, \Theta _t]\). For the latter region we can bound \(\rho _t(\Theta _t -x ) \le C x^{1/3}\),

$$\begin{aligned} \int _{\Delta _t}^{\Theta _t} \frac{ \rho _t (\Theta _t - x ) }{ \eta ^2 + ( \kappa + x)^2} {\textrm{d}}x \le C \int _{\Delta _t}^{\Theta _t} ( \Delta _t + x)^{-5/3} {\textrm{d}}x \le C \Delta _t^{-2/3} \le C \Delta _t^{-1/6} \frac{1}{ \sqrt{ \kappa + \eta } } , \end{aligned}$$
(4.9)

where in the last inequality we used the assumption \(\kappa + \eta \le 2 \Delta _t\). In the region \([0, \Delta _t ]\) we use \(\rho _t (\Theta _t - x) \Delta _t^{1/6} \le C x^{1/2}\),

$$\begin{aligned}&\int _0^{\Delta _t} \frac{ \rho _t (\Theta _t - x ) }{ \eta ^2 + ( \kappa + x)^2} {\textrm{d}}x \le C \Delta _t^{-1/6} \int _0^{\Delta _t} \frac{ x^{1/2}}{ \eta ^2 + ( \kappa + x)^2} {\textrm{d}}x \nonumber \\&\quad \le C \Delta _t^{-1/6} \int _0^{\Delta _t} ( x + \kappa + \eta )^{-3/2} {\textrm{d}}x \le C \Delta _t^{-1/6} \frac{1}{ \sqrt{ \kappa + \eta } }. \end{aligned}$$
(4.10)

This completes the proof of the upper bound. For the lower bound,

$$\begin{aligned}&\int _{0}^{\Theta _t} \frac{ \rho _t (\Theta _t - x ) }{ \eta ^2 + ( \kappa + x)^2} {\textrm{d}}x \ge \int _0^{ 2 \Delta _t } \frac{ \rho _t (\Theta _t - x ) }{ \eta ^2 + ( \kappa + x)^2} {\textrm{d}}x \ge c \Delta _t^{-1/6} \int _0^{2 \Delta _t} \frac{ \sqrt{x}}{ \eta ^2 + ( \kappa + x)^2} {\textrm{d}}x \nonumber \\&\quad \ge c \Delta _t^{-1/6} \int _{ \frac{\kappa +\eta }{2}}^{ \kappa +\eta } \frac{ \sqrt{x}}{ \eta ^2 + ( \kappa + x)^2} {\textrm{d}}x \ge c \Delta _t^{-1/6} \int _{ \frac{\kappa +\eta }{2}}^{ \kappa +\eta } (\kappa +\eta )^{-3/2} {\textrm{d}}x \ge c \Delta _t^{-1/6} \frac{1}{ \sqrt{ \kappa + \eta }}. \end{aligned}$$
(4.11)

We used above the assumption that \(\kappa +\eta \le 2 \Delta _t\). This completes the proof. \(\square \)

Lemma 1.20

Recall the convention \(T= 4-t\). There are \({\mathfrak {a}}, {\mathfrak {b}}>0\) so that the following holds. Let \(z_s = {\mathcal {C}}_{S, T} (z)\) denote a characteristic ending at a point \( z = (1+ \eta ) {\textrm{e}}^{ {\textrm{i}}(\Theta _t + \kappa ) }\) where \(0< \eta < {\mathfrak {a}}\) and \(0< \kappa < \Delta _T/2\). Let \(S_*\) be defined as,

$$\begin{aligned} S_* = \sup \{ S: |z_S| \ge 1 + {\mathfrak {a}}\Delta _S \}. \end{aligned}$$
(4.12)

Then for \(S_* \vee (4- 10^{-1} )< S < T\) we have,

$$\begin{aligned} \Delta _S^{1/6} \sqrt{ \kappa _S } > \Delta _T^{1/6} \sqrt{ \kappa _T} + {\mathfrak {b}}(T-S), \end{aligned}$$
(4.13)

and also that \(\kappa _S\) is decreasing. For \(S<S_*\) we have \(\eta _S \ge \eta _{S_*} \ge {\mathfrak {a}}\Delta _{S_*} \ge {\mathfrak {a}}\Delta _T\).

Proof

We let \(100 {\mathfrak {a}}\) be the constant from Proposition B.4. For \(S \in (S_*, T)\) we can apply the estimate from this proposition to obtain,

$$\begin{aligned} \frac{ {\textrm{d}}}{ {\textrm{d}}S} \kappa _S&= \frac{1}{2} \left( {\textrm{Im}}[ {\tilde{f}}(z_S, S) ] - {\textrm{Im}}[ {\tilde{f}}({\textrm{e}}^{ {\textrm{i}}\Theta _S}, S) ) ] \right) \le - c \Delta _S^{-1/6} \sqrt{ \kappa _S}, \end{aligned}$$
(4.14)

as long as \(\kappa _S \ge 0\). We see that \(\kappa _S\) is decreasing so if \(\kappa _T >0\) then \(\kappa _S > 0\) for \(S > S_* \vee (4- 10^{-1})\). Moreover, since \(\Delta _S\) is decreasing, we see that

$$\begin{aligned} \frac{ {\textrm{d}}}{ {\textrm{d}}S} \left( \Delta ^{1/3}_S \kappa _S \right) \le - c \left( \Delta _S^{1/3} \kappa _S \right) ^{1/2}. \end{aligned}$$
(4.15)

The differential inequality \(\partial _t g \le -c g^{1/2}\) is solved by considering \(\partial _t g^{1/2} \le -c/2\) and so we see that for \(S \in (S_* \vee (4- 10^{-1} ), T)\) that

$$\begin{aligned} \Delta _S^{1/6} \sqrt{ \kappa _S} \ge \Delta _T^{1/6} \sqrt{ \kappa _T} + c (T-S) \end{aligned}$$
(4.16)

for some \(c>0\) as desired. This completes the proof. \(\square \)

The following establishes estimates on the behavior of the characteristics near the spectral edge.

Proposition 1.21

Let \(z_s = {\mathcal {C}}_{s, t} (z)\) be a characteristic ending at a point z at final time \(t < 10^{-1}\) (recall the reversed time convention \(T = 4-t\)). Assume at the final time t the estimates,

$$\begin{aligned} \kappa _t > N^{-2/3+5 \varepsilon } \Delta _t^{1/9}, \qquad \Delta _t \ge N^{-3/4+9\varepsilon } \end{aligned}$$
(4.17)

hold. Moreover assume \(0< \eta _t < {\mathfrak {a}}\Delta _t\). Let,

$$\begin{aligned} s_* = \inf \{ s: \eta _s > {\mathfrak {a}}\Delta _s \}. \end{aligned}$$
(4.18)

Note \(S_* = 4 - s_*\) where \(S_*\) is as in Lemma 4.2. For \(t< s < s_* \wedge 10^{-1}\) we have,

$$\begin{aligned} \partial _s ( \kappa _s - N^{-2/3+5 \varepsilon } \Delta _s ) \ge 0. \end{aligned}$$
(4.19)

For the next two estimates, continue to assume \(t< s < s_* \wedge 10^{-1}\). There is furthermore a constant \({\mathfrak {d}}>0\) so that the following holds. If \(s-t \le \Delta ^{1/6}_t \sqrt{ \kappa _t} \) then,

$$\begin{aligned} (\kappa _s - N^{-2/3 + 5 \varepsilon } \Delta ^{1/9}_s)^2 \ge {\mathfrak {d}}(s-t)^2 \Delta ^{-1/3}_t \kappa _t. \end{aligned}$$
(4.20)

If \(s-t \ge \Delta ^{1/6}_t \sqrt{ \kappa _t}\) then,

$$\begin{aligned} (\kappa _s - N^{-2/3 + 5\varepsilon } \Delta _s^{1/9})^2 \ge {\mathfrak {d}}\kappa _t^2. \end{aligned}$$
(4.21)

Proof

By direct calculation, one can see that \(| \partial _t \Theta _t | \le C t^{1/2} \le C \Delta _t^{1/3}\). Therefore, proceeding as in the proof of Lemma 4.2 we have for \(t< s < s_*\),

$$\begin{aligned} \frac{ {\textrm{d}}}{ {\textrm{d}}s} ( \kappa _s - N^{-2/3+5 \varepsilon } \Delta _s^{1/9} ) \ge c \Delta ^{-1/6}_s \sqrt{ \kappa _s} - C N^{-2/3+5\varepsilon } \Delta ^{-5/9}_s, \end{aligned}$$
(4.22)

for some \(c, C>0\). Consider the set

$$\begin{aligned} {\mathcal {A}}:= \{ s: t \le s< s_*: \kappa _s < N^{-2/3+5\varepsilon } \Delta _s^{1/9} \}. \end{aligned}$$
(4.23)

Note that by assumption there is a small interval around t not contained in \({\mathcal {A}}\). Assume that \({\mathcal {A}}\) is not empty and let \(F = \inf {\mathcal {A}}\), so that \(s_*> F>t\). At F we clearly must have that

$$\begin{aligned} \kappa _F = N^{-2/3+5\varepsilon } \Delta _F^{1/9}. \end{aligned}$$
(4.24)

We now show that the RHS of (4.22) is strictly positive at \(s= F\). Indeed, we evaluate

$$\begin{aligned} \Delta _F^{10/9-1/3}\kappa _F = N^{-2/3+5\varepsilon } \Delta _F^{8/9} \ge N^{-2/3+5\varepsilon } \Delta _t^{8/9} \ge N^{3 \varepsilon } ( N^{-2/3+5\varepsilon } )^2 \end{aligned}$$
(4.25)

which shows that the RHS of (4.22) is strictly positive for N large enough. In particular, the derivative on the LHS of (4.22) is strictly positive at \(s=F\), showing that \(\kappa _s \ge N^{-2/3+5\varepsilon } \Delta _s^{1/9}\) in an open interval containing F. This contradicts the assumption that F was an infimum, proving that \({\mathcal {A}}\) is empty.

Substituting the lower bound \(\kappa _s \ge N^{-2/3+5 \varepsilon } \Delta _s^{1/9}\) into the RHS of (4.22) we then find that,

$$\begin{aligned} \frac{ {\textrm{d}}}{ {\textrm{d}}s} ( \kappa _s - N^{-2/3+5 \varepsilon } \Delta _s^{1/9} ) \ge c \Delta _s^{-1/9} N^{-1/3+5\varepsilon /2} - C N^{-2/3+5 \varepsilon } \Delta _s^{-5/9}. \end{aligned}$$
(4.26)

Positivity of the RHS is equivalent to

$$\begin{aligned} \Delta _s^{4/9} \ge \frac{C}{c} N^{-1/3+5\varepsilon /2}. \end{aligned}$$
(4.27)

However, by assumption \(\Delta _t^{4/9} \ge N^{-1/3+4 \varepsilon }\). We conclude the proof of (4.19).

We turn now to the proof of the remaining inequalities. From (4.13) we have

$$\begin{aligned} \Delta ^{1/3}(s) \kappa (s) - N^{-2/3 + 5 \varepsilon } \Delta ^{4/9}(s)&\ge [\Delta ^{1/3}(t) \kappa (t) - N^{-2/3 + 5 \varepsilon } \Delta ^{4/9}(t)] \nonumber \\&\quad + [N^{-2/3 + 5 \varepsilon } \Delta ^{4/9}(t) - N^{-2/3 + 5 \varepsilon } \Delta ^{4/9}(s)] \nonumber \\&\quad + 2 c(s-t) \Delta ^{1/6}(t) \sqrt{\kappa (t)}, \end{aligned}$$
(4.28)

for some \(c >0\). Note that the first term on the RHS is positive. First, assume that \(s-t \le \Delta _t^{1/6} \sqrt{ \kappa _t}\). From the fact that \(\kappa _t \le \Delta _t\) and \(\Delta _t \le C t^{3/2}\) we see that \(s \le C t\) for some \(C>0\). It follows from the mean value theorem that,

$$\begin{aligned} N^{-2/3+5\varepsilon } | \Delta ^{4/9}_t - \Delta _s^{4/9} |&\le C N^{-2/3+5\varepsilon } (s-t) \Delta ^{-2/9}_t \nonumber \\&\le C (s-t) \Delta _t^{1/6} \sqrt{ \kappa _t} N^{-3 \varepsilon /2}. \end{aligned}$$
(4.29)

The estimate (4.20) easily follows. Consider now the case \(s-t \ge \Delta _t^{1/6} \sqrt{ \kappa _t}\). Let \(t_*\) be the time such that

$$\begin{aligned} t_* = t + \Delta _t^{1/6} \sqrt{ \kappa _t}. \end{aligned}$$
(4.30)

Applying (4.19) we see that

$$\begin{aligned} \kappa _s - N^{-2/3+5 \varepsilon } \Delta _s \ge \kappa _{t_*} - N^{-2/3+5\varepsilon } \Delta _{t_*}. \end{aligned}$$
(4.31)

Since the estimate (4.20) holds at \(s = t_*\) we see that

$$\begin{aligned} ( \kappa _{t_*} - N^{-2/3+5\varepsilon } \Delta _{t_*} )^2 \ge c \kappa _t^2 \end{aligned}$$
(4.32)

for some \(c>0\). This completes the proof of the proposition. \(\square \)

The following is similar to Lemma 3.5 and the proof is deferred to Appendix B of [2].

Lemma 1.22

Let \(\varepsilon >0\). Consider the domain,

$$\begin{aligned} {\mathcal {D}}&{:}{=}&\{ z = (1+\eta ) {\textrm{e}}^{ {\textrm{i}}\theta }: \Delta _t^{1/9} N^{-2/3+\varepsilon }< \eta< 2 \Delta _t^{1/9} N^{-2/3+\varepsilon }, \Theta _t + N^{-2/3+5\varepsilon } \Delta _t^{1/9} \nonumber \\< & {} | \theta | \le \pi \}. \end{aligned}$$
(4.33)

Assume for all \( z \in {\mathcal {D}}\) we have the estimate,

$$\begin{aligned} | {\tilde{f}}(z, t) - f(z, t) | \le \frac{ N^{\varepsilon }}{ N \sqrt{ \eta } \sqrt{ \kappa +\eta } }. \end{aligned}$$
(4.34)

Then there are no eigenvalues of the form \(\lambda = {\textrm{e}}^{ {\textrm{i}}\theta }\) with \(\Theta _t + N^{-2/3+5\varepsilon } \Delta _t < | \theta | \le \pi \).

Lemma 1.23

Let \(\varepsilon >0\) and let \(t < N^{-1/10}\) with \(\Delta _t \ge N^{-3/4+9\varepsilon }\). Let \(z_s\) be a characteristic that ends at time t in the region,

$$\begin{aligned}{} & {} \{ z= (1+\eta ) {\textrm{e}}^{ {\textrm{i}}\theta }: N^{-2/3+\varepsilon } \Delta _t^{1/9} \le \eta \le 2N^{-2/3+\varepsilon } \Delta _t^{1/9}, \Theta _t + N^{-2/3+5\varepsilon } \Delta _t^{1/9} \nonumber \\{} & {} \quad \le | \theta | \le \pi \}. \end{aligned}$$
(4.35)

Then for \(t< s < 10^{-1}\) we have,

$$\begin{aligned} \eta _s \le C N^{-\varepsilon /2} \Delta _s. \end{aligned}$$
(4.36)

Proof

Define the functions

$$\begin{aligned} h_1 (u) = \log |z_t| + u N^{\varepsilon /2} \frac{ \eta _t}{ \Delta _t^{1/6} \sqrt{ \kappa _t} } \end{aligned}$$
(4.37)

and

$$\begin{aligned} h_2 (u) = N^{-\varepsilon /2} \Delta _t + u t^{1/2} N^{-\varepsilon }. \end{aligned}$$
(4.38)

By the definition of the characteristics,

$$\begin{aligned} \log |z_s| = \log |z_t| + (s-t) \frac{ | {\textrm{Re}}[ {\tilde{f}}(z, t) ] |}{2}, \end{aligned}$$
(4.39)

and so for \(0 \le u < 10^{-1}\) we have,

$$\begin{aligned} \eta _{t+u} \le C h_1 (u) \end{aligned}$$
(4.40)

for some large C, using Lemma 4.1. Since \(\partial _u \Delta _u \ge c u^{1/2}\) we have that

$$\begin{aligned} h_2 (u) \le N^{-\varepsilon /2} \Delta _{t+u}, \end{aligned}$$
(4.41)

for N large enough. Note that \(h_2(0) > h_1 (0)\). We claim for \(u < 10^{-1}\) that

$$\begin{aligned} h_2'(u) \ge h_1' (u) \end{aligned}$$
(4.42)

which will yield the claim. This is equivalent to,

$$\begin{aligned} N^{3\varepsilon /2} \eta _t \le t^{1/2} \Delta _t^{1/6} \sqrt{ \kappa _t}. \end{aligned}$$
(4.43)

The LHS is less than \(2 N^{5\varepsilon /2-2/3} \Delta _t^{1/9}\). The RHS is larger than \(c N^{5\varepsilon /2-1/3} \Delta _t^{1/2} \Delta _t^{1/18}\). But,

$$\begin{aligned} \Delta _t^{1/2-1/9+1/18} = \Delta _t^{4/9} \ge N^{-1/3+4 \varepsilon } \end{aligned}$$
(4.44)

by assumption. \(\square \)

Lemma 1.24

Fix a time t. Let \(z_s = {\mathcal {C}}_{s, t} (z)\) be a characteristic terminating at a point \(z = (1+\eta ) {\textrm{e}}^{ {\textrm{i}}( \Theta _t + \kappa )}\) where

$$\begin{aligned} N^{-2/3+\varepsilon } \Delta _t^{1/9}< \eta< 2 N^{-2/3+\varepsilon } \Delta _t^{1/9}, \qquad N^{-2/3+5 \varepsilon } \Delta _t^{1/9} \le \kappa < \pi - \Theta _t.\nonumber \\ \end{aligned}$$
(4.45)

Let \(t_1\) and \(t_2\) be two times \(t \le t_1< t_2 < 10^{-1}\) and assume that for \(t_1< s < t_2\) the estimate

$$\begin{aligned} | {\tilde{f}}(z_s, s) - f (z_s, s) | \le 2 | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | \end{aligned}$$
(4.46)

holds and that at each time s there are no eigenvalues of the form \(\lambda = {\textrm{e}}^{ {\textrm{i}}\theta }\) for \(\Theta _s + N^{-2/3+5\varepsilon } \Delta _s^{1/9} < | \theta | \le \pi \). Then,

$$\begin{aligned} \int _{t_1}^{t_2} \frac{1}{N} \sum _{i=1}^N \frac{1}{ | \lambda _i (s) - z_s |^4} {\textrm{d}}s \le C \log (N) B (z_{t_1} )^2 \end{aligned}$$
(4.47)

Proof

First we consider the case that \( {\mathfrak {a}}\kappa _{t_1} \le 10 \eta _{t_1}\). In this case, it suffices to estimate the integrand by \(C \log (N) \eta _{t_1}^{-2}\). This argument is similar to that appearing in the proof of Proposition 2.2 and so we omit it.

For the remainder of the proof we therefore assume that \(\kappa _{t_1} {\mathfrak {a}}\ge 10 \eta _{t_1}\). In particular, this implies that \(\eta _{t_1} \le {\mathfrak {a}}\Delta _{t_1} /10\).

Let \(s_*\) be as in Lemma 4.2, that is

$$\begin{aligned} s_*:= \inf \{ s> t: \eta _s > {\mathfrak {a}}\Delta _s \}. \end{aligned}$$
(4.48)

Note that by Lemma 4.5 we have \(s_* > t_2\). For \(t_1< s <t_2\) we know from (4.19) that \(\kappa _s \ge N^{-2/3+5\varepsilon } \Delta _s^{1/9}\) because this is satisfied at \(s=t\). In particular, define

$$\begin{aligned} {\tilde{t}} = t_1 + \Delta _{t_1}^{1/6} \sqrt{ \kappa _{t_1} }. \end{aligned}$$
(4.49)

We assume \( t_1< {\tilde{t}} < t_2\). The other cases are easy to deal with, as one has to treat only one of the regions of integration described below. Then for \(t_1< s < {\tilde{t}} \) we have,

$$\begin{aligned} ( \kappa _s - N^{-2/3+5\varepsilon } \Delta ^{1/9}_s )^2 \ge {\mathfrak {d}}(s-t_1)^2 \Delta ^{-1/3}_{t_1} \kappa _{t_1} \end{aligned}$$
(4.50)

and for \({\tilde{t}}< s < t_2\),

$$\begin{aligned} ( \kappa _s - N^{-2/3+5 \varepsilon } \Delta _s^{1/9} )^2 \ge {\mathfrak {d}}\kappa _{t_1}^2, \end{aligned}$$
(4.51)

by applying Proposition 4.3. We will use the estimate,

$$\begin{aligned} \int _{t_1}^{ t_2} \frac{1}{N} \sum _{i=1}^N \frac{1}{ | \lambda _i (s) - z_s|^4} {\textrm{d}}s&\le \int _{t_1}^{ t_2} \frac{1}{ ( \kappa _s - N^{-2/3+5\varepsilon } \Delta _s^{1/9} )^2 + \eta _s^2} \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | }{ \eta _s} {\textrm{d}}s . \end{aligned}$$
(4.52)

We start with the region \( s > {\tilde{t}}\). For such s we can apply (4.51) and estimate,

$$\begin{aligned} \int _{{\tilde{t}}}^{ t_2} \frac{1}{ ( \kappa _s - N^{-2/3+5\varepsilon } \Delta _s^{1/9} )^2 + \eta _s^2} \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | }{ \eta _s} {\textrm{d}}s&\le \frac{C}{ \kappa _{t_1}^2} \int _{{\tilde{t}}}^{ t_2} \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | }{ \eta _s} {\textrm{d}}s \nonumber \\&\le \frac{C}{ \kappa _{t_1}^2} \log (N) \nonumber \\&\le \frac{C}{ \eta _{t_1} ( \kappa _{t_1} + \eta _{t_1})} \log (N) \end{aligned}$$
(4.53)

We consider now the region \(s < {\tilde{t}}\). By applying (4.50) and Lemma 4.1 (as well as the constancy of \({\tilde{f}}\) along characteristics, and the fact that \(\eta _s\) is increasing in s) we obtain,

$$\begin{aligned}&\int ^{{\tilde{t}}}_{ t_1} \frac{1}{ ( \kappa _s - N^{-2/3+5\varepsilon } \Delta _s^{1/9} )^2 + \eta _s^2} \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_s, s) ] | }{ \eta _s} {\textrm{d}}s \nonumber \\&\quad \le \frac{C}{\Delta _{t_1}^{1/6} \sqrt{ \kappa _{t_1}+\eta _{t_1}} } \int _{t_1}^{{\tilde{t}}} \frac{1}{(s-t)^2 \Delta _{t_1}^{-1/3} \kappa _{t_1} + \eta _{t_1}^2} {\textrm{d}}s \nonumber \\&\quad \le \frac{C}{ \eta _{t_1} \sqrt{ \kappa _{t_1}} \sqrt{ \kappa _{t_1} + \eta _{t_1}}}. \end{aligned}$$
(4.54)

This completes the proof. \(\square \)

Define the domain,

$$\begin{aligned}{} & {} {\mathfrak {B}}_s:=\{ z = (1+\eta ) {\textrm{e}}^{ {\textrm{i}}\theta }: N^{-2/3+\varepsilon } \Delta ^{1/9}_s< \eta< 2 N^{-2/3+\varepsilon } \Delta ^{1/9}_s, N^{-2/3+5\varepsilon } \Delta _s^{1/9} \nonumber \\{} & {} \quad + \Theta _s < | \theta | \le \pi \}. \end{aligned}$$
(4.55)

We will assume \(s > N^{-1/2+9\varepsilon }\) so that \(\Delta _s > c N^{-3/4+10\varepsilon }\). We fix \(S_0 = 4 - s_0\) with \(s_0 = 10^{-1}\), and the final time,

$$\begin{aligned} S_f:= 4 - s_f, \qquad s_f = N^{-1/2+9\varepsilon } \end{aligned}$$
(4.56)

Similar to the proof of Theorem 1.4, we introduce a polynomial number of characteristics as follows. We introduce times \(T_i\) by

$$\begin{aligned} T_0 = S_0, \qquad T_i = T_{i-1} + \frac{S_f-S_0}{N^{10}}, \end{aligned}$$
(4.57)

for \(i=1, \dots , N^{10}\). At each \(T_i\) we introduce a well-spaced mesh of \({\mathcal {B}}_{T_i}\) of size \(N^{10}\) denoted by \(\{ u_j^i \}_{j=1}^{10}\). We introduce the characteristics,

$$\begin{aligned} z_j^i (S):= {\mathcal {C}}_{S, T_i} ( u_j^i ). \end{aligned}$$
(4.58)

Note that each characteristic is defined only for \( 0 \le s \le T_i\). For \(i \ge 1\) we introduce the stopping times,

$$\begin{aligned} \tau _{ij} = \inf \{ S \in (S_0, T_i ): | f (z_j^i (S), S) - {\tilde{f}}(z_j^i (S), S) | > N^{\varepsilon /2} B (z_j^i (S) ) \}, \end{aligned}$$
(4.59)

with the infimum of the empty set being \(+ \infty \). Let \({\mathcal {A}}_0\) be the event,

$$\begin{aligned} {\mathcal {A}}_0 = \{ \exists (i, j): | f ( z_i^j (S_0), S_0 ) - {\tilde{f}}( z_i^j (S_0), S_0 ) | > N^{\varepsilon _1} \inf _{ S_0< S < T_i } B ( z_i^j (S) ) \}^c \nonumber \\ \end{aligned}$$
(4.60)

Lemma 1.25

The event \({\mathcal {A}}_0\) holds with overwhelming probability.

Proof

We will show that the desired estimates hold on the event of Proposition B.5, with \(\varepsilon , \delta \) in that proposition statement chosen sufficiently small.

Fix a single characteristic \(z_S:= z_i^j (S)\) and let \(S_0< S < T_i\) be fixed. By (4.19) and Lemma 4.5 it follows that \(\kappa _S >0\) for all S and that \(\kappa _{S_0} \ge c N^{-2/3+5 \varepsilon }\). From (B.11) we see that

$$\begin{aligned} | f ( z_{S_0}, S_0 ) - {\tilde{f}}( z_{S_0}, S_0 ) | \le N^{\varepsilon _1} B( z_{S_0} ). \end{aligned}$$
(4.61)

Since \(\kappa _S >0\) and \(\kappa _S\) and \(\eta _S\) are both decreasing in S that,

$$\begin{aligned} B(z_S) \ge B( z_{S_0} ) \end{aligned}$$
(4.62)

for \(S > S_0\). This yields the claim. \(\square \)

We also let \(\tau _0\) be the stopping time that equals \(+\infty \) on \({\mathcal {A}}_0\) and is \(S_0\) on \({\mathcal {A}}_0^c\). We now introduce the stopping time,

$$\begin{aligned} \tau = \left( \min _{i, j} \tau _{ij} \right) \wedge \tau _0 \wedge S_f. \end{aligned}$$
(4.63)

Using the Lipschitz continuity of \({\tilde{f}}(z, S)\) and f(zS) on the domains \({\mathcal {B}}_S\) we have from Lemma 4.4 that for any \(S_0< S < \tau \) that there are no eigenvalues of the form \(\lambda = {\textrm{e}}^{ {\textrm{i}}\theta }\) for \(\Theta _S + N^{-2/3+5 \varepsilon } \Delta _S^{1/9} < | \theta | \le \pi \).

Lemma 1.26

Let \(z_S\) be a characteristic terminating at time \(T_i\) in the domain,

$$\begin{aligned} \{ z = (1+ \eta ) {\textrm{e}}^{ {\textrm{i}}\theta }: N^{-2/3+\varepsilon } \Delta _{T_i}^{1/9} \le \eta \le {\mathfrak {a}}\Delta _{T_i} / 10, \Theta _{T_i} + N^{-2/3+5\varepsilon } \Delta _{T_i}^{1/9} \le |\theta | \le \pi \}.\nonumber \\ \end{aligned}$$
(4.64)

Then for \(S_0< S < T_i\) we have,

$$\begin{aligned} N^{\varepsilon } B(z_S) \le \frac{1}{ \log (N)} | {\textrm{Re}}[ {\tilde{f}}(z_S, S) ]|. \end{aligned}$$
(4.65)

Proof

Note that \({\tilde{f}}(z_S, S)\) is constant. At \(S = T_i\) we have,

$$\begin{aligned} N^{\varepsilon } B(z_{T_i} )= & {} N^{\varepsilon } \frac{1}{ N \sqrt{ \eta _{T_i} (\eta _{T_i} + \kappa _{T_i}) }} \le C \frac{N^{\varepsilon } \Delta ^{1/6}_{T_i} }{ N \eta _{T_i}^{3/2}} | {\textrm{Re}}[ {\tilde{f}}(z_{T_i}, T_i ) ] | \nonumber \\\le & {} C N^{-\varepsilon /2} | {\textrm{Re}}[ {\tilde{f}}(z_{T_i}, T_i ) ] | \end{aligned}$$
(4.66)

where we used \(\eta _{T_i} \ge \Delta _{T_i}^{1/9} N^{-2/3+\varepsilon }\) which holds by assumption. Let \(S_*\) be as in Lemma 4.2. By Lemma 4.2 we have

$$\begin{aligned} B( z_S) \le B (z_{T_i} ) \end{aligned}$$
(4.67)

for all \(S > S_*\). If \(S_* > S_0\) then for \(S < S_*\) we have \(\eta _S \ge \eta _{S_*} \ge c \Delta _{S_*} \ge c \Delta _{T_i} \ge c \kappa _{T_i}\) and so

$$\begin{aligned} B(z_S) \le (N \eta _S)^{-1} \le C B (z_{T_i} ). \end{aligned}$$
(4.68)

This completes the proof. \(\square \)

With similar notation as in the other sections we have for any \(S_0 \le S \le T_i \wedge \tau \),

$$\begin{aligned} f ( z_j^i (S) , S) - {\tilde{f}}(z_j^i (S), S)&= f ( z_j^i (S_0) , S_0) - {\tilde{f}}(z_j^i (S_0), S_0) = {\mathcal {E}}_1 (S)_j^i + {\mathcal {E}}_2 (S)_j^i , \end{aligned}$$
(4.69)

where

$$\begin{aligned} {\mathcal {E}}_1 (S)_j^i = - \frac{1}{2} \int _{S_0}^S z_j^i (U) ( \partial _z f) ( z_j^i (U), U) ( f ( z_j^i (U), U) - {\tilde{f}}( z_j^i (U), U) ) {\textrm{d}}U\nonumber \\ \end{aligned}$$
(4.70)

and

$$\begin{aligned} {\mathcal {E}}_2(S)_j^i = \int _{S_0}^S {\textrm{d}}M_U ( z_j^i (U) ). \end{aligned}$$
(4.71)

For the martingale term we have the following.

Proposition 1.27

For any ij and \(\varepsilon _1 >0\) we have,

$$\begin{aligned} {\mathbb {P}}\left[ \exists S \in (S_0, T_i \wedge \tau ): | {\mathcal {E}}_2(S)_j^i | > N^{\varepsilon _1} B(z_j^i (S) ) \right] \le C \log (N) {\textrm{e}}^{ - c N^{\varepsilon _1} }. \end{aligned}$$
(4.72)

Proof

This is proven in an almost identical manner to Proposition 3.7. The quadratic variation is bounded using Lemma 4.6. Note that the condition (4.46) is a consequence of Lemma 4.8. \(\square \)

Proof of Theorem 1.6

Let \(\Upsilon \) be the intersection of the event of Proposition 4.9 and \({\mathcal {A}}_0\) so that \(\Upsilon \) holds with overwhelming probability. Let \(z_S = z_j^i(S)\) be a characteristic. On the event \(\Upsilon \) we have for \(S_0< S < T_i \wedge \tau \) that,

$$\begin{aligned} | f ( z_S, S) - {\tilde{f}}(z_S, S) | \le \int _0^S g (U) | f(z_U, U) - {\tilde{f}}(z_U, U) | {\textrm{d}}U + 2 N^{\varepsilon _1} B (z_S)\nonumber \\ \end{aligned}$$
(4.73)

by the definition of \({\mathcal {A}}_0\) and (4.69), where

$$\begin{aligned} g(U) = \frac{ \left| z_U ( \partial _z f ) ( z_U, U ) \right| }{2}. \end{aligned}$$
(4.74)

Hence, via Gronwall’s inequality we obtain,

$$\begin{aligned} | f (z_S,S) - {\tilde{f}}(z_S, s) | \le 2 \int _0^S g(u) \exp \left[ \int _U^S g(w) {\textrm{d}}w \right] N^{\varepsilon _1} B(z_U) {\textrm{d}}u + 2 N^{\varepsilon _1} B ( z_S).\nonumber \\ \end{aligned}$$
(4.75)

As in the proofs of Theorems 1.2 and 1.4 we have, using Lemma 4.8,

$$\begin{aligned} \exp \left[ \int _U^S g(w) {\textrm{d}}w \right] \le C \frac{ \eta _U}{\eta _S}, \end{aligned}$$
(4.76)

as well as

$$\begin{aligned} g(U) \le C \frac{ | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | }{\eta _U}. \end{aligned}$$
(4.77)

Hence,

$$\begin{aligned} | f (z_S, S) - {\tilde{f}}(z_S, S) | \le C \frac{ N^{\varepsilon _1}}{\eta _S} \int _0^S | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | B (z_U) {\textrm{d}}U + 2 N^{\varepsilon _1} B(z_S).\nonumber \\ \end{aligned}$$
(4.78)

We now split into a few different cases. Suppose first that \(\eta _S \ge \kappa _S\). Then,

$$\begin{aligned} \int _0^S | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | B (z_U) {\textrm{d}}U \le \frac{1}{N} \int _0^S | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | \eta _U^{-1} {\textrm{d}}U \le C N^{-1} \log (N)\nonumber \\ \end{aligned}$$
(4.79)

and so,

$$\begin{aligned} | f ( z_S, S) - {\tilde{f}}(z_S, S) | \le C N^{\varepsilon _1} B(z_S) + C \log (N) N^{\varepsilon _1} \eta _S \le C \log (N) N^{\varepsilon _1} B ( z_S)\nonumber \\ \end{aligned}$$
(4.80)

where we used the assumption that \(\eta _S \ge \kappa _S\) in the last inequality. Now assume that \(\eta _S \le \kappa _S\). Similar to the convention of Lemma 4.2 we define,

$$\begin{aligned} S_* = \sup \{ S_0< U < S: |z_U| \ge 1 + {\mathfrak {a}}\Delta _U \}. \end{aligned}$$
(4.81)

with the convention that the supremum of the empty set is \(-\infty \). By Lemma 4.5 it follows that \(S_* = - \infty \). Therefore, for all \(S_0< U < S\) we have,

$$\begin{aligned} | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | \asymp \frac{1}{ \Delta _U^{1/6}} \frac{ \eta _U}{ \sqrt{ \kappa _U + \eta _U}}, \end{aligned}$$
(4.82)

as \(\kappa _U >0\) for all \( S_0< U < S\).

Let us define \(S_1 = 4 - 2s\). Let us assume that \(S_0 < S_1\). The other case is easier, requiring consideration of only one of the regions of integration below. We consider first the region \(U \in (S_1, S)\). Then, for such U we have \(\Delta _U \le C \Delta _S\), and so

$$\begin{aligned}&\frac{1}{ \eta _S} \int _{S_1}^{S} | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | N B (z_U) {\textrm{d}}U \le \frac{1}{\sqrt{ \kappa _S + \eta _S} \Delta _S^{1/6} } \int _{S_1}^S \frac{1}{ \sqrt{ \kappa _U + \eta _U} \sqrt{ \eta _U} } {\textrm{d}}U \nonumber \\&\quad \le C \frac{1}{\sqrt{ \kappa _S + \eta _S} \Delta _S^{1/6} } \int _{S_1}^S \Delta _U^{1/6} | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | \eta _U^{-3/2} {\textrm{d}}U\nonumber \\&\quad \le C \frac{1}{\sqrt{ \kappa _S + \eta _S} } \int _{S_1}^S | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | \eta _U^{-3/2} {\textrm{d}}U \nonumber \\&\quad \le C \frac{1}{ \sqrt{ \kappa _S + \eta _S} \eta _S^{1/2} }. \end{aligned}$$
(4.83)

Now we consider \(U \in (S_0, S_1)\). For such U, let \(U = 4 - u\). Note \(u \ge 2 s\). In particular, from (4.13) we conclude that

$$\begin{aligned} \Delta _U^{1/6} \sqrt{ \kappa _U} \ge c u \end{aligned}$$
(4.84)

which implies that \(C \Delta _U \ge \kappa _U \ge c \Delta _U\). From Lemma 4.1 we see that for \(U \in (S_0, S_1)\),

$$\begin{aligned} c \frac{ \eta _U}{ \Delta _U^{2/3} } \le | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | \le C \frac{ \eta _U}{ \Delta _U^{2/3}}. \end{aligned}$$
(4.85)

Since \({\tilde{f}}(z_U, U)\) is constant along characteristics we deduce from this that

$$\begin{aligned} \Delta _U^{1/6} \le C \frac{ \eta _U^{1/4} \Delta _{S_1}^{1/6}}{\eta _{S_1}^{1/4}} \le C \frac{ \eta _U^{1/4} \Delta _{S}^{1/6}}{\eta _{S}^{1/4}} \end{aligned}$$
(4.86)

The second inequality uses that \(\eta _{S_1} \ge \eta _S\) and that \(\Delta _{S_1} \asymp \Delta _S\) by the choice of \(S_1\). Hence,

$$\begin{aligned}&\frac{1}{ \eta _S} \int _{S_0}^{S_1} | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | N B (z_U ) {\textrm{d}}U \le \frac{C}{\sqrt{ \kappa _S + \eta _S} \Delta _S^{1/6} } \int _{S_0}^{S_1} \frac{1}{ \sqrt{ \kappa _U + \eta _U} \sqrt{ \eta _U} } {\textrm{d}}U \nonumber \\&\quad \le \frac{C}{\sqrt{ \kappa _S + \eta _S} \Delta _S^{1/6} } \int _{S_0}^{S_1} \Delta _U^{1/6} | {\textrm{Re}}[{\tilde{f}}(z_U, U) ] | \eta _U^{-3/2} {\textrm{d}}U \nonumber \\&\quad \le \frac{C}{ \sqrt{ \kappa _S + \eta _S} \eta _S^{1/4} }\int _{S_0}^{S} | {\textrm{Re}}[ {\tilde{f}}(z_U, U) ] | \eta _U^{-5/4} {\textrm{d}}U \nonumber \\&\quad \le C \frac{1}{ \sqrt{ \kappa _S + \eta _S} \eta _S^{1/2} }. \end{aligned}$$
(4.87)

In the first inequality we used constancy of \({\tilde{f}}\) along characteristics and Lemma 4.1. In the second inequality we used Lemma 4.1 again. In the third inequality we used (4.86).

Summarizing, we see that on the event \(\Upsilon \) we have for any \(S_0< S < \tau \wedge T_i\) that

$$\begin{aligned} | f (z_S, S) - {\tilde{f}}(z_S, S) | \le C \log (N) N^{\varepsilon _1} B( z_S). \end{aligned}$$
(4.88)

Taking \(\varepsilon _1 < \varepsilon /100\) we see that we must have \(\tau > T_i\) for every i. Therefore, \(\tau = T\) and we conclude the theorem. \(\square \)

5 Center of mass evolution

In this section we prove Proposition 1.8. We will not track dependence of constants on the parameter N, as Proposition 1.8 is a statement in soft analysis. The reader should note that we will therefore change notation and consider unitary matrices of size n instead of N evolving according to (1.1). We do this in order to emphasize the ineffectiveness of the estimates in the dimension of the system.

Fix some time \(t_0 > 0\) and assume that there are no eigenvalues in the set \(\{ z = {\textrm{e}}^{ {\textrm{i}}\theta }: \theta \in I_0\}\) for \(I_0 = [ \theta _0 -L/2, \theta _0 + L/2]\) at time \(t_0\) and \(\theta _0 \in [0, 2 \pi )\). Let \(\tau > t_0\) be the first time an eigenvalue enters this set. Let \(\Gamma \) be the contour,

$$\begin{aligned} \Gamma :=&\, \{ z = r {\textrm{e}}^{ {\textrm{i}}\theta } : 1/2< r < 3/2, \theta = \theta _0 \pm L/4 \} \nonumber \\ \cup&\,\{ z = r {\textrm{e}}^{ {\textrm{i}}\theta } : r = 1/2, 3/2, \theta \notin [\theta _0 - L/4, \theta _0+L/4] \}. \end{aligned}$$
(5.1)

That is, it encloses all of the eigenvalues on the unit circle but avoids the ray \(\{ r {\textrm{e}}^{ {\textrm{i}}\theta _0 }: r \ge 0 \}\). A schematic diagram of the contour \(\Gamma \) is given in Fig. 1.

We use this contour as we will take a logarithm with branch cut being this ray. Then for any time \(t_0< t_1 < \tau \) we have,

$$\begin{aligned} {\textrm{i}}\theta _i (t_1) - {\textrm{i}}\theta _i (t_0) = \frac{1}{ 2\pi {\textrm{i}}} \int _{\Gamma } g_{\theta _0} ( z) \left( \frac{1}{ z - \lambda _i (t_1)} - \frac{1}{ z - \lambda _i (t_0) } \right) {\textrm{d}}z \end{aligned}$$
(5.2)

where \(g_{\theta _0}(z)\) is the branch of the logarithm holomorphic in \({\mathbb {C}}\backslash \{ z = r {\textrm{e}}^{ {\textrm{i}}\theta _0}, r \ge 0 \}\). This formula holds due to the fact that no eigenvalue crosses the angle \(\theta _0\) in this time interval, and so for this entire time interval, \(\theta _i (t) = 2 \pi k + {\tilde{\theta }}_i (t)\) where k is a constant integer and \(\theta _0< {\tilde{\theta }}_i (t) < \theta _0 + 2 \pi \).

Introducing,

$$\begin{aligned} m(z, t) = \frac{1}{n} \sum _{i=1}^n \frac{1}{ \lambda _i (t) -z } = \frac{1}{n} {\textrm{tr}}\frac{1}{ U_t - z} \end{aligned}$$
(5.3)

we therefore have,

$$\begin{aligned} {\bar{\theta }} (t_1 \wedge \tau ) - {\bar{\theta }} (t_0) = \frac{1}{ 2 \pi {\textrm{i}}} \int _\Gamma \frac{ - g_{\theta _0} ( z) }{ {\textrm{i}}} ( m (z, t_1 \wedge \tau ) - m (z, t_0) ) {\textrm{d}}z. \end{aligned}$$
(5.4)

For \(t_0< t < \tau \), m(zt) is well-defined for \(z \in \Gamma \) and so we may apply the Itô formula. Since,

$$\begin{aligned} f(z, t) = 1 + 2 z m(z, t) \end{aligned}$$
(5.5)

we have from (1.11) that (abbreviating \(g = g_{\theta _0}\) and \(m = m(z, t)\)),

$$\begin{aligned} {\textrm{d}}\frac{1}{ 2 \pi {\textrm{i}}} \int _{\Gamma } {\textrm{i}}g(z) m(z, t) {\textrm{d}}z&= \frac{1}{ 2 \pi {\textrm{i}}} \int _{\Gamma } \frac{- {\textrm{i}}g}{2} (1+ 2 z m)(m + zm') {\textrm{d}}z {\textrm{d}}t \nonumber \\&\quad + \frac{1}{ 2 \pi {\textrm{i}}} \int _{\Gamma } g(z) \frac{1}{n} {\textrm{tr}}\left( \frac{ U_t}{(U_t -z )^2} {\textrm{d}}W \right) {\textrm{d}}z. \end{aligned}$$
(5.6)
Fig. 1
figure 1

Schematic diagram of contour \(\Gamma \) defined in (5.1). Contour in red; unit circle in black. Eigenvalues are the thick black dots enclosed by the red contour (color figure online)

Observe that by the holomorphic functional calculus,

$$\begin{aligned} \frac{1}{ 2 \pi {\textrm{i}}} \int _{\Gamma } g(z) \frac{1}{ (U_t - z)^2} {\textrm{d}}z = U_t^{-1}, \end{aligned}$$
(5.7)

and so

$$\begin{aligned} \frac{1}{ 2 \pi {\textrm{i}}} \int _{\Gamma } g(z) \frac{1}{n} {\textrm{tr}}\left( \frac{ U_t}{(U_t -z )^2} {\textrm{d}}W \right) {\textrm{d}}z = \frac{1}{n} {\textrm{tr}}\left( {\textrm{d}}W \right) . \end{aligned}$$
(5.8)

The first line of (5.6) turns out to vanish identically. To see this, we will evaluate all of the integrals explicitly using the Cauchy integral formula as well as the variants

$$\begin{aligned} \frac{1}{ 2 \pi {\textrm{i}}} \int _{\Gamma } \frac{F(z)}{ (z-a)(z-b) } {\textrm{d}}z = \frac{ F(b) - F(a)}{b-a} \end{aligned}$$
(5.9)

and

$$\begin{aligned} \frac{1}{ 2 \pi {\textrm{i}}} \int _{\Gamma } \frac{ F(z) }{ (z-a)^2 (z-b)} {\textrm{d}}z = - \frac{F'(a)}{b-a} + \frac{ F(b) - F(a)}{(b-a)^2} \end{aligned}$$
(5.10)

valid for F analytic in the appropriate domain and \(\Gamma \) encircling ab. We have,

$$\begin{aligned}&\frac{1}{ 2 \pi {\textrm{i}}} \int g(z) (1 + 2 z m )( m + z m' ) {\textrm{d}}z = \frac{1}{ 2 \pi {\textrm{i}}} \int g(z) (m + 2 zm^2 + z m' + 2 z^2 m m' ) {\textrm{d}}z \nonumber \\&\quad = \frac{1}{ 2 \pi {\textrm{i}}} \int {\textrm{d}}z g(z) \bigg \{ \frac{1}{n} \sum _i \frac{1}{ \lambda _i - z} + 2 \frac{z}{n^2} \sum _{i \ne j } \frac{1}{( \lambda _i - z) ( \lambda _j - z ) } \nonumber \\&\qquad + z ( 2n^{-1} +1) \frac{1}{n} \sum _i \frac{1}{ ( \lambda _i - z )^2} \nonumber \\&\qquad + 2 \frac{z^2}{n^2} \sum _{i \ne j } \frac{1}{ ( \lambda _i -z)( \lambda _j - z)^2} + 2\frac{z^2}{n^2} \sum _i \frac{1}{ ( \lambda _i -z)^3} \bigg \} \nonumber \\&\quad = - \frac{1}{n} \sum _i g ( \lambda _i ) + \frac{2}{n^2} \sum _{i \ne j } \frac{ \lambda _i g ( \lambda _i ) - \lambda _j g ( \lambda _j ) }{ \lambda _i - \lambda _j } +\frac{ 2n^{-1}+1}{n} \sum _i ( g ( \lambda _i ) + 1 ) \nonumber \\&\qquad + \frac{1}{n^2} \sum _{i \ne j } \frac{ 4 \lambda _j g ( \lambda _j ) +2 \lambda _j }{ \lambda _i - \lambda _j } - \frac{2}{n^2} \sum _{i \ne j } \frac{ \lambda _i^2 g ( \lambda _i ) - \lambda _j^2 g ( \lambda _j ) }{ ( \lambda _i - \lambda _j )^2} - \frac{1}{n^2} \sum _i ( 2 g ( \lambda _i ) + 3). \end{aligned}$$
(5.11)

Now, note that

$$\begin{aligned} \frac{2}{n^2} \sum _{i \ne j } \frac{ \lambda _i^2 g ( \lambda _i ) - \lambda _j^2 g ( \lambda _j ) }{ ( \lambda _i - \lambda _j )^2} = 0 \end{aligned}$$
(5.12)

by symmetry as well as,

$$\begin{aligned} - \frac{1}{n} \sum _i g ( \lambda _i )+\frac{ 2n^{-1}+1}{n} \sum _i ( g ( \lambda _i ) + 1 )- \frac{1}{n^2} \sum _i ( 2 g ( \lambda _i ) + 3) =1 - \frac{1}{n}\nonumber \\ \end{aligned}$$
(5.13)

Finally,

$$\begin{aligned}&\frac{2}{n^2} \sum _{i \ne j } \frac{ \lambda _i g ( \lambda _i ) - \lambda _j g ( \lambda _j ) }{ \lambda _i - \lambda _j } + \frac{1}{n^2} \sum _{i \ne j } \frac{ 4 \lambda _j g ( \lambda _j ) +2 \lambda _j }{ \lambda _i - \lambda _j } = \frac{2}{n^2} \sum _{i \ne j } \frac{ \lambda _j }{ \lambda _i - \lambda _j } \nonumber \\&\quad = - \frac{n(n-1)}{n^2} \end{aligned}$$
(5.14)

and so indeed the term on the first line of (5.6) vanishes. It follows that for any \(t_0 < t_1 \le \tau \) we have

$$\begin{aligned} {\bar{\theta }} (t_1) - {\bar{\theta }} (t_0) = \frac{1}{n} \left( {\textrm{tr}}W_{t_1} - {\textrm{tr}}W_{t_0} \right) . \end{aligned}$$
(5.15)

Fix a final time \(T >1\) and a large integer \(m>0\), and intermediate times \(t_i = i T / m\). Let \(\theta _k\), \(k=1, 2, \dots , 4n\), be 4n equally spaced points along the unit circle. For any \(\ell >0\) define the arcs,

$$\begin{aligned} Z_{k, \ell } = \{ z= {\textrm{e}}^{ {\textrm{i}}\theta }: | \theta - \theta _k | < \ell / ( 100 n) \}. \end{aligned}$$
(5.16)

Throughout the remainder of the proof we will make use of the arcs \(\{ Z_{k, 1}\}_k\) and \(\{ Z_{k, 1/2} \}_k\), i.e., the choices \(\ell =1, 1/2\).

For each \(0 \le i \le m\) and \(1 \le k \le 4 n\) define the stopping time \(\tau _{i, k}\) as follows. If at time \(t_i\) there is an eigenvalue in \(Z_{k, 1}\) let \(\tau _{i, k} = t_i\). Otherwise, \(\tau _{i, k}\) be the first time \(t_i < t\le t_{i+1}\) that an eigenvalue hits the set \(Z_{k, 1/2}\). If this does not occur, let \(\tau _{i, k} = \infty \). Then, let \(\tau _i = \min _{k} \tau _{i, k}\). By the pigeonhole principle we have \(\tau _i > t_i\) for all i. Finally, let \(\tau ^{(m)} = \min _i \tau _i\).

From the above discussion we have on the event \(\tau ^{(m)} = \infty \) and by telescoping that for any \(0< t < T\) we have

$$\begin{aligned} {\bar{\theta }} (t) = \frac{1}{n} {\textrm{tr}}\left( W_t \right) . \end{aligned}$$
(5.17)

The proof of Proposition 1.8 will be complete once we prove that

$$\begin{aligned} \lim _{m \rightarrow \infty } {\mathbb {P}}\left[ \tau ^{(m)} = \infty \right] = 1. \end{aligned}$$
(5.18)

The remainder of this section is devoted to this. Consider the points,

$$\begin{aligned} z_{\pm , k } = (1+\eta ) \exp \left[ {\textrm{i}}\left( \theta _k \pm \frac{3}{4} \frac{1}{ 100 n } \right) \right] . \end{aligned}$$
(5.19)

Note that for any k that if \(Z_{k, 1}\) contains no eigenvalues at time \(t_i\), then

$$\begin{aligned} | {\textrm{Re}}[ f (z_{\pm , k}, t_i ) ] | \le C_1 n^2 \eta \end{aligned}$$
(5.20)

for some \(C_1 >0\). On the other hand, if for this k we have that \( \tau _{i, k} < \infty \) then there is some \(t \in [t_i, t_{i+1}]\) such that,

$$\begin{aligned} | {\textrm{Re}}[ f (z_{+, k}, t) ] | + | {\textrm{Re}}[ f (z_{-, k}, t) ] | > \frac{c_1}{n \eta }. \end{aligned}$$
(5.21)

Choosing \(\eta \) small enough so that

$$\begin{aligned} \frac{c_1}{ n \eta } > 10 C_1 n^2 \eta \end{aligned}$$
(5.22)

we have that

$$\begin{aligned}{} & {} \{ \tau _{i} < \infty \} \subseteq \bigcup _{k=1}^{4n} \left\{ \exists t \in [t_i, t_{i+1} ]: | f ( z_{+, k}, t) - f(z_{+, k}, t_i ) | + | f ( z_{-, k}, t) \right. \nonumber \\{} & {} \quad \left. - f(z_{-, k}, t_i ) |> \frac{c_2}{ 10 n \eta } \right\} . \end{aligned}$$
(5.23)

The parameter \(\eta >0\) is fixed for the remainder of the proof. We see by (1.11) that (abbreviating \( z= z_{\pm , k}\))

$$\begin{aligned} | f ( z, t) - f(z, t_i ) | \le CT \frac{1}{m \eta ^3} + (M_t - M_{t_i} ). \end{aligned}$$
(5.24)

By the BDG inequality,

$$\begin{aligned} {\mathbb {P}}\left[ \sup _{t_i< t < t_{i+1} } | M_{t_i} - M_t | > s \right] \le C \exp \left[ - c s m^{1/2} n^{-2} T^{-1} \right] . \end{aligned}$$
(5.25)

So taking \(s = c_2 / (100 n \eta )\) we see that for all m large enough, there are constants depending on n and \(\eta \) such that

$$\begin{aligned} {\mathbb {P}}\left[ \tau _i < \infty \right] \le C {\textrm{e}}^{- m^{1/2} c}. \end{aligned}$$
(5.26)

This completes the proof of Proposition 1.8. \(\square \)

6 Rigidity

6.1 Helffer–Sjöstrand formula

This section establishes an analog of the Helffer–Sjöstrand formula for measures on the unit circle. Recall,

$$\begin{aligned} \partial _{{\bar{z}}} = \frac{1}{2} ( \partial _x + {\textrm{i}}\partial _y) \end{aligned}$$
(6.1)

as well as Green’s theorem,

$$\begin{aligned} F( \lambda ) = \frac{1}{ \pi } \int _{{\mathbb {R}}^2}\frac{ (\partial _{{\bar{z}}} F) (x, y) }{ \lambda - (x+ {\textrm{i}}y) } {\textrm{d}}x {\textrm{d}}y \end{aligned}$$
(6.2)

for any \(F \in C^2\) of compact support. In polar coordinates we recall,

$$\begin{aligned} \partial _{{\bar{z}}} F (r, \theta ) = \frac{ {\textrm{e}}^{ {\textrm{i}}\theta }}{2} \left( \partial _r F + \frac{ {\textrm{i}}}{r} \partial _\theta F \right) . \end{aligned}$$
(6.3)

For any F supported in an annulus,

$$\begin{aligned} F( \lambda ) = \frac{1}{ \pi } \int _{{\mathbb {R}}^2} \frac{ \partial _{{\bar{z}}} F }{ 2z} \frac{ \lambda +z}{ \lambda - z} {\textrm{d}}x {\textrm{d}}y. \end{aligned}$$
(6.4)

Let \(\varphi : [0, 2\pi ] \rightarrow {\mathbb {R}}\) be a function on the unit circle that extends to a smooth \(2\pi \)-periodic function on \({\mathbb {R}}\). We define the quasi-analytic extension of \(\varphi \) by,

$$\begin{aligned} {\tilde{\varphi }}(r, \theta ):= ( \varphi ( \theta ) - {\textrm{i}}\log (r) \varphi ' ( \theta ) ) \chi (r) \end{aligned}$$
(6.5)

where \(\chi (r)\) is a function that is 1 on [3/4, 4/3] and 0 outside of [1/2, 2]. We may also assume that,

$$\begin{aligned} \chi (r) = \chi (r^{-1} ). \end{aligned}$$
(6.6)

If \(\mu \) is any measure on the unit circle with Cauchy transform \(f_\mu (z)\) we see that,

$$\begin{aligned} \int \varphi ( \theta ) {\textrm{d}}\mu ( \theta )&= \frac{1}{ \pi } \int _{{\mathbb {R}}^2} \frac{ \partial _{{\bar{z}}} {\tilde{\varphi }}}{2 z } f_\mu (z) {\textrm{d}}x {\textrm{d}}y \nonumber \\&= \frac{ 1}{ 4 \pi } \int ( \varphi ( \theta ) - {\textrm{i}}\log (r) \varphi ' ( \theta ) ) \chi ' (r) f_\mu (z) {\textrm{d}}r {\textrm{d}}\theta \nonumber \\&\quad + \frac{1}{4 \pi } \int \varphi '' ( \theta ) \log (r) \chi (r) f_\mu (z) r^{-1} {\textrm{d}}r {\textrm{d}}\theta . \end{aligned}$$
(6.7)

This is the Helffer–Sjöstrand formula we require to establish our eigenvalue estimates and will be used in the next subsections.

6.2 Proof of Corollary 1.3

We follow Section 3.3 of [30] very closely, as the estimates of Theorem 1.2 and the formula (6.7) combine in the exactly same manner there as here to prove Corollary 1.3. Introduce,

$$\begin{aligned} \eta ( \theta ) = \inf _{ \eta > 1 } \{ (\eta -1) | {\textrm{Re}}[ {\tilde{f}}( \eta {\textrm{e}}^{ {\textrm{i}}\theta } ) ] | \ge N^{\varepsilon -1} \}. \end{aligned}$$
(6.8)

Note that from Lemma B.2,

$$\begin{aligned} \eta \rightarrow (\eta -1) | {\textrm{Re}}[ {\tilde{f}}( \eta {\textrm{e}}^{ {\textrm{i}}\theta } ] | \end{aligned}$$
(6.9)

is an increasing function for \(\eta >1\). We first assume \(t < 2\). We consider \(I = [ \theta _0, \pi ]\) for some \( 0< \theta _0 < \pi \). The case \(- \pi< \theta _0 < 0\) requires only notational changes. Define,

$$\begin{aligned} {\tilde{\eta }}:= \inf _{ \eta : \log \eta \ge N^{1-{\mathfrak {c}}} } \{ \eta : \max _{ \theta _0 \le x \le \theta _0 + \eta } \eta (x) \le \eta \}. \end{aligned}$$
(6.10)

Let \({\tilde{\theta }}\) be

$$\begin{aligned} {\tilde{\theta }}:= \text{ argmax}_{ \theta _0 - {\tilde{\eta }}\le x \le \theta _0 } \eta (x) \end{aligned}$$
(6.11)

so that

$$\begin{aligned} \eta ( {\tilde{\theta }}) = {\tilde{\eta }}. \end{aligned}$$
(6.12)

We let \(\varphi \) be a function that is 1 on \([\theta _0, \pi ]\) and 0 outside of \([ \theta _0 - \log {\tilde{\eta }}\wedge \frac{1}{10}, \pi +N^{2\varepsilon -1} ]\), with \(| \varphi ^{(k)} (x) | \le C_k (N^{1-2\varepsilon } )^k\) for x near \(\pi \) and \(| \varphi ^{(k)} (x) | \le C_k (\log {\tilde{\eta }})^{-k}\) for x near \(\theta _0\).

By (6.7) we have,

$$\begin{aligned} \left| \frac{1}{N} \sum _i \varphi ( \lambda _i (t) ) - \int \varphi (\theta ) \rho _t ( \theta ) {\textrm{d}}\theta \right|&\le C \int ( | \varphi ( \theta ) | + | \varphi ' ( \theta ) | ) | \chi ' ( r) | S (z)| {\textrm{d}}r {\textrm{d}}\theta \nonumber \\&\quad + \left| \int \varphi '' ( \theta ) \log (r) \chi (r) S (z) r^{-1} {\textrm{d}}r {\textrm{d}}\theta \right| , \end{aligned}$$
(6.13)

where

$$\begin{aligned} S (z) = f (z, t) - {\tilde{f}}(z, t). \end{aligned}$$
(6.14)

On the event that the estimates of Theorem 1.2 hold we see that,

$$\begin{aligned} \int ( | \varphi ( \theta ) | + | \varphi ' ( \theta ) | ) | \chi ' ( r) | S (z)| {\textrm{d}}r {\textrm{d}}\theta \le C \frac{N^{\varepsilon }}{N}. \end{aligned}$$
(6.15)

For the second term, note that the measure \(r^{-1} {\textrm{d}}r\) is invariant under the transformation \(r \rightarrow r^{-1}\) so that,

$$\begin{aligned}{} & {} \left| \int \varphi '' ( \theta ) \log (r) \chi (r) S (z) r^{-1} {\textrm{d}}r {\textrm{d}}\theta \right| \nonumber \\{} & {} \quad = 2 \left| \int _{0}^{2 \pi } \int _{r>1} \varphi ''( \theta ) \log (r) | {\textrm{Re}}[S(z) ] | \chi (r) r^{-1} {\textrm{d}}r {\textrm{d}}\theta \right| \end{aligned}$$
(6.16)

Recall that \(\varphi ''(x) = 0\) unless \(x \in [\pi , \pi +N^{2 \varepsilon -1}]\) or \(x \in [ \theta _0 - \log {\tilde{\eta }}\wedge \frac{1}{10}, \theta _0 ]\). Note that for \(r \ge \eta _{\mathfrak {c}}\) where \(\log \eta _{\mathfrak {a}}:= N^{-{\mathfrak {c}}}\) we have,

$$\begin{aligned} | {\textrm{Re}}[ S (z)] | \le C \frac{ N^{\varepsilon }}{N \log |z|} \end{aligned}$$
(6.17)

using Lemma B.2 (see Remark 3.3 of [30] for a similar argument). Therefore,

$$\begin{aligned}&\int _{\theta \in [\pi , \pi +N^{2 \varepsilon -1}] } \int _{r>1} | \varphi ''( \theta ) | \log (r) | {\textrm{Re}}[S(z) ] | \chi (r) r^{-1} {\textrm{d}}r {\textrm{d}}\theta \nonumber \\&\quad \le \int _{\theta \in [\pi , \pi +N^{2 \varepsilon -1}] } \int _{r>\eta _{\mathfrak {c}}} | \varphi ''( \theta ) | \log (r) | {\textrm{Re}}[S(z) ] | \chi (r) r^{-1} {\textrm{d}}r {\textrm{d}}\theta \nonumber \\&\qquad + C \int _{\theta \in [\pi , \pi +N^{2 \varepsilon -1}] } \int _{1< r< \eta _{\mathfrak {c}}} | \varphi ''( \theta ) | \chi (r) {\textrm{d}}r {\textrm{d}}\theta \le C \frac{N^{3 \varepsilon }}{N} + \frac{C}{ N^{\mathfrak {c}}} \le C \frac{N^{3 \varepsilon }}{N}. \end{aligned}$$
(6.18)

We used (6.17) in the first integral and the estimate \(|(r-1) {\textrm{Re}}[ S ( r {\textrm{e}}^{ {\textrm{i}}\theta } ) ] | \le C\) for the second. For the region where \(\theta \in [ \theta _0 - \log {\tilde{\eta }}\wedge \frac{1}{10}, \theta _0 ] =: J\) we first bound,

$$\begin{aligned}&\int _{ \theta \in J} \int _{1< r< {\tilde{\eta }}} | \log (r) | | {\textrm{Re}}[S(z) ] \varphi '' ( \theta ) \chi (r) | {\textrm{d}}r {\textrm{d}}\theta \nonumber \\&\quad \le \int _{ \theta \in J} \int _{1< r< \eta _{\mathfrak {c}}} | \log (r) | | {\textrm{Re}}[S(z) ] \varphi '' ( \theta ) \chi (r) | {\textrm{d}}r {\textrm{d}}\theta \nonumber \\&\qquad + \int _{ \theta \in J} \int _{\eta _{\mathfrak {c}}< r < {\tilde{\eta }}} | \log (r) | | {\textrm{Re}}[S(z) ] \varphi '' ( \theta ) \chi (r) | {\textrm{d}}r {\textrm{d}}\theta \nonumber \\&\quad \le C \frac{ \eta _{\mathfrak {c}}}{{\tilde{\eta }}} + \frac{ C N^{\varepsilon }}{N} \le C N^{\varepsilon -1}. \end{aligned}$$
(6.19)

For the first integral we used \(|(r-1) {\textrm{Re}}[S] | \le 2\) and that \({\tilde{\eta }}\ge N^{1-{\mathfrak {c}}}\). For the second region we used again (6.17). For the contribution of \(r > {\tilde{\eta }}\) we have, by partial integration

$$\begin{aligned}&\int _{ \theta \in J} \int _{ r> {\tilde{\eta }}} \log (r) \varphi ''(\theta ) \chi (r) {\textrm{Re}}[S] r^{-1} {\textrm{d}}r {\textrm{d}}\theta \nonumber \\&\quad = \int _{ \theta \in J} \theta ' ( \theta ) \log ( {\tilde{\eta }}) {\textrm{Im}}[ S] {\tilde{\eta }}^{-1} {\textrm{d}}\theta \nonumber \\&\qquad - \int _{ \theta \in J } \int _{r > {\tilde{\eta }}} \varphi ' ( \theta ) \partial _r ( \log (r) \chi (r) r^{-1} ){\textrm{Im}}[S] {\textrm{d}}r {\textrm{d}}\theta . \end{aligned}$$
(6.20)

By definition of \({\tilde{\eta }}\), all z appearing in the above integration lie in \({\mathcal {B}}_t\). Therefore, we may apply the estimate on S of Theorem 1.2 and obtain that both of these integrals are bounded above by \(C \log (N) N^{\varepsilon -1}\).

When these estimates hold we therefore conclude that,

$$\begin{aligned} \left| \left\{ i: \theta _i (t) \in I \right\} \right| \le N \int \varphi ( \theta ) \rho _t ( \theta ) {\textrm{d}}\theta + C N^{3 \varepsilon }. \end{aligned}$$
(6.21)

For \(t < 2\), \(\rho _t\) has no support for \(| \theta - \pi | < \delta \) some \(\delta >0\). Therefore,

$$\begin{aligned} N \int \varphi ( \theta ) \rho _t ( \theta ) {\textrm{d}}\theta \le N \int _I \rho _t ( \theta ) {\textrm{d}}\theta + N \int _{\theta _0 - \log {\tilde{\eta }}\wedge \frac{1}{10} }^{\theta _0} \rho _t ( \theta ) {\textrm{d}}\theta . \end{aligned}$$
(6.22)

There are two cases. If \({\tilde{\eta }}= N^{-{\mathfrak {c}}}\) then the second integral is bounded by \({\tilde{\eta }}/ \sqrt{t} = N^{-{\mathfrak {c}}} t^{-1/2}\). Otherwise,

$$\begin{aligned} \int _{\theta _0 - \log {\tilde{\eta }}\wedge \frac{1}{10} }^{\theta _0} \rho _t ( \theta ) {\textrm{d}}\theta \le C ({\tilde{\eta }}-1) | {\textrm{Re}}[ {\tilde{\eta }}{\textrm{e}}^{ {\textrm{i}}{\tilde{\theta }}} ] | \le C \frac{N^{ 3 \varepsilon }}{N}. \end{aligned}$$
(6.23)

The lower bound for the number of \(\theta _i \in I\) follows similarly. For \(t >2\), then the densities \(\rho _t\) all satisfy that \(\inf _{|\theta | < c} \rho _t (\theta ) > c\) for some \(c>0\). In this case, one uses intervals with one end-point at \(\theta =0\). In (6.22) there is then a second term on the RHS,

$$\begin{aligned} \int _{-N^{2\varepsilon -1}}^0 \rho _t (\theta ) {\textrm{d}}\theta \le C N^{2 \varepsilon -1}. \end{aligned}$$
(6.24)

Everything else is identical. This proves the corollary. \(\square \)

6.3 Proof of Corollary 1.9

It follows from Proposition 1.8 that with overwhelming probability,

$$\begin{aligned} \sup _{0 \le t \le T} | {\bar{\theta }} (t) | \le \frac{N^{\varepsilon }}{N} \end{aligned}$$
(6.25)

for any \(\varepsilon >0\). Next, for each i, we may write,

$$\begin{aligned} \theta _i (t) = 2 \pi n_i (t) + \varphi _i (t) \end{aligned}$$
(6.26)

for \(n_i(t)\) an integer and \(- \pi \le \varphi _i (t) \le \pi \). For any fixed \(t >0\) it follows from the fact that the eigenvalues never cross that the set \(\{ n_1 (t), n_2 (t), \dots , n_N (t) \}\) contains at most two consecutive integers whose absolute values we will denote by \(m (t), m(t) + 1\). Let \(N_1(t)\) be the number of eigenvalues s.t. \(|n_i (t)| = m(t)\) and \(N_2 (t)\) be the number of eigenvalues s.t. \(| n_i (t) | = m(t) +1\). Let \(\mu \) be a probability measure on \([- \pi , \pi ]\) with a density. We have,

$$\begin{aligned} \int _0^\pi \theta {\textrm{d}}\mu ( \theta ) = \int _0^\pi \mu ( [t, \pi ] ) {\textrm{d}}t, \end{aligned}$$
(6.27)

and since \(\int _{-\pi }^{\pi } \theta \rho _t ( \theta ) {\textrm{d}}\theta = 0\) we have,

$$\begin{aligned} \left| \frac{1}{N} \sum _i \varphi _i (t) \right| \le \frac{ N^{\varepsilon }}{N} \end{aligned}$$
(6.28)

by Corollary 1.3. From this it follows that,

$$\begin{aligned} (1+m(t) ) N_2 (t) + m(t) N_1 (t) \le N^{\varepsilon } \end{aligned}$$
(6.29)

with overwhelming probability. In particular, \(m(t) = 0\) and the number of eigenvalues s.t. \(| \theta _i (t) | > \pi \) is at most \(N^{\varepsilon }\) with overwhelming probability. Using this, the remainder of Corollary 1.9 follows from Corollary 1.3 in a straightforward manner similar to the proof of Corollary 3.2 of [30]. \(\square \)