Keywords

1 Introduction

The identification of the limiting distribution of the length of the longest increasing subsequence of a random permutation or of a random word has attracted a lot of interest in the past decade, in particular in light of its connections with random matrices (see [14, 6, 8, 1215, 17, 18]). For random words, both the iid uniform and nonuniform settings are understood, leading respectively to the maximal eigenvalue of a traceless (or generalized traceless) element of the Gaussian unitary ensemble (GUE) as limiting laws of LI n . In a dependent framework, Kuperberg [16] conjectured that if the word is generated by an irreducible, doubly stochastic, cyclic, Markov chain with state space an ordered m-letter alphabet, then the limiting distribution of the length LI n is still that of the maximal eigenvalue of a traceless m ×m element of the GUE. More generally, the conjecture asserts that the shape of the Robinson–Schensted–Knuth (RSK) Young diagrams associated with the Markovian random word is that of the joint distribution of the eigenvalues of a traceless m ×m element of the GUE. For m = 2, Chistyakov and Götze [7] positively answered this conjecture, and in the present paper this result is rederived in an elementary way.

The precise class of homogeneous Markov chains with which Kuperberg’s conjecture is concerned is more specific than the ones we shall study. The irreducibility of the chain is a basic property we certainly must demand: each letter has to occur at some point following the occurrence of any given letter. The cyclic (also called circulant) criterion, i.e., the Markov transition matrix P has entries satisfying \(p_{i,j} = p_{i+1,j+1}\), for 1 ≤ i, j ≤ m (where \(m + 1 = 1\)), ensures a uniform stationary distribution.

Let us also note that Kuperberg implicitly assumes the Markov chain to also be aperiodic. Indeed, the simple 2-state Markov chain for the letters α1 and α2 described by \(\mathbb{P}(X_{n+1} = \alpha _{i}\vert X_{n} = \alpha _{j}) = 1\), for ij, produces a sequence of alternating letters, so that LI n is always either n ∕ 2 or \(n/2 + 1\), for n even, and \((n + 1)/2\), for n odd, and so has a degenerate limiting distribution. Even though this Markov chain is irreducible and cyclic, it is periodic.

By the end of this introduction, the reader might certainly have wondered how the binary results do get modified for ordered alphabets of arbitrary fixed size m. As shown in [10], for m = 3, Kuperberg’s conjecture is indeed true. However, for m ≥ 4, this is no longer the case; and some, but not all, cyclic Markov chains lead to a limiting law as in the iid uniform case.

2 Combinatorics

As in [9], one can express LI n in a combinatorial manner. For convenience, this short section recapitulates that development.

Let (X n ) n ≥ 1 consist of a sequence of values taken from an m-letter ordered alphabet, \(\mathcal{A}_{m} =\{ \alpha _{1} < \alpha _{2} < \cdots < \alpha _{m}\}\). Let a k r be the number of occurrences of α r among (X i )1 ≤ i ≤ k . Each increasing subsequence of (X i )1 ≤ i ≤ k consists simply of consecutive identical values, with these values forming an increasing subsequence of α r . Moreover, the number of occurrences of \(\alpha _{r} \in \{ \alpha _{1},\ldots,\alpha _{m}\}\) among (X i ) k + 1 ≤ i ≤  , where 1 ≤ k <  ≤ n, is simply a r − a k r. The length of the longest increasing subsequence of \(X_{1},X_{2},\ldots,X_{n}\) is thus given by

$$LI_{n} =\max\limits_{\stackrel{0\leq k_{1}\leq \cdots }{\leq k_{m-1}\leq n}}\left [\left (a_{k_{1}}^{1} - a_{ 0}^{1}\right ) + \left (a_{ k_{2}}^{2} - a_{ k_{1}}^{2}\right ) + \cdots + \left (a_{ n}^{m} - a_{ k_{m-1}}^{m}\right )\right ],$$
(23.1)

i.e.,

$$LI_{n} =\max\limits_{\stackrel{0\leq k_{1}\leq \cdots }{\leq k_{m-1}\leq n}}[(a_{k_{1}}^{1}-a_{ k_{1}}^{2})+(a_{ k_{2}}^{2}-a_{ k_{2}}^{3})+\cdots +(a_{ k_{m-1}}^{m-1}-a_{ k_{m-1}}^{m})+a_{ n}^{m}],$$
(23.2)

where a 0 r = 0.

For \(i = 1,\ldots,n\) and \(r = 1,\ldots,m - 1\), let

$$Z_{i}^{r} = \left \{\begin{array}{@{}l@{\quad }l@{}} 1, \quad &\text{if} X_{i} = \alpha _{r}, \\ -1,\quad &\text{if} X_{i} = \alpha _{\text{r+1},} \\ 0, \quad &\text{otherwise,} \end{array} \right.$$
(23.3)

and let \(S_{k}^{r} =\sum _{ i=1}^{k}Z_{i}^{r}\), \(k = 1,\ldots,n\), with also S 0 r = 0. Then clearly \(S_{k}^{r} = a_{k}^{r} - a_{k}^{r+1}\). Hence,

$$LI_{n} =\max\limits_{\stackrel{0\leq k_{1}\leq \cdots }{\leq k_{m-1}\leq n}}\left \{S_{k_{1}}^{1} + S_{ k_{2}}^{2} + \cdots + S_{ k_{m-1}}^{m-1} + a_{ n}^{m}\right \}.$$
(23.4)

By the telescoping nature of the sum \(\sum _{k=r}^{m-1}S_{n}^{k} =\sum _{ k=r}^{m-1}(a_{n}^{k} - a_{n}^{k+1})\), we find that, for each 1 ≤ r ≤ m − 1, \(a_{n}^{r} = a_{n}^{m} +\sum _{ k=r}^{m-1}S_{n}^{k}\). Since \(a_{k}^{1},\ldots,a_{k}^{m}\) must evidently sum up to k, we have

$$\displaystyle\begin{array}{lll} n&=&\displaystyle\sum\limits_{ r=1}^{m}a_{n}^{r}\\ & =&\displaystyle\sum\limits_{ r=1}^{m-1}\left (a_{n}^{m}+\displaystyle\sum\limits_{ k=r}^{m-1}S_{n}^{k}\right ) + a_{n}^{m}\\ & =&\displaystyle\sum\limits_{ r=1}^{m-1}rS_{n}^{r} + ma_{n}^{m}.\end{array}$$

Solving for a n m gives us

$$a_{n}^{m} = \frac{n} {m} - \frac{1} {m}\displaystyle\sum\limits_{r=1}^{m-1}rS_{ n}^{r}.$$

Substituting into Eq. (23.4), we finally obtain

$$LI_{n} = \frac{n} {m} - \frac{1} {m}\displaystyle\sum\limits_{r=1}^{m-1}rS_{ n}^{r} +\max\limits_{\stackrel{ 0\leq k_{1}\leq \cdots }{\leq k_{m-1}\leq n}}\left \{S_{k_{1}}^{1} + S_{ k_{2}}^{2} + \cdots + S_{ k_{m-1}}^{m-1}\right \}.$$
(23.5)

As emphasized in [9], Eq. (23.5) is of a purely combinatorial nature or, in more probabilistic terms, is of a pathwise nature. We now proceed to analyze Eq. (23.5) for a binary Markovian sequence.

3 Binary Markovian Alphabet

In the context of binary Markovian alphabets, (X n ) n ≥ 0 is described by the following transition probabilities between the two states (which we identify with the two letters α1 and α2): \(\mathbb{P}(X_{n+1} = \alpha _{2}\vert X_{n} = \alpha _{1}) = a\) and \(\mathbb{P}(X_{n+1} = \alpha _{1}\vert X_{n} = \alpha _{2}) = b\), where 0 < a + b < 2. We later examine the degenerate cases \(a = b = 0\) and \(a = b = 1\). In keeping with the common usage within the Markov chain literature, we begin our sequence at n = 0, although our focus will be on n ≥ 1. Denoting by (p n 1, p n 2) the vector describing the probability distribution on {α1, α2} at time n, we have

$$\left (\begin{array}{*{10}c} p_{n+1}^{1},p_{n+1}^{2} \end{array} \right ) = \left (\begin{array}{*{10}c} p_{n}^{1},p_{n}^{2} \end{array} \right )\left (\begin{array}{*{10}c} 1 - a& a\\ b &1 - b \end{array} \right ).$$
(23.6)

The eigenvalues of the matrix in Eq. (23.6) are λ1 = 1 and \(-1 < \lambda _{2} = 1 - a - b < 1\), with respective left eigenvectors \((\pi _{1},\pi _{2}) = (b/(a + b),a/(a + b))\) and (1, − 1). Moreover, (π1, π2) is also the stationary distribution. Given any initial distribution (p 0 1, p 0 2), we find that

$$\left (\begin{array}{*{10}c} p_{n}^{1},p_{n}^{2} \end{array} \right ) = \left (\begin{array}{*{10}c} \pi _{1},\pi _{2} \end{array} \right )+\lambda _{2}^{n}\frac{ap_{0}^{1} - bp_{ 0}^{2}} {a + b} \left (\begin{array}{*{10}c} 1,-1 \end{array} \right ) \rightarrow \left (\begin{array}{*{10}c} \pi _{1},\pi _{2} \end{array} \right ),$$
(23.7)

as n → , since | λ2 |  < 1.

Our goal is now to use these probabilistic expressions to describe the random variables Z k 1 and S k 1 defined in the previous section. (We retain the redundant superscript “1” in Z k 1 and S k 1 in the interest of uniformity.)

Setting \(\beta = ap_{0}^{1} - bp_{0}^{2}\), we easily find that

$$\displaystyle\begin{array}{rcl} \mathbb{E}Z_{k}^{1}& = (+1)\left (\pi _{ 1} + \frac{\beta } {a+b}\lambda _{2}^{k}\right ) + (-1)\left (\pi _{ 2} - \frac{\beta } {a+b}\lambda _{2}^{k}\right )& \\ & = \frac{b-a} {a+b} + 2 \frac{\beta } {a+b}\lambda _{2}^{k}, &\end{array}$$
(23.8)

for each 1 ≤ k ≤ n. Thus,

$$\mathbb{E}S_{k}^{1} = \frac{b - a} {a + b}k + 2\left ( \frac{\beta \lambda _{2}} {a + b}\right )\left (\frac{1 - \lambda _{2}^{k}} {1 - \lambda _{2}} \right ),$$
(23.9)

and so \(\mathbb{E}S_{k}^{1}/k \rightarrow (b - a)/(a + b)\), as k → .

Turning to the second moments of Z k 1 and S k 1, first note that \(\mathbb{E}{(Z_{k}^{1})}^{2} = 1\), since (Z k 1)2 = 1 a.s. Next, we consider \(\mathbb{E}Z_{k}^{1}Z_{\ell}^{1}\), for k < . Using the Markovian structure of (X n ) n ≥ 0, it quickly follows that

$$\displaystyle\begin{array}{rcl} & \mathbb{P}((X_{k},X_{\ell}) = (x_{k},x_{\ell})) & \\ & \qquad = \left \{\begin{array}{@{}l@{\quad }l@{}} \left (\pi _{1} + \lambda _{2}^{\ell-k} \frac{a} {a+b}\right )\left (\pi _{1} + \lambda _{2}^{k} \frac{\beta } {a+b}\right )\!,\quad &\text{if} (x_{k},x_{\ell}) = (\alpha _{1},\alpha _{1}), \\ \left (\pi _{1} - \lambda _{2}^{\ell-k} \frac{b} {a+b}\right )\left (\pi _{2} - \lambda _{2}^{k} \frac{\beta } {a+b}\right )\!,\quad &\text{if} (x_{k},x_{\ell}) = (\alpha _{1},\alpha _{2}), \\ \left (\pi _{2} - \lambda _{2}^{\ell-k} \frac{a} {a+b}\right )\left (\pi _{1} + \lambda _{2}^{k} \frac{\beta } {a+b}\right )\!,\quad &\text{if } (x_{k},x_{\ell}) = (\alpha _{2},\alpha _{1}), \\ \left (\pi _{2} + \lambda _{2}^{\ell-k} \frac{b} {a+b}\right )\left (\pi _{2} - \lambda _{2}^{k} \frac{\beta } {a+b}\right )\!,\quad &\text{if } (x_{k},x_{\ell}) = (\alpha _{2},\alpha _{2}). \end{array} \right.&\end{array}$$
(23.10)

For simplicity, we will henceforth assume that our initial distribution is the stationary one, i.e., (p 0 1, p 0 2) = (π1, π2). (This assumption can be dropped as explained in the Concluding Remarks of [10].) Under this assumption, β = 0, \(\mathbb{E}S_{k}^{1} = k\mu \), where \(\mu = \mathbb{E}Z_{k}^{1} = (b - a)/(a + b)\), and Eq. (23.10) simplifies to

$$\displaystyle\begin{array}{rcl} & \mathbb{P}((X_{k},X_{\ell}) = (x_{k},x_{\ell})) & \\ & \qquad = \left \{\begin{array}{@{}l@{\quad }l@{}} \left (\pi _{1} + \lambda _{2}^{\ell-k} \frac{a} {a+b}\right )\pi _{1},\quad &\text{if} (x_{k},x_{\ell}) = (\alpha _{1},\alpha _{1}), \\ \left (\pi _{1} - \lambda _{2}^{\ell-k} \frac{b} {a+b}\right )\pi _{2},\quad &\text{if} (x_{k},x_{\ell}) = (\alpha _{1},\alpha _{2}), \\ \left (\pi _{2} - \lambda _{2}^{\ell-k} \frac{a} {a+b}\right )\pi _{1},\quad &\text{if} (x_{k},x_{\ell}) = (\alpha _{2},\alpha _{1}), \\ \left (\pi _{2} + \lambda _{2}^{\ell-k} \frac{b} {a+b}\right )\pi _{2},\quad &\text{if} (x_{k},x_{\ell}) = (\alpha _{2},\alpha _{2}). \end{array} \right.&\end{array}$$
(23.11)

We can now compute \(\mathbb{E}Z_{k}^{1}Z_{\ell}^{1}\):

$$\displaystyle\begin{array}{lll}\mathbb{E}Z_{k}^{1}Z_{\ell}^{1}& = \mathbb{P}(Z_{ k}^{1}Z_{\ell}^{1} = +1) - \mathbb{P}(Z_{ k}^{1}Z_{\ell}^{1} = -1)\\ & = \mathbb{P}((X_{k},X_{\ell}) \in \{ (\alpha _{1},\alpha _{1}),(\alpha _{2},\alpha _{2})\})\\ & \quad -\mathbb{P}((X_{k},X_{\ell}) \in \{ (\alpha _{1},\alpha _{2}),(\alpha _{2},\alpha _{1})\})\\ & = \left (\pi _{1}^{2} + \lambda_{2}^{\ell-k} \frac{a} {a+b}\pi _{1} + \pi _{2}^{2} + \lambda _{2}^{\ell-k} \frac{b} {a+b}\pi _{2}\right )\\ & \quad -\left (\pi_{1}\pi _{2} - \lambda _{2}^{\ell-k} \frac{b} {a+b}\pi _{2} + \pi_{1}\pi _{2} - \lambda _{2}^{\ell-k} \frac{a} {a+b}\pi _{1}\right ) \\ & = \left (\pi _{1}^{2} + \pi _{2}^{2} + \frac{2ab}{{(a+b)}^{2}} \lambda _{2}^{\ell-k}\right ) -\left (2\pi _{1}\pi_{2} - \frac{2ab} {{(a+b)}^{2}} \lambda _{2}^{\ell-k}\right ) \\ & =\frac{{(b-a)}^{2}} {{(a+b)}^{2}} + \frac{4ab} {{(a+b)}^{2}} \lambda_{2}^{\ell-k}. \end{array}$$
(23.12)

Hence, recalling that β = 0,

$$\displaystyle\begin{array}{rcl}{ \sigma }^{2} := \mbox{ Var}Z_{ k}^{1}& = 1 -{\left (\frac{b-a} {a+b}\right )}^{2}& \\ & = \frac{4ab} {{(a+b)}^{2}}, &\end{array}$$
(23.13)

for all k ≥ 1 and, for k < , the covariance of Z k 1 and Z 1 is

$$\displaystyle\begin{array}{rcl} \mbox{ Cov}(Z_{k}^{1},Z_{\ell}^{1})& = \frac{{(b-a)}^{2}} {{(a+b)}^{2}} + {\sigma }^{2}\lambda _{2}^{\ell-k} -{\left (\frac{b-a} {a+b}\right )}^{2} = {\sigma }^{2}\lambda _{ 2}^{\ell-k}.&\end{array}$$
(23.14)

Proceeding to the covariance structure of S k 1, we first find that

$$\displaystyle\begin{array}{rcl} \mbox{ Var}S_{k}^{1}& =\displaystyle\sum\limits_{ j=1}^{k}\mbox{ Var}Z_{ j}^{1} + 2\displaystyle\sum\limits_{ j<\ell}\mbox{ Cov}(Z_{j}^{1},Z_{ l}^{1})& \\ & = {\sigma }^{2}k + 2{\sigma }^{2}\displaystyle\sum\limits_{j<\ell}\lambda _{2}^{\ell-j} & \\ & = {\sigma }^{2}k + 2{\sigma }^{2}\left (\frac{\lambda _{2}^{k+1}-k\lambda _{ 2}^{2}+(k-1)\lambda _{ 2}} {{(1-\lambda _{2})}^{2}} \right ) & \\ & = {\sigma }^{2}\left (\frac{1+\lambda _{2}} {1-\lambda _{2}} \right )k + 2{\sigma }^{2}\left (\frac{\lambda _{2}(\lambda _{2}^{k}-1)} {{(1-\lambda _{2})}^{2}} \right ). &\end{array}$$
(23.15)

Next, for k < , and using Eqs. (23.14) and (23.15), the covariance of S k 1 and S 1 is given by

$$\displaystyle\begin{array}{rcl} \mbox{ Cov}(S_{k}^{1},S_{\ell}^{1})& =\displaystyle\sum\limits_{ i=1}^{k}\displaystyle\sum\limits_{ j=1}^{\ell}\mbox{ Cov}(Z_{ i}^{1},Z_{ j}^{1}) & \\ & =\displaystyle\sum\limits_{ i=1}^{k}\mbox{ Var}Z_{i}^{1} + 2\displaystyle\sum\limits_{i<j<k}\mbox{ Cov}(Z_{i}^{1},Z_{j}^{1}) +\displaystyle\sum\limits_{ i=1}^{k}\displaystyle\sum\limits_{j=k+1}^{\ell}\mbox{ Cov}(Z_{i}^{1},Z_{j}^{1})& \\ & = \mbox{ Var}S_{k}^{1} +\displaystyle\sum\limits_{ i=1}^{k}\displaystyle\sum\limits_{j=k+1}^{\ell}\mbox{ Cov}(Z_{i}^{1},Z_{j}^{1}) & \\ & = \mbox{ Var}S_{k}^{1} + {\sigma }^{2}\left (\frac{\lambda _{2}(1-\lambda _{2}^{k})(1-\lambda _{ 2}^{\ell-k})} {{(1-\lambda _{2})}^{2}} \right ) & \\ & = {\sigma }^{2}\left (\left (\frac{1+\lambda _{2}} {1-\lambda _{2}} \right )k -\frac{\lambda _{2}(1-\lambda _{2}^{k})(1+\lambda _{ 2}^{\ell-k})} {{(1-\lambda _{2})}^{2}} \right ). &\end{array}$$
(23.16)

From Eqs. (23.15) and (23.16) we see that, as k → ,

$$\frac{\mbox{ Var}S_{k}^{1}} {k} \rightarrow {\sigma }^{2}\left (\frac{1 + \lambda _{2}} {1 - \lambda _{2}}\right ),$$
(23.17)

and, moreover, as k ∧  → ,

$$\frac{\mbox{ Cov}(S_{k}^{1},S_{\ell}^{1})} {(k\wedge \ell)} \rightarrow {\sigma }^{2}\left (\frac{1 + \lambda _{2}} {1 - \lambda _{2}}\right ).$$
(23.18)

When a = b, \(\mathbb{E}S_{k}^{1} = 0\), and in Eq. (23.17) the asymptotic variance becomes

$$\displaystyle\begin{array}{rcl} \frac{\mbox{ Var}S_{k}^{1}} {k} & \rightarrow \frac{4{a}^{2}} {{(2a)}^{2}} \left (\frac{1+(1-2a)} {1-(1-2a)}\right )& \\ & = \frac{1} {a} - 1. & \\ \end{array}$$

For a small, we have a “lazy” Markov chain, i.e., a Markov chain which tends to remain in a given state for long periods of time. In this regime, the random variable S k 1 has long periods of increase followed by long periods of decrease. In this way, linear asymptotics of the variance with large constants occur. If, on the other hand, a is close to 1, the Markov chain rapidly shifts back and forth between α1 and α2, and so the constant associated with the linearly increasing variance of S k 1 is small.

As in [9], Brownian functionals play a central rôle in describing the limiting distribution of LI n . To move towards a Brownian functional expression for the limiting law of LI n , define the polygonal function

$$\hat{B}_{n}(t) = \frac{S_{[nt]}^{1} - [nt]\mu } {\sigma \sqrt{n(1 + \lambda _{2 } )/(1 - \lambda _{2 } )}} + \frac{(nt - [nt])(Z_{[nt]+1}^{1} - \mu )} {\sigma \sqrt{n(1 + \lambda _{2 } )/(1 - \lambda _{2 } )}},$$
(23.19)

for 0 ≤ t ≤ 1. In our finite-state, irreducible, aperiodic, stationary Markov chain setting, we may conclude that \(\hat{B}_{n} \Rightarrow B\), as desired. (See, e.g., the more general settings for Gordin’s martingale approach to dependent invariance principles and the stationary ergodic invariance principle found in Theorem 19.1 of [5].)

Turning now to LI n , we see that for the present 2-letter situation, Eq. (23.5) simply becomes

$$LI_{n} = \frac{n} {2} -\frac{1} {2}S_{n}^{1} +\max\limits_{ 1\leq k\leq n}S_{k}^{1}.$$

To find the limiting distribution of LI n from this expression, recall that \(\pi _{1} = b/(a + b)\), \(\pi _{2} = a/(a + b)\), \(\mu = \pi _{1} - \pi _{2} = (b - a)/(a + b)\), \({\sigma }^{2} = 4ab/{(a + b)}^{2}\), and that \(\lambda _{2} = 1 - a - b\). Define πmax = max{π1, π2} and \(\tilde{{\sigma }}^{2} = {\sigma }^{2}(1 + \lambda _{2})/(1 - \lambda _{2})\). Rewriting Eq. (23.19) as

$$\hat{B}_{n}(t) = \frac{S_{[nt]}^{1} - [nt]\mu } {\tilde{\sigma }\sqrt{n}} + \frac{(nt - [nt])(Z_{[nt]+1}^{1} - \mu )} {\tilde{\sigma }\sqrt{n}},$$

 LI n becomes

$$\displaystyle\begin{array}{rcl} LI_{n}& = \frac{n} {2} -\frac{1} {2}\left (\tilde{\sigma }\sqrt{n}\hat{B}_{n}(1) + \mu n\right ) +\max\limits_{0\leq t\leq 1}\left (\tilde{\sigma }\sqrt{n}\hat{B}_{n}(t) + \mu nt\right ) & \\ & = n\pi _{2} -\frac{1} {2}\left (\tilde{\sigma }\sqrt{n}\hat{B}_{n}(1)\right ) +\max\limits_{0\leq t\leq 1}\left (\tilde{\sigma }\sqrt{n}\hat{B}_{n}(t) + (\pi _{1} - \pi _{2})nt\right )& \\ & = n\pi _{\mathrm{max}} -\frac{1} {2}\left (\tilde{\sigma }\sqrt{n}\hat{B}_{n}(1)\right ) & \\ & \quad +\max\limits_{0\leq t\leq 1}\left (\tilde{\sigma }\sqrt{n}\hat{B}_{n}(t) + (\pi _{1} - \pi _{2})nt - (\pi _{\mathrm{max}} - \pi _{2})n\right ). &\end{array}$$
(23.20)

This immediately gives

$$\displaystyle\begin{array}{rcl} \frac{LI_{n} - n\pi _{\mathrm{max}}} {\tilde{\sigma }\sqrt{n}} & = -\frac{1} {2}\hat{B}_{n}(1) & \\ & \quad +\max\limits_{0\leq t\leq 1}\left (\hat{B}_{n}(t) + \frac{\sqrt{n}} {\tilde{\sigma }} ((\pi _{1} - \pi _{2})t - (\pi _{\mathrm{max}} - \pi _{2}))\right ).&\end{array}$$
(23.21)

Let us examine Eq. (23.21) on a case-by-case basis. First, if \(\pi _{\mathrm{max}} = \pi _{1} = \pi _{2} = 1/2\), i.e., if a = b, then σ = 1 and \(\tilde{\sigma } = (1 - a)/a\), and so Eq. (23.21) becomes

$$\displaystyle\begin{array}{rcl} \frac{LI_{n} - n/2} {\sqrt{(1 - a)n/a}}& = -\frac{1} {2}\hat{B}_{n}(1) +\max\limits_{0\leq t\leq 1}\hat{B}_{n}(t).&\end{array}$$
(23.22)

Then, by the invariance principle and the continuous mapping theorem,

$$\frac{LI_{n} - n/2} {\sqrt{(1 - a)n/a}} \Rightarrow -\frac{1} {2}B(1) +\max\limits_{0\leq t\leq 1}B(t).$$
(23.23)

Next, if πmax = π2 > π1, Eq. (23.21) becomes

$$\displaystyle\begin{array}{rcl} \frac{LI_{n} - n\pi _{\mathrm{max}}} {\tilde{\sigma }\sqrt{n}} & = -\frac{1} {2}\hat{B}_{n}(1) & \\ & \quad +\max\limits_{0\leq t\leq 1}\left (\hat{B}_{n}(t) -\frac{\sqrt{n}} {\tilde{\sigma }} (\pi _{\mathrm{max}} - \pi _{1})t\right ).&\end{array}$$
(23.24)

On the other hand, if πmax = π1 > π2, Eq. (23.21) becomes

$$\displaystyle\begin{array}{rcl} \frac{LI_{n} - n\pi _{\mathrm{max}}} {\tilde{\sigma }\sqrt{n}} & = -\frac{1} {2}\hat{B}_{n}(1) & \\ & \quad +\max\limits_{0\leq t\leq 1}\left (\hat{B}_{n}(t) -\frac{\sqrt{n}} {\tilde{\sigma }} (\pi _{\mathrm{max}} - \pi _{2})(1 - t)\right ) & \\ & = \frac{1} {2}\hat{B}_{n}(1) & \\ & \quad +\max\limits_{0\leq t\leq 1}\left (\hat{B}_{n}(t) -\hat{B}_{n}(1) -\frac{\sqrt{n}} {\tilde{\sigma }} (\pi _{\mathrm{max}} - \pi _{2})(1 - t)\right ).&\end{array}$$
(23.25)

In both Eqs. (23.24) and (23.25) we have a term in our maximal functional which is linear in t or 1 − t, with a negative slope. We now show, in an elementary fashion, that in both cases, as n → , the maximal functional goes to zero in probability.

Consider first Eq. (23.24). Let \(c_{n} = \sqrt{n}(\pi _{\mathrm{max}} - \pi _{1})/\tilde{\sigma } > 0,\) and for any c > 0, let \(M_{c} =\max\limits_{0\leq t\leq 1}(B(t) - ct)\), where B(t) is a standard Brownian motion. Now for n large enough,

$$\hat{B}_{n}(t) - ct \geq \hat{B}_{n}(t) - c_{n}t$$

a.s., for all 0 ≤ t ≤ 1. Then for any z > 0 and n large enough,

$$\displaystyle\begin{array}{rcl} \mathbb{P}(\max\limits_{0\leq t\leq 1}(\hat{B}_{n}(t) - c_{n}t) > z)& \leq \mathbb{P}(\max\limits_{0\leq t\leq 1}(\hat{B}_{n}(t) - ct) > z),&\end{array}$$
(23.26)

and so by the Invariance Principle and the Continuous Mapping Theorem,

$$\displaystyle\begin{array}{rcl} \mathop{\lim\sup}\limits_{n\rightarrow \infty }\mathbb{P}(\max\limits_{0\leq t\leq 1}(\hat{B}_{n}(t) - c_{n}t) > z)& \leq \lim _{n\rightarrow \infty }\mathbb{P}(\max\limits_{0\leq t\leq 1}(\hat{B}_{n}(t) - ct) > z)& \\ & = \mathbb{P}(M_{c} > z). &\end{array}$$
(23.27)

Now, as is well known, \(\mathbb{P}(M_{c} > z) \rightarrow 0\) as c → . One can confirm this intuitive fact with the following simple argument. For z > 0, c > 0, and 0 < ε < 1, we have that

$$\displaystyle\begin{array}{rcl} \mathbb{P}(M_{c} > z)& \leq \mathbb{P}(\max\limits_{0\leq t\leq \varepsilon }(B(t) - ct) > z) + \mathbb{P}(\max\limits_{\varepsilon <t\leq 1}(B(t) - ct) > z)& \\ & \leq \mathbb{P}(\max\limits_{0\leq t\leq \varepsilon }B(t) > z) + \mathbb{P}(\max\limits_{\varepsilon <t\leq 1}(B(t) - c\varepsilon ) > z) & \\ & \leq \mathbb{P}(\max\limits_{0\leq t\leq \varepsilon }B(t) > z) + \mathbb{P}(\max\limits_{0<t\leq 1}B(t) > c\varepsilon + z) & \\ & = 2\left (1 - \Phi \left ( \frac{z} {\sqrt{\varepsilon }}\right )\right ) + 2\left (1 - \Phi (c\varepsilon + z)\right ). &\end{array}$$
(23.28)

But, as c and ε are arbitrary, we can first take the limsup of Eq. (23.28) as c → , and then let ε → 0, proving the claim.

We have thus shown that

$$\mathop{\lim\sup}\limits_{n\rightarrow \infty }\mathbb{P}(\max\limits_{0\leq t\leq 1}(\hat{B}_{n}(t) - c_{n}t) > z) \leq 0,$$

and since the functional clearly is equal to zero when t = 0, we have

$$\max\limits_{0\leq t\leq 1}(\hat{B}_{n}(t) - c_{n}t)\stackrel{\mathbb{P}}{ \rightarrow }0,$$
(23.29)

as n → . Thus, by the continuous mapping theorem and the converging together lemma, we obtain the weak convergence result

$$\frac{LI_{n} - n\pi _{\mathrm{max}}} {\tilde{\sigma }\sqrt{n}} \Rightarrow -\frac{1} {2}B(1).$$
(23.30)

Lastly, consider Eq. (23.25). Here we need simply note the following equality in law, which follows from the stationary and Markovian nature of the underlying sequence (X n ) n ≥ 0:

$$\displaystyle\begin{array}{rcl} \hat{B}_{n}(t) -\hat{B}_{n}(1)& -\frac{\sqrt{n}} {\tilde{\sigma }} (\pi _{\mathrm{max}} - \pi _{2})(1 - t) & \\ & \stackrel{\mathcal{L}}{=} -\hat{B}_{n}(1 - t) -\frac{\sqrt{n}} {\tilde{\sigma }} (\pi _{\mathrm{max}} - \pi _{2})(1 - t),&\end{array}$$
(23.31)

for \(t = 0,1/n,\ldots,(n - 1)/n,1\). With a change of variables \((u = 1 - t)\) and noting that B(t) and − B(t) are equal in law, our previous convergence result Eq. (23.29) implies that

$$\displaystyle\begin{array}{rcl} \max\limits_{0\leq t\leq 1}(\hat{B}_{n}(t) -\hat{B}_{n}(1) - c_{n}(1 - t))\stackrel{\mathcal{L}}{=}\max\limits_{0\leq u\leq 1}(-\hat{B}_{n}(u) - c_{n}u)\stackrel{\mathbb{P}}{ \rightarrow }0,& &\end{array}$$
(23.32)

as n → . Our limiting functional is thus of the form

$$\frac{LI_{n} - n\pi _{\mathrm{max}}} {\tilde{\sigma }\sqrt{n}} \Rightarrow \frac{1} {2}B(1).$$
(23.33)

Since B(1) is simply a standard normal random variable, the different signs in Eqs. (23.30) and (23.33) are inconsequential.

Finally, consider the degenerate cases. If either a = 0 or b = 0, then the sequence (X n ) n ≥ 0 will be a.s. constant, regardless of the starting state, and so LI n  ∼ n. On the other hand, if \(a = b = 1\), then the sequence oscillates back and forth between α1 and α2, so that LI n  ∼ n ∕ 2. Combining these trivial cases with the previous development gives

Theorem 3.1.

Let (X n)n≥0 be a 2-state Markov chain, with \(\mathbb{P}(X_{n+1}\,=\,\alpha _{2}\vert X_{n} = \alpha _{1})\) = a and \(\mathbb{P}(X_{n+1} = \alpha _{1}\vert X_{n} = \alpha _{2}) = b\) . Let the law of X 0 be the invariant distribution \((\pi _{1},\pi _{2}) = (b/(a + b),a/(a + b))\) , for 0 < a + b ≤ 2, and be (π 1 2 ) = (1,0), for \(a = b = 0\) . Then, for a = b > 0,

$$\frac{LI_{n} - n/2} {\sqrt{n}} \Rightarrow \sqrt{\frac{1 - a} {a}} \left (-\frac{1} {2}B(1) +\max\limits_{0\leq t\leq 1}B(t)\right )\!,$$
(23.34)

where (B(t)) 0≤t≤1 is a standard Brownian motion. For a≠b or \(a = b = 0,\)

$$\frac{LI_{n} - n\pi _{\mathrm{max}}} {\sqrt{n}} \Rightarrow N(0,\tilde{{\sigma }}^{2}/4),$$
(23.35)

with π max = max 1 2 } and where \(N(0,\tilde{{\sigma }}^{2}/4)\) is a centered normal random variable with variance \(\tilde{{\sigma }}^{2}/4 = ab(2 - a - b)/{(a + b)}^{3}\) , for a≠b, and \(\tilde{{\sigma }}^{2} = 0\) , for \(a = b = 0\) . (If \(a = b = 1\) or \(\tilde{{\sigma }}^{2} = 0\) , then the distributions in Eqs. (23.34) and (23.35), respectively, are understood to be degenerate at the origin.)

To extend this result to the entire RSK Young diagrams, let us introduce the following notation. By

$$(Y _{n}^{(1)},Y _{ n}^{(2)},\ldots,Y _{ n}^{(k)}) \Rightarrow (Y _{ \infty }^{(1)},Y _{ \infty }^{(2)},\ldots,Y _{ \infty }^{(k)}),$$
(23.36)

we shall indicate the weak convergence of the joint law of the k-vector \((Y _{n}^{(1)},Y _{n}^{(2)},\ldots,Y _{n}^{(k)})\) to that of \((Y _{\infty }^{(1)},Y _{\infty }^{(2)},\ldots,Y _{\infty }^{(k)})\), as n → . Since LI n is the length of the top row of the associated Young diagrams, the length of the second row is simply n − LI n . Denoting the length of the ith row by R n i, Eq. (23.36), together with an application of the Cramér–Wold theorem, recovers the result of Chistyakov and Götze [7] as part of the following easy corollary, which is in fact equivalent to Theorem 3.1:

Corollary 3.1.

For the sequence in Theorem 3.1, if a = b > 0, then

$$\left (\frac{R_{n}^{1} - n/2} {\sqrt{n}}, \frac{R_{n}^{2} - n/2} {\sqrt{n}} \right ) \Rightarrow Y _{\infty } := (R_{\infty }^{1},R_{ \infty }^{2}),$$
(23.37)

where the law of Y is supported on the second main diagonal of \({\mathbb{R}}^{2}\) and with

$$R_{\infty }^{1}\stackrel{\mathcal{L}}{=}\sqrt{\frac{1 - a} {a}} \left (-\frac{1} {2}B(1) +\max\limits_{0\leq t\leq 1}B(t)\right ).$$

If a≠b or \(a = b = 0\) , then setting π min = min 1 2 }, we have

$$\left (\frac{R_{n}^{1} - n\pi _{\mathrm{max}}} {\sqrt{n}}, \frac{R_{n}^{2} - n\pi _{\mathrm{min}}} {\sqrt{n}} \right ) \Rightarrow N((0,0),\tilde{\Sigma }),$$
(23.38)

where \(\tilde{\Sigma }\) is the covariance matrix

$$(\tilde{{\sigma }}^{2}/4)\left (\begin{array}{*{10}c} 1 &-1 \\ -1& 1 \end{array} \right ),$$

where \(\tilde{{\sigma }}^{2} = 4ab(2 - a - b)/{(a + b)}^{3}\) , for a≠b, and \(\tilde{{\sigma }}^{2} = 0\) , for \(a = b = 0\) .

Remark 3.1.

The joint distributions in Eqs. (23.37) and (23.38) are of course degenerate, in that the sum of the two components is a.s. identically zero in each case. In Eq. (23.37), the density of the first component of R is easy to find and is given by (e.g., see [11])

$$f(y) = \frac{16} {\sqrt{2\pi }}{\left ( \frac{a} {1 - a}\right )}^{3/2}{y}^{2}{\mathrm{e}}^{-2a{y}^{2}/(1-a) },\qquad y \geq 0.$$
(23.39)

As in Chistyakov and Götze [7], Eq. (23.37) can then be stated as follows: for any bounded, continuous function \(g : {\mathbb{R}}^{2} \rightarrow \mathbb{R}\),

$$\displaystyle\begin{array}{rcl} & \lim _{n\rightarrow \infty }\left (g\left ( \frac{R_{n}^{1}-n/2} {\sqrt{(1-a)n/a}}, \frac{R_{n}^{2}-n/2} {\sqrt{(1-a)n/a}}\right )\right ) & \\ & \qquad = 2\sqrt{2\pi }\displaystyle\int _{0}^{\infty }g(x,-x)\phi _{\mathrm{GUE},2}(x,-x)\mathrm{d}x,& \\ \end{array}$$

where ϕGUE, 2 is the density of the eigenvalues of the 2 ×2 GUE and is given by

$$\phi _{\mathrm{GUE},2}(x_{1},x_{2}) = \frac{1} {\pi }{(x_{1} - x_{2})}^{2}{\mathrm{e}}^{-(x_{1}^{2}+x_{ 2}^{2}) }.$$

To see the GUE connection more explicitly, consider the 2 ×2 traceless GUE matrix

$$M_{0} = \left (\begin{array}{*{10}c} X_{1} & Y + iZ \\ Y - iZ & X_{2} \end{array} \right ),$$

where X 1, X 2, Y, and Z are centered, normal random variables. Since \(\mbox{ Corr }(X_{1},X_{2}) = -1\), the largest eigenvalue of M 0 is

$$\lambda _{1,0} = \sqrt{X_{1 }^{2 } + {Y }^{2 } + {Z}^{2}},$$

almost surely, so that λ1, 0 2 ∼ χ3 2 if \(\mbox{ Var }X_{1} = \mbox{ Var }Y = \mbox{ Var }Z = 1\). Hence, up to a scaling factor, the density of λ1, 0 is given by Eq. (23.39). Next, let us perturb M 0 to

$$M = \alpha GI + \beta M_{0},$$

where α and β are constants, G is a standard normal random variable independent of M 0, and I is the identity matrix. The covariance of the diagonal elements of M is then computed to be \(\rho := {\alpha }^{2} - {\beta }^{2}\). Hence, to obtain a desired value of ρ, we may take \(\alpha = \sqrt{(1 + \rho )/2}\) and \(\beta = \sqrt{(1 - \rho )/2}.\) Clearly, the largest eigenvalue of M can then be expressed as

$$\lambda _{1} = \sqrt{\frac{1 + \rho } {2}} G + \sqrt{\frac{1 - \rho } {2}} \lambda _{1,0}.$$
(23.40)

At one extreme, \(\rho = -1\), we recover λ1 = λ1, 0. At the other extreme, ρ = 1, we obtain λ1 = Z. Midway between these two extremes, at ρ = 0, we have a standard GUE matrix, so that

$$\lambda _{1} = \sqrt{\frac{1} {2}}\left (G + \lambda _{1,0}\right ).$$