1 Introduction

Donsker-type functional limit theorems represent one of the key developments in probability theory. They express invariance principles for rescaled random walks of the form

$$\begin{aligned} S_{\lfloor nt \rfloor } = X_1 + \cdots + X_{\lfloor nt \rfloor }, \quad t\in [0,1]\,. \end{aligned}$$
(1.1)

Many extension of the original invariance principle exist, most notably allowing dependence between the steps \(X_i\), or showing, like Skorohod did, that non-Gaussian limits are possible if the steps \(X_i\) have infinite variance. For a survey of invariance principles in the case of dependent variables in the domain of attraction of the Gaussian law, we refer to [24], see also [10] for a thorough survey of mixing conditions. In the case of a non-Gaussian limit, the limit of the processes \((S_{\lfloor nt \rfloor })_{t\in [0,1]}\) is not a continuous process in general. Hence, the limiting theorems of this type are placed in the space of càdlàg functions denoted by D([0, 1]) under one of the Skorohod topologies. The topology denoted by \(J_1\) is the most widely used (often implicitely) and suitable for i.i.d. steps, but over the years many theorems involving dependent steps have been shown using other Skorohod topologies. Even in the case of a simple m-dependent linear process from a regularly varying distribution, it is known that the limiting theorem cannot be shown in the standard \(J_1\) topology, see Avram and Taqqu [2]. Moreover, there are examples of such processes for which none of the Skorohod topologies work, see Sect. 4.

However, as we found out, for all those processes and many other stochastic models relevant in applications, random walks do converge, but their limit exists in an entirely different space. To describe the elements of such a space we use the concept of decorated càdlàg functions and denote the corresponding space by E([0, 1]), following Whitt [34]. See Sect. 4. Presentation of this new type of limit theorem is the main goal of our article. For the statement of our main result see Theorem 4.5 in Sect. 4. As a related goal we also study the running maximum of the random walk \(S_{\lfloor nt \rfloor }\) for which, due to monotonicity, the limiting theorem can still be expressed in the familiar space D([0, 1]).

Our main analytical tool is the limit theory for point processes in a certain nonlocally compact space which is designed to preserve the order of the observations as we rescale time to interval [0, 1] as in (1.1). Observe that due to this scaling, successive observations collapse in the limit to the same time instance. As the first result in this context, we prove in Sect. 2 a limit theorem related to large deviations results of the type shown recently by [28] (cf. also [19]) and offer an alternative probabilistic interpretation of these results. Using our setup, we can group successive observations in the sequence \(\{X_i,\, i =1,\ldots , n\}\), in nonoverlapping clusters of increasing size to define a point process which completely preserves the information about the order among the observations. This allows us to show in a rather straightforward manner that so constructed empirical point processes converge in distribution towards a Poisson point process on an appropriate state space. The corresponding theorem could be arguably considered as the key result of the paper. It motivates all the theorems in the later sections and extends point process limit theorems in [5, 14], see Sect. 3.

Additionally, our method allows for the analysis of records and record times in the sequence of dependent stationary observations \(X_i\). By a classical result of Rényi, the number of records among first n iid observations from a continuous distribution grows logarithmically with n. Moreover it is known (see e.g. [30]) that record times rescaled by n tend to the so-called scale invariant Poisson process, which plays a fundamental role in several areas of probability, see [1]. For a stationary sequence with an arbitrary continuous marginal distribution, we show that the record times converge to a relatively simple compound Poisson process under certain restrictions on dependence. This form of the limit reflects the fact that for dependent sequences records tend to come in clusters, as one typically observes in many natural situations. This is the content of Sect. 5. Finally, proofs of certain technical auxiliary results are postponed to Sect. 6. In the rest of the introduction, we formally present the main ingredients of our model.

We now introduce our main assumptions and notation. Let \(\Vert \cdot \Vert \) denote an arbitrary norm on \(\mathbb {R}^d\) and let \(\mathbb {S}^{d-1}\) be the corresponding unit sphere. Recall that a d-dimensional random vector \(\varvec{X}\) is regularly varying with index \(\alpha > 0\) if there exists a random vector \(\varvec{\varTheta } \in \mathbb {S}^{d-1}\) such that

$$\begin{aligned} \frac{1}{\mathbb {P}(\Vert \varvec{X}\Vert> x)} \mathbb {P}(\Vert \varvec{X}\Vert > ux, \varvec{X}/ \Vert \varvec{X}\Vert \in \cdot )\Rightarrow u^{-\alpha } \mathbb {P}(\varvec{\varTheta }\in \cdot ), \end{aligned}$$
(1.2)

for every \(u>0\) as \(x \rightarrow \infty \), where \(\Rightarrow \) denotes the weak convergence of measures, here on \(\mathbb {S}^{d-1}\). An \(\mathbb {R}^d\)-valued time series is regularly varying if all the finite-dimensional vectors \((X_k,\ldots ,X_l)\), \(k\le l \in \mathbb {Z}\) are regularly varying, see [14] for instance. We will consider a stationary regularly varying process . The regular variation of the marginal distribution implies that there exists a sequence \(\{a_n\}\) which for all \(x>0\) satisfies

$$\begin{aligned} n\mathbb {P}(\Vert X_0\Vert > a_n x)\rightarrow x^{-\alpha }. \end{aligned}$$
(1.3)

If \(d=1\), it is known that

$$\begin{aligned} n\mathbb {P}(X_0 /a_n \in \cdot ) {\mathop {\longrightarrow }\limits ^{v}} \mu , \end{aligned}$$
(1.4)

where \({\mathop {\longrightarrow }\limits ^{v}}\) denotes vague convergence on \(\mathbb {R}{\setminus } \{0\}\) with the measure \(\mu \) on \(\mathbb {R}{\setminus }\{0\}\) given by

$$\begin{aligned} \mu (dy)=p\alpha y^{-\alpha -1} \mathbb {1}_{(0,\infty )}(y) \mathrm {d}y + (1-p)\alpha (-y)^{-\alpha -1} \mathbb {1}_{(-\infty ,0)}(y) \mathrm {d}y \end{aligned}$$
(1.5)

for some \(p\in [0,1]\).

According to [6], the regular variation of the stationary sequence is equivalent to the existence of an \(\mathbb {R}^d\)-valued time series called the tail process which satisfies \(\mathbb {P}(\Vert Y_0\Vert > y)=y^{-\alpha }\) for \(y \ge 1\) and, as \(x\rightarrow \infty \),

(1.6)

where \({\mathop {\longrightarrow }\limits ^{\mathrm {fi.di.}}}\) denotes convergence of finite-dimensional distributions. Moreover, the so-called spectral tail process , defined by \(\varTheta _t = Y_t/\Vert Y_0\Vert \), \(t \in \mathbb {Z}\), turns out to be independent of \(\Vert Y_0\Vert \) and satisfies

(1.7)

as \(x\rightarrow \infty \). If \(d=1\), it follows that p from (1.5) satisfies \(p=\mathbb {P}(\varTheta _0=1)=1-\mathbb {P}(\varTheta _0=-1)\).

We will often assume in addition that the following condition, referred to as the anticlustering or finite mean cluster length condition, holds.

Assumption 1.1

There exists a sequence of integers \((r_n)_{n\in \mathbb {N}}\) such that \(\lim _{n\rightarrow \infty } r_n =\lim _{n\rightarrow \infty } n/r_n =\infty \) and for every \(u > 0\),

$$\begin{aligned} \lim _{m \rightarrow \infty } \limsup _{n \rightarrow \infty } \mathbb {P}\biggl ( \max _{m \le |i| \le r_{n}} \Vert X_{i}\Vert> a_n u\,\bigg |\,\Vert X_{0}\Vert >a_n u \biggr ) = 0. \end{aligned}$$
(1.8)

There are many time series satisfying the conditions above including several nonlinear models like stochastic volatility or GARCH (see [26, Section 4.4]).

In the sequel, an important role will be played by the quantity \(\theta \) defined by

$$\begin{aligned} \theta = \mathbb {P}\left( \sup _{t\ge 1} \Vert Y_t\Vert \le 1\right) . \end{aligned}$$
(1.9)

It was shown in [6, Proposition 4.2] that Assumption 1.1 implies that \(\theta >0\).

2 Asymptotics of clusters

Let \(l_0\) be the space of double-sided \(\mathbb {R}^d\)-valued sequences converging to zero at both ends, i.e. . On \(l_0\) consider the uniform norm

$$\begin{aligned} \Vert \varvec{x}\Vert _\infty = \sup _{i\in \mathbb {Z}} \Vert x_i\Vert , \end{aligned}$$

which makes \(l_0\) into a separable Banach space. Indeed, \(l_0\) is the closure of all double-sided rational sequences with finitely many non zero terms in the Banach space of all bounded double-sided real sequences. Define the shift operator \(B\) on \(l_0\) by \((B\varvec{x})_i = x_{i+1}\) and introduce an equivalence relation \(\sim \) on \(l_0\) by letting \(\varvec{x}\sim \varvec{y}\) if \(\varvec{y}=B^k\varvec{x}\) for some \(k\in \mathbb {Z}\). In the sequel, we consider the quotient space

$$\begin{aligned} \tilde{l}_0= l_0/\sim , \end{aligned}$$

and define a function \(\tilde{d}:\tilde{l}_0\times \tilde{l}_0\longrightarrow [0,\infty )\) by

$$\begin{aligned} \tilde{d}(\tilde{\varvec{x}},\tilde{\varvec{y}}) = \inf \{\Vert \varvec{x}'-\varvec{y}'\Vert _\infty :\varvec{x}'\in \tilde{\varvec{x}},\varvec{y}'\in \tilde{\varvec{y}}\} = \inf \{\Vert B^k\varvec{x}-B^l\varvec{y}\Vert _\infty :k,l\in \mathbb {Z}\}. \end{aligned}$$

for all \(\tilde{\varvec{x}},\tilde{\varvec{y}}\in \tilde{l}_0\), and all \(\varvec{x}\in \tilde{\varvec{x}},\varvec{y}\in \tilde{\varvec{y}}\). The proof of the following result can be found in Sect. 6.

Lemma 2.1

The function \(\tilde{d}\) is a metric which makes \(\tilde{l}_0\) a separable and complete metric space.

One can naturally embed the set \(\cup _{d\ge 1} (\mathbb {R}^d)^n \cup l_0\) into \(\tilde{l}_0\) by mapping \(\varvec{x}\in l_0\) to its equivalence class and an arbitrary finite sequence \(\varvec{x}=(x_1,\ldots , x_n) \in (\mathbb {R}^d)^n\) to the equivalence class of the sequence

$$\begin{aligned} (\ldots ,0,0,\varvec{x},0,0,\ldots ), \end{aligned}$$

which adds zeros in front and after it.

Let be a sequence distributed as the tail process conditionally on the event \(\{\sup _{i\le -1} \Vert Y_i\Vert \le 1\}\) which, under Assumption 1.1, has a strictly positive probability (cf. [6, Proposition 4.2]). More precisely,

(2.1)

Since (1.8) implies that \(\mathbb {P}(\lim _{|t|\rightarrow \infty }\Vert Y_t\Vert = 0)=1\), see [6, Proposition 4.2], the sequence \(\{Z_t\}\) in (2.1) can be viewed as a random element in \(l_0\) and \(\tilde{l}_0\) in a natural way. In particular, the random variable

$$\begin{aligned} L_Z=\sup _{j\in \mathbb {Z}} \Vert Z_j\Vert , \end{aligned}$$

is a.s. finite and not smaller than 1 since \(\mathbb {P}(\Vert Y_0\Vert > 1)=1\). Due to regular variation and (1.8) one can show (see [7]) that for \(v\ge 1\)

$$\begin{aligned} \mathbb {P}(L_Z>v)= v^{-\alpha }. \end{aligned}$$

One can also define a new sequence in \(\tilde{l}_0\) as the equivalence class of

$$\begin{aligned} Q_t = Z_t / L_Z,\quad t\in \mathbb {Z}. \end{aligned}$$
(2.2)

Consider now a block of observations \(( X_1,\ldots , X_{r_n})\) and define \(M_{r_n} = \max _{1\le i \le r_n} \Vert X_i\Vert \). It turns out that conditionally on the event that \(M_{r_n} > a_n u\), the law of such a block has a limiting distribution and that \(L_Z\) and \(\{Q_t\}\) are independent.

Theorem 2.2

Under Assumption 1.1, for every \(u >0\),

as \(n \rightarrow \infty \) in \(\tilde{l}_0\). Moreover, and \(L_Z\) in (2.2) are independent random elements with values in \(\tilde{l}_0\) and \([0,\infty )\) respectively.

Proof

Step 1. We write \(\varvec{X}_n(i,j) =(X_{i},\ldots ,X_j)/a_nu\), \(\varvec{Y}(i,j)=(Y_{i},\ldots ,Y_{j})\) and \(M_{k,l}=\max _{k\le i\le l} \Vert X_i\Vert \), \(M_{k,l}^Y=\max _{k\le i\le l}\Vert Y_i\Vert \). By the Portmanteau theorem [8, Theorem 2.1], it suffices to prove that

(2.3)

for every nonnegative, bounded and uniformly continuous function g on \((\tilde{l}_0,\tilde{d})\).

Define the truncation \(\tilde{\varvec{x}}_\zeta \) at level \(\zeta \) of \(\tilde{\varvec{x}}\in \tilde{l}_0\) by putting all the coordinates of \(\tilde{\varvec{x}}\) which are no greater in norm than \(\zeta \) to zero, that is \(\tilde{\varvec{x}}_\zeta \) is the equivalence class of \((x_i\mathbb {1}_{\Vert x_i\Vert >\zeta })_{i\in \mathbb {Z}}\), where \(\varvec{x}\) is a representative of \(\tilde{\varvec{x}}\). Note that by definition, \(\tilde{d}(\tilde{\varvec{x}},\tilde{\varvec{x}}_\zeta )\le \zeta \).

For a function g on \(\tilde{l}_0\), define \(g_\zeta \) by \(g_\zeta (\tilde{\varvec{x}})=g(\tilde{\varvec{x}}_\zeta )\). If g is uniformly continuous, then for each \(\eta >0\), there exists \(\zeta \) such that \(|g(\tilde{\varvec{x}})-g(\tilde{\varvec{y}})|\le \eta \) if \(\tilde{d}(\tilde{\varvec{x}},\tilde{\varvec{y}})\le \zeta \), that is, \(\Vert g-g_\zeta \Vert _\infty \le \eta \). Thus it is sufficent to prove (2.3) for \(g_\zeta \) for \(\zeta \in (0,1)\).

One can now follow the steps of the proof of [6, Theorem 4.3]. Decompose the event \(\{M_{1,r_n} > a_nu\}\) according to the smallest \(j\in \{1,\ldots ,r_n\}\) such that \(\Vert X_j\Vert >a_nu.\) We have

$$\begin{aligned}&\mathbb {E}[g_\zeta (\varvec{X}_n(1,r_n)) ; M_{1,r_n} > a_nu ]\nonumber \\&\quad =\sum _{j=1}^{r_n} \mathbb {E}\left[ g_\zeta (\varvec{X}_n(1,r_n)) ; M_{1,j-1}\le a_nu<\Vert X_j\Vert \right] . \end{aligned}$$
(2.4)

Fix a positive integer m and let n be large enough so that \(r_n\ge 2m+1.\) By the definition of \(g_\zeta \), for all \(j\in \{m+1,\ldots ,r_n-m\}\) we have that

$$\begin{aligned} M_{1,j-m-1} \vee M_{j+m+1,r_n} \le a_nu \zeta \Rightarrow g_\zeta (\varvec{X}_n(1,r_n)) = g_\zeta (\varvec{X}_n(j-m,j+m)). \end{aligned}$$
(2.5)

The proof is now exactly along the same lines as the proof of [6, Theorem 4.3] and we omit some details. Using stationarity, the decomposition (2.4), the relation (2.5) and the boundedness of g, we have,

$$\begin{aligned}&\left| \mathbb {E}\left[ g_\zeta (\varvec{X}_n(1,r_n)) \mathbb {1}{\left\{ M_{1,r_n}> a_nu\right\} } \right] \right. \nonumber \\&\qquad \left. - r_n\mathbb {E}\left[ g_\zeta (\varvec{X}_n(-m,m)) \mathbb {1}{\left\{ M_{-m,-1} \le a_nu\right\} }\mathbb {1}{\left\{ \Vert X_0\Vert> a_nu\right\} } \right] \right| \nonumber \\&\quad \le 2m \Vert g\Vert _\infty \mathbb {P}(\Vert X_0\Vert>a_nu) + r_n\Vert g\Vert _\infty \mathbb {P}(M_{-r_n,-m-1} \nonumber \\&\qquad \vee M_{m+1,r_n}> a_nu; \Vert X_0\Vert >a_nu). \end{aligned}$$
(2.6)

Next define \(\theta _n = \mathbb {P}(M_{1,r_n}>a_nu) / \{r_n\mathbb {P}(\Vert X_0\Vert >a_nu)\}\). Under Assumption 1.1,

$$\begin{aligned} \lim _{n\rightarrow \infty } \theta _n = \mathbb {P}(\sup _{i\ge 1} \Vert Y_i\Vert \le 1) = \theta , \end{aligned}$$
(2.7)

where \(\theta \) was defined in (1.9). See [6, Proposition 4.2]. Therefore, by Assumption 1.1, (2.6) and (2.7) we conclude that

$$\begin{aligned}&\lim _{m\rightarrow \infty }\limsup _{n\rightarrow \infty }\bigg |\mathbb {E}[g_\zeta (\varvec{X}_n(1,r_n)) \mid M_{1,r_n}> a_nu ]\nonumber \\&\quad -\frac{1}{\theta _n}\mathbb {E}[g_\zeta (\varvec{X}_n(-m,m)) ; M_{-m,-1} \le a_nu \mid \Vert X_0\Vert >a_nu ] \bigg |=0. \end{aligned}$$
(2.8)

We now argue that, for every \(m\ge 1\),

$$\begin{aligned}&\lim _{n\rightarrow \infty } \mathbb {E}[g_\zeta (\varvec{X}_n(-m,m)); M_{-m,-1} \le a_nu \mid \Vert X_0\Vert > a_nu] \nonumber \\&\quad = \mathbb {E}[g_\zeta (\varvec{Y}(-m,m)); M_{-m,-1}^Y \le 1]. \end{aligned}$$
(2.9)

First observe that \(g_\zeta \), as a function on \((\mathbb {R}^d)^{2m+1}\), is continuous except maybe on the set \(D_\zeta ^{2m+1}=\{(x_1,\ldots ,x_{2m+1})\in (\mathbb {R}^d)^{2m+1};\Vert x_i\Vert =\zeta \text { for some } i\in \{1,\ldots , 2m+1\}\}\) and we have that \(\mathbb {P}(\varvec{Y}(-m,m)\in D_\zeta ^{2m+1})=0\) since \(\varvec{Y}=\Vert Y_0\Vert \varvec{\varTheta }\), \(\Vert Y_0\Vert \) and \(\varvec{\varTheta }\) are independent and the distribution of \(\Vert Y_0\Vert \) is Pareto therefore atomless. Observe similarly that the distribution of \(M_{k,l}^{Y}=\Vert Y_0\Vert \max _{k\le j\le l} \Vert \varTheta _j\Vert \) does not have atoms except maybe at zero. Therefore, since \(g_\zeta \) is bounded, (2.9) follows by the definition of the tail process and the continuous mapping theorem.

Finally, since \(\varvec{Y}(-m,m)\longrightarrow \varvec{Y}\) a.s. in \(\tilde{l}_0\) and since \(\varvec{Y}\) has only finitely many coordinates greater than \(\zeta \), \(g_\zeta (\varvec{Y}(-m,m))=g_\zeta (\varvec{Y})\) for large enough m, almost surely.

Thus, applying (2.8) and (2.9), we obtain by bounded convergence

$$\begin{aligned} \lim _{n\rightarrow \infty }&\mathbb {E}[g_\zeta (\varvec{X}_n(1,r_n)) \mid M_{1,r_n}> a_nu] \\&= \lim _{m\rightarrow \infty }\lim _{n\rightarrow \infty }\frac{1}{\theta _n} \mathbb {E}[g_\zeta (\varvec{X}_n(-m,m)); M_{-m,-1} \le a_nu \mid \Vert X_0\Vert > a_nu] \\&= \frac{1}{\theta } \lim _{m\rightarrow \infty } \mathbb {E}[g_\zeta (\varvec{Y}(-m,m)); M_{-m,-1}^Y\le 1 ] \\&= \frac{1}{\theta } \mathbb {E}[g_\zeta (\varvec{Y}); M_{-\infty ,-1}^Y \le 1]. \end{aligned}$$

Applying this to \(g\equiv 1\) we obtain \(\theta =\mathbb {P}(M_{-\infty ,-1}^Y \le 1)\). Hence (2.3) holds for \(g_\zeta \) as we wanted to show.

Step 2. Observing that the mapping \(\tilde{x} \mapsto (\tilde{x}, \Vert \varvec{x}\Vert _\infty )\) is continuous on \(\tilde{l}_0\), we obtain for every \(u>0\),

$$\begin{aligned} \mathcal {L}\left( \frac{X_1,\ldots , X_{r_n}}{a_nu}, \, \frac{M_{r_n}}{a_nu} \Big |\, M_{r_n} > a_nu\right) {\mathop {\longrightarrow }\limits ^{d}}\mathcal {L}\left( \{Z_t\}, \, L_Z \right) . \end{aligned}$$
(2.10)

Similarly, the mapping defined on \(\tilde{l}_0\times (0,\infty )\) by \((\tilde{x},b) \mapsto \tilde{x}/b\) is again continuous. Hence, (2.10) implies

$$\begin{aligned} \mathcal {L}\left( \frac{X_1,\ldots , X_{r_n}}{M_{r_n}}, \, \frac{M_{r_n}}{a_nu} \Big |\, M_{r_n} > a_nu\right) {\mathop {\longrightarrow }\limits ^{d}}\mathcal {L}\left( \{Q_t\}, \, L_Z \right) \end{aligned}$$
(2.11)

by the continuous mapping theorem. To show the independence between \(L_Z\) and \(\{Q_t\}\), it suffices to show

$$\begin{aligned} \mathbb {E}\left[ g\left( \{Q_t\}\right) \mathbb {1}_{\{L_Z>v\}} \right] =\mathbb {E}\left[ g\left( \{Q_t\}\right) \right] P({L_Z >v }), \end{aligned}$$
(2.12)

for an arbitrary uniformly continuous function g on \(\tilde{l}_0\) and \(v \ge 1\).

By (2.11), the left-hand side of (2.12) is the limit of

$$\begin{aligned} \mathbb {E}\left[ g \left( \frac{X_1,\ldots , X_{r_n}}{M_{r_n}} \right) \,\mathbb {1}_{\{(a_n)^{-1}M_{r_n}>v\}}\bigg \vert M_{r_n}>a_n\right] , \end{aligned}$$

which further equals

$$\begin{aligned} \mathbb {E}\left[ g \left( \frac{X_1,\ldots , X_{r_n}}{M_{r_n}} \right) \, \bigg \vert M_{r_n}>a_n v\right] \frac{\mathbb {P}(M_{r_n}>a_n v)}{\mathbb {P}(M_{r_n}>a_n )}. \end{aligned}$$

By (2.11), the first term in the product above tends to \(\mathbb {E}\left[ g\left( \{Q_t\}\right) \right] \) as \(n\rightarrow \infty \). On the other hand, by (2.7) and regular variation of \(\Vert X_0\Vert \), the second term tends to \(v^{-\alpha }=\mathbb {P}(L_Z>v)\). \(\square \)

3 The point process of clusters

In this section we prove our main result on the point process asymptotics for the sequence . Prior to that, we discuss the topology of \(w^\#\)-convergence.

3.1 Preliminaries on \(w^\#\)-convergence

To study convergence in distribution of point processes on the non locally-compact space \(\tilde{l}_0\) we use \(w^{\#}\)-convergence and refer to [11, Section A2.6.] and [12, Section 11.1.] for details. Let \(\mathbb {X}\) be a complete and separable metric space and let \(\mathcal {M}(\mathbb {X})\) denote the space of boundedly finite nonnegative Borel measures \(\mu \) on \(\mathbb {X}\), i.e. such that \(\mu (B)<\infty \) for all bounded Borel sets B. The subset of \(\mathcal {M}(\mathbb {X})\) of all point measures (that is measures \(\mu \) such that \(\mu (B)\) is a nonnegative integer for all bounded Borel sets B) is denoted by \(\mathcal {M}_p(\mathbb {X}).\) A sequence \(\{\mu _n\}\) in \(\mathcal {M}(\mathbb {X})\) is said to converge to \(\mu \) in the \(w^\#\)-topology, noted by \(\mu _n\rightarrow _{w^\#}\mu \), if

$$\begin{aligned} \mu _n(f)=\int f d\mu _n \rightarrow \int f d\mu =\mu (f), \end{aligned}$$

for every bounded and continuous function \(f:\mathbb {X}\rightarrow \mathbb {R}\) with bounded support. Equivalently ([11, Proposition A2.6.II.]), \(\mu _n\rightarrow _{w^\#}\mu \) refers to

$$\begin{aligned} \mu _n(B)\rightarrow \mu (B) \end{aligned}$$

for every bounded Borel set B with \(\mu (\partial B)=0.\) We note that when \(\mathbb {X}\) is locally compact, an equivalent metric can be chosen in which a set is relatively compact if and only if it is bounded, and \(w^\#\)-convergence coincides with vague convergence. We refer to [20] or [30] for details on vague convergence. The notion of \(w^\#\)-convergence is metrizable in such a way that \(\mathcal {M}(\mathbb {X})\) is Polish ([11, Theorem A2.6.III.(i)]). Denote by \(\mathcal {B}(\mathcal {M}(\mathbb {X}))\) the corresponding Borel sigma-field.

It is known, see [12, Theorem 11.1.VII], that a sequence \(\{N_n\}\) of random elements in \((\mathcal {M}(\mathbb {X}),\mathcal {B}(\mathcal {M}(\mathbb {X})))\), converges in distribution to N, denoted by \(N_n{\mathop {\longrightarrow }\limits ^{d}}N\), if and only if

$$\begin{aligned} (N_n(A_1),\ldots ,N_n(A_k)){\mathop {\longrightarrow }\limits ^{d}}(N(A_1),\ldots ,N(A_k))\quad \text {in}~\mathbb {R}^k, \end{aligned}$$

for all \(k\in \mathbb {N}\) and all bounded Borel sets \(A_1,\ldots ,A_k\) in \(\mathbb {X}\) such that \(N(\partial A_i)=0\) a.s. for all \(i=1,\ldots ,k\).

Remark 3.1

As shown in [12, Proposition 11.1.VIII], this is equivalent to the pointwise convergence of the Laplace functionals, that is, \(\lim _{n\rightarrow \infty }\mathbb {E}[e^{-N_n(f)}]=\mathbb {E}[e^{-N(f)}]\) for all bounded and continuous function f on \(\mathbb {X}\) with bounded support. It turns out that it is sufficient (and more convenient in our context) to verify the convergence of Laplace functionals for a smaller convergence determining family.

See the comments before Assumption 3.5.

3.2 Point process convergence

Consider now the space \(\tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}\) with the subspace topology. Following [21], we metrize the space \(\tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}\) with the complete metric

$$\begin{aligned} \tilde{d}'(\tilde{\varvec{x}},\tilde{\varvec{y}})=\left( \tilde{d}(\tilde{\varvec{x}},\tilde{\varvec{y}})\wedge 1\right) \vee \left| 1/\Vert \tilde{\varvec{x}}\Vert _\infty - 1/\Vert \tilde{\varvec{y}}\Vert _\infty \right| \end{aligned}$$

which is topologically equivalent to \(\tilde{d}\), i.e. it generates the same (separable) topology on \(\tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}\). However, a subset A of \(\tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}\) is bounded for \(\tilde{d}'\) if and only if there exists an \(\epsilon >0\) such that \(\tilde{x}\in A\) implies \(\Vert \tilde{\varvec{x}}\Vert _\infty >\epsilon \). Therefore, for measures \(\mu _n,\mu \in \mathcal {M}(\tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})\), \(\mu _n\rightarrow _{w^\#}\mu \) if \(\mu _n(f)\rightarrow \mu (f)\) for every bounded and continuous function f on \(\tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}\) such that for some \(\epsilon >0\), \(\Vert \tilde{\varvec{x}}\Vert _\infty \le \epsilon \) implies \(f(\tilde{\varvec{x}})=0\).

Remark 3.2

We note that under the metric \(\tilde{d}'\), \(w^\#\)-convergence coincides with the theory of \(M_0\)-convergence introduced in [18], further developed in [22] and with the corresponding point process convergence recently studied by [35].

Take now a sequence \(\{r_n\}\) as in Assumption 1.1, set \(k_{n} = \lfloor n / r_{n} \rfloor \) and define

$$\begin{aligned} \varvec{X}_{n,i} =(X_{(i-1)r_n+1},\ldots ,X_{ir_n})/a_n \end{aligned}$$

for \(i=1,\ldots ,k_n.\) As the main result of this section we show, under certain conditions, the point process of clusters \(N''_n\) defined by

$$\begin{aligned} N''_n = \sum _{i=1}^{k_n} \delta _{(i/k_n,\varvec{X}_{n,i})} \; \end{aligned}$$

restricted to \([0,1] \times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}\) (i.e. we ignore indices i with \(\varvec{X}_{n,i}=\varvec{0}\)), converges in distribution in \(\mathcal {M}_p([0,1] \times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})\) to a suitable Poisson point process.

We first prove a technical lemma which is also of independent interest, see Remark 3.4. Denote by \(\mathbb {S}=\{\tilde{\varvec{x}}\in \tilde{l}_0:\Vert \tilde{\varvec{x}}\Vert _\infty =1\}\) the unit sphere in \(\tilde{l}_0\) and define the polar decomposition \(\psi :\tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}\mapsto (0,+\infty )\times \mathbb {S}\) with \(\psi (\tilde{\varvec{x}})=(\Vert \tilde{\varvec{x}}\Vert _\infty ,\tilde{\varvec{x}}/\Vert \tilde{\varvec{x}}\Vert _\infty )\).

Lemma 3.3

Under Assumption 1.1, the sequence \(\nu _n=k_n \mathbb {P}(\varvec{X}_{n,1} \in \cdot )\) in \(\mathcal {M}(\tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})\) converges to \(\nu =\theta \left( d(- y^{-\alpha })\times \mathbb {P}_{\varvec{Q}}\right) \circ \psi \) in \(w^\#\)-topology and \(\mathbb {P}_{\varvec{Q}}\) is the distribution of \(\{Q_j\}\) defined in (2.2).

Proof

Let f be a bounded and continuous function on \(\tilde{l}_0\setminus \{\tilde{\varvec{0}}\}\) and \(\epsilon >0\) such that \(f(\tilde{\varvec{x}})=0\) if \(\Vert \tilde{\varvec{x}}\Vert _\infty \le \epsilon .\) Then \(\mathbb {E}[f(\varvec{X}_{n,1})] = \mathbb {E}[f(\varvec{X}_{n,1})\mathbb {1}_{\{M_{1,r_n}>\epsilon a_n\}}],\) so by (1.3), (2.7) and Theorem 2.2 we get

$$\begin{aligned} \lim _{n\rightarrow \infty } \nu _n(f)&=\lim _{n\rightarrow \infty } k_n\mathbb {E}[f(\varvec{X}_{n,1})] \\&= \lim _{n\rightarrow \infty } n \mathbb {P}(\Vert X_0\Vert>\epsilon a_n) \frac{\mathbb {P}(M_{1,r_n}>\epsilon a_n)}{r_n\mathbb {P}(\Vert X_0\Vert>\epsilon a_n)} \mathbb {E}[f(\varvec{X}_{n,1}) \mid M_{1,r_n}>\epsilon a_n] \\&= \epsilon ^{-\alpha }\theta \mathbb {E}[f(\epsilon \varvec{Z})]. \end{aligned}$$

Applying Theorem 2.2, the last expression is equal to

$$\begin{aligned} \epsilon ^{-\alpha }\theta \int _1^\infty \mathbb {E}[f(\epsilon y\varvec{Q})]\alpha y^{-\alpha -1}dy=\theta \int _\epsilon ^\infty \mathbb {E}[f(y\varvec{Q})]\alpha y^{-\alpha -1} dy \; . \end{aligned}$$

Finally, since \(\Vert \varvec{Q}\Vert _\infty =1\) a.s. and \(f(\tilde{\varvec{x}})=0\) if \(\Vert \tilde{\varvec{x}}\Vert _\infty \le \epsilon \) we have that

$$\begin{aligned} \lim _{n\rightarrow \infty } \nu _n(f)=\theta \int _0^\infty \mathbb {E}[f(y\varvec{Q})]\alpha y^{-\alpha -1}dy=\nu (f), \end{aligned}$$

by definition of \(\nu \). \(\square \)

Remark 3.4

The previous lemma is closely related to the large deviations result obtained in Mikosch and Wintenberger [28, Theorem 3.1]. For a class of functions f called cluster functionals, which can be directly linked to the functions we used in the proof of Lemma 3.3, they showed that

$$\begin{aligned}&\lim _{n\rightarrow \infty }\frac{\mathbb {E}[f(a_n^{-1}X_1,\dots ,a_n^{-1}X_{r_n})]}{r_n\mathbb {P}(\Vert X_0\Vert >a_n)}\nonumber \\&\quad =\int _0^\infty \mathbb {E}[f(y\{\varTheta _t, t\ge 0\})-f(y\{\varTheta _t, t\ge 1\})] \alpha y^{-\alpha -1} dy. \end{aligned}$$
(3.1)

However, for an arbitrary bounded measurable function \(f:\tilde{l}_0\rightarrow \mathbb {R}\) which is a.e. continuous with respect to \(\nu \) and such that for some \(\epsilon >0\), \(\Vert \tilde{\varvec{x}}\Vert _\infty \le \epsilon \) implies \(f(\tilde{\varvec{x}})=0\), Lemma 3.3 together with a continuous mapping argument and the fact that \(k_n^{-1} \sim r_n\mathbb {P}(\Vert X_0\Vert >a_n)\) yields

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\mathbb {E}[f(a_n^{-1}X_1,\ldots ,a_n^{-1}X_{r_n})]}{r_n\mathbb {P}(\Vert X_0\Vert >a_n)}=\lim _{n\rightarrow \infty } \nu _n(f)=\theta \int _0^\infty \mathbb {E}[f(y\varvec{Q})]\alpha y^{-\alpha -1}dy, \end{aligned}$$

which gives an alternative and arguably more interpretable expression for the limit in (3.1).

To show convergence of \(N''_n\) in \(\mathcal {M}_p([0,1]\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}),\) we will need to assume that, intuitively speaking, one can break the dependent series \(\{X_n, n\ge 1\}\) into asymptotically independent blocks.

Recall the notion of truncation of an element \(\tilde{\varvec{x}}\in \tilde{l}_0\) at level \(\epsilon >0\) denoted by \(\tilde{\varvec{x}}_\epsilon \), see paragraph following (2.3). We denote by \(\mathcal {F}_+\) the class of nonnegative functions f on \([0,1]\times \tilde{l}_0\) which satisfy \( f(t,\tilde{\varvec{0}}) = 0\) and \( f(t,\tilde{\varvec{x}}) = f(t, \tilde{\varvec{x}}_\epsilon )\) for some \(\epsilon >0\), and are continuous except maybe on the set \(\{\tilde{\varvec{x}}\in \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\} : \Vert x_j\Vert =\epsilon \; \text {for some}\; j\in \mathbb {Z}\;\text {where} \;(x_i)_{i\in \mathbb {Z}}\in \tilde{\varvec{x}}\}\). Thus, \(f \in \mathcal {F}_+\) depend only on coordinates greater in norm than \(\epsilon \). Using similar arguments as in the proof of Theorem 2.2, one can show that \(\mathcal {F}_+\) is convergence determining (in the sense of Remark 3.1).

Assumption 3.5

There exists a sequence of integers \(\{r_n,n\in \mathbb {N}\}\) such that \(\lim _{n\rightarrow \infty }r_n=\lim _{n\rightarrow \infty }n/r_n =\infty \) and

$$\begin{aligned} \lim _{n\rightarrow \infty } \left( \mathbb {E}[\mathrm {e}^{-N''_n(f)}] - \prod _{i=1}^{k_{n}} \mathbb {E}\biggl [ \exp \biggl \{ - f \biggl (\frac{i}{k_n},\varvec{X}_{n,i} \biggr ) \biggr \} \biggr ] \right) = 0 \; , \end{aligned}$$

for all \(f\in \mathcal {F}_+\).

This assumption is somewhat stronger than the related conditions in [14] or [5], cf. Condition 2.2 in the latter paper, since we consider functions of the whole cluster. Still, as we show in Lemma 6.2, \(\beta \)-mixing implies Assumption 3.5. Since sufficient conditions for \(\beta \)-mixing are well studied and hold for many standard time series models (linear processes are notable exception, note), one can avoid cumbersome task of checking the assumption above directly. Linear processes are considered separately in Sect. 3.3.

It turns out that the choice of \(\tilde{l}_0\) as the state space for clusters, together with the results above, does not only preserve the order of the observations within the cluster, but also makes the statement and the proof of the following theorem remarkably tidy and short.

Theorem 3.6

Let be a stationary \(\mathbb {R}^d\)-valued regularly varying sequence with tail index \(\alpha >0\), satisfying Assumption 1.1 and 3.5 for the same sequence \(\{r_n\}\). Assumption  Then \(N''_n {\mathop {\longrightarrow }\limits ^{d}}N''\) in \(\mathcal {M}_p([0,1]\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})\) where \(N''\) is a Poisson point process with intensity measure \(Leb \times \nu \) which can be expressed as

$$\begin{aligned} N''= \sum _{i=1}^\infty \delta _{(T_i, P_i\varvec{Q}_{i})}, \end{aligned}$$
(3.2)

where

  1. (i)

    \(\sum _{i=1}^\infty \delta _{(T_i,P_i)}\) is a Poisson point process on \([0,1]\times (0,\infty ]\) with intensity measure \(Leb \times d(-\theta y^{-\alpha })\);

  2. (ii)

    \(\{\varvec{Q}_i,\, i\ge 1\}\) is a sequence of i.i.d. elements in \(\mathbb {S}\), independent of \(\sum _{i=1}^\infty \delta _{(T_i, P_i)}\) and with common distribution equal to the distribution of \(\varvec{Q}\) in (2.2).

Proof

Let, for every \(n\in \mathbb {N}\), \(\{\varvec{X}_{n,i}^*,i=1,\ldots ,k_n\}\) be independent copies of \(\varvec{X}_{n,1}\) and define

$$\begin{aligned} N_n^* = \sum _{i=1}^{k_n} \delta _{(i/k_n, \varvec{X}_{n,i}^*)}. \end{aligned}$$
(3.3)

Since by the previous lemma, \(k_n \mathbb {P}(\varvec{X}_{n,1} \in \cdot )\rightarrow _{w^\#}\nu \) in \(\mathcal {M}(\tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})\), the convergence of \(N_n^*\) to \(N''\) in \(\mathcal {M}_p([0,1]\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})\) follows by an straightforward adaptation of [30, Proposition 3.21] (cf. [15, Lemma 2.2.(1)], see also [16, Theorem 2.4] or [33, Proposition 2.13]). Assumption 3.5 now yields that \(N''_n\) converge in distribution to the same limit since the convergence determining family \(\mathcal {F}_+\) consists of functions which are a.e. continuous with respect to the measure \(Leb\times \nu \). Finally, the representation of \(N''\) follows easily by standard Poisson point process transformation argument (see [30, Section 3.3.2.]). \(\square \)

Under the conditions of Theorem 3.6, as already noticed in [6, Remark 4.7], \(\theta =\mathbb {P}(\sup _{i\ge 1} \Vert Y_i\Vert \le 1)\) is also the extremal index of the time series \(\{\Vert X_j\Vert \}\), i.e. \(\lim _{n\rightarrow \infty } \mathbb {P}(a_n^{-1}\max _{1\le i \le n} \Vert X_i\Vert \le x) = \mathrm {e}^{-\theta x^{-\alpha }}\) for all \(x>0\).

Remark 3.7

Note that the restriction to the time interval [0, 1] is arbitrary and it could be substituted by an arbitrary closed interval [0, T].

3.3 Linear processes

As we observed above, \(\beta \)-mixing offers a way of establishing Assumption 3.5 for a wide class of time series models. However, for linear processes, the truncation method offers an alternative and simpler way to obtain the point process convergence stated in the previous theorem.

Let \(\{\xi _t,t\in \mathbb {Z}\}\) be a sequence of i.i.d. random variables with regularly varying distribution with index \(\alpha >0\). Consider the linear process \(\{X_t,t\in \mathbb {Z}\}\) defined by

$$\begin{aligned} X_t = \sum _{j\in \mathbb {Z}} c_j \xi _{t-j}, \end{aligned}$$

where \(\{c_j,j\in \mathbb {Z}\}\) is a sequence of real numbers such that \(|c_0|>0\) and

$$\begin{aligned} \sum _{j\in \mathbb {Z}} |c_j|^\delta< \infty \hbox { with } {\left\{ \begin{array}{ll} \delta<\alpha &{} \hbox { if } \alpha \in (0,1], \\ \delta <\alpha &{} \hbox { if } \alpha \in (1,2] \hbox { and } \mathbb {E}[\xi _0]=0, \\ \delta =2 &{} \hbox { if } \alpha>2 \hbox { and } \mathbb {E}[\xi _0]=0, \\ \delta = 1 &{} \hbox { if } \alpha >1 \hbox { and } \mathbb {E}[\xi _0]\ne 0. \end{array}\right. } \end{aligned}$$
(3.4)

These conditions imply that \(\sum _{j\in \mathbb {Z}} |c_j|^\alpha <\infty \). Furthermore, it has been proved in [25, Lemma A3] that the sequence \(\{X_t,t\in \mathbb {Z}\}\) is well defined, stationary and regularly varying with tail index \(\alpha \) and

$$\begin{aligned} \lim _{u\rightarrow \infty } \frac{\mathbb {P}(|X_0|>u)}{\mathbb {P}(|\xi _0|>u)}=\sum _{j\in \mathbb {Z}} |c_j|^\alpha . \end{aligned}$$
(3.5)

[25, Lemma A.4] proved that for \(\alpha \in (0,2]\), it is possible to take \(\delta =\alpha \) in (3.4) at the cost of some restrictions on the distribution of \(\xi _0\) which are satisfied for Pareto and stable distributions.

The spectral tail process \(\{\varTheta _t\}\) of the linear process was computed in [23]. It can be described as follows: let \(\varTheta ^\xi \) be an \(\{-1,1\}\)-valued random variable with distribution equal to the spectral measure of \(\xi _0\). Then

$$\begin{aligned} \mathcal {L}\left( \{\varTheta _t, t\in \mathbb {Z}\}\right) = \mathcal {L}\left( \left\{ \frac{c_{t+K}}{|c_K|} \varTheta ^\xi ,t\in \mathbb {Z}\right\} \right) , \end{aligned}$$
(3.6)

where \(K\) is an integer valued random variable, independent of \(\varTheta ^\xi \), such that

$$\begin{aligned} \mathbb {P}(K=n) = \frac{|c_n|^\alpha }{\sum _{j\in \mathbb {Z}} |c_j|^\alpha },\quad n\in \mathbb {Z}. \end{aligned}$$
(3.7)

It is also proved in [23] that the coefficient \(\theta \) from (1.9) is given by

$$\begin{aligned} \theta = \frac{\max _{j\in \mathbb {Z}} |c_j|^\alpha }{\sum _{j\in \mathbb {Z}} |c_j|^\alpha }. \end{aligned}$$

Since \(\lim _{|j|\rightarrow \infty } c_j=0\), random element in the space \(\mathbb {S}\) introduced in (2.2) is well defined and given by

$$\begin{aligned} \varvec{Q}= \left\{ \frac{\varTheta ^\xi c_j}{\max _{i\in \mathbb {Z}}|c_i|}, j\in \mathbb {Z}\right\} . \end{aligned}$$
(3.8)

The following proposition can be viewed as an extension of [13, Theorem 2.4] and also as a version of Theorem 3.6 adapted to linear processes.

Proposition 3.8

Let \(\{r_n\}\) be a nonnegative non decreasing integer valued sequence such that \(\lim _{n\rightarrow \infty } r_n =\lim _{n\rightarrow \infty }n/r_n =\infty \) and let \(\{a_n\}\) be a non decreasing sequence such that \(n\mathbb {P}(|X_0|>a_n)\rightarrow 1\). Then

$$\begin{aligned} N''_n= \sum _{i=1}^{k_n} \delta _{(i/k_n,(X_{(i-1)r_n+1},\ldots ,X_{ir_n})/a_n}){\mathop {\longrightarrow }\limits ^{d}}\sum _{i=1}^\infty \delta _{(T_i, P_i \varvec{Q}_{i})} \; \end{aligned}$$
(3.9)

in \(\mathcal {M}_p([0,1] \times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})\) where \(\sum _{i=1}^\infty \delta _{(T_i,P_i)}\) is a Poisson point process on \([0,1]\times (0,\infty ]\) with intensity measure \(Leb \times d(-\theta y^{- \alpha })\), independent of the i.i.d. sequence \(\{\varvec{Q}_{i},\,i\ge 1\}\) with values in \(\mathbb {S}\) and common distribution equal to the distribution of \(\varvec{Q}\) in (3.8).

Proof

The proof of (3.11) is based on a truncation argument which compares \(X_t\) with

$$\begin{aligned} X_t^{(m)} = \sum _{j=-m}^{m} c_j \xi _{t-j}, \ \ t\in \mathbb {Z}. \end{aligned}$$

Let \(\{b_n\}\) and \(\{a_{m,n}\}\) be non decreasing sequences such that \(n\mathbb {P}(|Z_0|>b_n)\rightarrow 1\) and \(n\mathbb {P}(|X_0^{(m)}|>a_{m,n})\rightarrow 1\). The limit (3.5) implies that \(a_{m,n}\sim \left( \sum _{j=-m}^m |c_j|^\alpha \right) ^{1/\alpha }b_n\). Let \(N''_{m,n}\) be the point process of exceedences of the truncated sequence defined by

$$\begin{aligned} N''_{m,n} = \sum _{i=1}^{k_n} \delta _{(i/k_n,(X_{(i-1)r_n+1}^{(m)},\ldots ,X_{ir_n}^{(m)})/b_n)}. \end{aligned}$$

The process \(\{X_t^{(m)}\}\) is (2m)-dependent (hence \(\beta \)-mixing with \(\beta _j=0\) for \(j>2m\)) and therefore satisfies the conditions of Theorem 3.6. Thus \(N''_{m,n}{\mathop {\longrightarrow }\limits ^{d}}N''_{(m)}\) with

$$\begin{aligned} N''_{(m)} = \sum _{i=1}^\infty \delta _{(T_i,P_i\varTheta _i^\xi \{c_j^{(m)}\})}, \end{aligned}$$

with \(\sum _{i=1}^\infty \delta _{(T_i,P_i)}\) a Poisson point process on \([0,1]\times (0,\infty )\) with mean measure \(Leb\times \mathrm {d}(-y^{-\alpha })\), \(c_j^{(m)} = c_j\) if \(|j|\le m\) and \(c_j=0\) otherwise, and \(\varTheta _i^\xi \), \(i\ge 1\) are i.i.d. copies of \(\varTheta ^\xi \). Since \(\{c_j^{(m)}\}\) converges to \(\{c_j\}\) in \(\tilde{l}_0\), it follows that

$$\begin{aligned} N''_{(m)} \rightarrow N''_\infty = \sum _{i=1}^\infty \delta _{(T_i, P_i \varTheta _i^\xi \{c_j\})}, \end{aligned}$$

almost surely in \(\mathcal {M}_p([0,1]\times \tilde{l}_0\setminus \{\varvec{0}\})\).

Define now

$$\begin{aligned} N''_{\infty ,n} = \sum _{i=1}^{k_n} \delta _{(i/k_n,(X_{(i-1)r_n+1},\dots ,X_{ir_n})/b_n)}. \end{aligned}$$

Then, for every bounded Lipschitz continuous function f defined on \([0,1]\times \tilde{l}_0{\setminus }\{\varvec{0}\}\) with bounded support,

$$\begin{aligned} \lim _{m\rightarrow \infty } \limsup _{n\rightarrow \infty } \mathbb {P}(|N''_{n,m}(f)-N''_{\infty ,n}(f)|>\eta ) = 0. \end{aligned}$$

As in the proof of [13, Theorem 2.4], this follows from

$$\begin{aligned} \lim _{m\rightarrow \infty }\limsup _{n\rightarrow \infty } \mathbb {P}(\max _{1\le i \le n}|X_i^{(m)}-X_i|>b_n\gamma )=0. \end{aligned}$$
(3.10)

for all \(\gamma >0\) which is implied by (3.5); see [13, Lemma 2.3].

This proves that

$$\begin{aligned} N''_{\infty ,n}{\mathop {\longrightarrow }\limits ^{d}}N''_\infty . \end{aligned}$$
(3.11)

and since \(N''_n\) and \(N''_{\infty ,n}\) differ only by a deterministic scaling of the points, this proves our result. \(\square \)

4 Convergence of the partial sum process

In order to study the convergence of the partial sum process in cases where it fails to hold in the usual space D, we first introduce an enlarged space E. In the rest of the paper we restrict to the case of \(\mathbb {R}\)-valued time series.

4.1 The space of decorated càdlàg functions

To establish convergence of the partial sum process of a dependent sequence \(\{X_n\}\) we will consider the function space \(E\equiv E([0,1],\mathbb {R})\) introduced in Whitt [34, Sections 15.4 and 15.5]. For the benefit of the reader, in what follows, we briefly introduce this space closely following the exposition in the previously mentioned reference.

The elements of E have the form

$$\begin{aligned} (x, J, \{I(t):t\in J\}) \end{aligned}$$

where

  • \(x\in D([0,1],\mathbb {R})\);

  • \(J\) is a countable subset of [0, 1] with \(Disc(x)\subseteq J\), where Disc(x) is the set of discontinuities of the càdlàg function x;

  • for each \(t\in J\), I(t) is a closed bounded interval (called the decoration) in \(\mathbb {R}\) such that \(x(t),x(t-)\in I(t)\) for all \(t\in J\).

Moreover, we assume that for each \(\epsilon >0,\) there are at most finitely many times t for which the length of the interval I(t) is greater than \(\epsilon .\) This ensures that the graphs of elements in E,  defined below, are compact subsets of \(\mathbb {R}^2\) which allows one to impose a metric on E by using the Hausdorff metric on the space of graphs of elements in E.

Note that every triple \((x,J,\{I(t):t\in J\})\) can be equivalently represented by a set-valued function

$$\begin{aligned} x'(t):= {\left\{ \begin{array}{ll} I(t) &{} \hbox { if } t \in J, \\ \{x(t)\} &{} \hbox { if } t\not \in J, \end{array}\right. } \end{aligned}$$

or by the graph of \(x'\) defined by

$$\begin{aligned} \varGamma _{x'}:=\{(t,z)\in [0,1]\times \mathbb {R}:z\in x'(t)\}. \end{aligned}$$

In the sequel, we will usually denote the elements of E by \(x'\).

Let \(m\) denote the Hausdorff metric on the space of compact subsets of \(\mathbb {R}^d\) (regardless of dimension) i.e. for compact subsets AB

$$\begin{aligned} m(A,B)=\sup _{x\in A}\Vert x-B\Vert _{\infty } \vee \sup _{y\in B}\Vert y-A\Vert _{\infty }, \end{aligned}$$

where \(\Vert x-B\Vert _{\infty }=\inf _{y\in B}\Vert x-y\Vert _\infty \). We then define a metric on E, denoted by \(m_E\), by

$$\begin{aligned} m_E(x',y') = m(\varGamma _{x'},\varGamma _{y'}). \end{aligned}$$
(4.1)

We call the topology induced by \(m_E\) on E the \(M_2\) topology. This topology is separable, but the metric space \((E,m_E)\) is not complete. Also, we define the uniform metric on E by

$$\begin{aligned} m^*(x',y')=\sup _{0\le t \le 1} m(x'(t),y'(t)), \end{aligned}$$
(4.2)

Obviously, \(m^*\) is a stronger metric than \(m_E\), i.e. for any \(x',y'\in E\),

$$\begin{aligned} m_E(x',y')\le m^*(x',y'). \end{aligned}$$
(4.3)

We will often use the following elementary fact: for \(a\le b\) and \(c\le d\) it holds that

$$\begin{aligned} m([a,b],[c,d])\le |c-a|\vee |d-b|. \end{aligned}$$
(4.4)

By a slight abuse of notation, we identify every \(x\in D\) with an element in E represented by

$$\begin{aligned} (x , Disc(x) , \{[x(t-),x(t)]: t\in Disc(x) \}), \end{aligned}$$

where for any two real numbers ab by [ab] we denote the closed interval \([\min \{a,b\},\max \{a,b\}]\). Consequently, we identify the space D with the subset \(D'\) of E given by

$$\begin{aligned} D'=\{x'\in E:J=Disc(x)\; \text {and for all} \; t\in J, \; I(t)=[x(t-),x(t)]\}. \end{aligned}$$
(4.5)

For an element \(x'\in D'\) we have

$$\begin{aligned} \varGamma _{x'}=\varGamma _{x}, \end{aligned}$$

where \(\varGamma _{x}\) is the completed graph of x. Since the \(M_2\) topology on D corresponds to the Hausdorff metric on the space of the completed graphs \(\varGamma _{x}\), the map \(x\rightarrow (x , Disc(x) , \{[x(t-),x(t)]: t\in Disc(x) \})\) is a homeomorphism from D endowed with the \(M_2\) topology onto \(D'\) endowed with the \(M_2\) topology. This yields the following lemma.

Lemma 4.1

The space D endowed with the \(M_2\) topology is homeomorphic to the subset \(D'\) in E with the \(M_2\) topology.

Remark 4.2

Because two elements in E can have intervals at the same time point, addition in E is in general not well behaved. However, problems disappear if one of the summands is a continuous function. In such a case, the sum is naturally defined as follows: consider an element \(x'=(x,J,\{I(t) : t\in J\})\) in E and a continuous function b on [0, 1],  we define the element \(x'+b\) in E by

$$\begin{aligned} x'+b = (x+b,J,\{I(t)+b(t): t\in J\}). \end{aligned}$$

We now state a useful characterization of convergence in \((E,m_E)\) in terms of the local-maximum function defined for any \(x'\in E\) by

$$\begin{aligned} M_{t_1,t_2}(x'):=\sup \{z:z\in x'(t), t_1\le t \le t_2\}, \end{aligned}$$
(4.6)

for \(0\le t_1 < t_2 \le 1\).

Theorem 4.3

(Theorem 15.5.1 Whitt [34]) For elements \(x_n',x'\in E\) the following are equivalent:

  1. (i)

    \(x_n'\rightarrow x'\) in \((E,m_E),\) i.e. \(m_E(x_n',x')\rightarrow 0.\)

  2. (ii)

    For all \(t_1<t_2\) in a countable dense subset of [0, 1],  including 0 and 1, 

    $$\begin{aligned} M_{t_1,t_2}(x_n')\rightarrow M_{t_1,t_2}(x')\quad \text {in}~\mathbb {R}\end{aligned}$$

    and

    $$\begin{aligned} M_{t_1,t_2}(-x_n')\rightarrow M_{t_1,t_2}(-x')\quad \text {in}~\mathbb {R}. \end{aligned}$$

4.2 Invariance principle in the space E

Consider the partial sum process in D([0, 1]) defined by

$$\begin{aligned} S_n(t) = \sum _{i=1}^{\lfloor nt \rfloor } \frac{X_i}{a_n},\quad t\in [0,1], \end{aligned}$$

and define also

$$\begin{aligned} V_n(t)= {\left\{ \begin{array}{ll} S_n(t) &{} \hbox { if } 0< \alpha< 1, \\ S_n(t)-\lfloor nt \rfloor \mathbb {E}\left( \frac{X_1}{a_n}\mathbb {1}_{\{|X_1|/a_n\le 1\}}\right) &{} \hbox { if } 1 \le \alpha < 2. \end{array}\right. } \end{aligned}$$

As usual, when \(1 \le \alpha <2\), an additional condition is needed to deal with the small jumps.

Assumption 4.4

For all \(\delta >0\),

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \limsup _{n\rightarrow \infty } \mathbb {P}\biggl ( \max _{1 \le k \le n} \biggl | \sum _{i=1}^k \{X_{i}\mathbb {1}_{\{|X_i|\le a_n \epsilon \}} -\mathbb {E}[X_i \mathbb {1}_{||X_{i}| \le a_n\epsilon \}}] \} \biggr | > a_n\delta \biggr ) = 0. \end{aligned}$$
(4.7)

It is known from [14] that the finite dimensional marginal distributions of \(V_n\) converge to those of an \(\alpha -\)stable Lévy process. This result is strengthened in [5] to convergence in the \(M_1\) topology if \(Q_{j}Q_{j'}\ge 0\) for all \(j\ne j'\in \mathbb {Z},\) i.e. if all extremes within one cluster have the same sign. In the next theorem, we remove the latter restriction and establish the convergence of the process \(V_n\) in the space E.

For that purpose, we assume only regular variation of the sequence and the conclusion of Theorem 3.6, i.e.

$$\begin{aligned} N''_n{\mathop {\longrightarrow }\limits ^{d}}N''= \sum _{i=1}^\infty \delta _{(T_i, P_i\varvec{Q}_{i})} \end{aligned}$$
(4.8)

in \(\mathcal {M}_p([0,1]\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})\), where \(\sum _{i=1}^\infty \delta _{(T_i,P_i)}\) is a Poisson point process on \([0,1]\times (0,\infty ]\) with intensity measure \(Leb \times d(-\theta y^{-\alpha })\) with \(\theta >0\) and \(\varvec{Q}_i=\{Q_{i,j},j\in \mathbb {Z}\}\), \(i\ge 1\) are i.i.d. sequences in \(\tilde{l}_0\), independent of \(\sum _{i=1}^\infty \delta _{(T_i, P_i)}\) and such that \(\mathbb {P}(\sup _{j\in \mathbb {Z}} |Q_{1,j}|=1)=1\). Denote by \(\varvec{Q}=\{Q_j,j\in \mathbb {Z}\}\) a random sequence the with the distribution equal to the distribution of \(\varvec{Q}_1\). We also describe the limit of \(V_n\) in terms of the point process \(N''\).

The convergence (4.8) and Fatou’s lemma imply that \(\theta \sum _{j\in \mathbb {Z}} \mathbb {E}[|Q_j|^\alpha ] \le 1\), as originally noted by [14, Theorem 2.6]. This implies that for \(\alpha \in (0,1]\),

$$\begin{aligned} \mathbb {E}\left[ \left( \sum _{j\in \mathbb {Z}} |Q_j| \right) ^\alpha \right] < \infty . \end{aligned}$$
(4.9)

For \(\alpha >1\), this will have to be assumed. Furthermore, the case \(\alpha =1\), as usual, requires additional care. We will assume that

$$\begin{aligned} \mathbb {E}\left[ \sum _{j\in \mathbb {Z}} |Q_j| \log \left( |Q_j|^{-1} \sum _{i\in \mathbb {Z}} |Q_i|\right) \right] < \infty , \end{aligned}$$
(4.10)

where we use the convention \(|Q_j| \log \left( |Q_j|^{-1} \sum _{i\in \mathbb {Z}} |Q_i|\right) =0\) if \(|Q_j|=0\). Fortunately, it turns out that conditions (4.9) and (4.10) are satisfied in most examples. See Remark 4.8 below.

Theorem 4.5

Let be a stationary \(\mathbb {R}\)-valued regularly varying sequence with tail index \(\alpha \in (0,2)\) and assume that the convergence in (4.8) holds. If \(\alpha \ge 1\) let Assumption 4.4 hold. For \(\alpha >1\), assume that (4.9) holds, and for \(\alpha =1\), assume that (4.10) holds. Then

$$\begin{aligned} V_n{\mathop {\longrightarrow }\limits ^{d}}V'=\left( V , \{T_i \}_{i\in \mathbb {N}} , \{I(T_i)\}_{i\in \mathbb {N}} \right) , \end{aligned}$$

with respect to \(M_2\) topology on \(E([0,1],\mathbb {R})\), where

  1. (i)

    V is an \(\alpha \)-stable Lévy process on [0, 1] given by

    $$\begin{aligned} V(\cdot )&= \sum _{T_i\le \cdot }\sum _{j\in \mathbb {Z}} P_iQ_{i,j}, \ 0< \alpha < 1, \end{aligned}$$
    (4.11a)
    $$\begin{aligned} V(\cdot )&= \lim _{\epsilon \rightarrow 0} \left( \sum _{T_i\le \cdot }\sum _{j\in \mathbb {Z}} P_iQ_{i,j}\mathbb {1}_{\{|P_iQ_{i,j}|>\epsilon \}} - (\cdot ) \int _{\epsilon<|x|\le 1} x \mu (\mathrm {d}x) \right) , \ 1 \le \alpha < 2, \end{aligned}$$
    (4.11b)

    where the series in (4.11a) is almost surely absolutely summable and the holds uniformly on [0, 1] almost surely (along some subsequence) with \(\mu \) given in (1.5).

  2. (ii)

    For all \(i\in \mathbb {N},\)

    $$\begin{aligned} I(T_i)=V(T_i-)+P_i\left[ \inf _{k\in \mathbb {Z}}\sum _{j\le k} Q_{i,j},\sup _{k\in \mathbb {Z}}\sum _{j\le k} Q_{i,j}\right] . \end{aligned}$$

Before proving the theorem, we make several remarks. We first note that for \(\alpha <1\), convergence of the point process is the only assumption of theorem. Further, an extension of Theorem 4.5 to multivariate regularly varying sequences would be possible at the cost of various technical issues (similar to those in [34, Section 12.3] for the extension of \(M_1\) and \(M_2\) topologies to vector valued processes) and one would moreover need to alter the definition of the space E substantially and introduce a new and weaker notion of \(M_2\) topology.

Remark 4.6

If (4.9) holds, the sums \(W_i=\sum _{j\in \mathbb {Z}}|Q_{i,j}|\) are almost surely well-defined and \(\{W_i, i\ge 1\}\) is a sequence of i.i.d. random variables with \(\mathbb {E}[W_i^\alpha ] <\infty \). Furthermore, by independence of \(\sum _{i=1}^\infty \delta _{P_i}\) and \(\{W_i\},\) it follows that \(\sum _{i=1}^\infty \delta _{P_i W_i}\) is a Poisson point process on \((0,\infty ]\) with intensity measure \(\theta \mathbb {E}[W_1^\alpha ] \alpha y^{-\alpha -1}dy.\) In particular, for every \(\delta >0\) there a.s. exist at most finitely many indices i such that \(P_iW_i >\delta \). Also, this implies that

$$\begin{aligned} \sup _{i\in \mathbb {N}} \sum _{j\in \mathbb {Z}} P_i|Q_{i,j}|\mathbb {1}_{\{P_i|Q_{i,j}|\le \epsilon \}} \rightarrow 0 \end{aligned}$$
(4.12)

almost surely as \(\epsilon \rightarrow 0.\) These facts will be used several times in the proof.

Remark 4.7

The Lévy process V from Theorem 4.5 is the weak limit in the sense of finite dimensional distributions of the partial sum process \(V_n\), characterized by

$$\begin{aligned}&\log \mathbb {E}[\mathrm {e}^{\mathrm {i}z V(1)}]\nonumber \\&\quad = {\left\{ \begin{array}{ll} \mathrm {i}az + \varGamma (1-\alpha )\cos (\pi \alpha /2) \sigma ^\alpha |z|^\alpha \{1 - \mathrm {i}\phi \mathrm {sgn}(z) \tan (\pi \alpha /2)\} &{} \alpha \ne 1, \\ \mathrm {i}az - \frac{\pi }{2} \sigma |z|\{1 - \mathrm {i}\frac{2}{\pi }\phi \mathrm {sgn}(z) \log (|z|)\} &{} \alpha =1, \end{array}\right. } \end{aligned}$$
(4.13)

with, denoting \(x^{\langle \alpha \rangle } = x|x|^{\alpha -1}=x_+^\alpha - x_{-}^\alpha \),

$$\begin{aligned} \sigma ^\alpha&= \theta \mathbb {E}\left[ \left| \sum _{j\in \mathbb {Z}}Q_j\right| ^\alpha \right] , \ \ \phi = \frac{\mathbb {E}[(\sum _{j\in \mathbb {Z}}Q_j)^{\langle \alpha \rangle }]}{\mathbb {E}[|\sum _{j\in \mathbb {Z}}Q_j|^\alpha ]} \end{aligned}$$

and

  1. (i)

    \(a=0\) if \(\alpha < 1\);

  2. (ii)

    \(a=(\alpha -1)^{-1}\alpha \theta \mathbb {E}\left[ \sum _{j\in \mathbb {Z}} Q_j^{\langle \alpha \rangle }\right] \) if \(\alpha >1\);

  3. (iii)

    if \(\alpha =1\), then

    $$\begin{aligned} a= & {} \theta \left( c_0 \mathbb {E}\left[ \sum _{j\in \mathbb {Z}} Q_j\right] - \mathbb {E}\left[ \sum _{j\in \mathbb {Z}} Q_j \log \left( \left| \sum _{j\in \mathbb {Z}} Q_j \right| \right) \right] \right. \\&\qquad \left. -\,\mathbb {E}\left[ \sum _{j\in \mathbb {Z}} Q_j \log \left( \left| Q_{j} \right| ^{-1}\right) \right] \right) , \end{aligned}$$

    with \(c_0 = \int _{0}^{\infty }(\sin y - y\mathbb {1}_{(0,1]}(y))y^{-2} \mathrm {d}y\).

These parameters were computed in [14, Theorem 3.2] but with a complicated expression for the location parameter \(a\) in the case \(\alpha =1\) (see [14, Remark 3.3]). The explicit expression given here, which holds under the assumption (4.10), is new; it is obtained by direct calculation of the characteristic function of V(1) following the steps of the proof of Lemma 6.4. As often done in the literature, if the sequence is assumed to be symmetric then assumption (4.10) is not needed and the location parameter is 0.

Remark 4.8

During the revision process for this paper, [29] showed that the quantity \(\theta \) from (1.9) is positive whenever \(\mathbb {P}(\lim _{|t|\rightarrow \infty }|Y_t|=0)=1\), and \(\theta =\mathbb {P}(\sup _{i\le -1}|Y_i|\le 1)\). In particular, the sequence \(\varvec{Q}\) from (2.2) is well defined in this case and moreover, by [29, Lemma 3.11], the condition (4.9) turns out to be equivalent to

$$\begin{aligned} \mathbb {E}\left[ \left( \sum _{j=0}^\infty |\varTheta _j| \right) ^{\alpha -1} \right] < \infty \; \end{aligned}$$
(4.14)

which is automatic if \(\alpha \in (0,1]\). Furthermore, if \(\alpha =1\), [29, Lemma 3.14] shows that the condition (4.10) is then equivalent to

$$\begin{aligned} \mathbb {E}\left[ \log \left( \sum _{j=0}^\infty |\varTheta _j| \right) \right] < \infty . \end{aligned}$$
(4.15)

These conditions are easier to check than conditions (4.9) and (4.10) since it is easier to determine the distribution of the spectral tail process than the distribution of the process \(\varvec{Q}\) from (2.2). In fact, it suffices to determine only the distribution of the forward spectral tail process \(\{\varTheta _j,j\ge 0\}\) which is often easier than determining the distribution of the whole spectral tail process. For example, it follows from the proof of [27, Theorem 3.2] that for functions of Markov chains satisfying a suitable drift condition, (4.14) and (4.15) hold for all \(\alpha >0\). Also, notice that for the linear process \(\{X_t\}\) from Sect. 3.3 these conditions are satisfied if \(\sum _{j\in \mathbb {Z}} |c_j|<\infty \).

Moreover, [29, Corollary 3.12 and Lemma 3.14] imply that the scale, skewness and location parameters from Remark 4.7 can also be expressed in terms of the forward spectral tail process as follows:

$$\begin{aligned} \sigma ^\alpha&= \mathbb {E}\left[ \left| \sum _{j=0}^\infty \varTheta _j\right| ^\alpha - \left| \sum _{j=1}^\infty \varTheta _j\right| ^\alpha \right] , \\ \phi&= \sigma ^{-\alpha } \mathbb {E}\left[ \left( \sum _{j=0}^\infty \varTheta _j\right) ^{\langle \alpha \rangle } - \left( \sum _{j=1}^\infty \varTheta _j\right) ^{\langle \alpha \rangle } \right] , \end{aligned}$$

\(a=0\) if \(\alpha < 1\), \(a= (\alpha -1)^{-1} \alpha \mathbb {E}[\varTheta _0]\) if \(\alpha >1\) (see (6.8)) and

$$\begin{aligned} a&= c_0 \mathbb {E}[\varTheta _0] - \mathbb {E}\left[ \sum _{j=0}^\infty \varTheta _j \log \left( \left| \sum _{j=0}^\infty \varTheta _j\right| \right) -\sum _{j=1}^\infty \varTheta _j \log \left( \left| \sum _{j=1}^\infty \varTheta _j\right| \right) \right] , \end{aligned}$$

if \(\alpha =1\). It can be shown that these expressions coincide for \(\alpha \ne 1\) with those in the literature, see e.g. [28, Theorem 4.3]. As already noted, the expression of the location parameter for \(\alpha =1\) under the assumption (4.10) (or (4.15)) is new.

Example 4.9

Consider again the linear process \(\{X_t\}\) from Sect. 3.3. For infinite order moving average processes, [13] proved convergence of the finite dimensional distributions of the partial sum process; [2] proved the functional convergence in the \(M_1\) topology (see [34, Section 12.3]) when \(c_j\ge 0\) for all j; using the S topology (which is weaker than the \(M_1\) topology and makes the supremum functional not continuous), [3] proved the corresponding result under more general conditions on the sequence \(\{c_j\}\) in the case \(\alpha \le 1\).

Our Theorem 4.5 directly applies to the case of a finite order moving average process. To consider the case of an infinite order moving average process, we assume for simplicity that \(\alpha <1\). Applying Theorem 4.5 to the point process convergence in (3.9), one obtains the convergence of the partial sum process \(V_n{\mathop {\longrightarrow }\limits ^{d}}V'=\left( V , \{T_i \}_{i\in \mathbb {N}} , \{I(T_i)\}_{i\in \mathbb {N}} \right) \) in (E, [0, 1]) where

$$\begin{aligned} V(\cdot )=\frac{\sum _{j\in \mathbb {Z}} c_j}{\max _{j\in \mathbb {Z}} |c_j| } \sum _{T_i\le \cdot }P_i \varTheta _i^\xi , \end{aligned}$$

and

$$\begin{aligned} I(T_i)= V(T_i-)+\frac{P_i\varTheta _i^\xi }{\max _{j\in \mathbb {Z}} |c_j| } \left[ \inf _{k\in \mathbb {Z}} \sum _{j\le k } c_j,\sup _{k\in \mathbb {Z}} \sum _{j\le k } c_j\right] . \end{aligned}$$

For an illustration consider the process

$$\begin{aligned} X_t = \xi _{t} + c \xi _{t-1}. \end{aligned}$$

In the case \(c \ge 0\), the convergence of partial sum process in \(M_1\) topology follows from [2]. On the other hand, for negative c’s convergence fails in any of Skorohod’s topology, but partial sums do have a limit in the sense described by our theorem as can be also guessed from Fig. 1.

Fig. 1
figure 1

A simulated sample path of the process \(S_n\) in the case of linear sequence \(X_t = \xi _{t} - 0.7 \xi _{t-1}\) with index of regular variation \(\alpha =0.7\) in blue. Observe that due to downward “corrections” after each large jump, in the limit the paths of the process \(S_n\) cannot converge to a càdlàg function

Remark 4.10

We do not exclude the case \(\sum _{j\in \mathbb {Z}}Q_{j}=0\) with probability one, as happens for instance in Example 4.9 with \(c=-1\). In such a case, the càdlàg component V is simply the null process.

Example 4.11

Consider a stationary GARCH(1, 1) process

$$\begin{aligned} X_t=\sigma _t Z_t,\;\; \sigma _t^2=\alpha _0 + \alpha _1X_{t-1}^2+\beta _1 \sigma _{t-1}^2,\;\; t\in \mathbb {Z}, \end{aligned}$$

where \(\alpha _0,\alpha _1,\beta _1>0,\) and is a sequence of i.i.d. random variables with mean zero and variance one. Under mild conditions the process \(\{X_t\}\) is regularly varying and satisfies Assumptions 1.1 and 3.5. These hold for instance in the case of standard normal innovations \({Z}_t\) and sufficiently small parameters \(\alpha _1, \beta _1\), see [26, Section 4.4]. Consider for simplicity such a stationary GARCH(1, 1) process with tail index \(\alpha \in (0,1).\) Since all the conditions of Theorem 4.5 are met, its partial sum process has a limit in the space E (cf. Fig. 2).

Fig. 2
figure 2

A simulated sample path of the process \(S_n\) in the case of GARCH(1, 1) process with parameters \(\alpha _0=0.01,\alpha _1=1.45\) and \(\beta _1=0.1,\) and tail index \(\alpha \) between 0.5 and 1

Proof of Theorem 4.5

The proof is split into the cases \(\alpha <1\) which is simpler, and the case \(\alpha \in [1,2)\) where centering and truncation introduce additional technical difficulties.

  1. (a)

    Assume first that \(\alpha \in (0,1).\) We divide the proof into several steps.

Step 1 For every \(\epsilon >0\), consider the functions \(s^\epsilon ,u^\epsilon \) and \(v^\epsilon \) defined on \(\tilde{l}_0\) by

$$\begin{aligned} s^\epsilon (\tilde{\varvec{x}})= & {} \sum _j x_j \mathbb {1}_{\{|x_j|>\epsilon \}},\quad u^\epsilon (\tilde{\varvec{x}}) = \inf _k \sum _{j\le k} x_j \mathbb {1}_{\{|x_j|>\epsilon \}}, \\ v^\epsilon (\tilde{\varvec{x}})= & {} \sup _k \sum _{j\le k} x_j \mathbb {1}_{\{|x_j|>\epsilon \}}, \end{aligned}$$

and define the mapping \(T^\epsilon :\mathcal {M}_p([0,1]\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}) \rightarrow E\) by setting, for \(\gamma =\sum _{i=1}^\infty \delta _{t_i,\tilde{\varvec{x}}^i},\)

$$\begin{aligned} T^\epsilon \gamma = \left( \left( \sum _{t_i\le t} s^\epsilon (\tilde{\varvec{x}}^i)\right) _{t\in [0,1]},\{t_i : \Vert \tilde{\varvec{x}}^i\Vert _\infty>\epsilon \} , \{I(t_i):\Vert \tilde{\varvec{x}}^i\Vert _\infty >\epsilon \} \right) , \end{aligned}$$

where

$$\begin{aligned} I(t_i)=\sum _{t_j<t_i} s^\epsilon (\tilde{\varvec{x}}^j) +\left[ \sum _{t_k=t_i} u^\epsilon (\tilde{\varvec{x}}^k) , \sum _{t_k=t_i} v^\epsilon (\tilde{\varvec{x}}^k)\right] . \end{aligned}$$

Since m belongs \(\mathcal {M}_p([0,1]\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}),\) there is only a finite number of points \((t_i,\tilde{\varvec{x}}^i)\) such that \(\Vert \tilde{\varvec{x}}^i\Vert _\infty >\epsilon \) and furthermore, every \(\tilde{\varvec{x}}^i\) has at most finitely many coordinates greater than \(\epsilon .\) Therefore, the mapping \(T^\epsilon \) is well-defined, that is, \(T^\epsilon \gamma \) is a proper element of E.

Next, we define the subsets of \(\mathcal {M}_p([0,1]\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})\) by

$$\begin{aligned} \Lambda _1&= \left\{ \sum _{i=1}^\infty \delta _{t_i,\tilde{\varvec{x}}^i}: |x_j^i|\ne \epsilon , i\ge 1, j\in \mathbb {Z}\right\} , \\ \Lambda _2&=\left\{ \sum _{i=1}^\infty \delta _{t_i,\tilde{\varvec{x}}^i}: 0<t_i<1 \text { and } t_i\ne t_j \text { for every } i>j\ge 1\right\} . \end{aligned}$$

We claim that \(T^\epsilon \) is continuous on the set \(\Lambda _1 \cap \Lambda _2.\) Assume that \(\gamma _n \rightarrow _{w^\#} \gamma =\sum _{i=1}^\infty \delta _{t_i,\tilde{\varvec{x}}^i} \in \Lambda _1 \cap \Lambda _2\). By an adaptation of [30, Proposition 3.13], this convergence implies that the finitely many points of \(\gamma _n\) in every set B bounded for \(\tilde{d}'\) and such that \(\gamma (\partial B)=0\) converge pointwise to the finitely many points of \(\gamma \) in B. In particular, this holds for \(B=\{(t,\tilde{\varvec{x}})\, : \, \Vert \tilde{\varvec{x}}\Vert _\infty >\epsilon \}\) and it follows that for all \(t_1<t_2\) in [0, 1] such that \(\gamma (\{t_1,t_2\}\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})=0\),

$$\begin{aligned} M_{t_1,t_2}(T^\epsilon (\gamma _n))\rightarrow M_{t_1,t_2}(T^\epsilon (\gamma ))\quad \text {in}~\mathbb {R}\end{aligned}$$

and

$$\begin{aligned} M_{t_1,t_2}(-T^\epsilon (\gamma _n))\rightarrow M_{t_1,t_2}(-T^\epsilon (\gamma ))\quad \text {in}~\mathbb {R}, \end{aligned}$$

whith the local-maximum function \(M_{t_1,t_2}\) defined as in (4.6). Since the set of all such times is dense in [0, 1] and includes 0 and 1, an application of Theorem 4.3 gives that

$$\begin{aligned} T^\epsilon (\gamma _n) \rightarrow T^\epsilon (\gamma ) \end{aligned}$$

in E endowed with the \(M_2\) topology.

Recall the point process \(N''= \sum _{i=1}^\infty \delta _{(T_i, P_i\varvec{Q}_{i})}\) from (4.8). Since the mean measure of \(\sum _{i=1}^\infty \delta _{(T_i,P_i)}\) does not have atoms, it is clear that \(N''\in \Lambda _1\cap \Lambda _2\) a.s. Therefore, by the convergence \(N''_n{\mathop {\longrightarrow }\limits ^{d}}N''\) and the continuous mapping argument

$$\begin{aligned} \tilde{S}_{n,\epsilon }'{\mathop {\longrightarrow }\limits ^{d}}S_\epsilon ', \end{aligned}$$

where \(\tilde{S}_{n,\epsilon }'=T^\epsilon (N_n'')\) and \(S_\epsilon '=T^\epsilon (N'').\)

Step 2 Recall that \(W_i=\sum _{j\in \mathbb {Z}}|Q_{i,j}|\) and \(\sum _{i=1}^\infty \delta _{P_i W_i}\) is a Poisson point process on \((0,\infty ]\) with intensity measure \(\theta \mathbb {E}[W_i^\alpha ] \alpha y^{-\alpha -1}dy\) (see Remark 4.6). Since \(\alpha <1\), one can sum up the points \(\{P_iW_i\},\) i.e.

$$\begin{aligned} \sum _{i=1}^\infty P_i W_i = \sum _{i=1}^\infty \sum _{j\in \mathbb {Z}}P_i |Q_{i,j}|<\infty \; \text { a.s.} \end{aligned}$$
(4.16)

Therefore, defining \(s (\tilde{\varvec{x}}) = \sum _j x_j\), we obtain that the process

$$\begin{aligned} V(t)=\sum _{T_i\le t} s(P_i \varvec{Q}_{i}),\quad t\in [0,1], \end{aligned}$$

is almost surely a well-defined element in D and moreover, it is an \(\alpha \)-stable Lévy process. Further, we define an element \(V'\) in \(E([0,1],\mathbb {R})\) by

$$\begin{aligned} V'= \left( V , \{T_i \}_{i\in \mathbb {N}} , \{I(T_i)\}_{i\in \mathbb {N}} \right) , \end{aligned}$$
(4.17)

where

$$\begin{aligned} I(T_i)&= V(T_i-) +\left[ u(P_i \varvec{Q}_{i}) , v(P_i \varvec{Q}_{i})\right] , \\ u(\tilde{\varvec{x}})&= \inf _k \sum _{j\le k} x_j, \ \ v(\tilde{\varvec{x}}) = \sup _k \sum _{j\le k} x_j. \end{aligned}$$

Since for every \(\delta >0\) there are at most finitely many points \(P_i W_i\) such that \(P_i W_i >\delta \) and \(diam(I(T_i))=v(P_i \varvec{Q}_{i})-u(P_i \varvec{Q}_{i})\le P_i W_i,\) \(V'\) is indeed a proper element of E a.s.

We now show that, as \(\epsilon \rightarrow 0,\) the limits \(S_\epsilon '\) from the previous step converge to \(V'\) in (Em) almost surely. Recall the uniform metric \(m^*\) on E defined in (4.2). By (4.16) and dominated convergence theorem

$$\begin{aligned} m^*(S_\epsilon ',V')=\sup _{0\le t \le 1} m(S_\epsilon '(t),V'(t))\le \sum _{i=1}^\infty \sum _{j\in \mathbb {Z}}P_i |Q_{ij}|\mathbb {1}_{\{P_i |Q_{ij}| \le \epsilon \}}\rightarrow 0, \end{aligned}$$
(4.18)

almost surely as \(\epsilon \rightarrow 0\). Indeed, let \(S_\epsilon \) be the càdlàg part of \(S_\epsilon ',\) i.e.

$$\begin{aligned} S_\epsilon (t)=\sum _{T_i\le t}s^\epsilon (P_i \varvec{Q}_i)=\sum _{T_i\le t}\sum _{j\in \mathbb {Z}}P_i Q_{i,j}\mathbb {1}_{\{|P_iQ_{i,j}|>\epsilon \}},\quad t\in [0,1]. \end{aligned}$$

If \(t\notin \{T_i\}\) then

$$\begin{aligned} m(S_\epsilon '(t),V'(t))=|S_\epsilon (t)-V(t)|\le \sum _{T_i\le t}\sum _{j\in \mathbb {Z}}P_i|Q_{i,j}|\mathbb {1}_{\{P_i |Q_{ij}| \le \epsilon \}}. \end{aligned}$$

Further, when \(t=T_k\) for some \(k\in \mathbb {Z},\) by using (4.4) we obtain

$$\begin{aligned} m(S_\epsilon '(t),V'(t))&\le |(S_\epsilon (T_k-)+v^\epsilon (P_k\varvec{Q}_k))-(V(T_k-)+v(P_k\varvec{Q}_k)) | \nonumber \\&\vee |(S_\epsilon (T_k-)+u^\epsilon (P_k\varvec{Q}_k))-(V(T_k-)+u(P_k\varvec{Q}_k)) |.\nonumber \\ \end{aligned}$$
(4.19)

The first term on the right-hand side of the equation above is bounded by

$$\begin{aligned}&\left| \sum _{T_i< T_k}\sum _{j\in \mathbb {Z}}P_i Q_{i,j}\mathbb {1}_{\{P_i |Q_{ij}| \le \epsilon \}}\right| +\left| \sup _{l\in \mathbb {Z}} \sum _{j\le l}P_k Q_{k,j}\mathbb {1}_{\{P_i |Q_{i,j}|> \epsilon \}}- \sup _{l\in \mathbb {Z}} \sum _{j\le l}P_k Q_{k,j}\right| \\&\quad \le \left| \sum _{T_i< T_k}\sum _{j\in \mathbb {Z}}P_i Q_{i,j}\mathbb {1}_{\{P_i |Q_{ij}| \le \epsilon \}}\right| +\sup _{l\in \mathbb {Z}}\left| \sum _{j\le l}P_k Q_{k,j}\mathbb {1}_{\{P_i |Q_{k,j}| > \epsilon \}}- \sum _{j\le l}P_k Q_{k,j}\right| \\&\quad \le \sum _{T_i\le T_k}\sum _{j\in \mathbb {Z}}P_i|Q_{i,j}|\mathbb {1}_{\{P_i |Q_{i,j}|\le \epsilon \}}. \end{aligned}$$

Since, by similar arguments, one can obtain the same bound for the second term on the right-hand side of (4.19), (4.18) holds.

It now follows from (4.3) that

$$\begin{aligned} S_\epsilon ' \rightarrow V' \end{aligned}$$

almost surely in (Em).

Step 3 Recall that

$$\begin{aligned} \varvec{X}_{n,i} =(X_{(i-1)r_n+1},\ldots ,X_{ir_n})/a_n \end{aligned}$$

for \(i=1,\ldots ,k_n\) and let \(\tilde{S}_n'\) be an element in E defined by

$$\begin{aligned} \left( \left( \sum _{i/k_n \le t} s(\varvec{X}_{n,i})\right) _{t\in [0,1]} , \{i/k_n \}_{i=1}^{k_n} , \{I(i/k_n)\}_{i=1}^{k_n} \right) , \end{aligned}$$

where

$$\begin{aligned} I(i/k_n)=\sum _{j<i} s(\varvec{X}_{n,j}) +\left[ u(\varvec{X}_{n,i}) , v(\varvec{X}_{n,i})\right] . \end{aligned}$$

By [8, Theorem 4.2] and the previous two steps, to show that

$$\begin{aligned} \tilde{S}_n'{\mathop {\longrightarrow }\limits ^{d}}V' \end{aligned}$$

in \((E,m_E)\), it suffices to prove that, for all \(\delta >0,\)

$$\begin{aligned} \lim _{\epsilon \rightarrow 0}\limsup _{n\rightarrow \infty } \mathbb {P}(m_E(\tilde{S}_{n,\epsilon }',\tilde{S}_n')>\delta )=0. \end{aligned}$$
(4.20)

Note first that, by the same arguments as in the previous step, we have

$$\begin{aligned} m^*(\tilde{S}_{n,\epsilon }',\tilde{S}_n')\le \sum _{j=1}^{k_nr_n}\frac{|X_j|}{a_n}\mathbb {1}_{\{|X_j|\le a_n\epsilon \}}. \end{aligned}$$

By (4.3), Markov’s inequality and Karamata’s theorem ([9, Proposition 1.5.8])

$$\begin{aligned} \limsup _{n\rightarrow \infty } \mathbb {P}(m_E(\tilde{S}_{n,\epsilon }',\tilde{S}_n')>\delta )&\le \limsup _{n\rightarrow \infty } \frac{k_nr_n}{\delta a_n} \mathbb {E}\left[ |X_1|\mathbb {1}_{\{|X_1|\le a_n\epsilon \}}\right] \\&= \lim _{n\rightarrow \infty } \frac{n}{\delta a_n} \cdot \frac{\alpha a_n\epsilon \mathbb {P}(|X_1|>a_n\epsilon )}{1-\alpha } \\&= \frac{\alpha }{(1-\alpha )\delta }\epsilon ^{1-\alpha }. \end{aligned}$$

This proves (4.20) since \(\alpha <1\) and hence

$$\begin{aligned} \tilde{S}_n'{\mathop {\longrightarrow }\limits ^{d}}V' \end{aligned}$$

in \((E,m_E)\).

Step 4 Finally, to show that the original partial sum process \(S_n\) (and therefore \(V_n\) since \(\alpha \in (0,1)\)) also converges in distribution to \(V'\) in \((E,m_E)\), by Slutsky argument it suffices to prove that

$$\begin{aligned} m_E(S_n,\tilde{S}_n'){\mathop {\longrightarrow }\limits ^{P}}0. \end{aligned}$$
(4.21)

Recall that \(k_n=\lfloor n / r_{n} \rfloor \) so \(\frac{i r_n}{n}\le \frac{i}{k_n}\) for all \(i=0,1,\ldots ,k_n\) and moreover

$$\begin{aligned} \frac{i}{k_n}-\frac{i r_n}{n} = \frac{i}{k_n}\left( 1-\frac{k_n r_n}{n}\right) \le 1-\frac{\lfloor n / r_{n} \rfloor }{n/r_n} = 1-\left( 1-\frac{\{ n / r_{n}\}}{n/r_n}\right) \le \frac{r_n}{n}. \end{aligned}$$
(4.22)

Let \(d_{n,i}\) for \(i=0,\ldots ,k_n-1\) be the Hausdorff distance between restrictions of graphs \(\varGamma _{S_n}\) and \(\varGamma _{\tilde{S}_n'}\) on time intervals \((\frac{i r_n}{n},\frac{(i+1)r_n}{n}]\) and \((\frac{i}{k_n},\frac{i+1}{k_n}],\) respectively (see Fig. 3).

First note that, by (4.22), the time distance between any two points on these graphs is at most \(2r_n/n.\) Further, by construction, \(S_n\) and \(\tilde{S}_n'\) have the same range of values on these time intervals. More precisely,

$$\begin{aligned} \bigcup _{t\in (\frac{i r_n}{n},\frac{(i+1)r_n}{n}]}\{z\in \mathbb {R}:\; (t,z)\in \varGamma _{S_n}\}= & {} \bigcup _{t\in (\frac{i}{k_n},\frac{i+1}{k_n}]}\{z\in \mathbb {R}:\; (t,z)\in \varGamma _{\tilde{S}_n'}\} \\= & {} \tilde{S}_n'((i+1)/k_n). \end{aligned}$$

Therefore, the distance between the graphs comes only from the time component, i.e.

$$\begin{aligned} d_{n,i}\le \frac{2 r_n}{n}\rightarrow 0, \; \text { as } n\rightarrow \infty , \end{aligned}$$

for all \(i=0,1,\ldots ,k_n-1.\)

Moreover, if we let \(d_{n,k_n}\) be the Hausdorff distance between the restriction of the graph \(\varGamma _{S_n}\) on \((\frac{k_n r_n}{n},1]\) and the interval \((1,\tilde{S}_n'(1)),\) it holds that

$$\begin{aligned} d_{n,k_n} \le \frac{r_n}{n} \vee \sum _{j=k_n r_n+1}^n\frac{|X_j|}{a_n}{\mathop {\longrightarrow }\limits ^{P}}0, \end{aligned}$$

as \(n\rightarrow \infty .\) Hence, (4.21) holds since

$$\begin{aligned} m_E(S_n,\tilde{S}_n')\le \bigvee _{i=0}^{k_n}d_{n,i}, \end{aligned}$$

and this finishes the proof in the case \(\alpha \in (0,1)\).

  1. (b)

    Assume now that \(\alpha \in [1,2)\). As shown in Step 1. in the proof of (a), it holds that

    $$\begin{aligned} \tilde{S}_{n,\epsilon }'{\mathop {\longrightarrow }\limits ^{d}}S_\epsilon ' \end{aligned}$$
    (4.23)

    in E,  where \(\tilde{S}_{n,\epsilon }'=T^\epsilon (N_n'')\) and \(S_\epsilon '=T^\epsilon (N'')\).

For every \(\epsilon >0\) define a càdlàg process \(S_{n,\epsilon }\) by setting, for \(t\in [0,1],\)

$$\begin{aligned} S_{n,\epsilon }(t)=\sum _{i=1}^{\lfloor nt \rfloor } \frac{X_i}{a_n}\mathbb {1}_{\{|X_i|/a_n > \epsilon \}}. \end{aligned}$$

Using the same arguments as in Step 4. in the proof of (a), it holds that, as \(n\rightarrow \infty ,\)

$$\begin{aligned} m_E(S_{n,\epsilon },\tilde{S}_{n,\epsilon }'){\mathop {\longrightarrow }\limits ^{P}}0. \end{aligned}$$
(4.24)

Therefore, by Slutsky argument it follows from (4.24) and (4.23) that

$$\begin{aligned} S_{n,\epsilon }{\mathop {\longrightarrow }\limits ^{d}}S_\epsilon ' \end{aligned}$$
(4.25)

in \((E,m_E)\).

Since \(\alpha \in [1,2)\) we need to introduce centering, so we define the càdlàg process \(V_{n,\epsilon }\) by setting, for \(t\in [0,1],\)

$$\begin{aligned} V_{n,\epsilon }(t)=S_{n,\epsilon }(t)-\lfloor nt \rfloor \mathbb {E}\left( \frac{X_1}{a_n}\mathbb {1}_{\{\epsilon <|X_1|/a_n\le 1\}}\right) . \end{aligned}$$

From (1.4) we have, for any \(t \in [0,1],\) as \(n\rightarrow \infty ,\)

$$\begin{aligned} \lfloor nt \rfloor \mathbb {E}\left( \frac{X_1}{a_n}\mathbb {1}_{\{\epsilon< |X_1|/a_n \le 1\}}\right) \rightarrow t \int _{\{x: \;\epsilon < |x| \le 1\}} x \mu (dx). \end{aligned}$$
(4.26)

Since the limit function above is continuous and the convergence is uniform on [0, 1], by Lemma 6.3 and (4.25) it follows that

$$\begin{aligned} V_{n,\epsilon }{\mathop {\longrightarrow }\limits ^{d}}V_\epsilon ' \end{aligned}$$
(4.27)

in E, where \(V_\epsilon '\) is given by (see Remark 4.2)

$$\begin{aligned} V_\epsilon '(t)&= S_\epsilon '(t) -t \int _{\{x: \;\epsilon < |x| \le 1\}} x \mu (dx). \end{aligned}$$

Let \(V_\epsilon \) be the càdlàg part of \(V_\epsilon ',\) i.e.,

$$\begin{aligned} V_\epsilon (t) = \sum _{T_i\le t} s^\epsilon (P_i \varvec{Q}_i) - t \int _{\{x: \;\epsilon < |x| \le 1\}} x \mu (dx). \end{aligned}$$
(4.28)

By Lemma 6.4, there exist an \(\alpha \)-stable Lévy process V such that \(V_\epsilon \) converges uniformly almost surely (along some subsequence) to V.

Next, as in Step 2. in the proof of (a) we can define a element \(V'\) in E by

$$\begin{aligned} V'=\left( V,\{T_i\}_{i\in \mathbb {N}}, \{I(T_i)\}_{i\in \mathbb {N}}\right) , \end{aligned}$$
(4.29)

where

$$\begin{aligned} I(T_i)=V(T_i-) +\left[ u(P_i \varvec{Q}_{i}) , v(P_i \varvec{Q}_{i})\right] , \end{aligned}$$
(4.30)

Also, one can argue similarly to the proof of (4.18) to conclude that

$$\begin{aligned} m^*(V',V_\epsilon ')\le ||V-V_\epsilon ||_\infty + \sup _{i\in \mathbb {N}} \sum _{j\in \mathbb {Z}} P_i|Q_{i,j}|\mathbb {1}_{\{P_i|Q_{i,j}|\le \epsilon \}}. \end{aligned}$$

Now it follows from (4.3), (4.12) and a.s. uniform convergence of \(V_\epsilon \) to V that

$$\begin{aligned} V_\epsilon '\rightarrow V' \; \text {a.s.} \end{aligned}$$
(4.31)

in \((E,m_E)\).

Finally, by (4.27), (4.31) and [8, Theorem 4.2], to show that

$$\begin{aligned} V_n{\mathop {\longrightarrow }\limits ^{d}}V', \end{aligned}$$
(4.32)

in \((E,m_E)\), it suffices to prove that, for all \(\delta >0,\)

$$\begin{aligned} \lim _{\epsilon \rightarrow 0}\limsup _{n\rightarrow \infty } \mathbb {P}(m_E(V_{n,\epsilon },V_n)>\delta )=0. \end{aligned}$$
(4.33)

But this follows from Assumption 4.4 and (4.3) since

$$\begin{aligned} m^*(V_{n,\epsilon },V_n) \le \max _{1\le k \le n} \left| \sum _{i=1}^k\left( \frac{X_i}{a_n} \mathbb {1}_{\{|X_i|/a_n\le \epsilon \}} - \mathbb {E}\left[ \frac{X_i}{a_n}\mathbb {1}_{\{|X_i|/a_n\le \epsilon \}}\right] \right) \right| . \end{aligned}$$

Hence (4.32) holds and this finishes the proof. \(\square \)

Fig. 3
figure 3

Restrictions of graphs \(\varGamma _{S_n}\) and \(\varGamma _{\tilde{S}_n'}\) on time intervals \((\frac{i r_n}{n},\frac{(i+1)r_n}{n}]\) and \((\frac{i}{k_n},\frac{i+1}{k_n}],\) respectively

4.3 Supremum of the partial sum process

We next show that the supremum of the partial sum process converges in distribution in D endowed with the \(M_1\) topology, where the limit is the “running supremum” of the limit process \(V'\) from Theorem 4.5.

Let V be the Lévy process defined in (4.11) and define the process \(V^+\) on [0, 1] by

$$\begin{aligned} V^+(t)=&{\left\{ \begin{array}{ll} V(t), &{} t\notin \{T_j\}_{j\in \mathbb {N}}\\ V(t-)+\sup _{k\in \mathbb {Z}}\sum _{j\le k}P_i Q_{i,j}, &{} t=T_i \text { for some } i\in \mathbb {N}.\ \ \end{array}\right. } \end{aligned}$$

Define \(V^-\) analogously using infimum instead of supremum. Note that \(V^+\) and \(V^-\) need not be right-continuous at the jump times \(T_j.\) However, their partial supremum or infimum are càdlàg functions.

Theorem 4.12

Under the same conditions as in Theorem 4.5, it holds that

$$\begin{aligned} \left( \sup _{s\le t} V_n (s)\right) _{t\in [0,1]} {\mathop {\longrightarrow }\limits ^{d}}\left( \sup _{s\le t} V^+ (s)\right) _{t\in [0,1]}, \end{aligned}$$

and

$$\begin{aligned} \left( \inf _{s\le t} V_n (s)\right) _{t\in [0,1]} {\mathop {\longrightarrow }\limits ^{d}}\left( \inf _{s\le t} V^- (s)\right) _{t\in [0,1]}, \end{aligned}$$

jointly in \(D([0,1],\mathbb {R})\) endowed with the \(M_1\) topology.

Proof

We prove the result only for the supremum of the partial sum process since the infimum case is completely analogous and joint convergence holds since we are applying the continuous mapping argument to the same process.

Define the mapping \(\;\sup :E([0,1],\mathbb {R})\rightarrow D([0,1],\mathbb {R})\) by

$$\begin{aligned} \sup (x')(t)=\sup \{z:z\in x'(s), 0\le s \le t\}. \end{aligned}$$

Note that \(\sup (x')\) is non-decreasing and since for every \(\delta >0\) there are at most finitely many times t for which the \(diam(x'(t))\) is greater than \(\delta \), by [34, Theorem 15.4.1.] it follows easily that this mapping is well-defined, i.e. that \(\sup (x')\) is indeed an element in D. Also, by construction,

$$\begin{aligned} \sup (V')=\left( \sup _{s\le t} V^+ (s)\right) _{t\in [0,1]} \end{aligned}$$

and

$$\begin{aligned} \sup (V_n)=\left( \sup _{s\le t} V_n (s)\right) _{t\in [0,1]}. \end{aligned}$$

Define the subset of E by

$$\begin{aligned} \Lambda =\left\{ x'\in E \, : \, x'(0)=\{0\}\right\} \end{aligned}$$

and assume that \(x_n'\rightarrow x'\) in \((E,m_E),\) where \(x_n',x'\in \Lambda \). By Theorem 4.3 it follows that

$$\begin{aligned} \sup (x_n')(t)=M_{0,t}(x_n')\rightarrow M_{0,t}(x')=\sup (x')(t) \end{aligned}$$

for all t in a dense subset of (0,1], including 1. Also, the convergence trivially holds for \(t=0\) since \(\sup (x_n')(0)=\sup (x')(0)=0\) for all \(n\in \mathbb {N}.\) Since \(\sup (x')\) is non-decreasing for all \(x'\in E,\) we can apply [34, Corollary 12.5.1] and conclude that

$$\begin{aligned} \sup (x_n') \rightarrow \sup (x') \end{aligned}$$

in D endowed with \(M_1\) topology. Since \(V_n,V'\in \Lambda \) almost surely, by Theorem 4.5 and continuous mapping argument it follows that

$$\begin{aligned} \left( \sup _{s\le t} V_n (s)\right) _{t\in [0,1]} {\mathop {\longrightarrow }\limits ^{d}}\left( \sup _{s\le t} V^+ (s)\right) _{t\in [0,1]} \end{aligned}$$

in D endowed with \(M_1\) topology. \(\square \)

Remark 4.13

Note that when \(\sum _{j\in \mathbb {Z}}Q_{j}=0\) a.s., the limit for the supremum of the partial sum process in Theorem 4.12 is simply a so called Fréchet extremal process. For an illustration of the general limiting behavior of running maxima in the case of a linear processes, consider again the moving average of order 1 from Example 4.9. Figure 1 shows a path (dashed line) of the running maxima of the MA(1) process \(X_t = \xi _{t} -0.7 \xi _{t-1}\).

4.4 \(M_2\) convergence of the partial sum process

We can now characterize the convergence of the partial sum process in the \(M_2\) topology in D([0, 1]) by an appropriate condition on the tail process of the sequence .

Assumption 4.14

The sequence satisfies

$$\begin{aligned} -\left( \sum _{j\in \mathbb {Z}} Q_{j} \right) _- = \inf _{k\in \mathbb {Z}} \sum _{j\le k} Q_{j} \le \sup _{k\in \mathbb {Z}} \sum _{j\le k} Q_{j} = \left( \sum _{j\in \mathbb {Z}} Q_{j} \right) _+ \; \hbox { a.s. } \end{aligned}$$
(4.34)

i.e. \(-s(\varvec{Q})_- = u(\varvec{Q}) \le v(\varvec{Q}) = s(\varvec{Q})_+\) a.s.

Note that this assumption ensures that \(\sum _{j\in \mathbb {Z}} Q_j\ne 0\) and that the limit process \(V'\) from Theorem 4.5 has sample paths in the subset \(D'\) of E which was defined in (4.5). By Lemma 4.1, Theorem 4.5 and the continuous mapping theorem, the next result follows immediately.

Theorem 4.15

If, in addition to conditions in Theorem 4.5, Assumption 4.14 holds, then

$$\begin{aligned} V_n{\mathop {\longrightarrow }\limits ^{d}}V \end{aligned}$$

in \(D([0,1],\mathbb {R})\) endowed with the \(M_2\) topology.

Since the supremum functional is continuous with respect to the \(M_2\) topology, this result implies that the limit of the running supremum of the partial sum process is the running supremum of the limiting \(\alpha \)-stable Lévy process as in the case of i.i.d. random variables.

Example 4.16

For the linear process \( X_t = \sum _{j\in \mathbb {Z}} c_j \xi _{t-j}\) from Sect. 3.3 and Example 4.9, the corresponding sequence \(\{Q_j\}\) was given in (3.8). It follows that the condition (4.34) can be expressed as

$$\begin{aligned} - \left( \sum _{j\in \mathbb {Z}} c_j\right) _- = \inf _{k\in \mathbb {Z}} \sum _{j\le k} c_j \le \sup _{k\in \mathbb {Z}} \sum _{j\le k} c_j = \left( \sum _{j\in \mathbb {Z}} c_j\right) _+. \end{aligned}$$
(4.35)

This is exactly [4, Condition 3.2]. Note that (4.35) implies that

$$\begin{aligned} \left| \sum _{j\in \mathbb {Z}} c_j \right| > 0. \end{aligned}$$

5 Record times

In this section we study record times in a stationary sequence . Since record times remain unaltered after a strictly increasing transformation, the main result below holds for stationary sequences with a general marginal distribution as long as they can be monotonically transformed into a regularly varying sequence.

We start by introducing the notion of records for sequences in \(\tilde{l}_0\). For \(y\ge 0\) and \(\varvec{x}= \{x_j\} \in \tilde{l}_0\) define

$$\begin{aligned} R^{\varvec{x}} (y) = \sum _{j=-\infty }^\infty \mathbb {1}_{\{x_j > y \vee \sup _{i<j} x_i \}}, \end{aligned}$$

representing the number of records in the sequence \(\varvec{x}\) larger than y, which is finite for \(\varvec{x}\in \tilde{l}_0\). For notational simplicity, we suppress notation \(\tilde{\varvec{x}}\) in this section.

Let \(\gamma = \sum _{i=1}^\infty \delta _{t_i,\varvec{x}^i} \in \mathcal {M}_p({[0,\infty ) \times \tilde{l}_0\setminus \{\tilde{\varvec{0}}\}})\), where \(\varvec{x}^i = \{x^i_j\} \in \tilde{l}_0\). Define, for \(t>0\)

$$\begin{aligned} M^{\gamma }(t) = \sup _{t_i \le t} \Vert \varvec{x}^i\Vert _\infty , \quad M^{\gamma }(t-) = \sup _{t_i < t} \Vert \varvec{x}^i\Vert _\infty , \end{aligned}$$

where we set \(\sup \emptyset = 0\) for convenience. Next, let \(R_\gamma \) be the (counting) point process on \((0,\infty )\) defined by

$$\begin{aligned} R_\gamma&= \sum _{i} \delta _{t_i} R^{\varvec{x}^i} (M^\gamma (t_i-)), \end{aligned}$$

hence for arbitrary \(0<a < b \)

$$\begin{aligned} R_\gamma (a,b]&= \sum _{a< t_i \le b} \sum _{j=-\infty }^\infty \mathbb {1}_{\{x^i_j > M^\gamma (t_i-) \vee \sup _{k<j} x^i_k \}}. \end{aligned}$$

Consider the following subset of \(\mathcal {M}_p({[0,\infty )\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}})\)

$$\begin{aligned} A =&\Bigg \{ \gamma = \sum _{i=1}^\infty \delta _{t_i,\varvec{x}^i} : \hbox { such that } M^\gamma (t)>0 \hbox { for all }t>0, \hbox { while } \\&\hbox {all }t_i\hbox {'s are mutually different as well as all nonzero } x^i_j\hbox {'s} \Bigg \}. \end{aligned}$$

The space \(\mathcal {M}_p((0,\infty ))\) is endowed with the \(w^\#\) topology which is equivalent to the usual vague topology since \((0,\infty )\) is locally compact and separable.

Lemma 5.1

The mapping \(\gamma \mapsto R_\gamma \) from \(\mathcal {M}_p({[0,\infty )\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}})\) to \(\mathcal {M}_p((0,\infty ))\) is continuous at every \(\gamma \in A\).

Proof

Fix an arbitrary \(\gamma \in A\), and assume a sequence \(\{\gamma _n\}\) in \( \mathcal {M}_p([0,\infty )\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\})\) satisfies \(\gamma _n \rightarrow _{w^\#} \gamma \). We must prove that \(R_{\gamma _n}\) converges vaguely to \(R_\gamma \) in \(\mathcal {M}_p((0,\infty ))\). By the Portmanteau theorem, it is sufficient to show that that for all \(0<a< b \in \{ t_i\}^c\)

$$\begin{aligned} R_{\gamma _n} (a,b] \rightarrow R_{\gamma } (a,b]\,, \end{aligned}$$

as \(n\rightarrow \infty \). For \(0<a< b \in \{ t_i\}^c\) there are finitely many time instances, say \(t_{i_1},\ldots ,t_{i_k}\in (a,b]\) such that \(\Vert \varvec{x}^{i_l}\Vert _{\infty }> M^\gamma (a)>0\), hence

$$\begin{aligned} R_\gamma (a,b]&= \sum _{t_{i} \in (a,b]} R^{\varvec{x}^i} (M^{\gamma }(t_i-)) = \sum _{l=1}^{k} R^{\varvec{x}^{i_l}} (M^{\gamma }(t_{i_l}-)). \end{aligned}$$
(5.1)

For all \( \gamma _n = \sum _{i=1}^\infty \delta _{t^n_i,\varvec{x}^{n,i}}\) with n large enough, there also exist exactly k (depending on a and b) time instances \(t^n_{i_1},\ldots ,t^n_{i_k}\in (a,b]\) such that \(\Vert \varvec{x}^{n,i_l}\Vert _{\infty }> M^\gamma (a)\). Moreover, they satisfy \(\varvec{x}^{n,i_l} \rightarrow \varvec{x}^{i_l}\) and \(t^n_{i_l} \rightarrow t_{i_l}\) for \(l=1,\ldots k\) as \(n\rightarrow \infty \). Hence for n large enough

$$\begin{aligned} R_{\gamma _n}(a,b]&= \sum _{t^n_{i} \in (a,b]} R^{\varvec{x}^{n,i}} (M^{\gamma _n}(t^n_i-)) = \sum _{l=1}^{k} R^{\varvec{x}^{n,i_l}} (M^{\gamma _n}(t^n_{i_l}-)). \end{aligned}$$
(5.2)

Assume \(y_n \rightarrow y>0 \) and \(\varvec{x}^n \rightarrow \varvec{x}=\{x_j\}\) where the non zero \(x_j\) are pairwise distinct and \(x_j\not = y\) for all \(j\in \mathbb {Z}\). Then, it is straightforward to check that

$$\begin{aligned} R^{\varvec{x}^n}(y_n) \rightarrow R^{\varvec{x}}(y). \end{aligned}$$

Observe further that for the choice of \(t_{i_l}\), \(t^n_{i_l}\) we made above, it holds that \(M^{\gamma _n} (t^n_{i_l}-) \rightarrow M^{\gamma } (t_{i_l}-)\) since \(t_i\)’s are all different. Together with (5.1) and (5.2) this yields

$$\begin{aligned} R_{\gamma _n}(a,b] \rightarrow R_{\gamma }(a,b] \end{aligned}$$

as \(n\rightarrow \infty \). \(\square \)

Since we are only interested in records, for convenience we consider a nonnegative stationary regularly varying sequence .

Adopting the notation of Sect. 3.2, we denote

$$\begin{aligned} N''_n = \sum _{i=1}^{\infty } \delta _{(i/k_n,\varvec{X}_{n,j})}. \end{aligned}$$
(5.3)

We will also need the point process \(N_n\) defined by

$$\begin{aligned} N_n = \sum _{i=1}^{\infty } \delta _{(i/n,X_{i}/a_n)}. \end{aligned}$$

The process \(N_n\) can be viewed as a point process on \([0,\infty ) \times \mathbb {R}{\setminus }\{0\}\), but since \(\mathbb {R}\) can be embedded in \(\tilde{l}_0\) (by identifying a real number \(x\ne 0\) to a sequence with exactly one nonzero coordinate equal to x), in the sequel we treat it as a process on the space \([0,\infty )\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}\).

As in the previous section, we will assume that

$$\begin{aligned} N''_n{\mathop {\longrightarrow }\limits ^{d}}N''= \sum _{i=1}^\infty \delta _{(T_i, P_i\varvec{Q}_{i})}, \end{aligned}$$
(5.4)

as \(n\rightarrow \infty \), where \(N''\) has the same form as in (4.8), but on the space \(\mathcal {M}_p({[0,\infty )\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}})\). For \(\beta \)-mixing and linear processes, this convergence follows by direct extension of results in Sect. 3 from the state space \([0,1]\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}\) to \([0,T]\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}\) for arbitrary \(T>0\).

Theorem 5.2

Let be a stationary regularly varying sequence with tail index \(\alpha >0\). Assume that the convergence in (5.4) holds and moreover that

$$\begin{aligned} \mathbb {P}( \hbox {all nonzero } Q_{1,j}\hbox {'s are mutually different}) = 1. \end{aligned}$$

Then

$$\begin{aligned} R_{N_n} {\mathop {\longrightarrow }\limits ^{d}}R_{N''}, \end{aligned}$$

in \(\mathcal {M}_p((0,\infty ))\). Moreover, the limiting process is a compound Poisson process with representation

$$\begin{aligned} R_{N''} = \sum _{i\in \mathbb {Z}} \delta _{\tau _i} \kappa _i, \end{aligned}$$

where \(\sum _{i\in \mathbb {Z}} \delta _{\tau _i}\) is a Poisson point process on \((0,\infty )\) with intensity measure \(x^{-1}\mathrm {d}x\) and \(\{\kappa _i\}\) is a sequence of i.i.d. random variables independent of it with the same distribution as the integer valued random variable \(R^{\varvec{Q}_1} (1/\zeta )\) where \(\zeta \) is a Pareto random variable with tail index \(\alpha \), independent of \(\varvec{Q}_1\).

Proof

Since they are constructed from the same sequence \(X_1,X_2,\ldots \), the record values of the point process \(N''_n \) in (5.3) correspond to the record values of the point process \(N_n\). Using (5.4) and the additional assumption on the Q’s, by Lemma 5.1 it follows that

$$\begin{aligned} R_{N''_n}{\mathop {\longrightarrow }\limits ^{d}}R_{N''}. \end{aligned}$$
(5.5)

Note that record times i / n of the process \(N_n\) appear at slightly altered times \((\lfloor i/r_n \rfloor +1) /k_n\) in the process \(N''_n\). However, asymptotically the record times are very close. Indeed, take \(f \in C_K^+(0,\infty )\), then f has a support on the set of the form (ab], \(0<a<b\), which can be enlarged slightly to a set \((a-\epsilon ,b+\epsilon ]\), where \(0<a-\epsilon \) for a sufficiently small \(\epsilon >0\). Clearly, f is continuous on that set, even uniformly. Note further that for any such \(\epsilon \) there is an integer \(n_0\) such that for \(n\ge n_0 \), \(i/n \in (a,b]\) implies \((\lfloor i/r_n \rfloor +1) /k_n \in (a-\epsilon ,b+\epsilon ]\), and vice versa. Moreover, by uniform continuity of the function f, \(n_0\) can be chosen such that for \(n \ge n_0\),

$$\begin{aligned} \left| f\left( \frac{i}{n} \right) - f\left( \frac{\lfloor i/r_n \rfloor +1}{k_n} \right) \right| \le \epsilon . \end{aligned}$$

Consider now the difference between Laplace functionals of the point processes \(R_{N''_n}\) and \(R_{N_n}\) for a function \(f\in C_K^+(0,\infty )\) as above. Since \(\epsilon \) above can be made arbitrarily small, it follows that

$$\begin{aligned} \left| \mathbb {E}[\mathrm {e}^{-f(R_{N''_n})}] - \mathbb {E}[\mathrm {e}^{-f(R_{N_n})}] \right| \rightarrow 0, \end{aligned}$$

which together with (5.5) yields the convergence statement of the theorem.

Consider now a point measure \(\gamma = \sum _{i=1}^\infty \delta _{t_i,\varvec{x}^i} \in \mathcal {M}_p({[0,\infty )\times \tilde{l}_0{\setminus }\{\tilde{\varvec{0}}\}})\), but such that all \(\varvec{x}^i\) have only nonnegative components and all \(t_i\)’s are mutually different. We say that a point measure \(\gamma \) has a record at time t if \((t,\varvec{x}) \in \{(t_i,\varvec{x}^i) : i \ge 1 \}\) and \(M^{\gamma }(t-) = \sup _{t_i< t} \Vert \varvec{x}^i\Vert _{\infty } < \Vert \varvec{x}\Vert _{\infty }\). Taking the order into account, at time t we will see exactly \(R^{\varvec{x}} (M^{\gamma }(t-)) \) records. Similarly, a point measure \(\eta = \sum _{i=1}^\infty \delta _{t_i,x_i} \in \mathcal {M}_p({[0,\infty )\times [0,\infty )})\) has a record at time t with corresponding record value x, if \((t,x) \in \{(t_i,x_i) : i \ge 1 \}\) and \(\eta ([0,t)\times [x,\infty )) = 0\).

To prove the representation of the limit, observe that \(N''= \sum _{i=1}^\infty \delta _{(T_i, P_i\varvec{Q}_{i})} \) has records at exactly the same time instances as the process \(M_0 = \sum _{i=1}^\infty \delta _{(T_i, P_i)} \), since by the assumptions of the theorem and by definition of the sequences \(\varvec{Q}_i\), all of their components are in [0, 1] with one of them being exactly equal to 1. Because \(M_0\) is a Poisson point process on \([0,\infty )\times (0,\infty ]\) with intensity measure \(\mathrm {d}x \times \mathrm {d}(-\theta y^{-\alpha })\), it has infinitely many points of in any set of the form \([a,b]\times [0,\epsilon ]\) with \(a<b\) and \(\epsilon >0\). Hence, one can a.s. write the record times of \(M_0\) as a double sided sequence \(\tau _n,\ n \in \mathbb {Z}\), such that \(\tau _n < \tau _{n+1}\) for each n. Fix an arbitrary \(s>0\), and assume without loss of generality that \(\tau _1\) represents the first record time strictly greater than s, i.e. \(\tau _1= \inf \{\tau _i : \tau _i >s\}\). Denote the corresponding successive record values by \(U_n\); they clearly satisfy \(U_n < U_{n+1}\) and \(U_n = \sup _{T_i\le \tau _n} P_i\) and \(U_0 = \sup _{T_i\le s} P_i\). According to [30, Proposition 4.9], \(\sum _{n\in \mathbb {Z}}\delta _{\tau _n}\) is a Poisson point process with intensity \(x^{-1}\mathrm {d}x\) on \((0,\infty )\). Apply now [30, Proposition 4.7 (iv)] (note that \(U_0\) corresponds to Y(s) in the notation of that proposition) to prove that \(\{U_{n}/U_{n-1},n\ge 1\}\) is a sequence of i.i.d. random variables with a Pareto distribution with tail index \(\alpha \). Because the record times \(\tau _n\) and record values \(U_n\) for \(n\ge 1\) of the point process \(M_0\) match the records of point process \(N''= \sum _{i=1}^\infty \delta _{(T_i, P_i\varvec{Q}_{i})}\) on the interval \((s,\infty )\), we just need to count how many of them appear at any give time \(\tau _n\) which are larger than the previous record \(U_{n-1}\). If, say, \(\tau _n=T_i\), that number corresponds to the number of \(Q_{ij}\)’s which after multiplication by the corresponding \(U_n=P_i\) represent a record larger than \(U_{n-1}\). Hence, that random number has the same distribution as

$$\begin{aligned} \kappa = R^{\varvec{Q}} (U_{1}/U_{0}). \end{aligned}$$

Recall that \(s>0\) was arbitrary. Now since the point process \(\sum _{i=1}^\infty \delta _{(T_i, P_i)} \) and therefore sequence \( \{U_n/U_{n-1}, n\ge 1 \}\) is independent of the i.i.d. random elements \(\{\varvec{Q}_i\}\) and since \(U_1/U_0\) has a Pareto distribution with tail index \(\alpha \), the claim follows. \(\square \)

Example 5.3

For an illustration of the previous theorem, consider the moving average process of order 1

$$\begin{aligned} X_t = \xi _{t} + c \xi _{t-1}, \end{aligned}$$

for a sequence of i.i.d. nonnegative random variables \(\{\xi _t,t\in \mathbb {Z}\}\) with regularly varying distribution and the tail index \(\alpha >0\). Assume further that \(c>1\). By (3.8), the sequence \(\{Q_j\},\) as a random element in the space \(\tilde{l}_0,\) is in this case equal to the deterministic sequence \(\{\ldots , 0, 1/c,1,0,\ldots \}\). Intuitively speaking, in each cluster of extremely large values, there are exactly two successive extreme values with the second one c times larger that the first. Therefore, each such cluster can give rise to at most 2 records. By straightforward calculations, the random variables \(\kappa _i\) from Theorem 5.2 have the following distribution

$$\begin{aligned} \mathbb {P}(\kappa _i=2)=\mathbb {P}(1/\zeta \le 1/c)=\mathbb {P}(\zeta \ge c)=\frac{1}{c^\alpha }=1-\mathbb {P}(\kappa _i=1). \end{aligned}$$

6 Lemmas

6.1 Metric on the space \(\tilde{l}_0\)

Let \((\mathbb {X},d)\) be a metric space, we define the distance between \(x\in \mathbb {X}\) and subset \(B\subset \mathbb {X}\) by \(d(x,B)=\inf \{d(x,y):y\in B\}.\) Let \(\sim \) be an equivalence relation on \(\mathbb {X}\) and let \(\tilde{\mathbb {X}}\) be the induced quotient space. Define a function \(\tilde{d}:\tilde{\mathbb {X}}\times \tilde{\mathbb {X}}\rightarrow [0,\infty )\) by:

$$\begin{aligned} \tilde{d}(\tilde{x},\tilde{y}) = \inf \{d(x',y'):x'\in \tilde{x},y'\in \tilde{y}\}, \end{aligned}$$

for all \(\tilde{x},\tilde{y}\in \tilde{\mathbb {X}}.\)

Lemma 6.1

Let \((\mathbb {X},d)\) be a complete separable metric space. Assume that for all \(\tilde{x},\tilde{y}\in \tilde{\mathbb {X}}\) and all \(x,x'\in \tilde{x}\) we have

$$\begin{aligned} d(x,\tilde{y})=d(x',\tilde{y}). \; \end{aligned}$$
(6.1)

Then \(\tilde{d}\) is a pseudo-metric which makes \(\tilde{\mathbb {X}}\) a separable and complete pseudo-metric space.

Proof

To prove that \(\tilde{d}\) is a pseudo-metric, the only nontrivial step is to show that \(\tilde{d}\) satisfies the triangle inequality, but that is implied by Condition (6.1). Separability is easy to check and it remains to prove that \((\tilde{\mathbb {X}},\tilde{d})\) is complete.

Let \(\{\tilde{x}_n\}\) be a Cauchy sequence in \((\tilde{\mathbb {X}},\tilde{d})\). Then we can find a strictly increasing sequence of nonnegative integers \(\{n_k\}\) such that

$$\begin{aligned} \tilde{d}(\tilde{x}_m,\tilde{x}_n)<\frac{1}{2^{k+1}}, \end{aligned}$$

for all integers \(m,n\ge n_k\) and for every integer \(k\ge 1\). We define a sequence of elements \(\{y_n\}\) in \(\mathbb {X}\) inductively as follows:

  • Let \(y_1\) be an arbitrary element of \(\tilde{x}_{n_1}.\)

  • For \(k\ge 1\) let \(y_{k+1}\) be an element of \(\tilde{x}_{n_{k+1}}\) such that \(d(y_k,y_{k+1})<\frac{1}{2^{k+1}}.\) Such an \(y_{k+1}\) exists by Condition (6.1).

Then the sequence \(\{y_n\}\) is a Cauchy sequence in \((\mathbb {X},d)\). Indeed, for every \(k\ge 1\) and for all \(m,n\ge k\) we have that

$$\begin{aligned} d(y_m,y_n) \le \sum _{l=m\wedge n}^{m\vee n -1} d(y_l,y_{l+1}) < \sum _{l=k}^{\infty }\frac{1}{2^{l+1}}=\frac{1}{2^k}. \end{aligned}$$

Since \((\mathbb {X},d)\) is complete, the sequence \(\{y_n\}\) converges to some \(x\in \mathbb {X}\). Let \(\tilde{x}\) be the equivalence class of x. It follows that the sequence \(\{\tilde{x}_{n_k}\}\) converges to \(\tilde{x}\) because \(\tilde{d}(\tilde{x}_{n_k},\tilde{x})\le d(y_k,x)\) by definition od \(\tilde{d}\). Finally, since \(\{\tilde{x}_n\}\) is a Cauchy sequence, it follows easily that the whole sequence \(\{\tilde{x}_n\}\) also converges to \(\tilde{x}\), hence \((\tilde{\mathbb {X}},\tilde{d})\) is complete. \(\square \)

Proof of Lemma 2.1

Since we have \(\Vert \theta ^k\varvec{x}-\theta ^l\varvec{y}\Vert _\infty =\Vert \theta ^{k-l}\varvec{x}-\varvec{y}\Vert _\infty \) for all \(\varvec{x},\varvec{y}\in l_0\) and \(k,l\in \mathbb {Z}\), it follows that

$$\begin{aligned} \tilde{d}(\tilde{\varvec{x}},\tilde{\varvec{y}}) = \inf \{\Vert \theta ^k\varvec{x}-\varvec{y}\Vert _\infty :k\in \mathbb {Z}\}=\inf \{\Vert \varvec{x}'-\varvec{y}\Vert _\infty :\varvec{x}'\in \tilde{\varvec{x}}\} , \end{aligned}$$

for all \(\tilde{\varvec{x}},\tilde{\varvec{y}}\in \tilde{l}_0\), and all \(\varvec{x}\in \tilde{\varvec{x}},\varvec{y}\in \tilde{\varvec{y}}\). In view of Lemma 6.1 it only remains to show that \(\tilde{d}\) is a metric, rather than just a pseudo-metric.

Assume that \(\tilde{d}(\tilde{\varvec{x}},\tilde{\varvec{y}})=0\) for some \(\tilde{\varvec{x}},\tilde{\varvec{y}}\in \tilde{l}_0\). Then, for arbitrary \(\varvec{x}\in \tilde{\varvec{x}},\varvec{y}\in \tilde{\varvec{y}}\), there exists a sequence of integers \(\{k_n\}\) such that \(\Vert \theta ^{k_n}\varvec{x}-\varvec{y}\Vert _\infty \rightarrow 0\), as \(n\rightarrow \infty \). It suffices to show that the sequence \(\{k_n\}\) is bounded. Indeed, by passing to a convergent subsequence it follows that there exists an integer k such that \(\varvec{y}=\theta ^k\varvec{x}\), hence \(\tilde{\varvec{x}}=\tilde{\varvec{y}}.\) Suppose now that the sequence \(\{k_n\}\) is unbounded and that \(\varvec{y}\ne \varvec{0}\) (the case when \(\varvec{y}=\varvec{0}\) is trivial). Without loss of generality, we can assume that \(k_n\rightarrow \infty \), as \(n\rightarrow \infty \). Since \(\varvec{y}\ne \varvec{0}\) and \(\lim _{|i|\rightarrow \infty } \Vert y_i\Vert = 0\), there exists integers \(i_0\) and \(N>0\) such that \(\Vert y_{i_0}\Vert =\Vert \varvec{y}\Vert _\infty >0\) and \(\Vert y_{i}\Vert <\Vert \varvec{y}\Vert _\infty / 4\) for \(|i|\ge N\). Since \(\Vert \theta ^{k_n}\varvec{x}-\varvec{y}\Vert _\infty \rightarrow 0\) there exists an integer \(n_0>0\) such that \(\Vert \theta ^{k_n}\varvec{x}-\varvec{y}\Vert _\infty <\Vert \varvec{y}\Vert _\infty / 4\) for \(n\ge n_0\). By our assumption, we can find an integer \(n\ge n_0\) such that \(k_n-k_{n_0}+i_0\ge N\), and it follows that

$$\begin{aligned} \frac{3}{4}\Vert \varvec{y}\Vert _\infty< \Vert (\theta ^{k_n}\varvec{x})_{i_0}\Vert = \Vert x_{k_n+i_0}\Vert = \Vert (\theta ^{k_{n_0}}\varvec{x})_{k_n-k_{n_0}+i_0}\Vert < \frac{1}{2}\Vert \varvec{y}\Vert _\infty , \end{aligned}$$

which is a contradiction. Hence, the sequence \(\{k_n\}\) is bounded. \(\square \)

6.2 Assumption 3.5 is a consequence of \(\beta \)-mixing

The \(\beta \)-mixing coefficients of the sequence \(\{X_j,j\in \mathbb {Z}\}\) are defined by

$$\begin{aligned} \beta _n = \frac{1}{2} \sup _{\mathcal {A},\mathcal {B}} \sum _{i\in I} \sum _{j\in J} | \mathbb {P}(A_i\cap B_j) - \mathbb {P}(A_i)\mathbb {P}(B_j)|, \end{aligned}$$

where the supremum is taken over all finite partitions \(\mathcal {A}=\{A_i,i\in I\}\) and \(\mathcal {B}=\{B_j,j\in J\}\) such that the sets \(A_i\) are measurable with respect to \(\sigma (X_k,k\le 0)\) and the sets \(B_j\) are measurable with respect to \(\sigma (X_k,k\ge n)\). See [32, Section 1.6].

Lemma 6.2

Assume that the sequence \(\{X_j\}\) is \(\beta \)-mixing with coefficients \(\{\beta _j,j\in \mathbb {N}\}\). Assume that there exists a sequence \(r_n\) satisfying Assumption 1.1 and a sequence \(\ell _n\) such that

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{\ell _n}{r_n} = \lim _{n\rightarrow \infty } \frac{n}{r_n} \beta _{\ell _n} = 0. \end{aligned}$$
(6.2)

Then Assumption 3.5 holds.

Proof

Write \(X_{i,j}\) for \((X_i,\ldots ,X_j)\). Set \(k_n = \lfloor n/r_n\rfloor \) and let \(\tilde{\mathbf {X}}_n\) be the vector of length \(k_n(r_n-\ell _n)\) which concatenates all the subvectors \(X_{(j-1)r_n+1,jr_n-\ell _n}\), \(j=1,\ldots ,k_n\). Let \(\tilde{\mathbf {X}}_n^*\) be the vector build with independent blocks \(X_{(j-1)r_n+1,jr_n-\ell _n}^*\) which each have the same distribution has the corresponding original blocks \(X_{(j-1)r_n+1,jr_n-\ell _n}\). Applying [17, Lemma 2] and (6.2), we obtain

$$\begin{aligned} \mathrm {d}_{TV}(\mathcal {L}(\tilde{\mathbf {X}}_n),\mathcal {L}(\tilde{\mathbf {X}}_n^*)) \le k_n \beta _{\ell _n} = o(1). \end{aligned}$$
(6.3)

Set \(\tilde{X}_{n,i}= a_n^{-1}X_{(j-1)r_n+1,jr_n-\ell _n}\) and \(\tilde{X}^*_{n,i}=a_n^{-1}X^*_{(j-1)r_n+1,jr_n-\ell _n}\) and define the following point processes

$$\begin{aligned} \tilde{N}''_n = \sum _{i=1}^{k_n} \delta _{(i/k_n,\tilde{X}_{n,i})}, \ \ \tilde{N}^*_n = \sum _{i=1}^{k_n} \delta _{(i/k_n,X^*_{n,i})}. \end{aligned}$$

Let f be a nonnegative function defined on \([0,1]\times \tilde{\ell }_0\). Since the exponential of a negative function is less than 1, by definition of the total variation distance, the bound (6.3) yields

$$\begin{aligned} \left| \mathbb {E}\left[ \mathrm {e}^{-\tilde{N}''_n(f)}\right] - \mathbb {E}\left[ \mathrm {e}^{-\tilde{N}^*_n(f)}\right] \right| \le \mathrm {d}_{TV}(\mathcal {L}(\tilde{\mathbf {X}}_n),\mathcal {L}(\tilde{\mathbf {X}}_n^*)) = o(1). \end{aligned}$$
(6.4)

We must now check that the same limit holds with the full blocks instead of the truncated blocks. Under Assumption 1.1 (which holds for any sequence smaller than \(r_n\) hence for \(\ell _n\)), we know by [6, Proposition 4.2] that for every \(\epsilon >0\) and every sequence \(\{\ell _n\}\) such that \(\ell _n\rightarrow \infty \) and \(\ell _n\bar{F}(a_n)\rightarrow 0\),

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{\mathbb {P}(\max _{1\le i \le \ell _n} \Vert X_i\Vert>\epsilon a_n) }{\ell _n \mathbb {P}(\Vert X_0\Vert >a_n)} = \theta \epsilon ^{-\alpha }. \end{aligned}$$
(6.5)

Then, applying (6.5) yields,

$$\begin{aligned} \mathbb {P}\left( \max _{1\le j\le k_n} \max _{1\le i \le \ell _n} \Vert X_{jr_n-i+1}\Vert>\epsilon a_n\right)&\le k_n \mathbb {P}\left( \max _{1\le i \le \ell _n} \Vert X_{i}\Vert>\epsilon a_n\right) \\&= O(k_n \ell _n\mathbb {P}(\Vert X_0\Vert >a_n)) \\&= O(\ell _n/r_n) = o(1). \end{aligned}$$

Assume now that f depends only on the components greater than some \(\epsilon >0\) in norm. Then \(N''_n(f)=\tilde{N}''_n(f)\) unless at least one component at the end of one block is greater than \(\epsilon \) in norm. This yields

$$\begin{aligned} \left| \mathbb {E}[\mathrm {e}^{-N_n''(f)}] - \mathbb {E}[\mathrm {e}^{-\tilde{N}_n''(f)}] \right| \le \mathbb {P}\left( \max _{1\le j\le k_n} \max _{1\le i \le \ell _n} \Vert X_{jr_n-i+1}\Vert >\epsilon a_n\right) = o(1). \end{aligned}$$

The same relation also holds for the independent blocks. Therefore, Assumption 3.5 holds. \(\square \)

6.3 On continuity of addition in E

The next lemma gives sufficient conditions for continuity of addition in the space \(E([0,1],\mathbb {R})\).

Lemma 6.3

Suppose that is a sequence in \(D([0,1],\mathbb {R})\) and \(x'=(x,S,\{I(t) : t\in S\})\) an element in E such that \(x_n\rightarrow x'\) in E. Suppose also that is a sequence in \(D([0,1],\mathbb {R})\) which converges uniformly to a continuous function b on [0, 1]. Then the sequence \(\{x_n-b_n\}\) converges in \((E,m_E)\) to an element \(x'-b\in E\) defined by

$$\begin{aligned} x'-b = (x-b,S,\{I(t)-b(t): t\in S\}). \end{aligned}$$

Proof

Recall the definiton of \(m_E\) given in (4.1). By Whitt [34, Theorem 15.5.1.] to show that \(x_n-b_n \rightarrow x'-b\) in E,  it suffices to prove that

$$\begin{aligned} \sup _{(t,z)\in \varGamma _{x_n-b_n}}\Vert (t,z)-\varGamma _{x'-b}\Vert _{\infty }\rightarrow 0. \end{aligned}$$
(6.6)

Take an arbitrary \(\epsilon >0.\) Note that b is uniformly continuous so by the conditions of the lemma there exists \(0<\delta \le \epsilon \) and \(n_0\in \mathbb {N}\) such that

  1. (i)

    \(|t-s|<\delta \Rightarrow |b(t)-b(s)|<\epsilon ,\)

  2. (ii)

    \(m_E(x_n,x')<\delta ,\) for all \(n\ge n_0\) and

  3. (iii)

    \(|b_n(t)-b(t)|<\epsilon ,\) for all \(t\in [0,1].\)

Also, since b is continuous, it easily follows that \(|b_n(t)-b_n(t-)|\le 2\epsilon \) for all \(n\ge n_0\) and \(t\in [0,1].\)

Take \(n\ge n_0\) and a point \((t,z)\in \varGamma _{x_n-b_n},\) i.e.

$$\begin{aligned} z\in [(x_n(t-)-b_n(t-))\wedge (x_n(t)-b_n(t)),(x_n(t-)-b_n(t-))\vee (x_n(t)-b_n(t))]. \end{aligned}$$

Since \(|b_n(t)-b_n(t-)|\le 2\epsilon \) there exists \(z'\in [x_n(t-)\wedge x_n(t),x_n(t-)\vee x_n(t)]\) (i.e. \((t,z')\in \varGamma _{x_n}\)), such that

$$\begin{aligned} |(z'-b_n(t))-z|\le 2\epsilon . \end{aligned}$$

Next, since \(m_E(x_n,x')<\delta ,\) there exists a point \((s,y)\in \varGamma _{x'}\) such that

$$\begin{aligned} |s-t|\vee |y-z'|<\delta . \end{aligned}$$

Note that \((s,y-b(s))\in \varGamma _{x'-b}\) and by previous arguments

$$\begin{aligned} |(y-b(s))-z|&=|(y-b(s))-z + (z' -b_n(t)) - (z' -b_n(t)) + b(t) - b(t)|\\&\le |y-z'|+|b(t)-b(s)|+|b_n(t)-b(t)|+|(z'-b_n(t))-z|\\&\le \delta + \epsilon + \epsilon + 2\epsilon \\&\le 5\epsilon . \end{aligned}$$

Also, \(|s-t|<\delta \le \epsilon .\) Hence, for all \(n\ge n_0,\)

$$\begin{aligned} \sup _{(t,z)\in \varGamma _{x_n-b_n}}\Vert (t,z)-\varGamma _{x'-b}\Vert _{\infty }\le 5\epsilon . \end{aligned}$$

and since \(\epsilon \) was arbitrary, (6.6) holds. \(\square \)

6.4 A lemma for partial sum convergence in E

Lemma 6.4

Let \(\alpha \in [1,2)\) and let the assumptions of Theorem 4.5 hold. Then there exists an \(\alpha \)-stable Lévy process V on [0, 1] such that, as \(\epsilon \rightarrow 0,\) the process \(V_\epsilon \) defined in (4.28) converges uniformly a.s. (along some subsequence) to V.

Proof

Recall that

$$\begin{aligned} V_\epsilon (t)&= \sum _{T_i\le t} s^\epsilon (P_i \varvec{Q}_i) - t \int _{\{x: \;\epsilon < |x| \le 1\}} x \mu (dx) \; \end{aligned}$$

where

$$\begin{aligned} \mu (dx)=p\alpha x^{-\alpha -1} \mathbb {1}_{(0,\infty )}(x) \mathrm {d}x + (1-p)\alpha (-x)^{-\alpha -1} \mathbb {1}_{(-\infty ,0)}(x) \mathrm {d}x \end{aligned}$$

for \(p=\mathbb {P}(\varTheta _0=1)\). We first show that the centering term can be expressed as an expectation of a functional of the limiting point process \(N''\). More precisely, we show that for all \(\epsilon >0\)

$$\begin{aligned} \int _{\{x: \;\epsilon< |x| \le 1\}} x \mu (dx) =\theta \int _0^\infty \mathbb {E}\left[ y\sum _{j\in \mathbb {Z}} Q_j \mathbb {1}_{\{\epsilon < y |Q_j| \le 1\}} \right] \alpha y^{-\alpha -1} \mathrm {d}y. \end{aligned}$$
(6.7)

First, as shown in [14, Theorem 3.2, Equation (3.13)] it holds that

$$\begin{aligned} \theta \mathbb {E}\left[ \sum _{j\in \mathbb {Z}} Q_j |Q_j|^{\alpha -1}\right] =2p-1, \end{aligned}$$
(6.8)

so by Fubini’s theorem, if \(\alpha >1\)

$$\begin{aligned} \theta \int _0^\infty \mathbb {E}\left[ y\sum _{j\in \mathbb {Z}} Q_j \mathbb {1}_{\{\epsilon < y |Q_j| \le 1\}} \right] \alpha y^{-\alpha -1} \mathrm {d}y&= \alpha \theta \mathbb {E}\left[ \sum _{j\in \mathbb {Z}} Q_j \int _{\epsilon |Q_j|^{-1}}^{|Q_j|^{-1}} y^{-\alpha } \mathrm {d}y \right] \\&= \frac{\alpha }{\alpha - 1} (\epsilon ^{-\alpha +1 } - 1) \theta \mathbb {E}\left[ \sum _{j\in \mathbb {Z}} Q_j |Q_j|^{\alpha -1}\right] \\&= \frac{\alpha }{\alpha - 1} (\epsilon ^{-\alpha +1 } - 1) (2p-1), \end{aligned}$$

and if \(\alpha =1\) the same term equals \( \log (\epsilon ^{-1}) (2p-1)\). Note that the use of Fubini’s theorem is justified since the same calculation as above shows that the above integral converges absolutely since \(\mathbb {E}[\sum _{j\in \mathbb {Z}} |Q_j|^{\alpha }] < \infty \). The equality in (6.7) now follows by the definition of the measure \(\mu \). Hence, for all \(t\in [0,1]\)

$$\begin{aligned} V_\epsilon (t) = \sum _{T_i\le t} s^\epsilon (P_i \varvec{Q}_i) - t \theta \int _0^\infty \mathbb {E}\left[ y\sum _{j\in \mathbb {Z}} Q_j \mathbb {1}_{\{\epsilon < y |Q_j| \le 1\}} \right] \alpha y^{-\alpha -1} \mathrm {d}y. \end{aligned}$$

Recall from Remark 4.6 that we can define \(W=\sum _{j\in \mathbb {Z}} |Q_j|\), \(W_i=\sum _{j\in \mathbb {Z}}|Q_{i,j}|\) so that \(\{W_i,i\ge 1\}\) is a sequence of i.i.d. random variables with the same distribution as W and \(\mathbb {E}[W_i^\alpha ] <\infty \) and that \(\sum _{i=1}^\infty \delta _{P_i W_i}\) is a Poisson point process on \((0,\infty ]\) with intensity measure \(\theta \mathbb {E}[W_1^\alpha ] \alpha y^{-\alpha -1}dy.\) In particular, for every \(\delta >0\) there are almost surely at most finitely many points \(P_i W_i\) such that \(P_i W_i > \delta \). For \(\delta ,\epsilon >0\), define

$$\begin{aligned} m_{\epsilon ,\delta }&= \theta \int _0^\infty \mathbb {E}\left[ y\sum _{j\in \mathbb {Z}} Q_j \mathbb {1}_{\{\epsilon< y|Q_j|\le 1,\;\delta < yW\}}\right] \alpha y^{-\alpha -1}dy. \end{aligned}$$

Note that \(\lim _{\epsilon \rightarrow 0} m_{\epsilon ,\delta }=m_{0,\delta }\) for all \(\delta >0\) by the dominated convergence theorem. Indeed, if \(\alpha =1\) we have that

$$\begin{aligned}&\theta \int _0^\infty \mathbb {E}\left[ y\sum _{j\in \mathbb {Z}} |Q_j| \mathbb {1}_{\{y|Q_j|\le 1,\;\delta < yW\}} \right] \alpha y^{-\alpha -1} \mathrm {d}y \le \theta \mathbb {E}\left[ \sum _{j\in \mathbb {Z}} |Q_j| \int _{\frac{\delta \wedge 1}{W}}^{\frac{1}{|Q_j|}} y^{-1} \mathrm {d}y \right] \\&\quad = \theta \mathbb {E}\left[ \sum _{j\in \mathbb {Z}} |Q_j| \log (|Q_j|^{-1}) + W \log W + \log ((\delta \wedge 1)^{-1}) W \right] , \end{aligned}$$

which is finite by assumption (4.10), and if \(\alpha >1\) similar calculation using the assumption \(\mathbb {E}[W^\alpha ] <\infty \) justifies the use of the dominated convergence theorem.

Since for every \(\delta >0\) there a.s. exists at most finitely many points \(P_i W_i\) such that \(P_i W_i >\delta \), for every \(\epsilon \ge 0\) we can define the process \(V_{\epsilon ,\delta }\) in D[0, 1] by

$$\begin{aligned} V_{\epsilon ,\delta }(t)= & {} \sum _{T_i\le t} s^\epsilon (P_i \varvec{Q}_i)\mathbb {1}_{\{\delta< P_iW_i\}} - tm_{\epsilon ,\delta }\\= & {} \sum _{T_i\le t}\sum _{j\in \mathbb {Z}}P_i Q_{i,j} \mathbb {1}_{\{\epsilon<P_i |Q_{i,j}|,\; \delta < P_iW_i\}} - t m_{\epsilon ,\delta }. \end{aligned}$$

Furthermore, for every fixed \(\delta >0\), as \(\epsilon \rightarrow 0\), \(V_{\epsilon ,\delta }\) converges uniformly almost surely to \(V_{0,\delta }\).

Next, we prove that for any positive sequence \(\{\delta _k\}\) with \(\delta _k\searrow 0\) as \(k\rightarrow \infty \), \(V_{0,\delta _k}\) converges uniformly almost surely to a process V in D([0, 1]). Note first that by [14, Theorem 3.1] the finite dimensional distributions of \(V_{0,\delta }\) converge to those of an \(\alpha \)-stable Lévy process.

Since \(\sum _{i\ge 1}\delta _{T_i,P_i,\varvec{Q}_i}\) is a Poisson point process on \([0,1]\times (0,\infty ]\times \tilde{l}_0\), the process \(V_{0,\delta }\) has independent increments with respect to \(\delta \), that is for every \(\delta <\delta '\), \(V_{0,\delta }-V_{0,\delta '}\) is independent of \(V_{0,\delta '}\). Moreover, since \(V_{0,\delta }-V_{0,\delta '}\) is a Poisson integral, we have that

$$\begin{aligned} {\text {var}}(V_{0,\delta }(1)-V_{0,\delta '}(1) )&= \theta \int _0^\infty y^2 \mathbb {E}\left[ \left( \sum _{j\in \mathbb {Z}} Q_j \right) ^2\mathbb {1}_{\{\delta < y W \le \delta '\}} \right] \alpha y^{-\alpha -1} \mathrm {d}y \\&\le \theta \mathbb {E}\left[ W^2 \int _0^{\delta '/W} \alpha y^{-\alpha +1} \mathrm {d}y \right] = \frac{\theta \alpha (\delta ')^{2-\alpha }}{(2-\alpha )} \mathbb {E}[W^\alpha ]. \end{aligned}$$

Therefore, \(\lim _{\delta ' \rightarrow 0} {\text {var}}(V_{0,\delta }(1)-V_{0,\delta '}(1) )=0\) and now arguing exactly as in the proof of [31, Proposition 5.7, Property 2] shows that for any positive sequence \(\{\delta _k\}\) with \(\delta _k\searrow 0\), \(\{V_{0,\delta _k}\}\) is almost surely a Cauchy sequence in D([0, 1]) with respect to the supremum metric \(\Vert \cdot \Vert _\infty \). Since the space D([0, 1]) is complete under this metric, we obtain the existence of the process \(V=\{V(t), t\in [0,1]\}\) with paths in D([0, 1]) almost surely and such that \(\lim _{k\rightarrow \infty }\Vert V_{0,\delta _k}-V\Vert _\infty =0\) almost surely.

There only remains to prove that for all \(u>0\),

$$\begin{aligned} \lim _{\delta \rightarrow 0} \limsup _{\epsilon \rightarrow 0} \mathbb {P}( \Vert V_\epsilon -V_{\epsilon ,\delta }\Vert _\infty > u) = 0. \end{aligned}$$
(6.9)

Indeed, this would imply that \(\Vert V_\epsilon -V\Vert _\infty \rightarrow 0\) in probability and hence that, along some subsequence, \(V_\epsilon \) converges to V uniformly almost surely. Since for \(\delta \le 1\), \(yW=\sum _{j\in \mathbb {Z}}y|Q_j|\le \delta \) implies that \(y|Q_j|\le \delta \le 1\) for all \(j\in \mathbb {Z}\), we have that

$$\begin{aligned} V_\epsilon (t)-V_{\epsilon ,\delta }(t)= & {} \sum _{T_i\le t}\sum _{j\in \mathbb {Z}}P_i Q_{i,j} \mathbb {1}_{\{\epsilon< P_i|Q_{i,j}|,\; P_iW_i \le \delta \}} \\&-\,t \theta \int _0^\infty \mathbb {E}\left[ y\sum _{j\in \mathbb {Z}} Q_j \mathbb {1}_{\{\epsilon < y|Q_j|,\; y W \le \delta \}} \right] \alpha y^{-\alpha -1} \mathrm {d}y. \end{aligned}$$

The process \(V_\epsilon -V_{\epsilon ,\delta }\) is a càdlàg martingale, thus applying Doob-Meyer’s inequality yields

$$\begin{aligned} \mathbb {P}\left( \Vert V_\epsilon -V_{\epsilon ,\delta }\Vert _\infty > u \right)&\le u^{-2} {\text {var}}(V_\epsilon (1)-V_{\epsilon ,\delta }(1) ) \\&= u^{-2} \theta \int _0^\infty y^2 \mathbb {E}\left[ \left( \sum _{j\in \mathbb {Z}} Q_j \mathbb {1}_{\{\epsilon < y|Q_j|,\; y W \le \delta \}} \right) ^2 \right] \alpha y^{-\alpha -1} \mathrm {d}y \\&\le u^{-2} \theta \mathbb {E}\left[ W^2 \int _0^{\delta /W} \alpha y^{-\alpha +1} \mathrm {d}y \right] = \frac{\theta \alpha \delta ^{2-\alpha }}{u^2(2-\alpha )} \mathbb {E}[W^\alpha ] \; \end{aligned}$$

and hence (6.9) holds. \(\square \)