1 Introduction

Within the last decades, statistical analysis for functional time series has become a very active area of research [see the monographs Bosq (2000), Ferraty and Vieu (2006), Horváth and Kokoszka (2012) and Hsing and Eubank (2015), among others]. Many authors impose the assumption of stationarity, which allows for developing advanced statistical theory. For instance, Bosq (2002) and Dehling and Sharipov (2005) investigate stationary functional processes with a linear representation and Hörmann and Kokoszka (2010) provide a general framework to model functional observations from stationary processes. Frequency domain analysis of stationary functional time series has been considered by Panaretos and Tavakoli (2013), while van Delft and Eichler (2018) propose a new concept of local stationarity for functional data. The assumption of second-order stationarity is also of particular importance for prediction problems [see Antoniadis and Sapatinas (2003), Aue et al. (2015), Hyndman and Shang (2009) among others] and for dynamic principal component analysis (Hörmann et al. 2015).

Ideally, the assumption of stationarity should be checked before applying any statistical methodology. Several authors have considered this problem, in particular within the context of change point analysis where the null hypothesis of stationarity is tested against the alternative of a structural change in certain parameters of the process; see Aue et al. (2009), Berkes et al. (2009), Horvath et al. (2010), Aston and Kirch (2012) among others. Tests that are designed to be powerful against more general alternatives are often based on an analysis in the frequency domain. For example, Aue and van Delft (2017) generalize the approach of Dwivedi and Subba Rao (2011) and Jentsch and Subba Rao (2015) to functional time series. More precisely, they begin by showing that the functional discrete Fourier transform (fDFT) is asymptotically uncorrelated at distinct Fourier frequencies if and only if the process is weakly stationary. The corresponding test is then based on a quadratic form based on a finite-dimensional projection of the empirical covariance operator of the fDFT’s. Consequently, the properties of the test depend on the number of lagged fDFT’s included. As an alternative, van Delft et al. (2017) construct a test using an estimate of a minimal distance between the spectral density operator of a non-stationary process and its best approximation by a spectral density operator corresponding to a stationary process (see also Dette et al. 2011 for a discussion of this approach in the univariate context). The test statistic consists of sums of Hilbert–Schmidt inner products of periodogram operators (evaluated at different frequencies) and is asymptotically normal distributed.

In the present paper, we propose an alternative time-domain test for second-order stationarity of a functional time series. More precisely, we suggest to address the problem of detecting non-stationarity by individually checking the hypothesis that the mean and the autocovariance operators at a given lag, say h, of a collection (indexed by time) of approximating stationary functional time series are in fact time independent. As explained in the next paragraph, the individual tests are then combined to yield a joint test including autocovariances up to a given maximal lag H. Thus, the approach investigated here is similar in spirit to the classical Portmanteau tests for serial correlation of a univariate time series, where the hypothesis of white noise is checked by investigating whether correlations up to a given lag vanish (see Box and Pierce 1970; Ljung and Box 1978). For the problem of checking stationarity in real-valued time series, similar approaches have been taken by Jin et al. (2015) and Bücher et al. (2018).

To combine the individual tests for stationarity of the mean and the autocovariance operators at a given lag h, we use appropriate extensions of well-known p value combination methods dating back to Fisher (1932). Each individual test is relying on a block multiplier approach making necessary the choice of a joint block length parameter m. Following ideas put forward in Politis and White (2004), a procedure is proposed to automatically select that parameter data adaptively in such a way that, asymptotically, a certain MSE criterion is optimized.

The remaining part of this article is organized as follows: In Sect. 2, we collect necessary mathematical preliminaries. In Sect. 3, we first propose individual tests for the hypothesis of second-order stationarity which are particularly sensitive to deviations in the mean, variance and a given lag h autocovariance, respectively. The tests are then combined to a joint test for second-order stationarity which is sensitive to deviations in the mean, variance and the first H autocovariances. In Sect. 4, we discuss an exemplary locally stationary time series model in great theoretical detail, while finite-sample results and a case study are presented in Sect. 5. The central proofs are collected in Sect. 6, while less central proofs and auxiliary results are provided in a supplementary material.

2 Mathematical preliminaries

2.1 Random elements in \(L^p\)-spaces

For some separable measurable space \((S, \mathcal {S},\nu )\) with a \(\sigma \)-finite measure \(\nu \) and \(p>1\), let \(\mathcal {L}^p(S, \nu )\) denote the set of measurable functions \(f:S \rightarrow \mathbb {R}\) such that \(\Vert f\Vert _p = (\int |f|^p {\,\mathrm {d}}\nu )^{1/p}<\infty \). For \(f\in \mathcal {L}^p(S, \nu )\), let [f] be the set of all functions g such that \(f=g\), \(\nu \)-almost surely. The space \(L^p(S, \nu )\) of all equivalence classes [f] then becomes a separable Banach space, and standard weak convergence theory is applicable. If S is a subset of \(\mathbb {R}^d\) and \(\nu \) is the Lebesgue measure, we occasionally write \(\mathcal {L}^p(S)\) and \(L^p(S)\).

Let \((\Omega , \mathcal {A}, \mathbb {P})\) denote a probability space and let \(X: S\times \Omega \rightarrow \mathbb {R}\) be \((\mathcal {S}\otimes \mathcal {A})\)-measurable such that \(X(\cdot , \omega ) \in \mathcal {L}^p(S, \mu )\) for \(\mathbb {P}\)-almost every \(\omega \). It follows from Lemma 6.1 in Janson and Kaijser (2015) that \(\omega \mapsto [X(\cdot , \omega )]\) is a random variable in \(L^p(S, \mu )\) (equipped with the Borel \(\sigma \)-field). Conversely, note that for any random variable [Y] in \(L^p(S, \mu )\), we can choose a \((\mu \otimes \mathbb {P})\)-a.s. unique \((\mathcal {S}\otimes \mathcal {A})\)-measurable mapping \(Y': S \times \Omega \rightarrow \mathbb {R}\) such that \(Y'(\cdot , \omega ) \in [Y](\omega )\) for \(\mathbb {P}\)-almost every \(\omega \). We can hence (a.s.) identify random variables in \(L^p(S, \mu )\) with measurable functions on \(S\times \Omega \) which are p-integrable in the first argument (\(\mathbb {P}\)-a.s.); slightly abusing notation, we also write X for the equivalence class [X].

A random variable X in \(L^2([0,1]^d)\) is called integrable if \(\mathbb {E}\Vert X\Vert _2 < \infty \). In that case, it follows from the Riesz representation theorem that there exists a unique element \(\mu _X =\mathbb {E}X \in L^2([0,1]^d)\) such that \(\mathbb {E}\langle X,f\rangle = \langle \mu _X,f\rangle \) for all \(f\in L^2([0,1]^d)\), where \(\langle f,g\rangle =\int _{\scriptscriptstyle [0,1]^d} fg {\,\mathrm {d}}\lambda _d\). If X is even square integrable, that is, \(\mathbb {E}\Vert X\Vert _2^2 < \infty \), the covariance operator of X is defined as the operator \(C_X: L^2([0,1]^d) \rightarrow L^2([0,1]^d)\) given by \(C_X(f)=\mathbb {E}[\langle f,X-\mu _X \rangle (X-\mu _X)]\). \(C_X\) is nuclear and hence a Hilbert–Schmidt operator (Bosq 2000, Sect. 1.5), whence, by Theorem 6.11 in Weidmann (1980), there exists a kernel \(c_X \in L^2([0,1]^d \times [0,1]^d)\) such that

$$\begin{aligned} C_X(f)(\tau ) = \int _{[0,1]^d} c_X(\tau ,\sigma ) f(\sigma ) {\,\mathrm {d}}\sigma \end{aligned}$$

for almost every \(\tau \in [0,1]^d\) and every \(f\in L^2([0,1]^d)\). Similarly, for square integrable random elements \(X,Y\in L^2([0,1]^d)\) we define the cross-covariance operator \(C_{X,Y}: L^2([0,1]^d) \rightarrow L^2([0,1]^d)\) by \(C_{X,Y}(f) = \mathbb {E}[\langle X- \mu _X, f\rangle (Y-\mu _Y)]\). By the same reasoning as above, there exists a kernel \(c_{X,Y}\in L^2([0,1]^d \times [0,1]^d)\) such that

$$\begin{aligned} C_{X,Y}(f)(\tau ) = \int _{[0,1]^d} c_{X,Y}(\tau ,\sigma ) f(\sigma ) {\,\mathrm {d}}\sigma . \end{aligned}$$

If X is in fact a \((\mathcal {B}([0,1]^d) \otimes \mathcal {A})\)-measurable function from \([0,1]^d \times \Omega \) to \(\mathbb {R}\) with \(X(\cdot , \omega ) \in \mathcal {L}^2([0,1]^d)\) a.s., then it can be shown that, in the respective \(L^2\)-spaces,

$$\begin{aligned} \mu (\tau )= & {} \mathbb {E}[X(\tau )], \quad \\ c_{X}(\tau ,\sigma )= & {} \text {Cov}\{ X(\tau ), X(\sigma ) \}, \quad c_{X,Y}(\tau ,\sigma )= \text {Cov}\{ X(\tau ), Y(\sigma ) \}. \end{aligned}$$

By the preceding paragraph, this notation also makes sense for equivalence classes \(X,Y\in L^{2}([0,1]^d)\).

2.2 Functional time series in \(L^2([0,1])\)

For each \(t\in \mathbb {Z}\), let \(X_{t}: [0,1] \times \Omega \rightarrow \mathbb {R}\) denote a \((\mathcal {B}|_{[0,1]} \otimes \mathcal {A})\)-measurable function with \(X_{t}(\cdot , \omega ) \in \mathcal {L}^2([0,1])\). By the preceding section, we can regard \([X_{t}]\) as a random variable in \(L^2([0,1])\), which we also write as \(X_{t}\). The sequence \( (X_{t})_{t\in \mathbb {Z}} \) will be referred to as a functional time series in \(L^2([0,1])\).

The functional time series will be called stationary if, for all \(q\in \mathbb {Z}\) and all \(h,t_1,\dots ,t_q\in \mathbb {Z}\)

$$\begin{aligned} (X_{t_1+h},\dots ,X_{t_q+h})\overset{d}{=} (X_{t_1},\dots ,X_{t_q}) \end{aligned}$$

in \(L^2([0,1])^{q}\).

Let \(\rho >0\). A sequence of functional time series \((X_{t,T})_{t\in \mathbb {Z}}\), indexed by \(T\in \mathbb {N}\), is called locally stationary (of order\(\rho \)) if, for any \(u\in [0,1]\), there exists a strictly stationary functional time series \(\{X_t^{\scriptscriptstyle (u)} \mid t\in \mathbb {Z}\}\) in \(L^2([0,1])\) and an array of real-valued random variables \(\{P_{\scriptscriptstyle t,T}^{\scriptscriptstyle (u)} \mid t=1,\dots ,T\}_{T\in \mathbb {N}}\) with \(\mathbb {E}|P_{t,T}^{\scriptscriptstyle (u)}|^\rho <\infty \), uniformly in \(1\le t\le T, T\in \mathbb {N}\) and \(u\in [0,1]\), such that

$$\begin{aligned} \Vert X_{t,T}-X_t^{(u)}\Vert _2 \le \bigg (\bigg |\frac{t}{T}-u\bigg |+\frac{1}{T}\bigg )P_{t,T}^{(u)} \end{aligned}$$
(1)

for all \(t=1,\dots ,T,T\in \mathbb {N}\) and \(u\in [0,1]\). This concept of local stationarity was first introduced by Vogt (2012) for p-dimensional time series (\(p\in \mathbb {N}\)). By the arguments in the preceding section, we may assume that \(X_t^{\scriptscriptstyle (u)}\) is in fact a \((\mathcal {B}([0,1])\times \mathcal {A})\)-measurable function from \([0,1]\times \Omega \) to \(\mathbb {R}\) such that \(X_t^{\scriptscriptstyle (u)}(\cdot , \omega ) \in \mathcal {L}^2([0,1])\) for \(\mathbb {P}\)-almost every \(\omega \). In the subsequent sections, we will usually assume that \(\rho \ge 2\) and that \(\mathbb {E}[ \Vert X_t^{(u)}\Vert _2^2]< \infty \) for all \(u\in [0,1]\). Despite the fact that \(\{(X_{t,T})_{t\in \mathbb {Z}}: T\in \mathbb {N}\}\) is a sequence of time series, we will occasionally simply call \((X_{t,T})_{t\in \mathbb {Z}}\) a locally stationary time series.

2.3 Further notation

In the following, we will deal with different norms on the spaces \(L^p([0,1]^d)\), for \(p\ge 1,d\in \mathbb {N}\). To avoid confusion, we denote the corresponding norms by \(\Vert \cdot \Vert _{p,d}\). As a special case, we will write \(\Vert \cdot \Vert _p\) instead of \(\Vert \cdot \Vert _{p,1}\). Further, we introduce the notation \(\Vert \cdot \Vert _{p,\Omega \times [0,1]^d}\) for the p-norm on the space \(L^p(\Omega \times [0,1]^d,\mathbb {P}\otimes \lambda ^d)\). Finally, we define \((f\otimes g)(x,y)=f(x)g(y)\) for functions \(f,g\in L^p([0,1])\).

3 Detecting deviations from second-order stationarity

3.1 Second-order stationarity in locally stationary time series

Before we can propose suitable test statistics for detecting deviations from second-order stationarity in a locally stationary functional time series, we need to clarify what is meant by second-order stationarity. Loosely speaking, we want to test the null hypothesis that the mean and/or the (auto)covariances do not vary too much over time. Meaningful asymptotic results will be obtained by formulating these null hypotheses in terms of the approximating sequences \(\{X_{t}^{\scriptscriptstyle (u)}: t \in \mathbb {Z}\}\) defined in Sect. 2.2. More precisely, we will subsequently assume that \(\mathbb {E}[ \Vert X_t^{(u)}\Vert _2^2]< \infty \) for all \(u\in [0,1]\) and consider the hypotheses

$$\begin{aligned} H_0^{(m)}: \Vert \mathbb {E}[X_0^{(u)}]-\mathbb {E}[X_0^{(v)}]\Vert _2=0 \quad \text { for all } u,v \in [0,1] \end{aligned}$$
(2)

and, for some lag \(h\ge 0\),

$$\begin{aligned} H_0^{(c,h)}:\Vert \mathbb {E}[X_{0}^{(u)} \otimes X_h^{(u)}]-\mathbb {E}[X_{0}^{(v)} \otimes X_h^{(v)}]\Vert _{2,2}=0 \quad \text { for all } u,v \in [0,1]. \end{aligned}$$
(3)

Note that the intersection

$$\begin{aligned} H_0=H_0^{(m)} \cap H_0^{(c,0)} \cap H_0^{(c,1)} \cap \dots \end{aligned}$$

corresponds to the case where the approximating sequences \(\{X_{t}^{\scriptscriptstyle (u)}: t \in \mathbb {Z}\}\), indexed by \(u\in [0,1]\), all share the same first- and second-order characteristics. We will therefore call the sequence of time series \((X_{t,T})_{t\in \mathbb {Z}}\), indexed by \(T\in \mathbb {N}\), second-order stationary if the global hypothesis \(H_0\) is met. The test statistics we are going to propose will be particularly sensitive to deviations from (weak) stationarity in the mean, the variance, and the first H autocovariances, which leads us to define

$$\begin{aligned} H_0^{(H)} = H_0^{(m)} \cap H_0^{(c,0)} \cap H_0^{(c,1)} \cap \dots \cap H_0^{(c,H)}, \end{aligned}$$
(4)

where \(H \in \mathbb {N}_0\) is fixed and denotes the maximum number of lags under consideration.

Remark 1

The hypotheses \(H_0^{\scriptscriptstyle (m)}\) and \(H_0^{\scriptscriptstyle (c,h)}\) are independent of the choice of the approximating family \(\{X_t^{\scriptscriptstyle (u)}: t\in \mathbb {Z}\}_{u\in [0,1]}\). Indeed, suppose there were two approximating families \(\{X_t^{\scriptscriptstyle (u)}: t\in \mathbb {Z}\}_{u\in [0,1]}\) and \(\{Y_t^{\scriptscriptstyle (u)}: t\in \mathbb {Z}\}_{u\in [0,1]}\) satisfying (1). By stationarity and the triangle inequality, we have, for any \(t,T\in \mathbb {N}\),

$$\begin{aligned} \mathbb {E}\Vert X_t^{(u)}-Y_t^{(u)}\Vert _2&= \mathbb {E}\Vert X_{\lfloor uT\rfloor }^{(u)}-Y_{\lfloor uT\rfloor }^{(u)}\Vert _2 \\&\le \mathbb {E}\Vert X_{\lfloor uT\rfloor }^{(u)}-X_{\lfloor uT\rfloor ,T}\Vert _2 +\mathbb {E}\Vert X_{\lfloor uT\rfloor ,T}-Y_{\lfloor uT\rfloor }^{(u)}\Vert _2 \le \frac{C}{T}. \end{aligned}$$

This implies \(\mathbb {E}\Vert X_t^{\scriptscriptstyle (u)}-Y_t^{\scriptscriptstyle (u)}\Vert _2=0\) and hence \(\Vert X_t^{\scriptscriptstyle (u)}-Y_t^{\scriptscriptstyle (u)}\Vert _2=0\) almost surely. \(\square \)

The following lemma provides two interesting equivalent formulations of each of the above hypotheses. Introduce the notations \(M:[0,1]^2\rightarrow \mathbb {R}, M_h:[0,1]^3\rightarrow \mathbb {R}\), where

$$\begin{aligned} M(u,\tau )&=\int _0^u \mathbb {E}[X_0^{(w)}(\tau )]{\,\mathrm {d}}w-u\int _0^1 \mathbb {E}[X_0^{(w)}(\tau )]{\,\mathrm {d}}w, \end{aligned}$$
(5)
$$\begin{aligned} M_h(u,\tau _1,\tau _2)&=\int _0^u \mathbb {E}[X_0^{(w)}(\tau _1)X_h^{(w)}(\tau _2)]{\,\mathrm {d}}w-u\int _0^1 \mathbb {E}[X_0^{(w)}(\tau _1)X_h^{(w)}(\tau _2)]{\,\mathrm {d}}w. \end{aligned}$$
(6)

Lemma 1

Let \(\{(X_{t,T})_{t\in \mathbb {Z}}: T\in \mathbb {N}\}\) denote a locally stationary functional time series of order \(\rho \ge 4\) with approximating sequences \((X_{t}^{\scriptscriptstyle (u)})_{t \in \mathbb {Z}}\) satisfying \(\mathbb {E}[\Vert X_{0}^{\scriptscriptstyle (u)} \Vert _2^4]<\infty \) for all \(u\in [0,1]\). Then, the hypothesis \(H_0^{\scriptscriptstyle (m)}\) in (2) is met if and only if

$$\begin{aligned} \Vert M \Vert _{2,2} =0. \end{aligned}$$
(7)

Likewise, for any \(h\in \mathbb {N}_0\), \(H_0^{\scriptscriptstyle (c,h)}\) in (3) is met if and only if

$$\begin{aligned} \Vert M_h \Vert _{2,3} =0. \end{aligned}$$
(8)

Moreover, the hypothesis \(H_0^{\scriptscriptstyle (m)}\) is equivalent to

$$\begin{aligned}&\exists \ C>0: \quad \Vert \mathbb {E}[X_{\lfloor uT\rfloor ,T}]-\mathbb {E}[X_{0,T}]\Vert _{2}\le \frac{C}{T} \quad \text {for all } u\in [0,1], T \in \mathbb {N}, \end{aligned}$$
(9)

and \(H_0^{\scriptscriptstyle (c,h)}\) is equivalent to

$$\begin{aligned}&\exists \ C>0: \quad \Vert \mathbb {E}[X_{\lfloor uT\rfloor ,T}\otimes X_{\lfloor uT\rfloor +h,T} - X_{0,T}\otimes X_{h,T}]\Vert _{2,2} \le \frac{C}{T} \nonumber \\&\quad \text {for all } u\in [0,1], T \in \mathbb {N}. \end{aligned}$$
(10)

The lemma is proven in Sect. 6.2. We will heavily rely on conditions (7) and (8) when constructing the test statistics in the next section. Assertions (9) and (10) are interesting in their own rights, as they provide a sub-asymptotic formulation of the hypothesis of second-order stationarity. They are used in the next section for showing that the tests are consistent and will also be crucial when extending consistency results to the case of piecewise locally stationary processes in Sect. 3.5.

3.2 Test statistics

In the subsequent sections, we assume to observe, for some \(T\in \mathbb {N}\), an excerpt \(X_{1,T}, \dots , X_{T,T}\) from a locally stationary time series \(\{(X_{t,T})_{t\in \mathbb {Z}}:T\in \mathbb {N}\}\). We are interested in testing the hypotheses \(H_0^{\scriptscriptstyle (m)}\) and \(H_0^{\scriptscriptstyle (c,h)}\) formulated in the preceding section, which can be done individually by a CUSUM-type procedure. More precisely, for \(u,\tau \in [0,1]\), let

$$\begin{aligned} U_T(u,\tau )&= \frac{1}{\sqrt{T}}\left( \sum _{t=1}^{\lfloor uT\rfloor }X_{t,T}(\tau )-u\sum _{t=1}^{T}X_{t,T}(\tau ) \right) \end{aligned}$$
(11)

denote the CUSUM process for the mean, and, for \(u,\tau _1, \tau _2\in [0,1]\) and \(h \in \mathbb {N}_0\), let

$$\begin{aligned}&{U}_{T,h}(u,\tau _1,\tau _2)\nonumber \\&\quad = \frac{1}{\sqrt{T}}\left( \sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)}X_{t,T}(\tau _1)X_{t+h,T}(\tau _2)-u\sum _{t=1}^{T-h}X_{t,T}(\tau _1)X_{t+h,T}(\tau _2) \right) \end{aligned}$$
(12)

denote the CUSUM process for the (auto)cross-moments at lagh. Under the null hypothesis \(H_0^{\scriptscriptstyle (m)}\), \(T^{-1/2} {U}_T(u,\tau )\) can be regarded as an estimator of the quantity \(M(u,\tau )\) defined in (5), and a similar statement holds for \(T^{-1/2} {U}_{T,h}\), which estimates the \(M_h\) in (6). Hence, by Lemma 1, it seems reasonable to reject \(H_0^{\scriptscriptstyle (m)}\) or \(H_0^{\scriptscriptstyle (c,h)}\) for large values of

$$\begin{aligned} \mathcal {S}_T^{(m)} = \Vert {U}_{T} \Vert _{2,2} \quad \text { or } \quad \mathcal {S}_T^{(c,h)} = \Vert {U}_{T,h} \Vert _{2,3}, \end{aligned}$$
(13)

respectively.

Alternatively, one could use the \(L^2\)-norm in \(\tau \) and \((\tau _1,\tau _2)\), respectively, and the supremum in u, as proposed in Sharipov et al. (2016). However, preliminary simulation results suggested that a test based on the \(L^2\)-norm in u performs better in applications with small sample sizes.

In Sect. 3.4, we will propose a procedure that allows to combine the previous tests statistics to obtain a joint test for the combined hypothesis \(H_0^{\scriptscriptstyle (H)}\), with maximal lag \(H\in \mathbb {N}_0\) fixed. For that purpose, we will first need (asymptotic) critical values for the individual test statistics \(\mathcal {S}_T^{(m)}\) and \(\mathcal {S}_T^{(c,h)}\), which in turn can be deduced from the joint asymptotic distribution of the CUSUM processes in (11) and (12). The basic tools are the following partial sum processes

$$\begin{aligned} \tilde{B}_{T}(u,\tau )&=\frac{1}{\sqrt{T}}\sum _{t=1}^{\lfloor uT\rfloor } X_{t,T}(\tau )-\mathbb {E}[X_{t,T}(\tau )], \\ \tilde{B}_{T,h}(u,\tau _1,\tau _2)&=\frac{1}{\sqrt{T}}\sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)} X_{t,T}(\tau _1)X_{t+h,T}(\tau _2)-\mathbb {E}[X_{t,T}(\tau _1)X_{t+h,T}(\tau _2)], \end{aligned}$$

where \(u,\tau ,\tau _1, \tau _2 \in [0,1]\) and \(h\in \mathbb {N}_0\). The expected values within the sums will be denoted by

$$\begin{aligned} \mu _{t,T}(\tau ) = \mathbb {E}[X_{t,T}(\tau )] \quad \text {and} \quad \mu _{t,T,h}(\tau _1, \tau _2) = \mathbb {E}[X_{t,T}(\tau _1)X_{t+h,T}(\tau _2)]. \end{aligned}$$

The following assumptions are sufficient to guarantee weak convergence of these processes.

Condition 1

(Assumptions on the functional time series)

  1. (A1)

    Local stationarity The observations \(X_{1,T}, \dots X_{T,T}\) are an excerpt from a locally stationary functional time series \(\{(X_{t,T})_{t\in \mathbb {Z}}:T\in \mathbb {N}\}\) of order \(\rho =4\) in \(L^2([0,1],\mathbb {R})\).

  2. (A2)

    Moment condition For any \(k\in \mathbb {N}\), there exists a constant \(C_k<\infty \) such that \(\mathbb {E}\Vert X_{t,T}\Vert _2^k\le C_k\) and \(\mathbb {E}\Vert X_0^{\scriptscriptstyle (u)}\Vert _2^k\le C_k\) uniformly in \(t\in \mathbb {Z},T\in \mathbb {N}\) and \(u\in [0,1]\).

  3. (A3)

    Cumulant condition For any \(j\in \mathbb {N}\), there is a constant \(C_j <\infty \) such that

    $$\begin{aligned} \sum _{t_1,\dots ,t_{j-1}=-\infty }^{\infty } \big \Vert {{\,\mathrm{cum}\,}}(X_{t_1,T},\dots ,X_{t_j,T})\big \Vert _{2,j} \le C_j<\infty , \end{aligned}$$
    (14)

    for any \(t_j\in \mathbb {Z}\) (for \(j=1\), the condition is to be interpreted as \(\Vert \mathbb {E}X_{t_1,T}\Vert _2\le C_1\) for all \(t_1 \in \mathbb {Z}\)). Further, for \(k\in \{2,3,4\}\), there exist functions \(\eta _k:\mathbb {Z}^{k-1}\rightarrow \mathbb {R}\) satisfying

    $$\begin{aligned} \sum _{t_1,\dots ,t_{k-1}=-\infty }^{\infty } (1+|t_1|+\dots +|t_{k-1}|)\eta _k(t_1,\dots ,t_{k-1}) < \infty \end{aligned}$$

    such that, for any \(T\in \mathbb {N}, 1 \le t_1 , \dots , t_k \le T, v, u_1, \dots , u_k \in [0,1],h_1,h_2\in \mathbb {Z}\), \(Z_{t,T}^{\scriptscriptstyle (u)}\in \{X_{\scriptscriptstyle t, T},X_{t}^{\scriptscriptstyle (u)}\}\), and any \(Y_{t,h,T}(\tau _1,\tau _2)\in \{ X_{t,T}(\tau _1), X_{t,T}(\tau _1)X_{t+h,T}(\tau _2) \}\), we have

    1. (i)

      \(\Vert {{\,\mathrm{cum}\,}}(X_{t_1,T}-X_{t_1}^{(t_1/T)},Z_{t_2,T}^{(u_2)},\ldots ,Z_{t_k,T}^{(u_k)})\Vert _{2,k} \le \frac{1}{T} \eta _k(t_2-t_1,\dots ,t_k-t_1)\),

    2. (ii)

      \(\Vert {{\,\mathrm{cum}\,}}(X_{t_1}^{(u_1)}-X_{t_1}^{(v)},Z_{t_2,T}^{(u_2)},\ldots ,Z_{t_k,T}^{(u_k)})\Vert _{2,k} \le |u_1-v| \eta _k(t_2-t_1,\dots ,t_k-t_1)\),

    3. (iii)

      \(\Vert {{\,\mathrm{cum}\,}}(X_{t_1,T},\dots ,X_{t_k,T})\Vert _{2,k} \le \eta _k(t_2-t_1,\ldots ,t_k-t_1)\),

    4. (iv)

      \(\int _{[0,1]^{2}} |{{\,\mathrm{cum}\,}}\big (Y_{t_1,h_1,T}(\tau ),Y_{t_2,h_2,T}(\tau ) \big )|{\,\mathrm {d}}\tau \le \eta _2(t_2-t_1)\).

Assumption (A2) is needed to ensure existence of all cumulants. The cumulant condition (A3) is a (partially) weakened version of the assumptions made by Lee and Subba Rao (2017) and Aue and van Delft (2017) and has its origins in classical multivariate time series analysis (see Brillinger (1981), Assumption 2.6.2). Lemma 2 shows that the cumulant conditions in (A3) hold, provided (A1), (A2), a further moment condition and a strong mixing condition are satisfied. In particular, they are met for the models employed in Sect. 5 within our simulation study (see in particular Lemma 4).

The following theorem, proven in Sect. 6.2, shows that \(\tilde{B}_{T}\) and \(\tilde{B}_{T,h}\) jointly converge weakly with respect to the \(L^2\)-metric. For \(H \in \mathbb {N}_0\), let the Cartesian product

$$\begin{aligned} \mathcal {H}_{H+2}= L^2([0,1]^2)\times \{ L^2([0,1]^3)\}^{H+1} \end{aligned}$$

be equipped with the sum of the individual scalar products, such that \(\mathcal {H}_{H+2}\) is a Hilbert space itself.

Theorem 1

Suppose that Assumptions (A1)–(A3) are met. Then, the vector \(\mathbb {B}_T=(\tilde{B}_T,\tilde{B}_{T,0},\dots ,\tilde{B}_{T,H})\) converges weakly to a centered Gaussian variable \(\mathbb {B}=(\tilde{B},\tilde{B}_0,\dots ,\tilde{B}_H)\) in \(\mathcal {H}_{H+2}\) with covariance operator \(C_\mathbb {B}:\mathcal {H}_{H+2} \rightarrow \mathcal {H}_{H+2}\) defined as

$$\begin{aligned}&C_{\mathbb {B}} \left( \begin{array}{c} g \\ f_0 \\ \vdots \\ f_{H} \end{array} \right) \left( \begin{array}{c} (u,\tau ) \\ (u_0, \tau _{01}, \tau _{02}) \\ \vdots \\ (u_H, \tau _{H1}, \tau _{H2}) \end{array} \right) \\&\quad = \left( \begin{array}{c} \langle r^{(m)}((u,\tau ), \cdot ), g\rangle + \sum \nolimits _{h=0}^H \langle r^{(m,c)}_h ((u,\tau ), \cdot ), f_h \rangle \\ \langle r^{(m,c)}_0(\cdot , (u_{0}, \tau _{01}, \tau _{02})), g\rangle + \sum \nolimits _{h=0}^H \langle r^{(c)}_{0,h} ((u_0,\tau _{01}, \tau _{02}) , \cdot ), f_h \rangle \\ \vdots \\ \langle r^{(m,c)}_H(\cdot , (u_{H}, \tau _{H1}, \tau _{H2})), g\rangle + \sum \nolimits _{h=0}^H \langle r^{(c)}_{H,h} ((u_H,\tau _{H1}, \tau _{H2}), \cdot ), f_h \rangle \end{array} \right) . \end{aligned}$$

Here, the kernel functions \(r^{(m)}, r^{(c)}_{h,h'}\) and \(r^{(m,c)}_h\) are given by

$$\begin{aligned} r^{(m)}((u,\tau ), (v,\varphi ))&= \text {Cov}\big (\tilde{B}(u,\tau ),\tilde{B}(v,\varphi )\big ) \\&= \sum _{k=-\infty }^{\infty }\int _{0}^{u\wedge v}c_{k,1}(w) {\,\mathrm {d}}w,\\ r_{h,h'}^{(c)}((u,\tau _1, \tau _2), (v,\varphi _1, \varphi _2))&= \text {Cov}\big (\tilde{B}_h(u,\tau _1,\tau _2),\tilde{B}_{h'}(v,\varphi _1,\varphi _2)\big ) \\&= \sum _{k=-\infty }^{\infty }\int _{0}^{u\wedge v}c_{k,2}(w) {\,\mathrm {d}}w,\\ r_{h}^{(m,c)}((u,\tau _1, \tau _2), (v,\varphi _1, \varphi _2))&= \text {Cov}\big (\tilde{B}(u,\tau ),\tilde{B}_{h}(v,\varphi _1,\varphi _2)\big ) \\&= \sum _{k=-\infty }^{\infty }\int _{0}^{u\wedge v}c_{k,3}(w) {\,\mathrm {d}}w, \end{aligned}$$

with

$$\begin{aligned} c_{k,1}(w)&= c_{k,1}(w, \tau , \varphi ) = \text {Cov}\big (X_0^{(w)}(\tau ),X_k^{(w)}(\varphi )\big ),\\ c_{k,2}(w)&= c_{k,2}(w, h, h',\tau _1, \tau _2, \varphi _1, \varphi _2)\\&= \text {Cov}\big (X_0^{(w)}(\tau _1)X_h^{(w)}(\tau _2),X_k^{(w)}(\varphi _1)X_{k+h'}^{(w)}(\varphi _2)\big ),\\ c_{k,3}(w)&= c_{k,3}(w, h, \tau , \varphi _1, \varphi _2) = \text {Cov}\big (X_0^{(w)}(\tau ),X_k^{(w)}(\varphi _1)X_{k+h}^{(w)}(\varphi _2)\big ), \end{aligned}$$

for any \(0\le h,h'\le H\). In particular, the infinite sums and integrals converge.

The following corollary on joint weak convergence of the CUSUM processes defined in (11) and (12) is essentially a mere consequence of the continuous mapping theorem. Let

$$\begin{aligned} \tilde{G}_T(u,\tau )&= \tilde{B}_T(u,\tau ) - u \tilde{B}_T(1,\tau ) \\ \tilde{G}_{T,h}(u,\tau _1, \tau _2)&= \tilde{B}_{T,h}(u,\tau _1, \tau _2) - u \tilde{B}_{T,h}(1,\tau _1, \tau _2) \\ \mathbb {G}_T&= (\tilde{G}_T, \tilde{G}_{T,1}, \dots , \tilde{G}_{T,H}) \end{aligned}$$

and, similarly,

$$\begin{aligned} \tilde{G}(u,\tau )&= \tilde{B}(u,\tau ) - u \tilde{B}(1,\tau ) \nonumber \\ \tilde{G}_{h}(u,\tau _1, \tau _2)&= \tilde{B}_{h}(u,\tau _1, \tau _2) - u \tilde{B}_{h}(1,\tau _1, \tau _2) \nonumber \\ \mathbb {G}&= (\tilde{G}, \tilde{G}_{1}, \dots , \tilde{G}_{H}). \end{aligned}$$
(15)

Corollary 1

Suppose that Assumptions (A1)–(A3) are satisfied. If \(H_0^{\scriptscriptstyle (m)}\) holds, then

$$\begin{aligned} \Vert {U}_T - \tilde{G}_T\Vert _{2,2} = o_\mathbb {P}(1). \end{aligned}$$

If \(H_0^{\scriptscriptstyle (c,h)}\) holds, then

$$\begin{aligned} \Vert {U}_{T,h} - \tilde{G}_{T,h}\Vert _{2,3} = o_\mathbb {P}(1). \end{aligned}$$

As a consequence, if the hypothesis \(H_0^{\scriptscriptstyle (H)}\) in (4) holds, then,

$$\begin{aligned} \mathbb {U}_T = (U_T, U_{T,1}, \dots , U_{T,H}) = \mathbb {G}_T + o_\mathbb {P}(1) \rightsquigarrow \mathbb {G}. \end{aligned}$$

On the other hand, if \(H_0^{\scriptscriptstyle (m)}\) or \(H_0^{\scriptscriptstyle (c,h)}\) does not hold, then \(\mathcal {S}_T^{(m)}\rightarrow \infty \) or \(\mathcal {S}_T^{(c,h)} \rightarrow \infty \) in probability, respectively.

The corollary suggests to reject \(H_0^{\scriptscriptstyle (m)}\) or \(H_0^{\scriptscriptstyle (c,h)}\) for large values of \(\mathcal {S}_T^{(m)}\) or \(\mathcal {S}_T^{(c,h)}\), respectively. However, the corresponding null-limiting distributions \(\Vert \tilde{G} \Vert _{2,2}\) and \(\Vert \tilde{G}_h \Vert _{2,3}\) depend in a complicated way on the functions \(c_{k,j}\) defined in Theorem 1 and cannot be easily transformed into a pivotal distribution. We therefore propose to derive critical values by a suitable block multiplier bootstrap approximation worked out in detail in Sect. 3.4.

3.3 Strong mixing and cumulants

In this section, we will demonstrate that under the assumption of a strong mixing locally stationary functional time series, Assumption (A3) is met. To be precise, let \(\mathcal {F}\) and \(\mathcal {G}\) be \(\sigma \)-fields in \((\Omega , \mathcal {A})\) and define

$$\begin{aligned} \alpha (\mathcal {F},\mathcal {G})=\sup \{ |\mathbb {P}(A\cap B)-\mathbb {P}(A)\mathbb {P}(B)| : A\in \mathcal {F}, B\in \mathcal {G}\}. \end{aligned}$$

A functional time series \(\{(X_{t,T})_{t\in \mathbb {Z}}: T\in \mathbb {N}\}\) in \(L^2([0,1])\) is called \(\alpha \)- or strongly mixing if the mixing coefficients

$$\begin{aligned} \alpha '(k)=\sup \limits _{T\in \mathbb {N}}\sup \limits _{t\in \mathbb {Z}}\alpha \Big (\sigma \big (\{X_{s,T}(\tau )|\tau \in [0,1]\}_{s=-\infty }^t\big ),\, \sigma \big (\{X_{s,T}(\tau )|\tau \in [0,1]\}_{s=t+k}^\infty \big )\Big ) \end{aligned}$$

vanish as k tends to infinity. Analogously, we define

$$\begin{aligned} \alpha ''(k)=\sup _{u\in [0,1]}\sup \limits _{t\in \mathbb {Z}}\alpha \Big (\sigma \big (\{X_{s}^{(u)}(\tau )|u,\tau \in [0,1]\}_{s=-\infty }^t\big ),\, \sigma \big (\{X_{s}^{(u)}(\tau )|u,\tau \in [0,1]\}_{s=t+k}^\infty \big )\Big ) \end{aligned}$$

as mixing coefficients of the family of approximating stationary processes. Further, we define \(\alpha (k)=\max \{\alpha '(k),\alpha ''(k)\}\). A locally stationary, functional time series is called strongly mixing, if \(\alpha (k)\) vanishes, as k tends to infinity and exponentially strongly mixing if \(\alpha (k) \le c a^k \) for some constants \(c >0 \) and \(a \in (0,1)\). Note that we can define the mixing coefficients in terms of a function in \(\mathcal {L}^2([0,1])\) rather than an element of the space \(L^2([0,1])\) of equivalence classes by Lemma 6.1 in Janson and Kaijser (2015). The main result of this section provides sufficient conditions for the theory developed so far for strong mixing processes.

Lemma 2

Let \(\{(X_{t,T})_{t\in \mathbb {Z}}: T\in \mathbb {N}\}\) be a strongly mixing locally stationary functional time series in \(L^2([0,1],\mathbb {R})\) such that Assumptions (A1), (A2) and the condition

$$\begin{aligned} \sup _{t,T} \Vert X_{t,T}\Vert _{r, \Omega \times [0,1]}<C_r < \infty \end{aligned}$$

are satisfied for any integer \(r> 2\). If \(\{(X_{t,T})_{t\in \mathbb {Z}}: T\in \mathbb {N}\}\) is exponentially strongly mixing, then it also satisfies the summability conditions for the cumulants in Assumption (A3).

3.4 Bootstrap approximation

The bootstrap approximation will be based on two smoothing parameters: a block length sequence \(m=m_T\) needed to asymptotically catch the serial dependence within the time series, and a bandwidth sequence \(n=n_T\) needed to estimate expected values locally in time. We will impose the following condition.

Condition 2

(Assumptions on the bootstrap scheme)

  1. (B1)

    Let \(m=m(T) \le T\) be an integer-valued sequence, to be understood as the block length within a block bootstrap procedure. Assume that m tends to infinity and m / T vanishes, as \(T\rightarrow \infty \).

  2. (B2)

    Let \(n=n(T) \le T/2\) be an integer-valued sequence such that both m / n and \(mn^2/T^2\) converge to zero, as T tends to infinity.

  3. (B3)

    Let \(\{R_i^{\scriptscriptstyle (k)}\}_{i,k\in \mathbb {N}}\) denote independent standard normally distributed random variables, independent of the stochastic process \(\{(X_{t,T})_{t\in \mathbb {Z}}:T\in \mathbb {N}\}\).

Under this set of notations, we define

$$\begin{aligned} \hat{B}_{T}^{(k)}(u,\tau ) = \frac{1}{\sqrt{T}} \sum _{i=1}^{\lfloor uT\rfloor }\frac{R_i^{(k)}}{\sqrt{m}}\sum _{t=i}^{(i+m-1)\wedge T} \big \{ X_{t,T}(\tau )-\hat{\mu }_{t,T}(\tau ) \big \} \end{aligned}$$

as a bootstrap approximation for \(\tilde{B}_T(u, \tau )\), where

$$\begin{aligned} \hat{\mu }_{t,T}(\tau ) = \frac{1}{\tilde{n}_{t,0}}\sum _{j=\underline{n}_t}^{\bar{n}_{t,0}}X_{t+j,T}(\tau ) \end{aligned}$$

denotes an estimator for \(\mu _{t,T}(\tau )\) relying on the bandwidth sequence n via

$$\begin{aligned} \bar{n}_{t,h}=n\wedge (T-t-h), \quad \underline{n}_t=-n\vee (1-t), \quad \tilde{n}_{t,h} = \bar{n}_{t,h}-\underline{n}_t+1, \end{aligned}$$
(16)

for \(0\le h\le H\). Similarly, for any \(0\le h\le H\), bootstrap approximations for \(\tilde{B}_{T,h}(u, \tau _1, \tau _2)\) are defined as

$$\begin{aligned} \hat{B}_{T,h}^{(k)}(u,\tau _1,\tau _2)&= \frac{1}{\sqrt{T}} \sum _{i=1}^{\lfloor uT\rfloor \wedge (T-h)}\frac{R_i^{(k)}}{\sqrt{m}} \\&\quad \sum _{t=i}^{(i+m-1)\wedge (T-h)} \big \{ X_{t,T}(\tau _1)X_{t+h,T}(\tau _2) - \hat{\mu }_{t,T,h}(\tau _1,\tau _2) \big \}, \end{aligned}$$

where

$$\begin{aligned} \hat{\mu }_{t,T,h}(\tau _1,\tau _2) = \frac{1}{\tilde{n}_{t,h}}\sum _{j=\underline{n}_t}^{\bar{n}_{t,h}}X_{t+j,T}(\tau _1)X_{t+j+h,T}(\tau _2). \end{aligned}$$

Finally, for fixed \(k\in \mathbb {N}\), collect the bootstrap approximations in the vector

$$\begin{aligned} \hat{\mathbb {B}}_T^{(k)} = (\hat{B}_{T}^{(k)},\hat{B}_{T,0}^{(k)},\dots ,\hat{B}_{T,H}^{(k)}). \end{aligned}$$

The following theorem shows that the bootstrap replicates can be regarded as asymptotically independent copies of the original process \(\mathbb {B}_T\) from Theorem 1.

Theorem 2

Suppose that Assumptions (A1)–(A3) and (B1)–(B3) are met. Then, for any fixed \(K\in \mathbb {N}\) and as \(T \rightarrow \infty \),

$$\begin{aligned} \big (\mathbb {B}_T,\hat{\mathbb {B}}_{T}^{(1)},\dots ,\hat{\mathbb {B}}_{T}^{(K)}\big ) \rightsquigarrow \big (\mathbb {B},\mathbb {B}^{(1)},\dots ,\mathbb {B}^{(K)}\big ) \end{aligned}$$

in \(\{ L^2([0,1]^2)\times (L^2([0,1]^3))^{H+1}\}^{K+1}\), where \(\mathbb {B}^{\scriptscriptstyle (k)}\) (\(k=1,\dots ,K\)) are independent copies of the centered Gaussian variable \(\mathbb {B}\) from Theorem 1. Equivalently (Bücher and Kojadinovic 2017, Lemma 2.2),

$$\begin{aligned} d_{\mathrm {BL}} ( \mathbb {P}^{\hat{\mathbb {B}}_{T}^{(1)} \mid X_{1,T}, \dots , X_{T,T}}, \mathbb {P}^{\mathbb {B}_T} ) =o_\mathbb {P}(1), \quad T \rightarrow \infty , \end{aligned}$$

where \(d_{\mathrm {BL}}\) denotes the bounded Lipschitz metric between probability distributions on \(L^2([0,1]^2)\times (L^2([0,1]^3))^{H+1}\).

The proof is given in Sect. 6.2. The preceding theorem, together with Corollary 1, suggests to define the following bootstrap approximation for the CUSUM processes defined in (11) and (12):

$$\begin{aligned} \hat{G}_T^{(k)}(u,\tau )&=\hat{B}_{T}^{(k)}(u,\tau ) -u\hat{B}_{T}^{(k)}(1,\tau ),\\ \hat{G}_{T,h}^{(k)}(u,\tau _1,\tau _2)&=\hat{B}_{T,h}^{(k)}(u,\tau _1,\tau _2) -u\hat{B}_{T,h}^{(k)}(1,\tau _1,\tau _2), \\ \hat{\mathbb {G}}_T^{(k)}&=(\hat{G}_T^{(k)},\hat{G}_{T,0}^{(k)},\dots ,\hat{G}_{T,H}^{(k)}). \end{aligned}$$

Theorem 2, Corollary 1 and the continuous mapping theorem then imply that, under the hypothesis \(H_0^{\scriptscriptstyle (H)}\) in (4),

$$\begin{aligned} (\varvec{S}_T, \varvec{S}_T^{(1)}, \dots , \varvec{S}_T^{(K)})&\equiv (\Phi (\mathbb {U}_T), \Phi (\hat{\mathbb {G}}_T^{\scriptscriptstyle (1)}),\dots ,\Phi (\hat{\mathbb {G}}_T^{\scriptscriptstyle (K)})) \\&= (\Phi (\mathbb {G}_T), \Phi (\hat{\mathbb {G}}_T^{\scriptscriptstyle (1)}),\dots ,\Phi (\hat{\mathbb {G}}_T^{\scriptscriptstyle (K)}))+o_\mathbb {P}(1) \\&\rightsquigarrow (\Phi (\mathbb {G}), \Phi (\mathbb {G}^{\scriptscriptstyle (1)}),\dots ,\Phi (\mathbb {G}^{\scriptscriptstyle (K)})) \equiv (\varvec{S}, \varvec{S}^{(1)}, \dots , \varvec{S}^{(K)}) , \end{aligned}$$

where \(\Phi (G_{-1},G_0, \dots , G_{H}) = (\Vert G_{-1} \Vert _{2,2}, \Vert G_{0}\Vert _{2,3}, \dots , \Vert G_H\Vert _{2,3})\) and where \(\mathbb {G}^{\scriptscriptstyle (1)},\dots ,\mathbb {G}^{\scriptscriptstyle (K)}\) are independent copies of \(\mathbb {G}\). Individual bootstrap-based tests for, e.g., \(H_{0}^{\scriptscriptstyle (c,h)}\) are then naturally defined by the p value

$$\begin{aligned} p_{T,K}(S_{T,h}) = \frac{1}{K} \sum _{j=1}^K \varvec{1}( S_{T,h}^{(k)} \ge S_{T,h}), \end{aligned}$$

where \(S_{T,h}^{\scriptscriptstyle (k)}\) and \(S_{T,h}\) denote the \((h+2)\)nd coordinate of \(\varvec{S}_{T}^{\scriptscriptstyle (k)}\) and \(\varvec{S}_T\), respectively; in particular, \(S_{T,-1}=\mathcal {S}_T^{\scriptscriptstyle (m)}\) and \(S_{T,h}=\mathcal {S}_T^{\scriptscriptstyle (c,h)}\) as defined in (13). Indeed, we can show the following result for each individual test.

Proposition 1

Suppose that Assumptions (A1)–(A3) and (B1)–(B3) are met. Then, for all \(h \in \mathbb {Z}_{\ge -1}\), provided \(K=K_T \rightarrow \infty \), and with \(H_0^{(c,-1)}=H_0^{(m)}\), we have

$$\begin{aligned} p_{T,K_T}(S_{T,h}) \rightsquigarrow {\left\{ \begin{array}{ll} \mathrm {Uniform}(0,1) &{} \text {if }H_0^{\scriptscriptstyle (c,h)} \text { is met} \\ 0 &{} \text {else}. \end{array}\right. } \end{aligned}$$

Moreover, we can rely on an extension of Fisher’s p value combination method (Fisher 1932) as described in Sect. 2 in Bücher et al. (2018) to obtain a combined test for the joint hypothesis \(H_0^{\scriptscriptstyle (H)}\) in (4). More precisely, let \(\psi :(0,1)^{H+2} \rightarrow \mathbb {R}\) be a continuous function that is decreasing in each argument (throughout the simulations, we employ \(\psi (p_{-1},\dots ,p_H)=\sum _{i=-1}^{H}w_i\Phi ^{-1}(1-p_i)\) with weights \(w_{-1}=w_{0}=1/3\) and \(w_1=\cdots =w_H= (3H)^{-1}\).) The combined test is defined by its p value calculated based on the following algorithm.

Algorithm 1

(Combined Bootstrap test for\(H_0^{\scriptscriptstyle (H)}\))

  1. (1)

    Let \(\varvec{S}_T^{\scriptscriptstyle (0)} = \varvec{S}_{T}\).

  2. (2)

    Given a large integer K, compute the sample of K bootstrap replicates \(\varvec{S}_T^{\scriptscriptstyle (1)},\dots ,\varvec{S}_T^{\scriptscriptstyle (K)}\) of the vector \(\varvec{S}_T^{\scriptscriptstyle (0)}\).

  3. (3)

    Then, for all \(i \in \{0,1,\dots ,K\}\) and \(h \in \{-1,\dots ,H\}\), compute

    $$\begin{aligned} p_{T,K}(S_{T,h}^{(i)}) = \frac{1}{K+1} \bigg \{\frac{1}{2} + \sum _{k=1}^K \varvec{1} \left( S_{T,h}^{(k)} \ge S_{T,h}^{(i)} \right) \bigg \}. \end{aligned}$$
  4. (4)

    Next, for all \(i \in \{0,1,\dots ,K\}\), compute

    $$\begin{aligned} W_{T,K}^{(i)} = \psi \{ p_{T,K}(S_{T,0}^{(i)}),\dots ,p_{T,K}(S_{T,H}^{(i)}) \}. \end{aligned}$$
  5. (5)

    The global statistic is \(W_{T,K}^{\scriptscriptstyle (0)}\), and the corresponding p value is given by

    $$\begin{aligned} p_{T,K}(W_{T,K}^{(0)}) =\frac{1}{K} \sum _{k=1}^K \varvec{1} \left( W_{T,K}^{(k)} \ge W_{T,K}^{(0)} \right) . \end{aligned}$$

Consistency of this procedure is a mere consequence of Proposition 2.1 in Bücher et al. (2018); details are omitted for the sake of brevity.

3.5 Consistency against AMOC-piecewise locally stationary alternatives

In the previous section, the proposed tests were shown to be consistent against locally stationary alternatives. In classical change point settings, the underlying CUSUM principle is also known to be consistent against piecewise (locally) stationary alternatives, notably against those that involve a single change in the signal of interest (AMOC = at most one change). We are going to derive such results within the present setting.

For the sake of brevity, we only consider AMOC alternatives in the mean. More precisely, we assume that \(\{(X_{t,T})_{t \in \mathbb {Z}}:T\in \mathbb {N}\}\) follows the data-generating process

$$\begin{aligned} X_{t,T}=\left\{ \begin{array}{ll} \mu _1 + Y_{t,T}~\text {for}~t\le \lambda T\rfloor \\ \mu _2 + Y_{t,T}~\text {for}~t\ge \lfloor \lambda T\rfloor +1. \end{array} \right. \end{aligned}$$
(17)

for some \(\lambda \in (0,1), \mu _1, \mu _2 \in \mathcal {L}^2([0,1])\) and \(\{(Y_{t,T})_{t \in \mathbb {Z}}:T\in \mathbb {N}\}\) a locally stationary time series satisfying Condition 1. In the literature on classic change point detection, one would be interested in testing for the null hypothesis that \(\Vert \mu _1 - \mu _2 \Vert _2 =0\), against the alternative that this \(L^2\)-norm is positive.

Now, if \(\Vert \mu _1 - \mu _2 \Vert _2 =0\), we are back in the situation of the preceding sections. However, one can show (by contradiction) that if \(\Vert \mu _1 - \mu _2 \Vert _2 > 0\), \(\{(X_{t,T})_{t \in \mathbb {Z}}:T\in \mathbb {N}\}\) is not locally stationary, whence additional theory must be developed to show consistency of the test statistic \(\mathcal {S}_{T}^{\scriptscriptstyle (H)}\). Note that even the formulation of \(H_0^{\scriptscriptstyle (H)}\) relying on (2) and (3) is not possible anymore, so that we need to rely on their equivalent sub-asymptotic counterparts (9) and (10) in Lemma 1.

Proposition 2

Let \(\{(X_{t,T})_{t \in \mathbb {Z}}:T\in \mathbb {N}\}\) be a sequence of functional time series as defined in (17), with \(\mu _1 \ne \mu _2\) in \(L^2([0,1])\) and with \(\{(Y_{t,T})_{t \in \mathbb {Z}}:T\in \mathbb {N}\}\) satisfying Conditions (A1)–(A3). Then, the test statistic \(\mathcal {S}_T^{\scriptscriptstyle (m)}=S_{T,-1}\) based on observations \(X_{1,T}, \dots , X_{T,T}\) diverges to infinity, in probability. If, additionally, (B1)–(B3) are met, then the bootstrap variables \(\hat{S}_{T,-1}^{\scriptscriptstyle (k)}\) are stochastically bounded. As a consequence, the proposed test is consistent.

Remark 2

A careful inspection of the proof of Proposition 2 shows that the testing procedure is also consistent against local alternatives of the form \(\mu _2 = \mu _1 + d_T\), for any sequence \(d_T\) with \(d_T \sqrt{T}\rightarrow \infty \).

3.6 Data-driven choice of the block length parameter m

The bootstrap procedure depends on the choice of the width of the local mean estimator, n, and the length of the bootstrap blocks, m. Preliminary simulation studies suggested that the performance of the procedure crucially depends on the choice of m, while it is less sensitive to the choice of n (which may also be chosen by other standard criteria in specific applications, like adaptations of Silverman’s rule of thumb, cross-validation or visual investigation of respective plots). In this section, we propose a data-driven procedure for choosing the block length m based on a certain optimality criterion.

Recall that the limiting null distributions of the proposed test statistics depend in a complicated way on the covariances \(\text {Cov}\{\tilde{B}(u,\tau ),\tilde{B}(v,\varphi )\}, \text {Cov}\big \{ \tilde{B}_h(u,\tau _1,\tau _2),\tilde{B}_{h'}(v,\varphi _1,\varphi _2)\}\) and \(\text {Cov}\{\tilde{B}(u,\tau ),\tilde{B}_{h}(v,\varphi _1,\varphi _2)\}\). Following Sect. 5 in Bücher and Kojadinovic (2016), the procedure we propose essentially chooses m in such a way that the bootstrap approximation for \(\sigma _c(\tau , \varphi )=\text {Cov}\{\tilde{B}(1,\tau ),\tilde{B}(1,\varphi )\}\) is optimal, with respect to m, in a certain asymptotic sense. More precisely, we propose to first minimize the integrated mean squared of the “bootstrap estimator”

$$\begin{aligned} \tilde{\sigma }_T(\tau ,\varphi )&= \text {Cov}\big (\tilde{B}_T^{(1)}(1,\tau ),\tilde{B}_T^{(1)}(1,\varphi )|X_{1,T},\ldots , X_{T,T}\big ) \end{aligned}$$

considered as an estimator for \(\sigma _c(\tau , \varphi )\), with respect to m theoretically (see Lemma 3), and then use a simple plug-in approach to obtain a formula that solely depends on observable quantities. Observe that \(\tilde{\sigma }_T(\tau ,\varphi )\) can be rewritten as

$$\begin{aligned}&\tilde{\sigma }_T(\tau ,\varphi ) =\mathbb {E}[\tilde{B}_T^{(1)}(1,\tau ),\tilde{B}_T^{(1)}(1,\varphi )|X_{1,T},\ldots , X_{T,T}]\\&\quad = \frac{1}{T} \sum _{i=1}^{T} \frac{1}{m} \left( \sum _{t=i}^{(i+m-1)\wedge T}X_{t,T}(\tau )-\mu _{t,T}(\tau )\right) \left( \sum _{t=i}^{(i+m-1)\wedge T}X_{t,T}(\varphi )-\mu _{t,T}(\varphi )\right) \end{aligned}$$

whence \(\tilde{\sigma }_T(\tau ,\varphi )\) is not a proper estimator as it depends on the unknown expectation \(\mu _{t,T}\). The asymptotic integrated bias and integrated variance satisfy the following expansions. For simplicity, we replace Condition (A3) by a strong mixing condition as in Sect. 3.3.

Lemma 3

Let \(m=m(T)\) be an integer-valued sequence, such that m tends to infinity and \(m^2/T\) vanishes, as T tends to infinity. If conditions (A1) and (A2) are met and \(\{(X_{t,T})_{t\in \mathbb {Z}}: T\in \mathbb {N}\}\) is exponentially strongly mixing, then, as \(T\rightarrow \infty \),

$$\begin{aligned} \int _{[0,1]^2} \big (\mathbb {E}[\tilde{\sigma }_T(\tau ,\varphi )]-\sigma _c(\tau ,\varphi )\big )^2 {\,\mathrm {d}}(\tau ,\varphi )&= \frac{1}{m^2} \Delta + o(m^{-2}), \\ \int _{[0,1]^2}\text {Var}\big (\tilde{\sigma }_T(\tau ,\varphi )\big ) {\,\mathrm {d}}(\tau ,\varphi )&= \frac{m}{T} \Gamma + o(m/T), \end{aligned}$$

where

$$\begin{aligned} \Delta = \bigg \Vert \sum _{k=-\infty }^{\infty } |k| \int _0^1 \text {Cov}(X_0^{(w)},X_k^{(w)}) {\,\mathrm {d}}w \bigg \Vert _{2,2}^2 \end{aligned}$$

and

$$\begin{aligned} \Gamma= & {} \frac{2}{3}\int _0^1 \left( \sum _{k=-\infty }^{\infty } \int _0^1 \text {Cov}\big (X_0^{(w)}(\tau ),X_{k}^{(w)}(\tau )\big ) {\,\mathrm {d}}\tau \right) ^2 \\&+ \bigg \Vert \sum _{k=-\infty }^{\infty } \text {Cov}(X_0^{(w)},X_k^{(w)}) \bigg \Vert _{2,2}^2 {\,\mathrm {d}}w. \end{aligned}$$

As a consequence of this lemma, we obtain the expansion

$$\begin{aligned} \text {IMSE}_T(m)&= \int _{[0,1]^2} \text {MSE}(\tilde{\sigma }_T(\tau ,\varphi )) {\,\mathrm {d}}(\tau ,\varphi )\\&= \int _{[0,1]^2} \text {Var}\big (\tilde{\sigma }_T(\tau ,\varphi )\big ) + \big (\mathbb {E}[\tilde{\sigma }_T(\tau ,\varphi )]-\sigma _c(\tau ,\varphi )\big )^2 {\,\mathrm {d}}(\tau ,\varphi ) \\&= \frac{m}{T} \Gamma + \frac{1}{m^2} \Delta + o(m^{-2}) + o(m/T), \end{aligned}$$

which can next be minimized with respect to m to get a natural choice for the block length. More precisely, the dominating function \(\Lambda (m)=\tfrac{m}{T} \Gamma + \tfrac{1}{m^2} \Delta \) is differentiable in m with \(\Lambda '(m)=\tfrac{\Gamma }{T} - \tfrac{2\Delta }{m^3}\) and \(\Lambda ''(m)=\tfrac{6\Delta }{m^4}\), whence \(m=\big (\tfrac{2\Delta T}{\Gamma }\big )^{1/3}\) is the unique minimizer of \(\Lambda \). In practice, both \(\Gamma \) and \(\Delta \) are unknown and must be estimated in terms of the observed data. This leads us to define

$$\begin{aligned} \hat{m} = \big ({2\hat{\Delta }_T T}/{\hat{\Gamma }_T}\big )^{1/3} \end{aligned}$$

where, for some constant \(L\in \mathbb {N}\) specified below,

$$\begin{aligned} \hat{\Delta }_T = \int _{[0,1]^2}\left( \frac{1}{T-2L} \sum _{i=L+1}^{T-L} \sum _{k=-L}^{L}|k| \hat{\gamma }_{i,k,T}(\tau ,\varphi ) \right) ^2 {\,\mathrm {d}}(\tau ,\varphi ) \end{aligned}$$

and

$$\begin{aligned} \hat{\Gamma }_T= & {} \frac{2}{3} \frac{1}{T-2L} \sum _{i=L+1}^{T-L} \left( \sum _{k=-L}^{L} \int _0^1 \hat{\gamma }_{i,k,T}(\tau ,\tau ) {\,\mathrm {d}}\tau \right) ^2 \\&+ \int _{[0,1]^2} \left( \sum _{k=-L}^{L} \hat{\gamma }_{i,k,T}(\tau ,\varphi ) \right) ^2 {\,\mathrm {d}}(\tau ,\varphi ). \end{aligned}$$

Here \(\hat{\gamma }_{i,k,T}\) is defined by

$$\begin{aligned} \hat{\gamma }_{i,k,T}(\tau ,\varphi )= \frac{1}{\bar{n}_{i+k,0}-\underline{n}_{i}+1 } \sum _{j=\underline{n}_{i}}^{\bar{n}_{i+k,0}} \left( X_{i+j,T}(\tau )-\frac{1}{\tilde{n}_{i+j,0}} \sum _{t=\underline{n}_{i+j}}^{\bar{n}_{i+j,0}} X_{i+j+t,T}(\tau )\right) \\ \times \left( X_{i+j+k,T}(\varphi )-\frac{1}{\tilde{n}_{i+j+k,0}}\sum _{t=\underline{n}_{i+j+k}}^{\bar{n}_{i+j+k,0}} X_{i+j+k+t,T}(\varphi )\right) \end{aligned}$$

and \(\bar{n}_{t,h},\underline{n}_t\) and \(\tilde{n}_{t,h}\) are given in (16). Note that the above estimators depend on the choice of the integer L. Following Bücher and Kojadinovic (2016) and Politis and White (2004), we select L to be the smallest integer, such that

$$\begin{aligned} \hat{\rho }_{k,T} = \frac{\left\| \frac{1}{T-k}\sum _{i=1}^{T-k}\hat{\gamma }_{i,k,T}\right\| _{2,2}}{\left\| \frac{1}{T} \sum _{i=1}^{T} \hat{\gamma }_{i,0,T}\right\| _{2,2}} \end{aligned}$$

is negligible for any \(k>L\); more precisely, L is chosen as the smallest integer such that \(\hat{\rho }_{L+k,T} \le 2 \sqrt{\log (T)/T}\), for any \(k=1,\ldots , K_T\), with \(K_T= \max \{5, \sqrt{\log T}\}\).

4 Time-varying random operator functional AR processes

We consider an exemplary class of functional locally stationary processes and specify the approximating family of stationary processes. The results in this section are similar to Theorem 3.1 of Bosq (2000).

Let \(\mathcal {L}=\mathcal {L}\big (L^2([0,1]),L^2([0,1])\big )\) be the space of bounded linear operators on \(L^2([0,1])\). Further, denote by \(\Vert \cdot \Vert _\mathcal {L}\) and \(\Vert \cdot \Vert _\mathcal {S}\) the standard operator norm and the Hilbert–Schmidt norm, respectively, i. e.,

$$\begin{aligned} \Vert \ell \Vert _\mathcal {L} = \sup \limits _{\Vert x\Vert _2\le 1} \Vert \ell (x)\Vert _2, \qquad \Vert \ell \Vert _\mathcal {S} = \left( \sum _{j=1}^{\infty } \lambda _j^2\right) ^{1/2} \end{aligned}$$

for \(\ell \in \mathcal {L}\) with eigenvalues \(\lambda _1\ge \lambda _2\ge \dots \). By Eq. (1.55) in Bosq (2000), we have \(\Vert \cdot \Vert _\mathcal {L} \le \Vert \cdot \Vert _\mathcal {S}\). For any \(T\in \mathbb {N}\), consider the recursive functional equation

$$\begin{aligned} X_{t,T} = Y_{t,T} + \mu (t/T), \qquad Y_{t,T} = A_{t/T}(Y_{t-1,T})+ {\varepsilon }_{t,T}, \qquad t \in \mathbb {Z}, \end{aligned}$$
(18)

where \(({\varepsilon }_{t,T})_{t\in \mathbb {Z}}\) is a sequence of independent zero mean innovations in \(L^2([0,1])\) and where \(A_{t,T}:L^2([0,1]) \rightarrow L^2([0,1])\) denotes a possibly random and time-varying bounded linear operator. The equation defines what might be called a (time-varying) random operator functional autoregressive process of order one, denoted by \(\mathrm{tvrFAR}(1)\) (see also van Delft et al. (2017), Sect. 4.1) for the non-random case with \({\varepsilon }_{t,T}\) not depending on T.

In the following, we will only consider the case where \(\mu \) is the null function. In the more general case of \(\mu \) being Lipschitz, if there exists a locally stationary solution \(Y_{t,T}\) of the equation on the right-hand side of (18) with approximating family \(\{Y_t^{\scriptscriptstyle (u)}|t\in \mathbb {Z}\}_{u\in [0,1]}\), then \(X_{t,T} = Y_{t,T} + \mu (t/T)\) is obviously locally stationary with approximating family \(X_t^{\scriptscriptstyle (u)}=Y_t+\mu (u)\).

To be precise, we restrict ourselves to the following specific parameterization

$$\begin{aligned} \mu \equiv 0, \qquad A_{t/T} = a(t/T) \tilde{A}, \qquad {\varepsilon }_{t,T} = \sigma (t/T) \tilde{\varepsilon }_t, \end{aligned}$$

where a and \(\sigma >0\) are measurable functions on [0, 1]. The following lemma provides sufficient conditions for ensuring local stationarity of the model and provides an explicit expression for the approximating family of stationary processes. For a related result in the case where \(A_{t/T}\) is non-random and \({\varepsilon }_{t,T}\) does not depend on T, see Theorem 3.1 in van Delft and Eichler (2018).

For a sequence of operators \((B_i)_i\) in \(\mathcal {L}\), we will write \(\prod _{i=0}^{n} B_i = B_0 \circ \dots \circ B_n\) for \(n\in \mathbb {N}\). The empty product will be identified with the identity on \(L^2([0,1])\), that is, \(\prod _{i=0}^{-1} B_i = {{\,\mathrm{id}\,}}_{L^2([0,1])}\).

Lemma 4

Let \((\tilde{{\varepsilon }}_t)_{t\in \mathbb {Z}}\) be strong white noise in \(L^2([0,1])\). Further, let a and \(\sigma \) be measurable functions on \((-\infty ,1]\) such that \(\sigma >0\), \(a(u) = a(0)\) and \(\sigma (u)=\sigma (0)\) for all \(u\le 0\). Finally, let \({\varepsilon }_{t,T}=\sigma (t/T)\tilde{\varepsilon }_t\), \({\varepsilon }_t^{(u)} = \sigma (u) \tilde{\varepsilon }_t\) and \(A_u=a(u) \tilde{A}\), where \(\tilde{A}\) denotes a random operator in \(\mathcal {L}\) that is independent from \((\tilde{\varepsilon }_t)_{t\in \mathbb {Z}}\) and satisfies \(\sup _{u \in [0,1]} \Vert A_u\Vert _\mathcal {S}\le q<1\) with probability one. Then:

  1. (i)

    For any \(u\in [0,1]\), there exists a unique stationary solution \((Y_t^{\scriptscriptstyle (u)})_{t \in \mathbb {Z}}\) of the recursive equation

    $$\begin{aligned} Y_t^{(u)}=A_u(Y_{t-1}^{(u)})+{\varepsilon }_{t}^{(u)}, \qquad t \in \mathbb {Z}, \end{aligned}$$

    namely

    $$\begin{aligned} Y_t^{(u)}=\sum _{j=0}^{\infty } A_u^j({\varepsilon }_{u,t-j}), \end{aligned}$$

    where the latter series converges in \(L^2(\Omega \times [0,1], \mathbb {P}\otimes \lambda )\) and almost surely in \(L^2([0,1])\).

  2. (ii)

    If \(\sigma \) and a are Lipschitz continuous, then there exists a unique locally stationary solution \((Y_{t,T})\) of order \(\rho =2\) satisfying \(\sup _{t\in \mathbb {Z}, T\in \mathbb {N}} \mathbb {E}[\Vert Y_{t,T}\Vert _2^2]<\infty \) of the recursive equation

    $$\begin{aligned} Y_{t,T}=A_{t/T}(Y_{t-1,T})+{\varepsilon }_{t,T}, \qquad t \in \mathbb {Z}, T\in \mathbb {N}, \end{aligned}$$

    namely

    $$\begin{aligned} Y_{t,T}=\sum _{j=0}^{\infty } \left( \prod _{i=0}^{j-1} A_{\tfrac{t-i}{T}} \right) ({\varepsilon }_{t-j,T}), \end{aligned}$$

    the series again being convergent in \(L^2(\Omega \times [0,1], \mathbb {P}\otimes \lambda )\) and almost surely in \(L^2([0,1])\). The locally stationary process has approximating family \(\{(Y_t^{(u)})_{t\in \mathbb {Z}}: u \in [0,1] \}\).

Remark 3

The previous lemma provides sufficient conditions for local stationarity of the random operator FAR model. In the simulation study of the following section, we use specific models for the parameter curves and the functional white noise and show that those models satisfy the cumulant condition in Assumption (A3) as well.

5 Finite-sample results

5.1 Monte Carlo simulations

A large-scale Monte Carlo simulation study was performed to analyze the finite-sample behavior of the proposed tests. The major goals of the study were to analyze the level approximation and the power of the various tests, with a particular view on investigating various different forms of alternatives, notably models from \(H_1^{\scriptscriptstyle (m)}, H_1^{\scriptscriptstyle (c,0)}\) and \(H_1^{\scriptscriptstyle (c,1)}\). All stated results related to testing the joint hypothesis \(H_0^{\scriptscriptstyle (H)}\) are for the combined test described in Algorithm 1, with \(\psi (p_{-1},\dots ,p_H)=\sum _{i=-1}^{H}w_i\Phi ^{-1}(1-p_i)\) with weights \(w_1=w_0=1/2\) for \(H=0\) and \(w_{-1}=w_{0}=1/3\) and \(w_1=\cdots =w_H= (3H)^{-1}\) for \(H\ge 1\).

For the data-generating processes, we employed 10 different choices for the parameters in (18), which will be described next. Let \((\psi _i)_{i\in \mathbb {N}_0}\) denote the Fourier basis of \(L^2([0,1])\), that is, for \(n\in \mathbb {N}\),

$$\begin{aligned} \psi _0 \equiv 1, \quad \psi _{2n-1}(\tau ) = \sqrt{2} \sin (2\pi n\tau ), \quad \psi _{2n}(\tau ) = \sqrt{2} \cos (2\pi n\tau ). \end{aligned}$$

Let \((\tilde{\varepsilon }_t)_{t\in \mathbb {Z}}\) denote an i.i.d. sequence of mean zero random variables in \(L^2([0,1])\), defined by \(\tilde{\varepsilon }_t = \sum _{i=0}^{16} u_{t,i} \psi _i\), where \(u_{t,i}\) are independent and normally distributed with mean zero and variance \(\text {Var}(u_{i,t}) = \exp (-i/10)\). Independent of \((\tilde{\varepsilon }_t)_{t\in \mathbb {Z}}\), let \(\varvec{G}=(G_{i,j})_{i,j=0, \dots , 16}\) denote a matrix with independent normally distributed entries with \(\text {Var}(G_{i,j})=\exp (-i-j)\). Let \(\tilde{A} :L^2([0,1]) \rightarrow L^2([0,1])\) denote the (random) integral operator defined by

$$\begin{aligned} \tilde{A} (f)(t)= & {} \tfrac{1}{3 {\left| \left| \left| G\right| \right| \right| }_F} \sum _{i,j=0}^{16} G_{i,j} \langle f, \psi _i\rangle \psi _j(t) \\= & {} \int _0^1 \left( \tfrac{1}{3 {\left| \left| \left| G\right| \right| \right| }_F} \sum _{i,j=0}^{16} G_{i,j} \psi _i(s) \psi _j(t) \right) f(s) {\,\mathrm {d}}s, \end{aligned}$$

where \({\left| \left| \left| \varvec{G}\right| \right| \right| }_F\) denotes the Frobenius norm (note that the Hilbert–Schmidt norm of \(\tilde{A}\) is equal to 1 / 3, see Horváth and Kokoszka 2012, Sect. 2.2). Finally, let

$$\begin{aligned} a_0(u)&= 1,&a_1(u)&= \tfrac{1}{2} + u, \\ a_2(u)&=1-\tfrac{1}{2}\cos (2\pi u),&a_3(u)&= \tfrac{1}{2}+\mathbb {1}(u \ge 1/2), \end{aligned}$$

for \(u\in [0,1]\) and let \(a_j(u)=a_j(0)\) for \(u\le 0\) and \(a_j(u)=a_j(1)\) for \(u\ge 1\). The following ten data-generating processes are considered:

  • Stationary case. Let

    $$\begin{aligned} \mu \equiv 0, \qquad A_{t/T} = \tilde{A}, \qquad {\varepsilon }_{t,T} = \tilde{\varepsilon }_t. \end{aligned}$$
    (19)
  • Models deviating from\(H_0^{\scriptscriptstyle (m)}\). For \(j=1,\dots , 3\), consider the choices

    $$\begin{aligned} \mu (\tau ) = a_j(\tau ), \qquad A_{t/T} = \tilde{A}, \qquad {\varepsilon }_{t,T} = \tilde{\varepsilon }_t. \end{aligned}$$
    (20)
  • Models deviating from\(H_0^{\scriptscriptstyle (c,0)}\). For \(j=1,\dots , 3\), consider the choices

    $$\begin{aligned} \mu \equiv 0, \qquad A_{t/T} = \tilde{A}, \qquad {\varepsilon }_{t,T} = a_j(t/T) \tilde{\varepsilon }_t. \end{aligned}$$
    (21)
  • Models deviating from\(H_0^{\scriptscriptstyle (c,1)}\). For \(j=1,\dots , 3\), consider the choices

    $$\begin{aligned} \mu \equiv 0, \qquad A_{t/T} = a_j(t/T) \tilde{A}, \qquad {\varepsilon }_{t,T} = \tilde{\varepsilon }_t. \end{aligned}$$
    (22)

Remark 4

The models defined in (19)–(22) satisfy condition (A3) if the functions a and \(\sigma \) are Lipschitz continuous. To see this, we exemplarily derive (14), as the other parts follow by similar arguments. Observe that in all models, the mean function \(\mu \) is in \(L^2\); thus, (14) is trivial for \(j=1\). For \(j\ge 2\), consider first the case that \(\tilde{A}\) is non-random first. Then, by Lemma 4 (ii), the definition of \(A_{t/T}\) and \({\varepsilon }_{t,T}\), and linearity of both (powers of) the operator \(\tilde{A}\) and the cumulants,

$$\begin{aligned}&\Vert {{\,\mathrm{cum}\,}}(X_{t_1,T},\ldots , X_{t_j,T})\Vert _{2,j}\\&\quad = \Vert {{\,\mathrm{cum}\,}}(Y_{t_1,T},\ldots , Y_{t_j,T})\Vert _{2,j}\\&\quad = \left\| {{\,\mathrm{cum}\,}}\left( \sum _{k=0}^\infty \left( \prod _{i=0}^{k-1} A_{\frac{t_1-i}{T}} \right) ({\varepsilon }_{t_1-k,T}), \ldots , \sum _{k=0}^\infty \left( \prod _{i=0}^{k-1} A_{\frac{t_j-i}{T}} \right) ({\varepsilon }_{t_j-k,T}) \right) \right\| _{2,j}\\&\quad = \left\| {{\,\mathrm{cum}\,}}\left( \sum _{k=0}^\infty \left( \prod _{i=0}^{k-1} a\left( \tfrac{t_1-i}{T}\right) \right) \tilde{A}^k \left( \sigma \left( \tfrac{t_1-k}{T}\right) \sum _{\ell =0}^{16} u_{t_1-k,\ell }\psi _\ell \right) , \ldots \right. \right. \\&\qquad \left. \left. \ldots , \sum _{k=0}^\infty \left( \prod _{i=0}^{k-1} a\left( \tfrac{t_j-i}{T}\right) \right) \tilde{A}^k \left( \sigma \left( \tfrac{t_j-k}{T}\right) \sum _{\ell =0}^{16}u_{t_j-k,\ell }\psi _\ell \right) \right) \right\| _{2,j}\\&\quad = \left\| \sum _{k_1,\ldots , k_j=0}^{\infty } \sum _{\ell _1,\ldots ,\ell _j=0}^{16} \left( \prod _{\nu =1}^{j}\left\{ \left( \prod _{i=0}^{k_j-1}a \left( \tfrac{t_\nu -i}{T}\right) \right) \sigma \left( \tfrac{t_\nu -k_\nu }{T}\right) \tilde{A}^{k_\nu }(\psi _{\ell _\nu }) \right\} \right) \right. \\&\qquad \left. \times {{\,\mathrm{cum}\,}}(u_{t_1-k_1,\ell _1},\ldots , u_{t_j-k_j,\ell _j})\right\| _{2,j}. \end{aligned}$$

By independence of the random variables \(u_{t,\ell }\), the cumulants on the right-hand side of the previous display are zero if there is an index \(\nu \in \{2,\ldots ,j\}\) such that \(\ell _1\ne \ell _\nu \) or \(t_1-k_1\ne t_\nu -k_\nu \). Further, the random variables \(u_{t,\ell }\) are normally distributed and the higher cumulants, for \(j>2\), are zero. Thus, (14) is trivial for \(j>2\). Now, let \(j=2\) and consider \(t_2\ge t_1\) without loss of generality. Then, by the previous considerations,

$$\begin{aligned} \Vert&\text {Cov}(X_{t_1,T},X_{t_2,T})\Vert _{2,2} \\&\quad = \left\| \sum _{k=0}^{\infty } \sum _{\ell =0}^{16} \left( \prod _{i=0}^{k-1}a\left( \tfrac{t_1-i}{T}\right) \right) \sigma \left( \tfrac{t_1-k}{T}\right) \tilde{A}^{k}(\psi _\ell )\left( \prod _{i=0}^{t_2-t_1+k-1}a\left( \tfrac{t_2-i}{T}\right) \right) \right. \\&\qquad \left. \times \sigma \left( \tfrac{t_2-(t_2-t_1+k)}{T}\right) \tilde{A}^{t_2-t_1+k}(\psi _\ell ) \text {Var}(u_{t_1-k,\ell })\right\| _{2,2}, \end{aligned}$$

which can be bounded by

$$\begin{aligned} \frac{9}{4}\sum _{k=0}^{\infty } \left( \frac{1}{2}\right) ^{t_2-t_1+2k} \sum _{\ell =0}^{16} \exp \left( -\frac{\ell }{10}\right) \le C \big (\frac{1}{2}\big )^{t_2-t_1} \end{aligned}$$

since \(\Vert \tilde{A}\Vert _\mathcal {S}= \tfrac{1}{3}\), \(\text {Var}(u_{t,\ell })=\exp (-\ell /10)\), and both a and \(\sigma \) can be bounded from above by 3 / 2. Thus,

$$\begin{aligned} \sum _{t_1=-\infty }^{\infty } \Vert \text {Cov}(X_{t_1,T},X_{t_2,T})\Vert _{2,2} \le C \left( \sum _{t_1=-\infty }^{t_2} \left( \frac{1}{2}\right) ^{t_2-t_1} + \sum _{t_1=t_2+1}^{\infty } \left( \frac{1}{2}\right) ^{t_1-t_2}\right) \le C, \end{aligned}$$

which proves (14) in case \(\tilde{A}\) is non-random. The random case follows since the upper bound does not depend on \(\tilde{A}\) and since \(\tilde{A}\) is assumed to be independent of \(\tilde{\varepsilon }_t\).

Subsequently, the respective models will be denoted by \((\mathcal {M}_0)\) and \((\mathcal {M}_{m,j}), (\mathcal {M}_{v,j})\) and \((\mathcal {M}_{a,j})\) for \(j=1, \dots , 3\). Note that the model descriptions are non-exclusive: For instance, the models in (20) exhibiting deviations from \(H_0^{\scriptscriptstyle (m)}\) also deviate from \(H_0^{\scriptscriptstyle (c,0)}\).

Preliminary simulation studies showed that the data-driven choice of m, as introduced in Sect. 3.6, yields similar results as a manual choice of m and should be favored. Further parameters of the simulation design are as follows: The number of bootstrap replicates is set to \(K=200\). Two sample sizes were considered, namely \(T=256\) and \(T=512\). Observe though that, unlike many frequency domain-based methods for functional time series, the proposed testing procedure does not require the sample sizes to be a power of two to work effectively. The hyperparameter n for estimating local means is set to \(n=45,60,75,90,T\). Finally, the maximum number of lags considered was set to \(H=4\). Empirical rejection rates are based on \(N=500\) simulation runs each and are summarized in Tables 1 and 2.

Table 1 Empirical rejection rates for various combined tests, based on a sample size of \(T=256\) and a block length parameter m calculated as proposed in Sect. 3.6
Table 2 Empirical rejection rates for various combined tests, based on a sample size of \(T=512\) and a block length parameter m calculated as proposed in Sect. 3.6

From the previous results, it can be seen that different choices of n do not lead to crucially different results. For \(T=256\), the tests for the hypotheses \(H_0^{\scriptscriptstyle (m)}\) and \(H_0^{\scriptscriptstyle (c,0)}\) already have good power against the alternatives \((\mathcal {M}_{m,1}),(\mathcal {M}_{m,3})\) and \((\mathcal {M}_{v,1}),(\mathcal {M}_{v,2}),(\mathcal {M}_{v,3})\), respectively. When combining \(H_0^{\scriptscriptstyle (m)}\) and \(H_0^{\scriptscriptstyle (c,0)}\) and taking even more autocovariances into account, the test does not loose significant power. For \(T=512\), the power further increases such that all tests have good power against the alternatives \((\mathcal {M}_{m,i})\) and \((\mathcal {M}_{v,i})\), \(i=1,2,3\). Detecting non-stationarities in models \((\mathcal {M}_{a,i}),i=1,2,3\) turns out to be more difficult. Even though the power increases with T, for small values of T, the results are not too convincing. These findings can be explained by the fact that the measures of non-stationarity\(\Vert M\Vert _{2,2}\) and \(\Vert M_h\Vert _{2,3}\), as introduced in (5) and (6), are comparably small for models \((\mathcal {M}_{a,i}),i=1,2,3\). This can be deduced from Table 3, where these measures of non-stationarity are approximated by their natural estimators \(\Vert M_T\Vert _{2,2}=\Vert U_T\Vert _{2,2}/\sqrt{T}\) and \(\Vert M_{T,h}\Vert _{2,3}=\Vert U_{T,h}\Vert _{2,3}/\sqrt{T}\), based on 2000 Monte Carlo repetitions and for various choices of T. It is noticeable that the values for models \((\mathcal {M}_{a,1})\) and \((\mathcal {M}_{a,2})\) are close to those for \((\mathcal {M}_{0})\), which perfectly explains the results of the simulation study.

Table 3 \(\Vert M_{T}\Vert _{2,2}\) and \(\Vert M_{T,h}\Vert _{2,3},h=0,\ldots , 4\), calculated by 2000 Monte Carlo repetitions

As pointed out by a reviewer, it is of interest to investigate the sensitivity of the test with respect to the number of components included in the procedure. For this purpose, we highlight in Table 4 the rejection probabilities of the tests for the different hypotheses in the models \((\mathcal {M}_{m,1})\) and \((\mathcal {M}_{v,1})\), which correspond to a change in the mean (and as a consequence in all second-order moments) and changes in all second-order moments (with no change in the mean), respectively.

We observe that the impact of the number of components on the power of the tests shows no clear pattern. For example, the power is almost constant under model \((\mathcal {M}_{m,1})\) if additional components are included in the test statistic. Conversely, under model \((\mathcal {M}_{v,1})\), the power decreases with the number of components.

We also list results for the functional moving average model

figure a

with \(\sigma (u)=\mathbb {1}(u\le 0.5)-\mathbb {1}(u>0.5)\) and \(\tilde{\varepsilon }_t\) as for the autoregressive model. This model has constant mean and covariances, except of the covariance with lag 2, and consequently only the tests for hypotheses \(H_0^{(H)}\) (\(H\ge 2\)) should reject the null since all moments besides \(\mathbb {E}[X_{t,T}X_{t+2,T}]\) are constant. This effect is clearly visible in the last rows of Table 4, where we observe the largest rejection probabilities by the test constructed for the hypotheses \(H_0^{(2)}\). The power decreases for the hypotheses \(H_0^{(3)}\) and \(H_0^{(4)}\), which can be explained by the fact that in this case the statistics combines stationary and non-stationary parts of the serial dependence. Compared to the other cases, the test has less power in model \(\mathcal {M}_\mathrm{{MA}}\). An intuitive explanation for this observation consists of the fact that the weights of the p values, as introduced in Sect. 3.4, favor changes in the mean and the second moment (the corresponding weights in the test statistic are larger), which makes it more difficult to detect changes in models that are only non-stationary in the second-order moments at higher lags. On the other hand, it is worthwhile to mention that the null hypothesis \(H_0^{(c,2)}\) is rejected in nearly \(100\%\) of the cases in the \(\mathcal {M}_\mathrm{{MA}}\) model (these results are not displayed).

Table 4 Empirical rejection rates for various combined tests, based on sample sizes of \(T=256, 512\), the bandwidth of the local mean estimator \(n=60\) (for \(T=256\)), \(n=90\) (for \(T=512\)), and a block length parameter selected automatically

5.2 Case study

Functional time series naturally arise in the field of meteorology. For instance, the daily minimal temperature at one place over time can be naturally divided into yearly functional data.

To illustrate the proposed methodology, we consider the daily minimum temperature recorded at eight different locations across Australia. Exemplary, the temperature curves of Melbourne and Sydney are displayed in Fig. 1. The results of our testing procedure can be found in Table 5, where we employed \(K=1000\) bootstrap replicates, considered up to \(H=4\) lags and chose \(n=25\), based on visual exploration of the respective plots. The null hypotheses of stationarity can be rejected, at level \(\alpha =0.05\), for all measuring stations except of Gunnedah Pool, for which the p values exceed \(\alpha \) by a small amount.

Fig. 1
figure 1

Temperature curves of Melbourne (\(T=161\) years) and Sydney (\(T=160\) years), where the x axis corresponds to a year in rescaled time and the y axis denotes temperature in degree Celsius

Table 5 p values of the (combined) tests for the respective null hypotheses in percent, and selected value of m

6 Proofs

Throughout the proofs, C denotes a generic constant whose value may change from line to line. If not specified otherwise, all convergences are for \(T\rightarrow \infty \).

6.1 A fundamental approximation lemma in Hilbert spaces

Lemma 5

Fix \(p\in \mathbb {N}\). For \(i=1, \dots , p\) and \(T\in \mathbb {N}\), let \(X_{i,T}\) and \(X_i\) denote random variables in a separable Hilbert space \((H_i, \langle \cdot , \cdot \rangle _i)\). Futher, let \((\psi _k^{\scriptscriptstyle (i)})_{k\in \mathbb {N}}\) be an orthonormal basis of \(H_i\) and for brevity write \(\langle \cdot , \cdot \rangle = \langle \cdot , \cdot \rangle _i\). Suppose that

$$\begin{aligned} (1)\quad&Y_T^n:=\big ((\langle X_{1,T},\psi _k^{(1)}\rangle )_{k=1}^n,\dots ,(\langle X_{p,T},\psi _k^{(p)}\rangle )_{k=1}^n\big ) \\&\quad \rightsquigarrow \big ((\langle X_{1},\psi _k^{(1)}\rangle )_{k=1}^n,\dots ,(\langle X_{p},\psi _k^{(p)}\rangle )_{k=1}^n\big )=:Y^n \quad \text {as }T\rightarrow \infty ,\text { for any } n\in \mathbb {N}, \\ (2)\quad&\lim \limits _{n\rightarrow \infty } \limsup \limits _{T\rightarrow \infty } \mathbb {P}\Big (\sum _{k=n+1}^{\infty }\sum _{i=1}^{p}\langle X_{i,T},\psi _k^{(i)}\rangle ^2> {\varepsilon }\Big ) =0 \quad \text { for all } {\varepsilon }>0. \end{aligned}$$

Then, using the notation \(\Vert (x_k)_{k\in \mathbb {N}}\Vert _2=\sum _{k=1}^\infty x_k^2\),

$$\begin{aligned} Y_T^\infty:= & {} \big ((\langle X_{1,T},\psi _k^{(1)}\rangle )_{k=1}^\infty ,\dots ,(\langle X_{p,T},\psi _k^{(p)}\rangle )_{k=1}^\infty \big ) \\&\rightsquigarrow \big ((\langle X_1,\psi _k^{(1)}\rangle )_{k=1}^\infty ,\dots , (\langle X_p,\psi _k^{(p)}\rangle )_{k=1}^\infty \big ) =:Y^\infty \quad \text { in } (\ell ^2(\mathbb {N}), \Vert \cdot \Vert _2)^p \end{aligned}$$

and, as a consequence,

$$\begin{aligned} (X_{1,T},\dots ,X_{p,T}) \rightsquigarrow (X_1,\dots , X_p) \quad \text {in } H_1 \times \dots \times H_p. \end{aligned}$$

Proof of Lemma 1

To prove the first part, we employ Theorem 2 of Dehling et al. (2009). Expand the random variables \(Y_T^n\) and \(Y_n\) in \(\mathbb {R}^{pn}\) to

$$\begin{aligned} \tilde{Y}_{T,n}^\infty =\big ((a_{T,k}^{(1)})_{k\in \mathbb {N}},\dots (a_{T,k}^{(p)})_{k\in \mathbb {N}}\big )~\text {and}~\tilde{Y}_{n}^\infty =\big ((a_{k}^{(1)})_{k\in \mathbb {N}},\dots ,(a_{k}^{(p)})_{k\in \mathbb {N}}\big ) \end{aligned}$$

in \((\ell ^2(\mathbb {N}), \Vert \cdot \Vert _2)^p\), where \(a_{T,k}^{\scriptscriptstyle (i)}=\langle X_{i,T},\psi _k^{\scriptscriptstyle (i)}\rangle \) and \(a_{k}^{\scriptscriptstyle (i)}=\langle X_i,\psi _k^{\scriptscriptstyle (i)}\rangle \), for any \(1\le k\le n\), and \(a_{T,k}^{\scriptscriptstyle (i)}=a_k^{\scriptscriptstyle (i)}=0\), for any \(k>n\), \(i=1,\dots ,p\). By the continuous mapping theorem, \(\tilde{Y}_{T,n}^\infty \) converges weakly to \(\tilde{Y}_n^\infty \) in \((\ell ^2(\mathbb {N}), \Vert \cdot \Vert _2)^p\), for any \(n\in \mathbb {N}\) and as T tends to infinity.

By assumption (2) and since the space \((\ell ^2(\mathbb {N}), \Vert \cdot \Vert _2)^p\) is separable and complete, there is a random variable \(\tilde{Y}^\infty \in (\ell ^2(\mathbb {N}), \Vert \cdot \Vert _2)^p\) such that \(Y_T^\infty \rightsquigarrow \tilde{Y}^\infty \), as T tends to infinity, and \(\tilde{Y}_n^\infty \rightsquigarrow \tilde{Y}^\infty \), as n tends to infinity, by Theorem 2 of Dehling et al. (2009). Due to the latter convergence, the finite-dimensional distributions of \(\tilde{Y}^\infty \) and \(Y^\infty \) are the same. Thus, by Theorem 1.3 of Billingsley (1999) and Lemma 1.5.3 of van der Vaart and Wellner (1996), \(\tilde{Y}^\infty \) and \(Y^\infty \) have the same distribution in \( (\ell ^2(\mathbb {N}), \Vert \cdot \Vert _2)^p\).

Next, observe that, for an arbitrary Hilbert space H, the function

\(\Phi :=\Bigg \{\begin{array}{llc}\ell ^2(\mathbb {N})&{}\rightarrow &{}H \\ (y_k)_{k\in \mathbb {N}}&{}\mapsto &{}\sum _{k=1}^\infty y_k \psi _k\end{array}\)

is continuous, provided \((\psi _k)_{k\in \mathbb {N}}\) is an orthonormal basis of H. Indeed,

$$\begin{aligned} \big \Vert \Phi \big ((y_k)_{k}\big )-\Phi \big ((z_k)_{k}\big )\big \Vert ^2 =\textstyle \sum _{k=1}^{\infty } (y_k-z_k)^2 = \Vert y-z\Vert _2^2. \end{aligned}$$

Thus, the mapping

$$\begin{aligned} \Phi ':=\Bigg \{\begin{array}{llc}(\ell ^2(\mathbb {N}), \Vert \cdot \Vert _2)^p &{}\rightarrow &{}H_1 \times \dots \times H_p \\ \big ((y_{k,1})_{k\in \mathbb {N}},\dots ,(y_{k,p})_{k\in \mathbb {N}}\big )&{}\mapsto &{}\big (\sum _{k=1}^\infty y_{k,1} \psi _k^{(1)},\dots ,\sum _{k=1}^\infty y_{k,p} \psi _k^{(p)}\big )\end{array} \end{aligned}$$

is continuous too, and the continuous mapping theorem implies that

$$\begin{aligned} \textstyle (X_{1,T},\dots ,X_{p,T})&= \textstyle \Big (\sum _{k=1}^{\infty } \langle X_{1,T},\psi _k^{(1)}\rangle \psi _k^{(1)},\dots \sum _{k=1}^{\infty } \langle X_{p,T},\psi _k^{(p)}\rangle \psi _k^{(p)}\Big )\\&\rightsquigarrow \textstyle \Big (\sum _{k=1}^{\infty } \langle X_1,\psi _k^{(1)}\rangle \psi _k^{(1)},\dots ,\sum _{k=1}^{\infty } \langle X_p,\psi _k^{(p)}\rangle \psi _k^{(p)}\Big )\\&=(X_1,\dots ,X_p), \end{aligned}$$

as T tends to infinity. \(\square \)

6.2 Proofs for Sects. 3.1, 3.2, 3.3 and 3.4

Proof of Lemma 2

We only prove the equivalence concerning \(H_0^{\scriptscriptstyle (h)}\); the equivalences regarding \(H_0^{\scriptscriptstyle (m)}\) follow along similar lines.

Step 1: Equivalence between (3) and (8). Suppose that (8) is met. To prove (3), it is sufficient to show that

$$\begin{aligned} \Big \Vert \mathbb {E}[X_0^{(u)}\otimes X_h^{(u)}]-\int _0^1 \mathbb {E}[X_0^{(w)}\otimes X_h^{(w)}]{\,\mathrm {d}}w\Big \Vert _{2,2}=0 \end{aligned}$$
(23)

for any \(u\in [0,1]\).

Fix \(u\in [0,1)\) and let \(\delta >0\) be sufficiently small such that \(u+\delta <1\). By the reverse triangle inequality, we obtain that

$$\begin{aligned} 0 \le&\Bigg | \bigg \Vert \frac{1}{\delta }\int _u^{u+\delta } \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} -X_0^{(u)}\otimes X_h^{(u)}\big ]{\,\mathrm {d}}w\bigg \Vert _{2,2} \\&\quad - \bigg \Vert \int _0^1 \mathbb {E}\big [X_0^{(w)}\otimes X_h^{(w)}- X_0^{(u)}\otimes X_h^{(u)} \big ]{\,\mathrm {d}}w\bigg \Vert _{2,2} \Bigg | \\ \le&\frac{1}{\delta } \bigg \Vert \int _u^{u+\delta } \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w- \delta \int _0^1 \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w\bigg \Vert _{2,2} \\ =&\frac{1}{\delta } \bigg \Vert \int _0^{u+\delta } \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w-(u+\delta ) \int _0^1 \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w\\&\quad - \int _0^{u} \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w+ u \int _0^1 \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w\bigg \Vert _{2,2} \\ \le&\frac{1}{\delta } \bigg \Vert \int _0^{u+\delta } \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w-(u+\delta ) \int _0^1 \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w\bigg \Vert _{2,2} \\&\quad + \frac{1}{\delta } \bigg \Vert \int _0^u \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w-u \int _0^1 \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w\bigg \Vert _{2,2}. \end{aligned}$$

By continuity of integrals in the upper integration limit, it follows from (8) that both summands on the right-hand side of this display are equal to zero. As a consequence,

$$\begin{aligned}&\bigg \Vert \int _0^1 \mathbb {E}[X_0^{(w)}\otimes X_h^{(w)}] {\,\mathrm {d}}w - \mathbb {E}[X_0^{(u)}\otimes X_h^{(u)} ] \bigg \Vert _{2,2} \nonumber \\&\quad = \bigg \Vert \frac{1}{\delta }\int _u^{u+\delta } \mathbb {E}[ X_0^{(w)}\otimes X_h^{(w)} -X_0^{(u)}\otimes X_h^{(u)}]{\,\mathrm {d}}w\bigg \Vert _{2,2}. \end{aligned}$$
(24)

By Jensen’s inequality, we can bound the right-hand side of this display from above by

$$\begin{aligned} \left( \int _{[0,1]^2} \frac{1}{\delta ^2} \int _u^{u+\delta }\big (\mathbb {E}[X_0^{(w)}(\tau _1)X_h^{(w)}(\tau _2)-X_0^{(u)}(\tau _1)X_h^{(u)}(\tau _2)]\big )^2 {\,\mathrm {d}}w {\,\mathrm {d}}(\tau _1,\tau _2)\right) ^{1/2}. \end{aligned}$$
(25)

By employing Jensen’s inequality again, we can bound the integrand by

$$\begin{aligned} \mathbb {E}\big [\big (X_0^{(w)}(\tau _1)X_h^{(w)}(\tau _2)-X_0^{(u)}(\tau _1)X_h^{(u)}(\tau _2)\big )^2\big ]. \end{aligned}$$

Thus, by Fubini’s theorem, (25) is less than or equal to

$$\begin{aligned} \bigg ( \frac{1}{\delta ^2} \int _u^{u+\delta }\mathbb {E}[\Vert X_0^{(w)}\otimes X_h^{(w)}-X_0^{(u)}\otimes X_h^{(u)}\Vert _{2,2}^2] {\,\mathrm {d}}w\bigg )^{1/2}. \end{aligned}$$

This term is of the order \(O(\delta ^{1/2})\) due to the inequality

$$\begin{aligned}&\mathbb {E}\Vert X_0^{(w)}\otimes X_h^{(w)}-X_0^{(u)}\otimes X_h^{(u)} \Vert _{2,2}^2 \nonumber \\&\quad = \mathbb {E}\Vert X_0^{(w)}\otimes X_h^{(w)}-X_0^{(u)}\otimes X_h^{(w)}+X_0^{(u)}\otimes X_h^{(w)}-X_0^{(u)}\otimes X_h^{(u)} \Vert _{2,2}^2 \nonumber \\&\quad \le \ 2\big \{ \mathbb {E}\Vert (X_0^{(w)}-X_0^{(u)})\otimes X_h^{(w)}\Vert _{2,2}^2+ \mathbb {E}\Vert X_0^{(u)}\otimes (X_h^{(w)}-X_h^{(u)}) \Vert _{2,2}^2\big \} \nonumber \\&\quad \le 2\big \{ \mathbb {E}[ \Vert X_0^{(w)}-X_0^{(u)}\Vert _2^4]^{1/2} \ \mathbb {E}[\Vert X_h^{(w)}\Vert _{2}^4]^{1/2}+ \mathbb {E}[ \Vert X_0^{(u)}\Vert _2^4]^{1/2} \ \mathbb {E}[ \Vert X_h^{(w)}-X_h^{(u)} \Vert _{2}^4]^{1/2} \big \} \nonumber \\&\quad \le C |u-w|^2, \end{aligned}$$
(26)

where the final bound follows from Lemma C.2 in the supplementary material. Since \(\delta \) was chosen arbitrarily, we obtain that the right-hand side of (24) is equal to zero. This proves (23) for \(u\in [0,1)\), and the case \(u=1\) follows from (26), which is also valid for \(u=1\).

Conversely, if (3) holds true, we have by a change of variables, linearity of the integral, Jensen’s inequality and Fubini’s theorem,

$$\begin{aligned}&\bigg \Vert \int _0^u \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w-u \int _0^1 \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w\bigg \Vert _{2,3}^2\\&\quad = \bigg \Vert u \int _0^1 \mathbb {E}\big [ X_0^{(uw)}\otimes X_h^{(uw)} \big ]- \mathbb {E}\big [ X_0^{(w)}\otimes X_h^{(w)} \big ]{\,\mathrm {d}}w\bigg \Vert _{2,3}^2 \\&\quad \le \int _0^1 \int _0^1 u^2 \big \Vert \mathbb {E}[X_0^{(uw)}\otimes X_h^{(uw)}]-\mathbb {E}[X_0^{(w)}\otimes X_h^{(w)}]\big \Vert _{2,2}^2 {\,\mathrm {d}}w {\,\mathrm {d}}u=0. \end{aligned}$$

Step 2: Equivalence between (3) and (10). Note that, irrespective of whether (3) or (10) is met, local stationarity of \(X_{t,T}\) of order \(\rho \ge 4\) and stationarity of \((X_t^{\scriptscriptstyle (u)})_{t\in \mathbb {Z}}\) with \(\mathbb {E}\Vert X_t^{\scriptscriptstyle (u)}\Vert _2^4 < \infty \) implies that

$$\begin{aligned}&\ \Vert \mathbb {E}[X_{\lfloor uT\rfloor ,T}\otimes X_{\lfloor uT\rfloor +h,T}- X_{\lfloor uT\rfloor }^{(u)}\otimes X_{\lfloor uT\rfloor +h}^{(u)}]\Vert _{2,2}^2 \nonumber \\&\quad \le \ \mathbb {E}[\Vert X_{\lfloor uT\rfloor ,T}\otimes X_{\lfloor uT\rfloor +h,T}-X_{\lfloor uT\rfloor }^{(u)}\otimes X_{\lfloor uT\rfloor +h}^{(u)}\Vert _{2,2}^2] \nonumber \\&\quad \le \ 2 \big \{ \mathbb {E}[\Vert (X_{\lfloor uT\rfloor ,T}-X_{\lfloor uT\rfloor }^{(u)})\otimes X_{\lfloor uT\rfloor +h,T}\Vert _{2,2}^2]\nonumber \\&\qquad + \mathbb {E}[\Vert X_{\lfloor uT\rfloor }^{(u)}\otimes (X_{\lfloor uT\rfloor +h,T}-X_{\lfloor uT\rfloor +h}^{(u)})\Vert _{2,2}^2]\big \} \nonumber \\&\quad \le \ 2 \big \{ \mathbb {E}[\Vert X_{\lfloor uT\rfloor ,T}-X_{\lfloor uT\rfloor }^{(u)}\Vert _2^2 \Vert X_{\lfloor uT\rfloor +h,T}\Vert _2^2]\nonumber \\&\qquad + \mathbb {E}[\Vert X_{\lfloor uT\rfloor }^{(u)}\Vert _2^2 \Vert X_{\lfloor uT\rfloor +h,T}-X_{\lfloor uT\rfloor +h}^{(u)}\Vert _2^2]\big \} \nonumber \\&\quad \le \ (C/T^2) \big \{ \mathbb {E}[ (P_{\lfloor uT\rfloor ,T}^{(u)})^4]^{1/2} \mathbb {E}\Vert X_{\lfloor uT\rfloor +h,T}\Vert _2^4]^{1/2} \nonumber \\&\qquad + \mathbb {E}[\Vert X_{\lfloor uT\rfloor }^{(u)}\Vert _2^4]^{1/2} \mathbb {E}[ (P_{\lfloor uT\rfloor +h,T}^{(u)})^4]^{1/2} \big \} \nonumber \\&\quad \le \ {C}/{T^2}, \end{aligned}$$
(27)

for any \(u\in [0,1]\) and \(T\in \mathbb {N}\) and for some universal constant \(C>0\).

Now, suppose that (3) is met. Then, the previous display implies that

$$\begin{aligned}&\ \Vert \mathbb {E}[X_{\lfloor uT\rfloor ,T}\otimes X_{\lfloor uT\rfloor +h,T}]-\mathbb {E}[X_{0,T}\otimes X_{h,T}]\Vert _{2,2}\\&\quad \le \ \Vert \mathbb {E}[X_{\lfloor uT\rfloor ,T}\otimes X_{\lfloor uT\rfloor +h,T}]-\mathbb {E}[X_{\lfloor uT\rfloor }^{(u)}\otimes X_{\lfloor uT\rfloor +h}^{(u)}]\Vert _{2,2} \\&\qquad + \Vert \mathbb {E}[X_{0}^{(u)}\otimes X_{h}^{(u)}]-\mathbb {E}[X_{0}^{(0)}\otimes X_{h}^{(0)}]\Vert _{2,2} + \Vert \mathbb {E}[X_{0}^{(0)}\otimes X_{h}^{(0)}]\\&\qquad -\mathbb {E}[X_{0,T}\otimes X_{h,T}]\Vert _{2,2}\\&\quad \le \ {C}/{T} + 0 + {C}/{T} = {2C}/{T}, \end{aligned}$$

for any \(u\in [0,1]\) and \(T\in \mathbb {N}\), that is, (10) is met.

Conversely, if (10) is met, then, by (27) and (1), for any \(u,v\in [0,1]\) and \(T\in \mathbb {N}\),

$$\begin{aligned}&\ \Vert \mathbb {E}[X_{0}^{(u)} \otimes X_h^{(u)}-X_{0}^{(v)} \otimes X_h^{(v)}]\Vert _2 \\&\quad =\ \Vert \mathbb {E}[X_{\lfloor uT\rfloor }^{(u)} \otimes X_{\lfloor uT\rfloor +h}^{(u)}-X_{\lfloor vT\rfloor }^{(v)} \otimes X_{\lfloor vT\rfloor +h}^{(v)}]\Vert _2 \\&\quad \le \ \Vert \mathbb {E}[X_{\lfloor uT\rfloor }^{(u)} \otimes X_{\lfloor uT\rfloor +h}^{(u)}-X_{\lfloor uT\rfloor ,T} \otimes X_{\lfloor uT\rfloor +h,T}]\Vert _2 \\&\qquad + \Vert \mathbb {E}[X_{\lfloor uT\rfloor ,T} \otimes X_{\lfloor uT\rfloor +h,T} - X_{0,T} \otimes X_{h,T}] \Vert _2 \\&\qquad + \Vert \mathbb {E}[X_{0,T} \otimes X_{h,T}-X_{\lfloor vT\rfloor ,T} \otimes X_{\lfloor vT\rfloor +h,T} \Vert _2 \\&\ \qquad + \Vert \mathbb {E}[X_{\lfloor vT\rfloor ,T} \otimes X_{\lfloor vT\rfloor +h,T}- X_{\lfloor vT\rfloor }^{(v)} \otimes X_{\lfloor vT\rfloor +h}^{(v)}]\Vert _2 \\&\quad \le \ 4 C/T. \end{aligned}$$

Since T was arbitrary, the left-hand side of this display must be zero, whence (3). \(\square \)

Proof of Theorem 1

This theorem is an immediate consequence of Theorem C.3 of the supplementary material. \(\square \)

Proof of Corollary 1

Suppose that \(H_0^{\scriptscriptstyle (c,h)}\) is met. Then, by the triangle inequality and a slight abuse of notation (note that u is a variable of integration in the norm \(\Vert \cdot \Vert _{2,3}\)), for \(h\le T\),

$$\begin{aligned}&\ \Vert U_{T,h}-G_{T,h}\Vert _{2,\Omega \times [0,1]^3} \\&\quad =\ \bigg \Vert \frac{1}{\sqrt{T}}\bigg (\sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)}\mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-u\sum _{t'=1}^{T-h}\mathbb {E}[X_{t',T}\otimes X_{t'+h,T}]\bigg ) \bigg \Vert _{2,3}\\&\quad = \ \frac{1}{\sqrt{T}}\bigg \Vert \frac{1}{T-h}\sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)}\sum _{t'=1}^{T-h}\mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-\mathbb {E}[X_{t',T}\otimes X_{t'+h,T}]\\&\qquad +\bigg (\frac{1}{T-h}-\frac{u}{\lfloor uT\rfloor \wedge (T-h)}\bigg )\sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)}\sum _{t'=1}^{T-h}\mathbb {E}[X_{t',T}\otimes X_{t'+h,T}] \bigg \Vert _{2,3}\\&\quad \le \ \frac{C}{T^{3/2}}\sum _{t,t'=1}^{T-h}\Vert \mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-\mathbb {E}[X_{t',T}\otimes X_{t'+h,T}]\Vert _{2,2}+\frac{C}{T^{3/2}}\sum _{t=1}^{T-h}\Vert \mathbb {E}[X_{t,T}\otimes X_{t+h,T}]\Vert _{2,2}. \end{aligned}$$

This expression is of the order \(O(T^{-1/2})\) by (10) and Assumption (A2). Hence, \(\Vert U_{T,h}-\tilde{G}_{T,h}\Vert _{2,3}=o_\mathbb {P}(1)\), and the assertion for \(U_{T}\) follows along similar lines.

Now, consider the assertion regarding the alternative \(H_1^{\scriptscriptstyle (H)} = H_1^{\scriptscriptstyle (m)} \cup H_1^{\scriptscriptstyle (c,0)} \cup \dots \cup H_1^{\scriptscriptstyle (c,H)}\). We only treat the case where \(H_1^{\scriptscriptstyle (c,h)}\) is met for some \(h\in \{0, \dots , H\}\) and the case \(H_1^{\scriptscriptstyle (m)}\) is similar. It is to be shown that \(\Vert U_{T,h}\Vert _{2,3}\rightarrow \infty \) in probability.

By the reverse triangle inequality, we have

$$\begin{aligned} \Vert U_{T,h}\Vert _{2,3} = \Vert \tilde{G}_{T,h} + \mathbb {E}U_{T,h}\Vert _{2,3} \ge \big | \Vert \tilde{G}_{T,h}\Vert _{2,3} - \Vert \mathbb {E}U_{T,h}\Vert _{2,3} \big |. \end{aligned}$$

The term \(\Vert \tilde{G}_{T,h}\Vert _{2,3}\) converges weakly to \(\Vert \tilde{G}_h\Vert _{2,3}\). Thus, it suffices to show that the second term \(\Vert \mathbb {E}U_{T,h}\Vert _{2,3}\) diverges to infinity. For that purpose, note that another application of the reverse triangle inequality implies that

$$\begin{aligned} \Vert \mathbb {E}U_{T,h}\Vert _{2,3}&=\ \left\| \frac{1}{\sqrt{T}} \left( \sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)} \mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-u\sum _{t=1}^{T-h}\mathbb {E}[X_{t,T}\otimes X_{t+h,T}]\right) \right\| _{2,3} \\&\ge \ |S_{1,T}-S_{2,T}|, \end{aligned}$$

where

$$\begin{aligned} S_{1,T}= \left\| \frac{1}{\sqrt{T}}\left( \sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)} \left\{ \mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-\mathbb {E}[X_t^{(t/T)} \otimes X_{t+h}^{(t/T)}]\right\} \right. \right. \\ \left. \left. - u\sum _{t=1}^{T-h}\left\{ \mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-\mathbb {E}[X_t^{(t/T)}\otimes X_{t+h}^{(t/T)}]\right\} \right) \right\| _{2,3} \end{aligned}$$

and

$$\begin{aligned} S_{2,T}&=\left\| \frac{1}{\sqrt{T}}\left( \sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)} \mathbb {E}\left[ X_t^{(t/T)}\otimes X_{t+h}^{(t/T)}\right] -u\sum _{t=1}^{T-h}\mathbb {E}\left[ X_t^{(t/T)}\otimes X_{t+h}^{(t/T)}\right] \right) \right\| _{2,3}. \end{aligned}$$

In the following, we will show that \(S_{1,T}\) vanishes as T increases and that \(S_{2,T}\) diverges to infinity. We have

$$\begin{aligned} S_{1,T} \le \left\{ \int _0^1\left( \frac{1}{\sqrt{T}} \sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)} \left\| \mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-\mathbb {E}[X_t^{(t/T)}\otimes X_{t+h}^{(t/T)}] \right\| _{2,2}\right. \right. \\ \left. \left. + \frac{u}{\sqrt{T}} \sum _{t=1}^{T-h} \left\| \mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-\mathbb {E}[X_t^{(t/T)}\otimes X_{t+h}^{(t/T)}] \right\| _{2,2} \right) ^2 du \right\} ^{1/2}, \end{aligned}$$

which is of order \(O(T^{-1/2})\) since \(\Vert \mathbb {E}[X_{t,T}\otimes X_{t+h,T}]-\mathbb {E}[X_t^{\scriptscriptstyle (t/T)}\otimes X_{t+h}^{\scriptscriptstyle (t/T)}] \Vert _{2,2}\le C/T\) by (1). For the second term \(S_{2,T}\), we have, by stationarity

$$\begin{aligned} S_{2,T}&= \sqrt{T}\left\| \frac{1}{T}\left( \sum _{t=1}^{\lfloor uT\rfloor \wedge (T-h)} \mathbb {E}[X_0^{(t/T)}\otimes X_{h}^{(t/T)}]-u\sum _{t=1}^{T-h}\mathbb {E}[X_0^{(t/T)}\otimes X_{h}^{(t/T)}]\right) \right\| _{2,3}, \end{aligned}$$

where the norm converges to

$$\begin{aligned} \bigg \Vert \int _0^u \mathbb {E}[X_0^{(w)}\otimes X_h^{(w)}]{\,\mathrm {d}}w - u \int _0^1 \mathbb {E}[X_0^{(w)}\otimes X_h^{(w)}]{\,\mathrm {d}}w\bigg \Vert _{2,3}, \end{aligned}$$

by the dominated convergence theorem and the moment condition (A2). The expression in the latter display is strictly positive since (8) is not satisfied and by the continuity of

$$\begin{aligned} \bigg \Vert \int _0^u \mathbb {E}[X_0^{(w)}\otimes X_h^{(w)}] {\,\mathrm {d}}w - u \int _0^1 \mathbb {E}[X_0^{(w)}\otimes X_h^{(w)}]{\,\mathrm {d}}w\bigg \Vert _{2,2} \end{aligned}$$

in \(u\in [0,1]\). Thus, \(S_{2,T}\rightarrow \infty \), which implies the assertion. \(\square \)

Proof of Lemma 3

We will only give a proof of (14). Parts (i)–(iv) of the cumulant condition (A3) follow by similar arguments, which are omitted for the sake of brevity. According to Theorem 3 in Statulevicius and Jakimavicius (1988), we have

$$\begin{aligned} |{{\,\mathrm{cum}\,}}(X_{t_1,T}(\tau _1),\dots ,X_{t_k,T}(\tau _k))|\le 3(k-1)!2^{k-1} \alpha ^{\delta /(1+\delta )}(t_{i+1}-t_i) \prod _{j=1}^{k} \big (\mathbb {E}|X_{t_j,T}(\tau _j)|^{(1+\delta )k}\big )^{\frac{1}{(1+\delta )k}}, \end{aligned}$$

for any increasing sequence \(t_1\le t_2\le \dots \le t_k\). Straightforward calculations combined with Hölder’s and Jensen’s inequality lead to

$$\begin{aligned} \left\| \prod _{j=1}^k \mathbb {E}[|X_{t_j,T}|^{(1+\delta )k}]^{\frac{1}{(1+\delta )k}}\right\| _{2,k}&= \prod _{j=1}^k \big \Vert \mathbb {E}[|X_{t_j,T}|^{(1+\delta )k}]^{\frac{1}{(1+\delta )k}}\big \Vert _{2}\\&= \prod _{j=1}^k \bigg (\int _{[0,1]} \mathbb {E}[|X_{t_j,T}(\tau )|^{(1+\delta )k}]^{\frac{2}{(1+\delta )k}}\hbox {d}\tau \bigg )^{1/2}\\&\le \prod _{j=1}^k \bigg (\int _{[0,1]} \mathbb {E}[|X_{t_j,T}(\tau )|^{(1+\delta )k}]\hbox {d}\tau \bigg )^{\frac{1}{(1+\delta )k}}\\&= \prod _{j=1}^k \mathbb {E}\Big [\big \Vert X_{t_j,T}\big \Vert _{(1+\delta )k}^{(1+\delta )k}\Big ]^{\frac{1}{(1+\delta )k}}\\&\le \sup _{t,T}\mathbb {E}\Big [\big \Vert X_{t,T}\big \Vert _{(1+\delta )k}^{(1+\delta )k}\Big ]^{1/(1+\delta )}\\&\le C_{k,1}. \end{aligned}$$

Thus, combining the previous results leads to

$$\begin{aligned} \Vert {{\,\mathrm{cum}\,}}(X_{t_1,T},\dots ,X_{t_k,T})\Vert _{2,k}&\le 3(k-1)!2^{k-1}C_{k,1} \alpha ^{\delta /(1+\delta )}(t_{i+1}-t_i) \\&\le C_{k,4} \alpha ^{\delta /(1+\delta )}(t_{i+1}-t_i), \end{aligned}$$

for any \(i=1,\dots ,k-1\), where the constant \(C_{k,4}>0\) depends on k only. Hence,

$$\begin{aligned} \Vert {{\,\mathrm{cum}\,}}(X_{t_1,T},\dots ,X_{t_k,T})\Vert _{2,k}\le C_{k,4} \prod _{i=1}^{k-1} \alpha ^{\frac{\delta }{(1+\delta )(k-1)}}(t_{i+1}-t_i). \end{aligned}$$

Analogously, for arbitrary, not necessarily increasing \(t_1,\dots ,t_k\), we may obtain that

$$\begin{aligned}\Vert {{\,\mathrm{cum}\,}}(X_{t_1,T},\dots ,X_{t_k,T})\Vert _{2,k}\le C_{k,4} \prod _{i=1}^{k-1} \alpha ^{\frac{\delta }{(1+\delta )(k-1)}}\big (t_{(i+1)}-t_{(i)}\big ),\end{aligned}$$

where \(\big (t_{(1)},\dots ,t_{(k)}\big )\) denotes the order statistic of \((t_1,\dots ,t_k)\). The latter expression is symmetric in its arguments, and thus, we have, for any \(t_k\in \mathbb {Z}\),

$$\begin{aligned}&\sum _{t_1,\dots ,t_{k-1}=-\infty }^{\infty } \big \Vert {{\,\mathrm{cum}\,}}(X_{t_1,T},\dots ,X_{t_k,T})\big \Vert _{2,k}\\&\quad \le C_{k,4} \sum _{t_1,\dots ,t_{k-1}=-\infty }^{\infty } \prod _{i=1}^{k-1} \alpha ^{\frac{\delta }{(1+\delta )(k-1)}}\big (t_{(i+1)}-t_{(i)}\big )\\&\quad \le C_{k,4} (k-1)! \sum _{-\infty< t_1\le \dots \le t_{k-1}< \infty } \prod _{i=1}^{k-1} \alpha ^{\frac{\delta }{(1+\delta )(k-1)}}(t_{i+1}-t_{i})\\&\quad \le C_{k,4} (k-1)! \sum _{-\infty< t_2\le \dots \le t_{k-1}< \infty } \sum _{s_1=-\infty }^{\infty } \alpha ^{\frac{\delta }{(1+\delta )(k-1)}}(s_1) \prod _{i=2}^{k-1} \alpha ^{\frac{\delta }{(1+\delta )(k-1)}}(t_{i+1}-t_{i}). \end{aligned}$$

By assumption, \(\{(X_{t,T})_{t\in \mathbb {Z}}: T\in \mathbb {N}\}\) is exponentially strong mixing, and the inner sum is finite and can be bounded by some constant \(C_{k,2}\). Thus,

$$\begin{aligned}&\sum _{t_1,\ldots ,t_{k-1}=-\infty }^{\infty } \big \Vert {{\,\mathrm{cum}\,}}(X_{t_1,T},\dots ,X_{t_k,T})\big \Vert _{2,k}\\&\quad \le C_{k,4} (k-1)! C_{k,2} \sum _{-\infty< t_2\le \ldots \le t_{k-1}< \infty }\prod _{i=1}^{k-1} \alpha ^{\frac{\delta }{(1+\delta )(k-1)}}(t_{i+1}-t_{i}). \end{aligned}$$

Repeating this argument successively, we obtain finally (14) as asserted. \(\square \)

Proof of Theorem 2

By Slutsky’s lemma and Theorem C.3 of the supplementary material, it is sufficient to prove that

$$\begin{aligned} \big (\hat{\mathbb {B}}_{T}^{(1)}-\mathbb {B}_{T}^{(1)},\dots ,\hat{\mathbb {B}}_{T}^{(K)}-\mathbb {B}_{T}^{(K)}\big ) =o_\mathbb {P}(1) \end{aligned}$$

in \(\{ L^2([0,1]^2)\times \{L^2([0,1]^3)\}^{H+1}\}^{K}\), as T tends to infinity. This in turn is equivalent to

$$\begin{aligned} \big (\Vert \hat{B}_{T}^{(k)}-\tilde{B}_T^{(k)}\Vert _{2,3},\Vert \hat{B}_{T,0}^{(k)}-\tilde{B}_{T,0}^{(k)}\Vert _{2,3}\dots ,\Vert \hat{B}_{T}^{(k)}-\tilde{B}_{T,h}^{(k)}\Vert _{2,3}\big )_{k=1,\dots ,K} =o_\mathbb {P}(1) \end{aligned}$$

in \(\mathbb {R}^{K(H+2)}\). The last convergence holds true if and only if the coordinates converge, i.e., if \(\Vert \hat{B}_{T}^{\scriptscriptstyle (k)}-\tilde{B}_{T}^{\scriptscriptstyle (k)}\Vert _{2,3}=o_\mathbb {P}(1)\) and \(\Vert \hat{B}_{T,h}^{\scriptscriptstyle (k)}-\tilde{B}_{T,h}^{\scriptscriptstyle (k)}\Vert _{2,3}=o_\mathbb {P}(1)\), for all \(k=1,\dots ,K\) and \(h=0,\dots ,H\). We only consider the latter assertion (the former can be treated similarly), and in fact, we will show convergence in \(L^2(\Omega , \mathbb {P})\), which is even stronger. For this purpose, observe that by Fubini’s theorem and the independence of the family \((R_i^{\scriptscriptstyle (k)})_{i\in \mathbb {N}}\)

$$\begin{aligned}&\mathbb {E}\Vert \hat{B}_{T,h}^{(k)}-\tilde{B}_{T,h}^{(k)}\Vert _{2,3}^2\\&\quad = \mathbb {E}\Bigg [\int _{[0,1]^3}\frac{1}{mT}\bigg \{\sum _{i=1}^{\lfloor uT\rfloor \wedge (T-h)} R_i^{(k)} \sum _{t=i}^{(i+m-1)\wedge (T-h)}\mu _{t,T,h}(\tau _1, \tau _2)- \hat{\mu }_{t,T,h}(\tau _1, \tau _2) \bigg \}^2 {\,\mathrm {d}}(u,\tau _1,\tau _2)\Bigg ]\\&\quad = \frac{1}{mT}\int _{[0,1]^3}\sum _{i=1}^{\lfloor uT\rfloor \wedge (T-h)} \mathbb {E}\Bigg [\bigg \{\sum _{t=i}^{(i+m-1)\wedge (T-h)} A_{t,1} + A_{t,2}\bigg \}^2\Bigg ]{\,\mathrm {d}}(u,\tau _1,\tau _2), \end{aligned}$$

where

$$\begin{aligned} A_{t,1}(\tau _1,\tau _2)= \textstyle \frac{1}{\tilde{n}_{t,h}}\sum _{k=\underline{n}_t}^{\bar{n}_{t,h}}\mathbb {E}[X_{t,T}(\tau _1)X_{t+h,T}(\tau _2)]-\mathbb {E}[X_{t+k,T}(\tau _1)X_{t+k+h,T}(\tau _2)] \end{aligned}$$

and

$$\begin{aligned} A_{t,2}(\tau _1,\tau _2)= \textstyle \frac{1}{\tilde{n}_{t,h}}\sum _{k=\underline{n}_t}^{\bar{n}_{t,h}}X_{t+k,T}(\tau _1)X_{t+k+h,T}(\tau _2)-\mathbb {E}[X_{t+k,T}(\tau _1)X_{t+k+h,T}(\tau _2)]. \end{aligned}$$

Since \(A_{t,1}\) is deterministic and since \(A_{t,2}\) is centered, we can rewrite the expectation in the previous integral as

$$\begin{aligned}&\mathbb {E}\left[ \left\{ \sum _{t=i}^{(i+m-1)\wedge (T-h)}A_{t,1}(\tau _1,\tau _2)+A_{t,2}(\tau _1,\tau _2) \right\} ^2\right] \\&\quad =\left( \sum _{t=i}^{(i+m-1)\wedge (T-h)}A_{t,1}(\tau _1,\tau _2) \right) ^2+\mathbb {E}\left[ \left( \sum _{t=i}^{(i+m-1)\wedge (T-h)}A_{t,2}(\tau _1,\tau _2)\right) ^2\right] . \end{aligned}$$

In the following, we bound both parts separately. For the term \(A_{t,1}\), first note that, by stationarity of \((X_t^{\scriptscriptstyle (u)})_{t\in \mathbb {Z}}\),

$$\begin{aligned}&\mathbb {E}[X_{t,T}(\tau _1) X_{t+h,T}(\tau _2)]-\mathbb {E}[X_{t+k,T}(\tau _1)X_{t+k+h,T}(\tau _2)]\\&\quad = \mathbb {E}[X_{t,T}(\tau _1) X_{t+h,T}(\tau _2)-X_t^{(t/T)}(\tau _1) X_{t+h}^{(t/T)}(\tau _2)]\\&\qquad \phantom {=}-\mathbb {E}[X_{t+k,T}(\tau _1) X_{t+k+h,T}(\tau _2)-X_{t+k}^{(t/T)}(\tau _1) X_{t+k+h}^{(t/T)}(\tau _2)] \end{aligned}$$

in \(L^2([0,1]^2)\). Thus, by Jensen’s inequality and Fubini’s theorem, we have

$$\begin{aligned}&\ \frac{1}{mT}\int _{[0,1]^3}\sum _{i=1}^{\lfloor uT\rfloor \wedge (T-h)} \left( \sum _{t=i}^{(i+m-1) \wedge (T-h)}A_{t,1}\right) ^2 {\,\mathrm {d}}(u,\tau _1,\tau _2)\\&\quad \le \ \frac{1}{mT}\int _{[0,1]^3}\sum _{i=1}^{\lfloor uT\rfloor \wedge (T-h)} \mathbb {E}\left[ \left\{ \sum _{t=i}^{(i+m-1)\wedge (T-h)}\frac{1}{\tilde{n}_{t,h}} \sum _{k=\underline{n}_t}^{\bar{n}_{t,h}}X_{t,T}(\tau _1)X_{t+h,T}(\tau _2) {-}X_t^{(t/T)}(\tau _1)X_{t+h}^{(t/T)}(\tau _2)\right. \right. \\&\qquad \left. \left. -X_{t+k,T}(\tau _1)X_{t+k+h,T}(\tau _2)+X_{t+k}^{(t/T)}(\tau _1)X_{t+k+h}^{(t/T)}(\tau _2) \right\} ^2\right] {\,\mathrm {d}}(u,\tau _1,\tau _2)\\&\quad \le \ \frac{1}{mT}\sum _{i=1}^{T-h}\mathbb {E}\left\| \sum _{t=i}^{(i+m-1)\wedge (T-h)} \frac{1}{\tilde{n}_{t,h}}\sum _{k=\underline{n}_t}^{\bar{n}_{t,h}} X_{t,T} \otimes X_{t+h,T}-X_t^{(t/T)}\otimes X_{t+h}^{(t/T)}\right. \\&\qquad \left. -X_{t+k,T}\otimes X_{t+k+h,T}+X_{t+k}^{(t/T)}\otimes X_{t+k+h}^{(t/T)}\right\| _{2,2}^2. \end{aligned}$$

The norm on the right-hand side of the previous inequality be bounded by the triangle inequality by

$$\begin{aligned}&\sum _{t=i}^{(i+m-1)\wedge (T-h)}\frac{1}{\tilde{n}_{t,h}}\sum _{k=\underline{n}_t}^{\bar{n}_{t,h}} \Vert X_{t,T}\otimes X_{t+h,T}-X_t^{(t/T)}\otimes X_{t+h}^{(t/T)}\Vert _{2,2}\\&\quad +\Vert X_{t+k,T}\otimes X_{t+k+h,T}-X_{t+k}^{(t/T)}\otimes X_{t+k+h}^{(t/T)}\Vert _{2,2} \end{aligned}$$

and the inner summands can be bounded due to the local stationarity of \((X_{t,T})\): first,

$$\begin{aligned}&\Vert X_{t,T} \, \otimes X_{t+h,T}-X_t^{(t/T)}\otimes X_{t+h}^{(t/T)}\Vert _{2,2} \\&\quad \le \ \Vert X_{t,T}\otimes (X_{t+h,T}-X_{t+h}^{(t/T)})\Vert _{2,2}+\Vert X_{t+h}^{(t/T)}\otimes (X_{t,T}-X_t^{(t/T)})\Vert _{2,2}\\&\quad = \Vert X_{t,T}\Vert _2\Vert X_{t+h,T}-X_{t+h}^{(t/T)}\Vert _2+\Vert X_{t+h}^{(t/T)}\Vert _2\Vert X_{t,T}-X_t^{(t/T)}\Vert _2\\&\quad \le T^{-1}\big \{(h+1)\Vert X_{t,T}\Vert _2+\Vert X_{t+h}^{(t/T)}\Vert _2\big \}P_{t,T}^{(t/T)} \end{aligned}$$

and similarly

$$\begin{aligned}&\,\Vert X_{t+k,T} \, \otimes X_{t+k+h,T}-X_{t+k}^{(t/T)}\otimes X_{t+k+h}^{(t/T)}\Vert _{2,2} \\&\quad \le T^{-1}\big \{(|k+h|+1)\Vert X_{t+k,T}\Vert _2+(|k|+1)\Vert X_{t+k+h}^{(t/T)}\Vert _2\big \}P_{t,T}^{(t/T)}. \end{aligned}$$

Assembling bounds, we obtain that

$$\begin{aligned}&\ \frac{1}{mT}\int _{[0,1]^3}\sum _{i=1}^{\lfloor uT\rfloor \wedge (T-h)} \left( \sum _{t=i}^{(i+m-1)\wedge (T-h)}A_{t,1}(\tau _1,\tau _2)\right) ^2 {\,\mathrm {d}}(u,\tau _1,\tau _2)\\&\quad \le \frac{1}{mT}\sum _{i=1}^{T-h}\sum _{t,t'=i}^{(i+m-1)\wedge (T-h)}\frac{1}{\tilde{n}_{t,h} \tilde{n}_{t',h}} \sum _{k=\underline{n}_t}^{\bar{n}_{t,h}}\sum _{k'=\underline{n}_{t'}}^{\bar{n}_{t',h}}\frac{1}{T^2}\mathbb {E}\Big [(|k|+h+1)(|k'|+h+1)P_{t,T}^{(t/T)}P_{t',T}^{(t'/T)}\\&\ \qquad \times \big (\Vert X_{t,T}\Vert _2+\Vert X_{t+h}^{(t/T)} \Vert _2+\Vert X_{t+k,T}\Vert _2+\Vert X_{t+k+h}^{(t/T)}\Vert _2\big )\\&\ \qquad \times \big (\Vert X_{t',T}\Vert _2+\Vert X_{t'+h}^{(t'/T)} \Vert _2+\Vert X_{t'+k',T}\Vert _2+\Vert X_{t'+k'+h}^{(t'/T)}\Vert _2\big )\Big ]\\&\quad \le \frac{C}{mT^3}\sum _{i=1}^{T-h}\sum _{t,t'=i}^{(i+m-1) \wedge (T-h)}\frac{1}{\tilde{n}_{t,h} \tilde{n}_{t',h}} \sum _{k=\underline{n}_t}^{\bar{n}_{t,h}}\sum _{k'=\underline{n}_{t'}}^{\bar{n}_{t',h}} (|k|+h+1)(|k'|+h+1)=O\Big (\frac{mn^2}{T^2}\Big ), \end{aligned}$$

which converges to zero by Assumption (B2).

For the term \(A_{t,2}\), first observe that, by Jensen’s inequality for convex functions,

$$\begin{aligned}&\ \mathbb {E}\bigg [\bigg (\sum _{t=i}^{(i+m-1)\wedge (T-h)}A_{t,2}(\tau _1,\tau _2)\bigg )^2\bigg ]\\&\quad \le m\sum _{t=i}^{(i+m-1)\wedge (T-h)}\frac{1}{\tilde{n}_{t,h}^2}\sum _{k,k'=\underline{n}_t}^{\bar{n}_{t,h}}\text {Cov}\big \{ X_{t+k,T}(\tau _1)X_{t+k+h,T}(\tau _2),X_{t+k',T}(\tau _1)X_{t+k'+h,T}(\tau _2)\big \}. \end{aligned}$$

By the same arguments as in the proof of Proposition C.7 of the supplementary material and Assumption (A3), one can see that the right-hand side of the inequality

$$\begin{aligned}&\frac{1}{mT}\sum _{i=1}^{T-h} \int _{[0,1]^2} \mathbb {E}\bigg [\bigg (\sum _{t=i}^{(i+m-1)\wedge (T-h)}A_{t,2}(\tau _1,\tau _2)\bigg )^2\bigg ] {\,\mathrm {d}}(\tau _1,\tau _2)\\&\quad \le \frac{1}{T}\sum _{i=1}^{T-h} \sum _{t=i}^{(i+m-1)\wedge (T-h)} \frac{1}{\tilde{n}_{t,h}^2} \sum _{k,k'=\underline{n}_t}^{\bar{n}_{t,h}} \Vert \text {Cov}(X_{t+k,T}\otimes X_{t+k+h,T}, X_{t+k',T} \otimes X_{t+k'+h,T}) \Vert _{1,2} \end{aligned}$$

is of order \(\mathcal {O}(m/n)\). The assertion follows since \(m/n=o(1)\) by Assumption (B2).

\(\square \)

Proof of Proposition 1

The cumulative distribution function of the \((h+2)\)nd coordinate of \(\varvec{S}\) is continuous by Theorem 7.5 of Davydov and Lifshits (1985). The assertion under the null hypothesis follows from Lemma 4.1 in Bücher and Kojadinovic (2017). Consistency follows from the fact that the bootstrap quantiles are stochastically bounded by Theorem 2, whereas the test statistic diverges by Corollary 1. \(\square \)