1 Introduction

Since the seminal work of Efron (1979), bootstrapping has been established as a major tool for estimating unknown finite sample distributions of general statistics. Among others, this method has successfully been applied to construct confidence intervals for sample quantiles of continuous distributions; see e.g. Serfling (2002, Chapter 2.6), Sun and Lahiri (2006) and Sharipov and Wendler (2013) and references therein. In this case, the asymptotic behavior of quantile estimators is well understood. Based on the well-known Bahadur representation, a CLT can then be established for sample quantiles in case of an underlying distribution exhibiting a differentiable cumulative distribution function (cdf) and a positive density at the quantile level of interest. This allows for the application of classical results on the bootstrap to mimic the unknown finite sample distribution.

Quantile estimation has many practical applications for discrete-valued data, too. For instance, Chen and Lazar (2010) use it to analyze epileptic seizure count data. Moreover, it plays a central role in survey analysis, e.g. to report the median age at first marriage or the median customer satisfaction, where the latter is typically categorical data. For an overview of results on bootstrapping sample quantiles in the context of survey data, we refer to Shao and Chen (1998). However, the results obtained there rely on certain smoothness assumptions on the underlying distribution, which do generally not hold true for discrete data. In supply chain management, especially for sporadic demand, quantile estimation is required to develop inventory policies that lead to a prescribed \(\alpha \)-service level in the sense of Tempelmeier (2000, Sect. 2.1). Confidence intervals can then be used to determine the uncertainty of these estimates.

However, if the underlying distribution is discrete, this task is much more delicate than in the continuous case. Sample quantiles may not even be consistent in general for the population quantiles in this case. This issue occurs due to the fact that the cdf is a step function. This leads to inconsistency if the level of the quantile of interest lies in the image of the cdf and, consequently, CLTs do not hold true anymore. Before we illustrate this inconsistency with the help of a simple, but very insightful toy example below, first, we fix some notation that is used throughout this paper. Let \(Q_p\) for \(p\in (0,1)\) be the usual population \(p\)-quantile of a probability distribution with cdf \(F\) defined via its generalized inverse, i.e.

$$\begin{aligned} Q_p=F^{-1}(p)=\inf _{t}\left\{ t:F(t)\ge p\right\} . \end{aligned}$$
(1)

With observations \(X_1,\ldots ,X_n\) at hand, the sample \(p\)-quantile \(\widehat{Q}_p\) is defined as the empirical counterpart to (1), that is,

$$\begin{aligned} \widehat{Q}_p=\widehat{F}_n^{-1}(p)=\inf _{t}\{t:\widehat{F}_n(t)\ge p\}, \end{aligned}$$
(2)

where \(\widehat{F}_n(x)=n^{-1}\sum _{i=1}^n 1(X_i\le x)\) denotes the empirical distribution. Here and in the sequel, \(\lceil x\rceil \) (\(\lfloor x\rfloor \)) denotes the smallest (largest) integer that is larger (smaller) or equal to \(x\).

Toy example: coin flip data Suppose a coin is flipped independently \(n\) times and we observe a sequence \(X_1,\ldots ,X_n\) of zeros and ones such that \(P(X_i=0)={\theta }=1-P(X_i=1)\) for some \({\theta }\in (0,1)\). Let \(X_\mathrm{med}=Q_{0.5}\) and \(\widehat{X}_\mathrm{med}=\widehat{Q}_{0.5}\) denote the population median and the sample median, respectively. This leads to

$$\begin{aligned} P(\widehat{X}_\mathrm{med}=0)=\sum _{k=\lceil \frac{n}{2}\rceil }^n\left( {\begin{array}{c}n\\ k\end{array}}\right) {\theta }^k(1-{\theta })^{n-k}. \end{aligned}$$
(3)

If a coin is fair, i.e. \({\theta }=1/2\), we have \(X_{med}=0\) and, by symmetry properties, we get

$$\begin{aligned} P(\widehat{X}_\mathrm{med}=0)= {\left\{ \begin{array}{ll} \frac{1}{2}, &{} n\text { odd}\\ \frac{1}{2}+\left( {\begin{array}{c}n\\ n/2\end{array}}\right) \left( \frac{1}{2}\right) ^{n+1},&n\text { even}\end{array}\right. }. \end{aligned}$$
(4)

From Stirling’s formula (see e.g.Krantz 1991, Theorem 10.23), we get \(\left( {\begin{array}{c}n\\ n/2\end{array}}\right) \left( \frac{1}{2}\right) ^{n+1}=O(n^{-1/2})\), which leads to

$$\begin{aligned} P(\widehat{X}_\mathrm{med}=0)=1-P(\widehat{X}_\mathrm{med}=1)\rightarrow \frac{1}{2} \end{aligned}$$
(5)

as \(n\rightarrow \infty \). This is contrary to \(\widehat{X}_\mathrm{med}\mathop {\longrightarrow }\limits ^{P}0\), i.e. the sample median is not a consistent estimator and its limiting distribution is an equally-weighted 2-point distribution.

In this paper, as a first result, we show that one consequence of the estimation inconsistency illustrated in (5) is that the classical bootstrap of Efron for i.i.d. data is inconsistent for sample quantiles if they do not consistently estimate the true quantile. More precisely, we prove that the Kolmogorov-Smirnov distance between the cdf’s and their bootstrap analogues does not converge to zero, but to non-degenerate random variables. These turn out to be functions of a random variable \(U\sim \mathrm{Unif}(0,1)\) in the special case of the sample median for the fair coin flip discussed in the example and in Theorem 1 in Sect. 1. To the authors’ knowledge, such a specific phenomenon has not been observed in the bootstrap literature so far.

Toy example: bootstrapping coin flip data Let \(X_1^*,\ldots ,X_n^*\) be i.i.d. (Efron) bootstrap replicates of \(X_1,\ldots ,X_n\) and let \(\widehat{X}_\mathrm{med}^*\) denote the bootstrap sample median based on the bootstrap observations. Then, we have analogously to (3)

$$\begin{aligned} P^*(\widehat{X}_\mathrm{med}^*=0)=\sum _{k=\lceil \frac{n}{2}\rceil }^n\left( {\begin{array}{c}n\\ k\end{array}}\right) {\widehat{\theta }_n^k}(1-{\widehat{\theta }_n})^{n-k}, \end{aligned}$$
(6)

where \({\widehat{\theta }_n}=n^{-1}\sum _{t=1}^n 1(X_t=0)\) and \(P^*\) denotes as usual the bootstrap distribution (conditional on \(X_{1},\ldots ,X_{n}\)). In Theorem 1 below, we show that

$$\begin{aligned} P^*(\widehat{X}_\mathrm{med}^*=0)=1-P^*(\widehat{X}_\mathrm{med}^*=1) \mathop {\longrightarrow }\limits ^{{\mathcal D}}U\sim \mathrm{Unif}(0,1). \end{aligned}$$
(7)

By combining the result in (7) with (5), we get inconsistency of Efron’s bootstrap, see Theorem 1 below for details.

In view of the results displayed in the toy example, it is worth noting that, more generally, the population \(p\)-quantile \(Q_p\) may be defined as any real number \(q\) that satisfies the two inequalities

$$\begin{aligned} P(X\le q)\ge p \quad \text { and } \quad P(X\ge q)\ge 1-p, \end{aligned}$$
(8)

where \(X\sim F\), i.e. the definition (1) corresponds to the smallest possible value of \(q\) in (8). In particular, it is not unusual to define the median \(X_\mathrm{med}\) as the center of the smallest and the largest possible values of the median with respect to definition (8). The sample median \(\widehat{X}_{med}\) is then defined in direct analogy. However, this choice does not affect at all the inconsistency results above and we prefer the definitions via (1) and (2) for two reasons. Firstly, they naturally fit into the more general notation of the (generalized) inverse of the cdf and, secondly, the (sample) median then takes values in the support of \(P^{{X}}\) only.

Still, one would like to establish consistent bootstrap results not only for the continuous setting, but also in general for discrete distributions. In this paper, as the use of ordinary quantiles in discrete settings can be discussed conversely, we provide two different and complementing strategies to tackle the issue of bootstrap inconsistency for sample quantiles in the discrete setup that is illustrated in the toy example above.

In the first part of this paper, we investigate whether the \(m\)-out-of-\(n\) bootstrap (or low-intensity bootstrap) leads to asymptotically consistent bootstrap approximations. In several contexts where the classical bootstrap fails, this modified bootstrap scheme is known to be a valid alternative; see e.g. Swanepoel (1986), Angus (1993), Deheuvels et al. (1993), Athreya and Fukuchi (1994, 1997) and Del Barrio et al. (2013) among others. We prove that the i.i.d. \(m\)-out-of-\(n\) bootstrap is consistent for sample quantiles without centering in the i.i.d. discrete data case, but also that inconsistency for Efron’s bootstrap remains if the procedure is applied with centering. These differing results seem to be odd at first sight, but they can be explained by systematically different centering schemes. Another somewhat surprising result is that, on the one hand, bootstrap consistency can be achieved for i.i.d. data as well as dependent time series data for one and the same i.i.d. \(m\)-out-of-\(n\) bootstrap procedure (without centering) as long as only single sample quantiles are considered. But on the other hand, an \(m\)-out-of-\(n\) block bootstrap procedure à la Athreya et al. (1999) has to be used to mimic correctly the joint limiting distribution of several sample quantiles in the time series case. To be able to establish this theory, we had to derive the joint limiting distribution of vectors of sample quantiles for weakly dependent time series processes. This might be of independent interest.

The consistency results achieved for the \(m\)-out-of-\(n\) bootstrap are then applied to construct bootstrap confidence intervals. As these tend to be conservative due to the discreteness of the true distribution, we propose randomization techniques similar to the construction of randomized tests (e.g. classical Neyman–Pearson tests) to construct bootstrap confidence intervals of asymptotically correct size.

All afore-mentioned difficulties related to discrete distributions are mainly due to the jumps occurring in the distribution function, which leads to many quantiles having the same values. Another look at quantiles of discretely distributed data is to employ the so-called mid-distribution function proposed by Parzen (1997, 2004). This concept has been further studied in Ma et al. (2011) and has been applied successfully e.g. to probabilistic index models in Thas et al. (2006). The corresponding mid-quantile function is a continuous, piecewise linear modification of the ordinary quantile function.

In the second part of this paper, we make use of mid-quantiles. Although the distributions of the mid-quantiles lose their discrete nature, they allow for a meaningful interpretation in many relevant situations. Exemplary, compare two (small) samples stemming from coin flip scenarios. Both their sample medians may be computed to 0. Actually, this is not much information since the samples widely may differ. Assume for example that in the first sample five out of nine heads (equal to 0) may be occurred and in the second sample eight out of nine heads occurred. It would be of much more use to regard the empirical proportion of heads and tails within each sample to describe their underlying distributions and to reflect possible differences. Based on such considerations Parzen (1997, 2004) established the concept of mid-distribution functions to handle sample medians more likely. Contrary to ordinary quantiles, it turns out that the mid-quantiles can be estimated consistently. Moreover, (non-)central limit theorems of the sample mid-quantiles can be achieved, where the limiting distributions crucially depend on whether the mid-distribution function is differentiable or not at the quantile of interest.

First, we generalize the limiting results obtained in Ma et al. (2011) to the time series case under a so-called \(\tau \)-weak dependence condition introduced by Dedecker and Prieur (2005). This extension is motivated by a growing literature on modeling of and statistical inference for count data that appear, e.g. as transaction numbers of financial assets or biology where the evolution of infection numbers over time is of great interest; see for instance Fokianos et al. (2009) and Ferland et al. (2006). In particular, the theory provided in this paper covers parameter-driven integer-valued autoregressive (INAR) models but also observation-driven integer-valued GARCH (INGARCH) models. By construction, the mid-quantile function is continuous, but it fails to be differentiable. Caused by this non-smoothness, it turns out that classical i.i.d. or block bootstrap methods do not lead to completely satisfactory results and \(m\)-out-of-\(n\) variants are required here as well. Moreover, due to boundary effects, randomization techniques still have to be used to construct confidence intervals of asymptotic correct level \(1-\alpha \) for \(\alpha \in (0,1)\).

The rest of the paper is organized as follows. Section 2 focuses on bootstrapping classical quantiles. In a first Sect. 2.1 we show inconsistency of Efron’s bootstrap in the special case of the fair coin flip. Afterwards, in Sect. 2.2 we discuss validity of low-intensity bootstrap methods for quantiles in a much more general framework that covers a large class of discretely distributed time series. In Sect. 2.3 randomization techniques for the construction of confidence sets are provided before the finite sample behavior of our methods is illustrated in Sect. 2.4. In Sect. 3 we consider the alternative concept of mid-quantiles. In Sect. 3.1 we generalize the asymptotic results established in Ma et al. (2011) for the i.i.d. case to the case of weakly dependent time series data. Bootstrap validity is discussed in Sect. 3.2 and, based on these results, confidence intervals for mid-quantiles are provided in Sect. 3.3. Numerical experiments are reported in Sect. 3.4. Finally, both concepts are discussed in a comparative conclusion. All proofs and auxiliary results are deferred to a final section of the paper.

2 Bootstrapping sample quantiles

2.1 Inconsistency of Efron’s bootstrap

In this section, we prove for the simple example of a fair coin flip and the sample median that Efron’s bootstrap is not capable in general to estimate consistently the limiting distribution of sample quantiles from discretely distributed data. To check for bootstrap consistency, we make use of the Kolmogorov–Smirnov distance and show that neither

$$\begin{aligned} d_{KS}(\widehat{X}_\mathrm{med}^*,\widehat{X}_\mathrm{med})=\sup _{x\in \mathbb {R}}\left| P^*(\widehat{X}_\mathrm{med}^*\le x)-P(\widehat{X}_\mathrm{med}\le x)\right| \end{aligned}$$
(9)

(without centering) nor

$$\begin{aligned}&d_{KS}(\widehat{X}_\mathrm{med}^*-\widehat{X}_\mathrm{med},\widehat{X}_\mathrm{med}-X_\mathrm{med})\nonumber \\&\quad =\sup _{x\in \mathbb {R}}\left| P^*(\widehat{X}_\mathrm{med}^*-\widehat{X}_\mathrm{med}\le x)-P(\widehat{X}_\mathrm{med}-X_\mathrm{med}\le x)\right| \end{aligned}$$
(10)

(with centering) converges to zero for increasing sample size, but to non-degenerate distributions, which turn out to be different in these two cases. Dealing with the non-centered case (9) first and due to \(X_i\in \{0,1\}\) for the coin flip example, it suffices to consider

$$\begin{aligned} \sup _{x\in [0,1)}\left| P^*(\widehat{X}_\mathrm{med}^*\le x)-P(\widehat{X}_\mathrm{med}\le x)\right| =\left| P^*(\widehat{X}_\mathrm{med}^*=0)-P(\widehat{X}_\mathrm{med}=0)\right| , \qquad \end{aligned}$$
(11)

because \(|P^*(\widehat{X}_\mathrm{med}^*\le x)-P(\widehat{X}_\mathrm{med}\le x)|=0\) holds for all \(x\notin [0,1)\). Further, we know that \(P(\widehat{X}_\mathrm{med}=0)\rightarrow 1/2\) with \(n\rightarrow \infty \) by (5) such that we have to investigate

$$\begin{aligned} P^*(\widehat{X}_\mathrm{med}^*=0)=\sum _{k=\lceil \frac{n}{2}\rceil }^n\left( {\begin{array}{c}n\\ k\end{array}}\right) {\widehat{\theta }}_n^k(1-{\widehat{\theta }}_n)^{n-k} \end{aligned}$$
(12)

in more detail. For the case with centering (10), things become slightly different and it suffices to consider

$$\begin{aligned} \sup _{k\in \{-1,0\}}\left| P^*(\widehat{X}_\mathrm{med}^*-\widehat{X}_\mathrm{med}\le k)-P(\widehat{X}_\mathrm{med}-X_\mathrm{med}\le k)\right| \end{aligned}$$
(13)

in this case. Precisely, we get the following results.

Theorem 1

(Inconsistency of Efron’s bootstrap) For independent and fair (\({\theta }=0.5\)) coin flip random variables \(X_1,\ldots ,X_n\) and i.i.d. (Efron) bootstrap replicates \(X_1^*,\ldots ,X_n^*\), it holds

$$\begin{aligned} P^*(\widehat{X}_\mathrm{med}^*=0)=\sum _{k=\lceil \frac{n}{2}\rceil }^n\left( {\begin{array}{c}n\\ k\end{array}}\right) {\widehat{\theta }}_n^k(1- {\widehat{\theta }}_n)^{n-k}\overset{\mathcal {D}}{\longrightarrow }U\sim \textit{Unif}\,\,(0,1). \end{aligned}$$
(14)

This leads to:

  1. (i)

    For Efron’s bootstrap without centering, it holds

    $$\begin{aligned} d_{KS}(\widehat{X}_\mathrm{med}^*,\widehat{X}_\mathrm{med})\overset{\mathcal {D}}{\longrightarrow }\left| U-\frac{1}{2}\right| \sim \textit{Unif}\,\,(0,1/2). \end{aligned}$$
    (15)
  2. (ii)

    For Efron’s bootstrap with centering, it holds

    $$\begin{aligned}&d_{KS}(\widehat{X}_\mathrm{med}^*-\widehat{X}_\mathrm{med},\widehat{X}_\mathrm{med}-X_\mathrm{med})\nonumber \\&\quad \overset{\mathcal {D}}{\longrightarrow }1\left( \frac{1}{2}\le U\right) U+1\left( \frac{1}{2}>U\right) -\frac{1}{2}=:S, \end{aligned}$$
    (16)

    where the cdf of \(S\) is given by

    $$\begin{aligned} F_{S}(x)=x1_{[0,\frac{1}{2})}(x)+1_{[\frac{1}{2},\infty )}(x). \end{aligned}$$

2.2 The \(m\)-out-of-\(n\) bootstrap

2.2.1 Coin flip data

Of course, there are other situations discussed in the literature, where the ordinary Efron’s bootstrap fails; see Bickel and Friedman (1981, Section 6), Mammen (1992) and Horowitz (2001) and references therein. The most prominent example is the maximum of i.i.d. random variables \(X_1,\ldots ,X_n\), that is, \(M_n=\max (X_1,\ldots ,X_n)\). In this case, bootstrap inconsistency of \(M_n^*=\max (X_1^*,\ldots ,X_n^*)\) has been investigated in Angus (1993). To circumvent this problem and in view of the well-known limiting result [cf. Resnick (1987), Chapter 1]

$$\begin{aligned} P(a_n^{-1}(M_n-b_n)\le x)\mathop {\longrightarrow }\limits _{n\rightarrow \infty }G(x)\quad \forall x\in \mathbb {R}\end{aligned}$$

for suitable distributions \(P^{X_1}\), sequences \((a_n)_n\) and \((b_n)_n\) and a non-degenerate cdf \(G\), Swanepoel (1986), Deheuvels et al. (1993) and Athreya and Fukuchi (1994, 1997) used the low-intensity \(m\)-out-of-\(n\)-bootstrap. That is, drawing with replacement \(m\) times with \(m\rightarrow \infty \) such that \(m=o(n)\) to get \(X_1^*,\ldots ,X_m^*\) and to mimic the distribution of \(a_n^{-1}(M_n-b_n)\) by that of \(a_m^{-1}(M_m^*-b_m)\). This task has been generalized by Athreya et al. (1999) to time series data, where additionally a low-intensity block bootstrap has been proposed and investigated.

The situation addressed in this paper is somehow comparable. A closer inspection of (3) and (6) leads to the conclusion that if we were allowed to replace \({\widehat{\theta }}_n\) by \({\theta }\) for asymptotic considerations, we would get the same limiting results. Obviously, from (5) and (7), this is not the case. However, as

$$\begin{aligned} \sqrt{n}({\widehat{\theta }}_n-{\theta })\overset{\mathcal {D}}{\rightarrow }\mathcal {N}\left( 0,{\theta (1-\theta )}\right) , \end{aligned}$$
(17)

inconsistency stated in Theorem 1 for the coin flip can be explained by the fact that the convergence \({\widehat{\theta }}_n-{\theta }=O_P(n^{-1/2})\) is just “too slow”. Hence, the bootstrap is not able to mimic the underlying scenario correctly since the latter completely differs for \({\theta }=1/2\) and \({\theta }\ne 1/2\). Note that the limiting distribution is a non-degenerate 2-point distribution in the first and degenerate in the second case; compare Theorem 6 below and Fig. 1. Therefore, natural questions are whether an \(m\)-out-of-\(n\) bootstrap may be capable to “speed up” the convergence of \({\widehat{\theta }}_n\) (relative to the convergence of the empirical cdf on the bootstrap side) and whether this does lead to bootstrap consistency. The following theorem summarizes our findings in this direction for the sample median without and with centering corresponding to the results of Theorem 1.

Fig. 1
figure 1

1000 realizations of \(\sup _{k}\left| P^*(\widehat{X}_\mathrm{med}^*\le k)-P(\widehat{X}_\mathrm{med}\le k)\right| \) of a coin flip for sample sizes \(n\in \{100,500,1000\}\) (from left to right) and for \({\theta }\in \{0.5,0.45\}\) (from top to bottom)

Theorem 2

(Consistency and inconsistency for the \(m\)-out-of-\(n\) bootstrap for the sample median) For independent and fair \(({\theta }=0.5)\) coin flip random variables \(X_1,\ldots ,X_n\), we draw i.i.d. bootstrap replicates \(X_1^*,\ldots ,X_m^*\). Suppose that \(m/n+1/m=o(1)\) as \(n\rightarrow \infty \) and denote the bootstrap sample median based on \(X_1^*,\dots , X_m^*\) by \(\widehat{X}^*_{m,med}\).

  1. (i)

    For the \(m\)-out-of-\(n\) bootstrap without centering, it holds

    $$\begin{aligned} d_{KS}(\widehat{X}_{m,\mathrm{med}}^*,\widehat{X}_\mathrm{med})\mathop {\longrightarrow }\limits ^{P}0. \end{aligned}$$
  2. (ii)

    For the \(m\)-out-of-\(n\) bootstrap with centering, it holds

    $$\begin{aligned} d_{KS}(\widehat{X}_{m,\mathrm{med}}^*-\widehat{X}_\mathrm{med},\widehat{X}_\mathrm{med}-X_\mathrm{med})\overset{\mathcal {D}}{\longrightarrow }\frac{1}{2}1\left( U<\frac{1}{2}\right) =:\widetilde{S}, \end{aligned}$$

    where \(U\sim \mathrm{Unif}(0,1)\) such that \(2\widetilde{S}\sim \mathrm{Bin}(1,0.5)\) is Bernoulli-distributed.

Remark 3

The results of Theorem 2 that state consistency for the non-centered sample median, but inconsistency for the centered version for the \(m\)-out-of-\(n\) bootstrap, seem to be surprising at first sight. However, by a closer inspection of part (ii) this oddity can be explained by the fact that \(X_\mathrm{med}=0\), while \(\widehat{X}_\mathrm{med}\) and \(\widehat{X}^*_{m,\mathrm{med}}\) take the values \(0\) and \(1\) with limiting probability \(1/2\) each. Hence, the centering differs on the bootstrap and the non-bootstrap side of (ii). This effect is caused by the estimation inconsistency of the sample median.

Fig. 2
figure 2

Histograms of \(\widehat{X}_{\mathrm{med},m}^*\) based on i.i.d. bootstrap replicates \(X_1^*,\ldots ,X_m^*\) from fair coin flip data \(X_1,\ldots ,X_n\) for \(n=10000\) and \(m=n\) (first column) and \(m=n^{2/3}\) (second column)

In Fig. 2, the differing asymptotic behavior of \(\widehat{X}_{\mathrm{med},m}^*\) for \(m=n\) and \(m=n^{2/3}\) is illustrated via histogram plots for coin flip data. For the first case, the asymptotic uniform distribution of \(P^*(\widehat{X}_{\mathrm{med},n}^*=0)\) is reflected by the high variability of the histograms, whereas the probabilities seem to be more balanced in the second case.

So far we have considered only the toy example of i.i.d. fair coin flip random variables and the sample median. This seems to be very restrictive at first sight. In the following, we turn to a much more general setup and show that asymptotics follow immediately from the results established for the coin flip example. Consequently, it turns out to be not that toyish at all.

2.2.2 General setup

We now turn to more general distributions than the Bernoulli distribution and suppose that \((X_t)_{t\in \mathbb {Z}}\) is a sequence of random variables that might inherit a certain dependence structure. In the last decade, Poisson autoregressions [e.g. Ferland et al. (2006) and Fokianos et al. (2009)], INAR processes [e.g. McKenzie (1988), Weiß (2008) and Drost et al. (2009)] and various extensions of these models have attracted increasing interest, see Fokianos (2011). We intend to derive results that hold true for a broad range of processes including the previous one. Doukhan et al. (2012a, b) showed that these processes are \(\tau \)-dependent with geometrically decaying coefficients. Therefore, we will use this concept in the sequel and state its definition for sake of completeness. However, it can be seen from the proofs below that any other concept of weak dependence being sufficient for a CLT of the empirical distribution function can be applied here as well.

Definition 4

Let \((\Omega ,{\mathcal A},P)\) be a probability space and \((X_t)_{t\in \mathbb {Z}}\) be a strictly stationary sequence of integrable \(\mathbb {R}^d\)-valued random variables. The process is called \(\tau \)-(weakly) dependent if

$$\begin{aligned} \tau (h) = \sup _{D\in \mathbb {N}} \frac{1}{D} \sup _{h\le t_1<\dots <t_D} \left\{ \tau \left( \sigma (X_t,t\le 0),(X_{t_1},\dots ,X_{t_D})\right) \right\} \,\mathop {\longrightarrow }_{h\rightarrow \infty }\, 0, \end{aligned}$$

where

$$\begin{aligned} \tau ({\mathcal M}, X)=E\left( \sup _{f\in \Lambda _1(\mathbb {R}^p)} \left| \int _{\mathbb {R}^p} f(x) dP^{X|{\mathcal M}}(x)-\int _{\mathbb {R}^{p}} f(x)d P^X(x)\right| \right) . \end{aligned}$$

Here, \({\mathcal M}\) is a sub-\(\sigma \)-algebra of \({\mathcal A}\), \(P^{X|{\mathcal M}}\) denotes the conditional distribution of the \(\mathbb {R}^p\)-valued random variable \(X\) given \({\mathcal M}\), and \( \Lambda _1(\mathbb {R}^{p})\) denotes the set of 1-Lipschitz functions from \(\mathbb {R}^p\) to \(\mathbb {R}\), i.e. \(f\in \Lambda _1(\mathbb {R}^p)\) if \(|f(x)-f(y)|\le \Vert x-y\Vert _1=\sum _{j=1}^p |x_j-y_j|\; \forall \, x,y\in \mathbb {R}^p\).

Remark 5

If a process \((X_t)_{t\in \mathbb {Z}}\) on \((\Omega ,{\mathcal A},P)\) is \(\tau \)-dependent and if \({\mathcal A}\) is rich enough, then there exists, for all \(t<t_1<\cdots <t_D\in \mathbb {Z}\), \(D\in \mathbb {N}\), a random vector \((\widetilde{X}_{t_1},\ldots ,\widetilde{X}_{t_D})'\) which is independent of \((X_s)_{s\le t}\), has the same distribution as \((X_{t_1},\ldots ,X_{t_D})'\) and satisfies

$$\begin{aligned} \frac{1}{D} \sum _{j=1}^D E\Vert \widetilde{X}_{t_j} - X_{t_j} \Vert _1 \,\le \, \tau (t_1 - t); \end{aligned}$$
(18)

cf. Dedecker and Prieur (2004). This \(L_1\)-coupling property will be an essential device for the proofs of our results below. Also note that in particular sequences of i.i.d. random variables \((X_t)_{t\in \mathbb {Z}}\) are \(\tau \)-dependent with \(\tau (0)\le 2 E\Vert X_1\Vert \) and \(\tau (h)=0\) for \(h\ne 0\). Nevertheless, we state the i.i.d. case separately in all our Theorems since \(\tau \)-dependent processes are assumed to have finite first moment which is not necessary in our results if the data are i.i.d..

Regarding the marginal distribution \(P^{X_1}\), we assume that it has support \(\text {supp} (P^{X_1})=V\), that is, \(P(X_i\in V)=1\), where

$$\begin{aligned} V=\{v_j\mid j\in T\subseteq \mathbb {Z}\} \end{aligned}$$
(19)

for some finite or countable index set \(T\) with \(v_j<v_{j+1}\) for all \(j\in T\). Further, we assume that \(V\) has no accumulation point. As the cdf \(F\) is a step function, there is always a \(p\in (0,1)\) such that the \(p\)-quantile \(Q_p=v_j\), say, as well as \(v_{j+1}\) satisfies both inequalities in (8). Recall that this covers particularly the population median in the fair coin flip example. In the following, we consider the asymptotics for the sample quantile \(\widehat{Q}_p\) as defined in (2) and its bootstrap analogue

$$\begin{aligned} \widehat{Q}_{p,m}^*=(\widehat{F}_m^*(p))^{-1}=\inf _{t}\{t:\widehat{F}_m^*(t)\ge p\}, \end{aligned}$$

where \(\widehat{F}_m^*(x)=m^{-1}\sum _{i=1}^m 1(X_i^*\le x)\) denotes the empirical bootstrap distribution function. Similar to (3), for all \(x\in \mathbb {R}\), we have

$$\begin{aligned} P(\widehat{Q}_p\le x)=P\left( \sum _{i=1}^n 1(X_i\le x)\ge \lceil np \rceil \right) =\sum _{j=\lceil np \rceil }^n \left( {\begin{array}{c}n\\ j\end{array}}\right) F^j(x)(1-F(x))^{n-j}. \end{aligned}$$

For the bootstrap \(p\)-quantile \(\widehat{Q}^*_{p,m}\) based on i.i.d. bootstrap pseudo replicates \(X_1^*,\ldots ,X_m^*\), we get the analogue representation

$$\begin{aligned} P^*(\widehat{Q}^*_{p,m}\le x)&= P^*\left( \sum _{i=1}^m 1(X_i^*\le x)\ge \lceil mp \rceil \right) \\&= \sum _{j=\lceil mp \rceil }^m \left( {\begin{array}{c}m\\ j\end{array}}\right) \widehat{F}_n^j(x)(1-\widehat{F}_n(x))^{m-j}. \end{aligned}$$

Further, for all \(x\in \mathbb {R}\) and analogue to (17), we have for the i.i.d. case

$$\begin{aligned} \sqrt{n}(\widehat{F}_n(x)-F(x))\overset{\mathcal {D}}{\rightarrow }\mathcal {N}\left( 0,W\right) \quad \text {where}\quad W= cov (1(X_0\le x),1(X_0\le x)). \end{aligned}$$

As for the median in the coin flip example and analogue to (9) and (10), to check for bootstrap consistency, we have to consider \(d_{KS}(\widehat{Q}^*_p,\widehat{Q}_p)\) or \(d_{KS}(\widehat{Q}^*_p-\widehat{Q}_p,\widehat{Q}_p-Q_p)\). To this end, we first study the asymptotics for the empirical quantile \(\widehat{Q}_p\). In particular, part (iii) of the following lemma addresses the joint limiting distributions of several empirical quantiles. To the authors knowledge, such a result has not been established in this generality so far and may be of independent interest.

Theorem 6

(Asymptotics of empirical quantiles for discrete distributions) Let \(X_1,\ldots ,X_n\) be discretely distributed random variables which are either i.i.d.  or observations of a strictly stationary and \(\tau \)-dependent process \((X_t)_{t\in \mathbb {Z}}\) with \(\sum _{h=0}^\infty \tau (h)<\infty \) and \(\text {supp} (P^{X_1})=V\) as described above.

  1. (i)

    If \(F(Q_p)>p\),

    $$\begin{aligned} P(\widehat{Q}_p=Q_p)\mathop {\longrightarrow }\limits _{n\rightarrow \infty }1. \end{aligned}$$
  2. (ii)

    If \(F(Q_p)=p\) and \(Q_p=v_j\), say, for some \(v_j\in V\),

    $$\begin{aligned} P(\widehat{Q}_p=v_j)\mathop {\longrightarrow }\limits _{n\rightarrow \infty }1/2\quad \mathrm{and}\quad P(\widehat{Q}_p=v_{j+1})\mathop {\longrightarrow }\limits _{n\rightarrow \infty }1/2. \end{aligned}$$
  3. (iii)

    For \(p_1,\ldots ,p_d\) such that \(F(Q_{p_i})=p_i\), \(i=1,\ldots ,k\) and \(F(Q_{p_i})>p_i\), \(i=k+1,\ldots ,d\) with \(Q_{p_i}=v_{l_i}\), say, joint convergence in distribution of \(\underline{\widehat{Q}}=(\widehat{Q}_{p_1},\ldots ,\widehat{Q}_{p_d})'\) holds. Precisely, we have

    $$\begin{aligned} P(\underline{\widehat{Q}}=\underline{q}) \underset{n\longrightarrow \infty }{\rightarrow } {\left\{ \begin{array}{ll} P\left( \bigcap _{j=1}^k\left\{ (2\cdot 1(q_j=Q_{p_j})-1)Z_j\ge 0\right\} \right) , &{} q_i=Q_{p_i},\ i=k+1,\ldots ,d \\ 0, &{} \mathrm{otherwise}\end{array}\right. } \end{aligned}$$
    (20)

    where \(\underline{q}=(q_1,\ldots ,q_d)'\) with \(q_i\in \{v_{l_i},v_{l_i+1}\}\). Here, the probability of the empty intersection is set to one and \(\underline{Z}=(Z_1,\ldots ,Z_k)^\prime \sim \mathcal {N}(\underline{0},\mathbf {W})\) with covariance matrix \(\mathbf {W}\) having entries

    $$\begin{aligned} W_{i,j}={\left\{ \begin{array}{ll}\mathrm{cov} (1(X_0\le q_{i}),1(X_0\le q_{j})), \quad &{} \mathrm{i.i.d.}\, \mathrm{case} \\ \sum _{h\in \mathbb {Z}} \mathrm{cov}(1(X_h\le q_{i}),1(X_0\le q_{j})), \quad &{} \mathrm{time}\, \mathrm{series}\, \mathrm{case}\end{array}\right. }. \end{aligned}$$

Note that the asymptotics do not depend on the dependence structure of the underlying process as long as single quantiles are considered, compare Remark 10. This does no longer hold true when the joint distribution of several quantiles is considered. Part (iii) above shows that \(\underline{\widehat{Q}}\) converges to a random variable with 2-point marginal distributions that are indeed dependent not only for the time series case, but also for i.i.d. random variables. More precisely, the probability that the vector of empirical quantiles \(\underline{\widehat{Q}}\) equals the vector \(\underline{q}\) corresponds asymptotically to the probability that the normally distributed random variable \(\underline{Z}\) takes values in a certain orthant of \(\mathbb {R}^k\) depending on \(\underline{q}\). This is illustrated in the following example.

Example 7

In the situation of Theorem 6(iii) let \(k=2\) and suppose \((Q_{p_1},Q_{p_2})=(v_{i_1},v_{i_2})\).

  1. (i)

    If \(\underline{q}=(v_{i_1},v_{i_2})\), we have \(P(\underline{\widehat{Q}}=\underline{q})\underset{n\rightarrow \infty }{\rightarrow } P(0 \le Z_1,0\le Z_2)\).

  2. (ii)

    If \(\underline{q}=(v_{i_1},v_{i_2+1})\), we have \(P(\underline{\widehat{Q}}=\underline{q})\underset{n\rightarrow \infty }{\rightarrow } P(0 \le Z_1,0\ge Z_2)\).

After having established asymptotic theory for sample quantiles in this general setup, it remains to consider the bootstrap analogue, i.e. \(P^*(\widehat{Q}^*_{p,m}\le x)\), in more detail. In particular, for \(x=Q_p\) we have by Theorem 8 below

$$\begin{aligned} P^*(\widehat{Q}^*_{p,m}=Q_p)&= P^*(\widehat{Q}^*_{p,m}\le Q_p)-o_P(1)\\&= \sum _{k=\lceil mp \rceil }^m \left( {\begin{array}{c}m\\ k\end{array}}\right) \widehat{F}_n^k(Q_p)(1-\widehat{F}_n(Q_p))^{m-k}-o_P(1), \end{aligned}$$

which has (asymptotically) exactly the same shape as (12), and the results of Theorems 1 and 2 transfer directly to this more general setup.

Theorem 8

(Consistency of the i.i.d.  \(m\)-out-of-\(n\) bootstrap) Let \(X_1,\ldots ,X_n\) be discretely distributed i.i.d. random variables with \(\text {supp} (P^{X_1})=V\) as above and we draw i.i.d. bootstrap replicates \(X_1^*,\dots , X_m^*\). Suppose that \(m/n+1/m=o(1)\) as \(n\rightarrow \infty \) and let \(\underline{\widehat{Q}}=(\widehat{Q}_{p_1},\ldots ,\widehat{Q}_{p_d})\) as in Theorem 6 and \(\underline{\widehat{Q}}^*_m=(\widehat{Q}_{p_1,m}^*,\ldots ,\widehat{Q}_{p_d,m}^*)\) for \(p_1,\ldots ,p_d\in (0,1)\). Then, we have bootstrap consistency, i.e.

$$\begin{aligned} d_{KS}\left( \underline{\widehat{Q}}^*_m,\underline{\widehat{Q}}\right) :=\sup _{\underline{x}\in \mathbb {R}^d}\left| P^*(\underline{\widehat{Q}}^*_m\le \underline{x})- P(\underline{\widehat{Q}}\le \underline{x})\right| \mathop {\longrightarrow }\limits ^{P} 0. \end{aligned}$$

Here, the short-hand \(\underline{x}\le \underline{y}\) for \(\underline{x},\underline{y}\in \mathbb {R}^d\) is used to denote \(x_i\le y_i\) for all \(i=1,\ldots ,d\).

To capture the dependence structure of the process \((X_t)_{t\in \mathbb {Z}}\) in the time series case, we approach an \(m\)-out-of-\(n\) (moving) block bootstrap procedure:

  1. Step 1.

    Choose a bootstrap sample size \(m\), a block length \(l\) and let \(b=\lceil m/l\rceil \) be the smallest number of blocks required to get a bootstrap sample of length \(bl\ge m\). Define blocks \(B_{i,l}=(X_{i+1},\ldots ,X_{i+l})\), \(i=0,\ldots ,n-l\) and let \(i_0,\ldots ,i_{b-1}\) be i.i.d.  random variables uniformly distributed on the set \(\{0,1,2,\ldots ,n-l\}\).

  2. Step 2.

    Lay the blocks \(B_{i_0,l},\ldots ,B_{i_{b-1},l}\) end-to-end together to get

    $$\begin{aligned} B_{i_0,l},\ldots ,B_{i_{b-1},l}&= X_{i_0+1},\ldots ,X_{i_0+l},X_{i_1+1},\ldots ,X_{i_1+l},\ldots ,X_{i_{b-1}+1},\ldots ,X_{i_{b-1}+l} \\&= X_1^*,\ldots ,X_{bl}^* \end{aligned}$$

    and discard the last \(bl-m\) values to get a bootstrap sample \(X_1^*,\ldots ,X_m^*\).

An application of this block bootstrap is in particular necessary to obtain bootstrap consistency if several quantiles are considered jointly. This leads to the following theorem.

Theorem 9

(Consistency of the block-wise \(m\)-out-of-\(n\) bootstrap) Let \(X_1,\ldots ,X_n\) be discretely distributed random variables with \(\text {supp} (P^{X_1})=V\) as above that are observations of a strictly stationary and \(\tau \)-dependent process \((X_t)_{t\in \mathbb {Z}}\) with \(\sum _{h=0}^\infty h\,\tau (h)<\infty \). We apply the block-wise \(m\)-out-of-\(n\) bootstrap to get a bootstrap sample \(X_1^*,\dots , X_m^*\). Suppose that \(m/n+l/m+1/l=o(1)\) as \(n\rightarrow \infty \). With the notation of Theorem 8, we have bootstrap consistency, i.e.

$$\begin{aligned} d_{KS}\left( \underline{\widehat{Q}}^*_m,\underline{\widehat{Q}}\right) \mathop {\longrightarrow }\limits ^{P} 0. \end{aligned}$$

Remark 10

It can be seen from Theorem 6(iii) that \(P( \widehat{Q}_p=Q_p)\longrightarrow P(Z\ge 0)=1/2\) as \(n\rightarrow \infty \) if \(F(Q_p)=p\). Here, \(Z\) is a centered normal variable whose variance depends on the dependence structure of the underlying process. However, for the limit behavior of the sample quantile itself the variance of \(Z\) is not relevant and we only require symmetry around the origin. In the case of \(F(Q_p)>p\) the proof of \(P(\widehat{Q}_p=Q_p)\longrightarrow 1\) is based on the WLLN which holds for i.i.d. as well as for \(\tau \)-weakly dependent data. This implies in particular that to mimic the asymptotic behavior of a single quantile correctly we do not have to imitate the dependence structure correctly. Hence, the i.i.d. \(m\)-out-of-\(n\)-bootstrap is also valid for sequences of weakly dependent random variables if single quantiles are considered; for details follow the lines of the proof of Theorem 8. A similar phenomenon occurs when \(m\)-out-of-\(n\) bootstrap is used to mimic the distribution of \(M_n=\max (X_1,\dots ,X_n)\); see Theorem 4 and Section 4 in Athreya et al. (1999).

2.3 Randomized construction of confidence sets

In discrete setups it is more appropriate to work with confidence sets rather than confidence intervals for population quantiles. By consistency of the non-centered \(m\)-out-of-\(n\) i.i.d.  bootstrap (and the \(m\)-out-of-\(n\) block bootstrap) we can apply this method to derive such confidence sets. Due to the discreteness of the underlying distribution a naive construction of confidence sets will be too conservative, that is, the effective limiting coverage of an asymptotic (\(1-\alpha \))-quantile is strictly larger than \(1-\alpha \); actually equal to one if \(\alpha < 1/2\). If one does not want to use conservative confidence sets with (too) large coverages, one can compensate this effect by randomization techniques. More precisely, we proceed as follows: We calculate one confidence set for the sample quantile with coverage larger than the prescribed size \(1-\alpha \) and another one with a coverage (asymptotically) smaller than \(1-\alpha \). Then, we choose randomly (with an appropriate distribution) one of these sets and use this to construct a final confidence set for the population quantile of asymptotic level \(1-\alpha \). Another difficulty that has to be taken into account is that we have bootstrap consistency only without centering, that is,

$$\begin{aligned} P^*(\widehat{Q}_{p,m}^*\le x)\approx P(\widehat{Q}_{p}\le x), \text { but } P^*(\widehat{Q}_{p,m}^*-\widehat{Q}_{p}\le x)\not \approx P(\widehat{Q}_{p}-Q_{p}\le x), \nonumber \\ \end{aligned}$$
(21)

such that the standard construction of bootstrap confidence intervals is not possible. Let \(V_n\) denote the support of the empirical marginal distribution based on \(X_1, \dots , X_n\). Then, we define large and small confidence sets \(\mathrm{CS}_L\) and \(\mathrm{CS}_S\), respectively, for the sample quantile

$$\begin{aligned} \mathrm{CS}_{L}&=\left[ F^{*-1}_{\widehat{Q}^*_{p,m}}(\alpha /2), \,F^{*-1}_{\widehat{Q}^*_{p,m}}(1-\alpha /2)\right] \cap V_n,\\ \mathrm{CS}_{S}&=\left[ F^{*-1}_{\widehat{Q}^*_{p,m}}(\alpha /2),\, F^{*-1}_{\widehat{Q}^*_{p,m}}(1-\alpha /2)\right) \cap V_n, \end{aligned}$$

and their coverages

$$\begin{aligned} \mathrm{cov}_L=P^*(\widehat{Q}^*_{p,m}\in CS_L),\quad \mathrm{cov}_S=P^*(\widehat{Q}^*_{p,m}\in CS_S). \end{aligned}$$

Note that \(\mathrm{cov}_L\ge 1-\alpha \) while the size of \(cov_S\) is not clear in finite samples. It will turn out to be less than \(1-\alpha \) in the limit. Finally, we specify

$$\begin{aligned} p^*=\frac{1-\alpha -{\text {cov}}_S}{{\text {cov}}_L-{\text {cov}}_S} \end{aligned}$$

and define the bootstrap approximation of the confidence set for the sample quantile

$$\begin{aligned} \widetilde{\mathrm{CS}}= {\left\{ \begin{array}{ll} \mathrm{CS}_L\quad \text {if } Y\le p^*\\ \mathrm{CS}_S\quad \text {if } Y> p^*, \end{array}\right. } \end{aligned}$$

where \(Y\sim \mathrm{Unif}(0,1)\) is chosen independently from all observations and all bootstrap variables. A corresponding confidence set for the population quantile is then given by

$$\begin{aligned} \mathrm{CS}=\widetilde{\mathrm{CS}}-\widehat{Q}_p+H(\widehat{Q}^*_{p,m}). \end{aligned}$$

Due to (21) and as \(P(\widehat{Q}_p\in \widetilde{\mathrm{CS}})\rightarrow 1-\alpha \) holds, the use of a correction term \(H(\widehat{Q}^*_{p,m}):=F^{*-1}_{\widehat{Q}^*_{p,m}}(0.4)\) is necessary as an approximation of the true quantile \(Q_p\); see proof of Theorem 11 below. In principle, any value in \((0,1/2]\) can be used instead of 0.4.

Theorem 11

Suppose that either the assumptions of Theorem 8 or Theorem 9 hold true. Then, for \(\alpha \in (0,1/2)\), we have

$$\begin{aligned} P\left( Q_p\in \mathrm{CS}\right) \mathop {\longrightarrow }\limits _{n\rightarrow \infty }1-\alpha . \end{aligned}$$

Remark 12

(On the use of \(V\) or \(V_n\)) The effect of using \(V\) or \(V_n\) is asymptotically negligible. For applications it might be reasonable to assume either that \(V\) is known in advance or that it is unknown. In the first case \(V\) should be used to construct the confidence intervals and in the latter case \(V_n\) seems to be the more reasonable choice.

Table 1 Coverage rates of \((1-\alpha )\)-bootstrap confidence sets \(CS\) with \(\alpha =0.05\) for the median \(X_{med}\) of \(X_t\sim Bin(N,0.5)\) for several choices of \(N\), sample sizes \(n\) and bootstrap sample sizes \(m\)

2.4 Simulations

In this section, we illustrate the bootstrap performance by means of coverage rates of \((1-\alpha )\)-confidence sets \(CS\) for \(\alpha =0.05\) as proposed in the previous section. To cover both cases of i.i.d.  as well as time series data, let \(X_1,\ldots ,X_n\) be either

  1. (a)

    an i.i.d. realization of a binomial distribution \(X_i\sim \mathrm{Bin}(N,{\theta })\)

or

  1. (b)

    a realization of a (Poisson-)INAR(1) model \(X_t=\beta \circ X_{t-1}+\epsilon _t\), where \(\epsilon _t\sim \mathrm{Poi}({\eta }(1-\beta ))\) is Poisson distributed and \(\beta \circ k\sim \mathrm{Bin}(k,\beta )\) for \(k\in \mathbb {N}_0\) denotes the binomial thinning operator.

The quantity of interest is the (sample) median, where we consider different parameter settings for both cases (a) and (b) that lead to degenerate one-point as well as non-degenerate two-point limiting distributions, respectively. In all simulations we have used \(V\) to construct confidence sets; compare Remark 12.

In Table 1, we show coverage rates of confidence sets for model a) for several sample sizes \(n\in \{100,500,1000,5000\}\) and parameter settings with \({\theta }=0.5\) and \(N\in \{1,2,19,20,39,40\}\), where odd \(N\) leads to a non-degenerate limiting distribution (\(N=1\) is the fair coin flip) and even \(N\) results in a degenerate one-point limiting distribution. In Fig. 3, we show typical bootstrap confidence sets for the examples \(\mathrm{Bin}(19, 0.5)\) and \(\mathrm{Bin}(39,0.5)\). As our theory provided in Sect. 2.2 suggests, we use the \(m\)-out-of-\(n\) bootstrap to mimic correctly the limiting behavior of sample quantiles in the degenerate as well as the non-degenerate case. To illustrate how sensitive the bootstrap reacts on the choice of the bootstrap sample size, we show results for several (rounded) values of \(m\in \{n^{1/2},n^{2/3},n^{3/4}\}\). For each parameter setting, we generate \(K=1000\) time series and \(B=1000\) bootstrap replicates are used to construct the confidence set as described in Sect. 2.3. By the de Moivre–Laplace theorem, a binomial probability mass function can be well approximated by a normal density around the mean if the number of experiments \(N\) is large. As suggested by one referee we included also coverage rates of the corresponding asymptotic confidence intervals. To construct these intervals, we treated the data as being normally distributed with mean \(Np\) and variance \(N p(1-p)\) such that a Bahadur-type CLT

$$\begin{aligned} \sqrt{n}(\widehat{Q}_p-Q_p)=\frac{1}{f(Q_p)}\sqrt{n}(p-\widehat{F}_n(Q_p))+o_P(1)\mathop {\longrightarrow }\limits ^{\mathcal {D}}\mathcal {N}(0,S), \end{aligned}$$
(22)
Fig. 3
figure 3

Confidence sets CS for the median \(X_\mathrm{med}\) from five realizations of \(X_1,\ldots ,X_n\) with \(X_i\sim Bin(N,0.5)\) i.i.d.  for \(N=19\) (left panels) and \(N=39\) (right panels), several sample sizes \(n\) and bootstrap sample sizes \(m=n^{2/3}\). The true median is marked with a red vertical line

would hold true with limiting variance \(S=\mathrm{var}(1(X_1\le Q_p))/f^2(Q_p)\) if \(X_1,\ldots ,X_n\) are i.i.d. and \({f(\cdot )=\varphi ((\cdot -N\theta )/\sqrt{N\theta (1-\theta )})/\sqrt{N\theta (1-\theta )}}\), where \(\varphi \) denotes the probability density function of the standard normal distribution. Precisely, we used the confidence intervals \([\widehat{X}_\mathrm{med}-q_{1-\alpha /2}*\widehat{S}/\sqrt{n},\widehat{X}_\mathrm{med}-q_{\alpha /2}*\widehat{S}/\sqrt{n}]\), where \(\widehat{S}\) is an empirical version of \(S\) and \(q_{\alpha }\) denotes the \(\alpha \)-quantile of the standard normal distribution. Table 1 reports a good overall finite sample performance of our procedure, whereas the coverage rates of the CLT-based confidence intervals that treat the data as being normally distributed clearly fail here. In the case of a limiting two-point distribution, the coverage is around 50 % and in the limiting one-point distribution it converges to 100 %. Observe that these confidence intervals are just ad hoc and not asymptotically valid such that we did not expect good results here. For our procedure, we see that an increasing binomial parameter \(N\) leads to higher variance of the data generating process, i.e.  \(\mathrm{var}(X_t)=N/4\). Hence, confidence sets are larger and we observe a slight overcoverage. Moreover, we observe that confidence sets for even \(N\) are more conservative than for odd \(N\) which is due to the degeneracy of the limit distribution of the sample median for even \(N\).

Remark 13

Theorem 11 yields asymptotic validity of the procedure under the minimal condition \(m=o(n)\) on the intensity parameter. The question of its optimal choice has been discussed in different contexts. We refer the reader to an overview on the literature in Santana (2009, Sections 4.4 and 5.5) and Bickel and Sakov (2008). Unfortunately, these findings cannot be applied directly in the present context since their prerequisites are violated in our special case. Bickel and Sakov (2008) require the limiting distribution of the quantity under consideration to be non-degenerate which contradicts our Theorem 6(i). However, our simulation study shows that the bootstrap method is robust for different choices of the intensity \(m\). If \(N\) is large, small choices of \(m\) lead to more conservative confidence intervals than large ones. The effect of overcoverage can be explained by larger variability caused by small bootstrap sample sizes \(m\). It seems that \(m=n^{2/3}\) is a good compromise and might be recommended as a rule of thumb. A detailed investigation of the optimal choice of \(m\) is left to further research.

In the setup (b), displayed in Table 2, we consider again the non-degenerate case for \(\lambda =3.67206\ldots \) such that \(X_\mathrm{med}=3\) as well as the degenerate case for \(\lambda =4\) such that \(X_\mathrm{med}=4\). As discussed in Remark 10, Table 2 shows that already the i.i.d. low-intensity bootstrap leads to valid results and the block bootstrap does not lead to visible improvements of the performance.

3 Mid-distribution quantiles

3.1 Asymptotics for sample mid-quantiles

Suppose we observe \(X_1,\dots , X_n\) from a (\(\tau \)-dependent) process with discrete support \(\text {supp}(P^{X_1})=V\) as defined in (19). Instead of considering classical quantiles as in Sect. 2 of the present paper, Parzen (1997, 2004) and Ma et al. (2011) suggested to investigate a modified quantile function of the corresponding so-called mid-distribution function \(F_\mathrm{mid}\), which is given by

$$\begin{aligned} F_\mathrm{mid}(x)=F(x)-0.5 p(x),\quad x\in \mathbb {R}, \end{aligned}$$

where, as before, \(F\) denotes the cdf of the random variable \(X\) with probability mass function \(p(x)=P(X=x)\). Their concept allows for a meaningful interpretation of quantiles in the discrete setup and appears to be beneficial in cases of tied samples. Here, we refer to the paper of Ma et al. (2011) for details. In particular, it is argued there that mid-quantiles, related to the mid-distribution function in the sense of (23) below, behave more favorably than classical quantiles. That is, contrary to classical sample quantiles in discrete setups, they showed that sample (mid-)quantiles based on the mid-distribution function converge to non-degenerate limiting distributions when properly centered and inflated with the usual \(\sqrt{n}\)-rate as long as they do not correspond to the boundary values of the support of the underlying distribution. In the latter case the limiting distribution is degenerate for any choice of the inflation factor. Moreover, they show that asymptotic theory coincides for mid-quantiles and classical quantiles if the underlying distribution is absolutely continuous. In view of this, mid-quantiles can be interpreted as a natural generalization of classical quantiles which appears to be robust to discreteness of the underlying distribution.

Table 2 Coverage rates of (\(1-\alpha \))-bootstrap confidence sets \(CS\) with \(\alpha =0.05\) for the median \(X_{med}\) of the INAR(1) model \(X_t=\beta \circ X_{t-1}+\epsilon _t\), \(\beta =0.5\) for two choices of \(\lambda \) and several sample sizes \(n\) and bootstrap sample sizes \(m\)

We first assume the support \(V\) to be bounded, \(V=\{v_1,\ldots ,v_d \mid v_1<..<v_d\}\), say. However, it turns out that the case of unbounded support can be treated similarly and the asymptotics are even easier; see Remark 15 below. According to Ma et al. (2011) the mid-quantile function is a linear interpolation of the points \((F_\mathrm{mid}(v_j),v_j)\), \(j=1,\ldots ,d\). More precisely, we define the \(p\)th population mid-quantile \(Q_{p,\mathrm{mid}}\) as

$$\begin{aligned} Q_{p,\mathrm{mid}}= \left\{ \begin{array}{ll} v_1 &{}\quad \text {if } p<F_{mid}(v_1)\\ v_k &{}\quad \text {if } p=F_{mid}(v_k),\; k=1,\dots , d\\ \lambda v_k+(1-\lambda )v_{k+1} &{}\quad \text {if } p\!=\!\lambda F_{mid}(v_k)+(1-\lambda )F_{mid}(v_{k+1}),\,\lambda \in (0,1),\\ &{}\quad k=1,\dots , d-1\\ v_d &{}\quad \text {if } p>F_{mid}(v_d) \end{array}\right. \qquad \end{aligned}$$
(23)

and its empirical counterpart \(\widehat{Q}_{p,\mathrm{mid}}\) as

$$\begin{aligned} \widehat{Q}_{p,\mathrm{mid}}= \left\{ \begin{array}{ll} v_1 &{}\quad \text {if } p<\widehat{F}_{mid}(v_1)\\ v_k &{}\quad \text {if } p=\widehat{F}_{mid}(v_k)<\widehat{F}_{mid}(v_{k+1}),\; k=1,\dots , d\\ \lambda _n v_k+(1-\lambda _n)v_{k+1} &{}\quad \text {if } p=\lambda \widehat{F}_{mid}(v_k)+(1-\lambda _n) \widehat{F}_{mid}(v_{k+1}),\,\lambda _n \in (0,1),\\ &{}\quad \widehat{F}_{mid}(v_k)<\widehat{F}_{mid}(v_{k+1}),\, k=1,\dots , d-1\\ v_d &{}\quad \text {if } p>\widehat{F}_{mid}(v_d) \end{array}\right. \qquad \end{aligned}$$
(24)

where \(\widehat{F}_\mathrm{mid}(x)=n^{-1}\sum _{k=1}^n \{1(X_k\le x)-0.5\cdot 1(X_k=x)\}\) is the empirical counterpart of \(F_\mathrm{mid}(x)\); see also Fig. 4 for illustration. There, we compare the classical and the mid-quantile function for the \(\mathrm{Bin}(1,0.5)\) and the \(\mathrm{Bin}(3,0.5)\) distribution. For both distributions the classical median cannot be estimated consistently which follows from the plateau of the classical quantile function from argument 0.5 onwards. Contrary, the mid-quantile function is strictly increasing around this argument which then results in a \(\sqrt{n}\)-consistency of its empirical version, see also Theorem 14 below.

Fig. 4
figure 4

Comparison of quantile function (black, dashed) and mid-quantile function (red, solid) for \(\mathrm{Bin}(1,0.5)\) (left panel) and \(\mathrm{Bin}(3,0.5)\) (right panel)

Our first goal is to extend the asymptotic results of Ma et al. (2011) from i.i.d. data to strictly stationary, \(\tau \)-dependent processes. Similar to Sect. 2 of the paper, any other concept of dependence might be applied as long as the CLT for the empirical distribution function holds. For sake of definiteness, we restrict ourselves to \(\tau \)-dependence here.

Theorem 14

(Asymptotics of sample mid-quantiles for discrete distributions) Suppose that \(X_1, \dots , X_n\) are either i.i.d. or observations of a strictly stationary, \(\tau \)-dependent process \((X_t)_{t\in \mathbb {Z}}\) with \(\sum _{h=0}^\infty \tau (h)<\infty \). Let the support of \(P^{X_1}\) be \(V=\{v_1,\dots , v_d\}\) such that \(v_1<\dots <v_d\) and denote the corresponding probabilities by \(a_1,\dots , a_d\). Further, define \(a_0=a_{d+1}=0\), \(v_0=v_1\) and \(v_{d+1}=v_d\). Then, we have

$$\begin{aligned} \sqrt{n}(\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\mathop {\longrightarrow }\limits ^{{\mathcal D}}{\left\{ \begin{array}{ll} 0 &{}\quad \mathrm{if}\, p<F_\mathrm{mid}(v_1)\,\mathrm{or}\, p>F_\mathrm{mid}(v_d) \\ Z_1 &{}\quad \mathrm{if}\, p=\lambda F_\mathrm{mid}(v_{k+1})+(1-\lambda )F_\mathrm{mid}(v_{k+2}),\, \lambda \!\in \!(0,1),\\ &{}\quad k=0,\ldots ,d-2 \\ Z_2 \quad &{} \quad \mathrm{if}\, p=F_\mathrm{mid}(v_{k+1}),\, k=1,\dots , d-2 \\ Z_3 &{}\quad \mathrm{if}\, p=F_\mathrm{mid}(v_{1}) \\ Z_4 &{}\quad \mathrm{if}\, p=F_\mathrm{mid}(v_{d}) \end{array}\right. } \end{aligned}$$
(25)

where \(Z_1,Z_2,Z_3,Z_4\) are random variables having certain non-degenerate distributions as described in the following. \(Z_1\) is centered and normally distributed with variance

$$\begin{aligned} \sigma _1^2=4\left( \frac{v_{k+1}-v_{k+2}}{a_{k+2}+a_{k+1}}\right) ^2\, h_{k+2}'\Sigma ^{(k+2)}h_{k+2}, \end{aligned}$$
(26)

where

$$\begin{aligned} h_{k+2}=\left( 1,\dots ,1,1-\frac{F_\mathrm{mid}(v_{k+2})-p}{a_{k+1}+a_{k+2}},\frac{1}{2}-\frac{F_\mathrm{mid}(v_{k+2})-p}{a_{k+1}+a_{k+2}}\right) ' \end{aligned}$$

and \(\Sigma ^{(k+2)}=(\Sigma _{j_1,j_2})_{j_1,j_2=1,\dots , k+2}\) with \(\Sigma _{j_1,j_2}=\sum _{h\in \mathbb {Z}}\mathrm{cov}(1(X_h=v_{j_1}), 1(X_0=v_{j_2}))\). The density of \(Z_2\) is that of a centered normal distribution with variance

$$\begin{aligned} \sigma _{2-}^2=4\left( \frac{v_{k+1}-v_{k}}{a_{k}+a_{k+1}}\right) ^2\,\left\{ (1,\dots , 1,0.5)\Sigma ^{(k+1)}(1,\dots , 1,0.5)'\right\} \end{aligned}$$

on the negative real line and that of a centered normal distribution with variance

$$\begin{aligned} \sigma _{2+}^2=4\left( \frac{v_{k+2}-v_{k+1}}{a_{k+1}+a_{k+2}}\right) ^2\,\left\{ (1,\dots , 1,0.5)\Sigma ^{(k+1)}(1,\dots , 1,0.5)'\right\} \end{aligned}$$

on the positive real line; such distributions are termed half-Gaussian or two-piece normal distributions. The distribution of \(Z_3\) has point mass of \(1/2\) in zero and admits a density on the positive real line which is that of a centered normal distribution with variance \(\sigma _{2+}^2\). Similarly, \(Z_4\) has point mass of \(1/2\) in zero and admits a density on the negative real line which is that of a centered normal distribution with variance \(\sigma _{2-}^2\).

Observe that depending on the situation, the limiting results established in Theorem 14 include four different types of distributions. These are, degenerate, Gaussian, half-Gaussian and half-Gaussian with point masses at the boundary. Also observe that we present the limiting results for sample mid-quantiles in a different way than Ma et al. (2011). The results displayed in (25) will turn out to be convenient for investigating the applicability of bootstrap methods in the sequel. Nevertheless, in comparison to the i.i.d.  case, only the covariance matrix \(\Sigma ^{(k+2)}\) changes.

Remark 15

(Boundary issues)

  1. (i)

    In the boundary cases \(p<F_\mathrm{mid}(v_1)\) and \(p>F_\mathrm{mid}(v_d)\) we even get \(\widehat{Q}_{p,\mathrm{mid}}=Q_{p,\mathrm{mid}}\) with probability tending to one; see the proof of Theorem 14. These stronger results are used in the proofs of bootstrap consistency later on.

  2. (ii)

    Note that the results of Theorem 14 carry over to countable support \(V\) as long as it does not contain an accumulation point. Then, the cases \(p<F_\mathrm{mid}(v_1)\) and/or \(p>F_\mathrm{mid}(v_d)\) simply disappear; see also Remark 2 in Ma et al. (2011).

Remark 16

Similar to Theorem 6, it is possible to prove joint convergence of several sample mid-quantiles. For clarity of exposition, we do not give the exact convergence results here, but mention that multivariate limiting distributions of several sample mid-quantiles can be obtained essentially by combining the univariate results of Theorem 14 above.

Before considering the bootstrap for mid-quantiles in Sect. 3.2, we first illustrate the concept of mid-quantiles with the help of a continuation of the coin flip example discussed in the Introduction; compare also Fig. 4.

Toy example: coin flip data for mid-quantiles Suppose a fair coin is flipped independently \(n\) times and we observe a sequence \(X_1,\ldots ,X_n\) of zeros and ones such that \(P(X_t=0)=1/2=1-P(X_t=1)\). Let \(X_\mathrm{med,mid}=Q_{0.5,\mathrm{mid}}\) and \(\widehat{X}_\mathrm{med,mid}=\widehat{Q}_{0.5,\mathrm{mid}}\) denote the population mid-median and the sample mid-median, respectively. Then, (23) gives \(X_\mathrm{med,mid}=1/2\) and from Theorem 14, we get

$$\begin{aligned} \sqrt{n}(\widehat{X}_\mathrm{med,mid}-X_\mathrm{med,mid})\mathop {\longrightarrow }\limits ^{{\mathcal D}}\mathcal {N}(0,1/4). \end{aligned}$$
(27)

Thus, the sample mid-median fulfils a CLT and, in particular, it is a \(\sqrt{n}\)-consistent estimator for the mid-median.

3.2 Bootstrapping sample mid-quantiles

We showed that standard bootstrap proposals may fail in the purely discrete data case for classical sample quantiles. A closer inspection of the bootstrap invalidity result of Theorem 1 shows that this issue is caused by the discreteness of the distributions which in turn leads to quantile functions having jumps. In view of this observation, the use of mid-quantiles may circumvent this problem, because the corresponding mid-quantile function is piecewise linear and thus, in particular, continuous by construction; compare Fig. 4.

In a first step, we investigate to what extend \(m\)-out-of-\(n\)-type bootstraps (i.i.d. and block) are capable to mimic correctly the limiting distributions established in Theorem 14. Here, we allow explicitly the case \(m=n\) to cover also Efron’s bootstrap and the standard moving block bootstrap. To fix some notation, let \(\widehat{Q}_{p,\mathrm{mid},m}^*\) denote the \(p\)th bootstrap sample mid-quantile based on bootstrap observations \(X_1^*,\ldots ,X_m^*\). More precisely and analogue to (24), we define

$$\begin{aligned} \widehat{Q}_{p,\mathrm{mid},m}^*=\left\{ \begin{array}{l@{\quad }l} v_1 \quad &{} \text {if } p<\widehat{F}_{\mathrm{mid},m}^*(v_1)\\ v_k \quad &{} \text {if } p=\widehat{F}_{\mathrm{mid},m}^*(v_k)<\widehat{F}_{\mathrm{mid},m}^*(v_{k+1}),\; k=1\dots , d\\ \lambda _m^* v_k+(1-\lambda _m^*)v_{k+1} \quad &{} \text {if } p=\lambda _m^* \widehat{F}_{\mathrm{mid},m}^*(v_k)+(1-\lambda _m^*) \widehat{F}_{\mathrm{mid},m}^*(v_{k+1}),\,\lambda _m^* \in (0,1),\\ \quad &{} \widehat{F}_{\mathrm{mid},m}^*(v_k)<\widehat{F}_{\mathrm{mid},m}^*(v_{k+1}),\, k=1,\dots , d-1\\ v_d \quad &{} \text {if } p>\widehat{F}_{\mathrm{mid},m}^*(v_d) \end{array}\right. \end{aligned}$$
(28)

where \(\widehat{F}_{\mathrm{mid},m}^*(x)=m^{-1}\sum _{k=1}^m \{1(X_k^*\le x)-0.5\cdot 1(X_k^*=x)\}\) is the bootstrap counterpart of \(\widehat{F}_\mathrm{mid}(x)\) based on \(X_1^*,\ldots ,X_m^*\).

Theorem 17

(Asymptotics of bootstrap sample mid-quantiles for discrete distributions) Suppose either (i) or (ii) holds, where

  1. (i)

    \(X_1,\ldots ,X_n\) are i.i.d. and we draw i.i.d. bootstrap replicates \(X_1^*,\ldots ,X_m^*\) such that \(m\rightarrow \infty \) and \(m=o(n)\) or \(m=n\) as \(n\rightarrow \infty \)

  2. (ii)

    \(X_1,\ldots ,X_n\) are \(\tau \)-dependent with \(\sum _{h=1}^\infty h\tau (h)<\infty \) and we apply an \(m\)-out-of-\(n\) block bootstrap with block length \(l\) to get \(X_1^*,\dots , X_m^*\) such that \(l/m+1/l=o(1)\) and \(m=o(n)\) or \(m=n\) as \(n\rightarrow \infty \)

Then, we have

$$\begin{aligned}&\sqrt{m}(\widehat{Q}_{p,\mathrm{mid},m}^*-\widehat{Q}_{p,\mathrm{mid}})\mathop {\longrightarrow }\limits ^{{\mathcal D}} {\left\{ \begin{array}{ll} 0 &{}\quad \mathrm{if}\, p<F_\mathrm{mid}(v_1)\,\mathrm{or}\, p>F_\mathrm{mid}(v_d) \\ Z_1 &{}\quad \mathrm{if}\, p=\lambda F_\mathrm{mid}(v_{k+1})+(1-\lambda )F_\mathrm{mid}(v_{k+2}),\, \lambda \in (0,1),\\ {} &{}\quad k=0,\ldots ,d-2 \end{array}\right. } \nonumber \\ \end{aligned}$$
(29)

and

$$\begin{aligned} \sqrt{m}(\widehat{Q}_{p,\mathrm{mid},m}^*-Q_{p,\mathrm{mid}})\mathop {\longrightarrow }\limits ^{{\mathcal D}}{\left\{ \begin{array}{ll}Z_2 &{} \mathrm{if}\, p=F_\mathrm{mid}(v_{k+1}),\, k=1,\dots , d-2\\ Z_3 &{} \mathrm{if}\, p=F_\mathrm{mid}(v_{1}) \\ Z_4 &{} \mathrm{if}\, p=F_\mathrm{mid}(v_{d}) \end{array}\right. } \end{aligned}$$
(30)

in probability, respectively. The distributions of \(Z_1\) to \(Z_4\) are described in Theorem 14.

At this point, it is worth noting that the results of Theorem 17 above do not require at all the use of an \(m\)-out-of-\(n\)-type bootstrap procedure with \(m=o(n)\) to mimic correctly the complicated limiting distributions in all cases presented in Theorem 14. However, a comparison of (25) with (29) and (30) shows that the correct centering for the bootstrap sample mid-quantiles depends on the true situation. That is, \(\widehat{Q}_{p,\mathrm{mid},m}^*\) has to be centered around the sample mid-quantile \(\widehat{Q}_{p,\mathrm{mid}}\) for the first two cases and around the population quantile \(Q_{p,\mathrm{mid}}\) for the latter three. However, as the true mid-quantile function is generally unknown, the true situation is also not known. Consequently, the results of Theorem 17 are per se useless for practical applications as it is not clear which centering has to be used.

To overcome this issue, we require the bootstrap procedure to be valid for all different cases when centered around one and the same quantity. To achieve this, note that the difference of the left-hand sides of (29) and (30) computes to

$$\begin{aligned} \sqrt{m}(\widehat{Q}_{p,\mathrm{mid},m}^*-Q_{p,\mathrm{mid}})-\sqrt{m}(\widehat{Q}_{p,\mathrm{mid},m}^*-\widehat{Q}_{p,\mathrm{mid}})&= \sqrt{\frac{m}{n}}\left\{ \sqrt{n}(\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\right\} \nonumber \\&= O_{P}\left( \sqrt{\frac{m}{n}}\right) \end{aligned}$$
(31)

and vanishes for \(m=o(n)\), but not for \(m=n\). This leads to the following result.

Corollary 18

(Consistency of \(m\)-out-of-\(n\) bootstraps for sample mid-quantiles) Suppose either (i) or (ii) in Theorem 17 holds with \(m=o(n)\). Then, we have

$$\begin{aligned} d_\mathrm{KS}\left( \sqrt{m}(\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}}),\,\sqrt{n}(\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\right) \mathop {\longrightarrow }\limits ^{\mathcal {P}} 0. \end{aligned}$$

3.3 Randomized construction of confidence intervals

We invoke the ideas of Sect. 2.3 to construct confidence intervals of level \(1-\alpha \) for mid-quantiles. These quantities take their values in the interval \([v_1,v_d]\) if \(V=\{v_1,\dots , v_d\}\) such that \(v_1<\cdots <v_d\) in contrast to classical quantiles that take their values only in the countable set \(V\). In particular, if the image of the mid-quantile function is the whole real line, the limit distribution is continuous by Theorems 14 and 17. Therefore, no randomization techniques are required to construct asymptotic exact \((1-\alpha )\) confidence sets. If this is not the case, a randomization procedure as described in the sequel has to be applied. Note that the asymptotics in the previous section do not rely on the (empirical) mid-quantile itself but on suitably centered and inflated versions. Therefore, instead of \(\mathrm{CS}_L\) and \(\mathrm{CS}_S\) defined in Sect. 2.3, we consider large and small intervals of the form

$$\begin{aligned} \mathrm{CI}_{L,\mathrm{mid}}&=\left[ F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(\alpha /2), \,F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(1-\alpha /2)\right] ,\\ \mathrm{CI}_{S,\mathrm{mid}}^{(r)}&=\left[ F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(\alpha /2),\, F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(1-\alpha /2)\right) ,\\ CI_{S,\mathrm{mid}}^{(l)}&=\left( F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(\alpha /2),\, F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(1-\alpha /2)\right] \end{aligned}$$

and their coverages

$$\begin{aligned} \mathrm{cov}_{L,\mathrm{mid}}&=P^*\left( \sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\in CI_{L,\mathrm{mid}}\right) ,\\ \mathrm{cov}_{S,\mathrm{mid}}^{(r) }&= P^*\left( \sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\in \mathrm{CI}_{S,\mathrm{mid}}^{(r)}\right) ,\\ \mathrm{cov}_{S,\mathrm{mid}}^{(l) }&=P^*\left( \sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\in \mathrm{CI}_{S,\mathrm{mid}}^{(l)}\right) . \end{aligned}$$

Finally, we specify the probability for choosing the large interval

$$\begin{aligned} p^*_\mathrm{mid}={\left\{ \begin{array}{ll} \frac{1-\alpha -\mathrm{cov}_{S,\mathrm{mid}}^{(r)}}{\mathrm{cov}_{L,\mathrm{mid}}-\mathrm{cov}_{S,\mathrm{mid}}^{(r)}}, &{} \mathrm{cov}_{S,\mathrm{mid}}^{(r)}\le 1-\alpha \\ \frac{1-\alpha -\mathrm{cov}_{S,\mathrm{mid}}^{(l)}}{\mathrm{cov}_{L,\mathrm{mid}}-\mathrm{cov}_{S,\mathrm{mid}}^{(l)}}, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$

and define the bootstrap approximation of the confidence set for the \(p\)-level mid-quantile

$$\begin{aligned} \mathrm{CI}=\! {\left\{ \begin{array}{ll}\! \left[ \!\widehat{Q}_{p,\mathrm{mid}}{-}\frac{F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(1-\alpha /2)}{\sqrt{n}},\;\widehat{Q}_{p,\mathrm{mid}}{-} \frac{F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(\alpha /2)}{\sqrt{n}}\!\right] \quad &{}\!\text {if } Y\le p^*_\mathrm{mid}\\ {\left[ \!\!\widehat{Q}_{p,\mathrm{mid}}{\!-\!}\frac{F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(1\!-\!\alpha /2)}{\sqrt{n}},\;\widehat{Q}_{p,\mathrm{mid}}{\!-\!} \frac{F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}\!-\!\widehat{Q}_{p,\mathrm{mid}})}(\alpha /2)}{\sqrt{n}}\!\right) }\quad &{}\!\text {if } Y> p^*_\mathrm{mid}\\ &{}\!\text {and }\mathrm{cov}_{S,\mathrm{mid}}^{(r)}\le 1-\alpha \\ {\left( \!\widehat{Q}_{p,\mathrm{mid}}{-}\frac{F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(1-\alpha /2)}{\sqrt{n}},\;\widehat{Q}_{p,\mathrm{mid}}{-} \frac{F^{*-1}_{\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})}(\alpha /2)}{\sqrt{n}}\!\right] }\quad &{}\!\text {otherwise} \end{array}\right. }\!\!, \end{aligned}$$

where \(Y\sim \mathrm{Unif}(0,1)\) is chosen independently from all observations and all bootstrap variables. This gives an asymptotic confidence interval of level \(1-\alpha \).

Theorem 19

Suppose that the assumptions of Corollary 18 hold true. Then, for \(\alpha \in (0,1/2)\),

$$\begin{aligned} P(Q_ {p,\mathrm{mid}}\in \mathrm{CI})\mathop {\longrightarrow }\limits _{n\rightarrow \infty }1-\alpha . \end{aligned}$$

3.4 Simulations

In this section, we illustrate the bootstrap performance by means of coverage rates of \((1-\alpha )\)-confidence intervals \(CI\) for \(\alpha =0.05\) as proposed in the previous section. To make the simulation results comparable to those obtained in Sect. 2.4, we use the same settings here. Recall that the image of mid-quantile functions is continuous which leads to confidence intervals rather than confidence sets; compare Fig. 5. Contrary to the results in setup (a) obtained for classical quantiles, all choices of the binomial parameter \(N\) lead to non-degenerate distributions for the sample mid-median. In view of Table 3, we observe that the bootstrap works equally well in both cases. As for the classical quantiles, we included coverage rates of asymptotic confidence intervals that treat the data as being normally distributed for the i.i.d. setup such that we can make use of the CLT established in Theorem 1, Case 1 of Ma et al. (2011). This result leads to the same limiting distribution as obtained in the CLT for classical quantiles in (22) and we used the same construction of confidence intervals as in Sect. 2.4. For the coverage rates of the CLT-based confidence intervals that treat the data as being normally distributed, we see in Table 3 a systematic overcoverage which is in general less pronounced for larger values of \(N\). To explain the latter observation, note that these confidence intervals are not asymptotically valid as they assume an underlying normal distribution. Even though the concept of mid-quantiles slightly differs from the classical ones in the discrete setup, a smooth modification of the quantile function appears to be beneficial wrt coverage rate performance of bootstrap confidence intervals.

Fig. 5
figure 5

Confidence intervals CI for the mid-median \(X_\mathrm{med,mid}\) from five realizations of \(X_1,\ldots ,X_n\) with \(X_i\sim \mathrm{Bin}(N,0.5)\) i.i.d.  for \(N=19\) (left panels) and \(N=39\) (right panels), several sample sizes \(n\) and bootstrap sample sizes \(m=n^{2/3}\). The true mid-median is marked with a red vertical line

In comparison to the results displayed in Table 2 for classical quantiles, Table 4 illustrates the necessity of a block-type resampling scheme that takes the dependence structure of the INAR process in setting (b) into account.

Table 3 Coverage rates of (\(1-\alpha \))-bootstrap confidence sets CI with \(\alpha =0.05\) for the mid-median \(X_\mathrm{med,mid}\) of \(X_i\sim \mathrm{Bin}(N,0.5)\) for several choices of \(N\), sample sizes \(n\) and bootstrap sample sizes \(m\)
Table 4 Coverage rates of (\(1-\alpha \))-bootstrap confidence sets CI with \(\alpha =0.05\) for the mid-median \(X_\mathrm{med,mid}\) of the INAR(1) model \(X_t=\beta \circ X_{t-1}+\epsilon _t\), \(\beta =0.5\) for two choices of \(\lambda \) and several sample sizes \(n\) and bootstrap sample sizes \(m\)

4 Conclusion

In this paper, we investigated bootstrap validity for classical quantiles as well as so-called mid-quantiles of discrete distributions. The classical quantile function is piecewise constant and discontinuous which makes statistical inference challenging. The concept of mid-distribution tries to overcome this deficiency by relying on piecewise linear mid-quantile functions that are continuous, but not differentiable. This approach is partly motivated by the fact that the latter function coincides with the classical quantile function if the underlying distribution is continuous. Indeed, in contrast to classical quantiles, mid-quantiles can be estimated consistently. Regarding the validity of bootstrap methods this concept alone is not entirely successful. In both cases, low-intensity (block) bootstrap methods are required to mimic the distribution of the (mid-)quantile estimators correctly. In particular two tuning parameters, i.e. the intensity \(m\) and the block length \(l\) have to be chosen, irrespective of the type of quantiles. Moreover, to overcome the issue of potentially too conservative intervals, randomization techniques have to be invoked. An overview of the (in-)consistency of all bootstrap methods addressed in this paper is given in Table 5.

Table 5 Bootstrap (in-)consistency for single sample (mid-)quantiles

Still, smoothness of mid-quantile functions in comparison to ordinary quantile functions turns out to be beneficial wrt the finite sample performance. Despite the application of randomization techniques, confidence sets for classical quantiles tend to be quite conservative. This effect is not observed for the mid-distribution counterparts where bootstrap consistency for commonly centered quantities leads to a straightforward construction of confidence intervals. Therefore, the question arises whether further smooth modifications of mid-quantiles may lead to even better results. A first attempt has been proposed by Wang and Hutson (2011) which is motivated by the Harrell–Davis quantile estimator for continuous distributions. These quantile estimators appear as sums of weighted order statistics where the weights are smooth functions of Beta cdfs. However, while Harrell and Davis (1982) use this method for the order statistic of the sample itself, Wang and Hutson (2011) apply this to the support instead. Hence, it is not clear how their definition of quantiles can be used directly for continuous data and whether there is a deep relationship between classical quantiles and these variants as in the case of mid-quantiles. Therefore, we did not follow this line of research in the present paper. Nevertheless, we conjecture that proving consistency of i.i.d. and block bootstrap methods is straightforward since the proof of asymptotic normality in Wang and Hutson (2011) relies on the CLT for the empirical cdf and the \(\Delta \)-method only. The construction of other smooth modifications of quantiles and even more importantly the identification of their relationship to classical quantiles for continuous distributions and convenience for practitioners goes far beyond the scope of our paper and should be investigated in future research.

5 Appendix: Proofs and auxiliary results

5.1 Proofs of the main results

Proof of Theorem 1

We first prove (14). With the notation

$$\begin{aligned} Z_n=\sqrt{n}\frac{\widehat{F}_n(\epsilon )-F(\epsilon )}{\sqrt{\mathrm{var}(1(X_1\le \epsilon ))}} \quad \text {and} \quad Z_n^*=\sqrt{n}\frac{\widehat{F}_n^*(\epsilon )-\widehat{F}_n(\epsilon )}{\sqrt{\mathrm{var}(1(X_1\le \epsilon ))}} \end{aligned}$$

for any fixed \(\epsilon \in (0,1)\) and using the fact that for any distribution function \(G\) on \(\mathbb {R}\), \(G(x)\ge t\) if and only if \(x\ge G^{-1}(t)\), we get

$$\begin{aligned} P^*(\widehat{X}_\mathrm{med}^*=0)&= P^*(\widehat{X}_\mathrm{med}^*\le \epsilon )=P^*\left( \frac{1}{2}\le \widehat{F}^*_n\left( \epsilon \right) \right) =1-P^*\left( Z_n^*<-Z_n\right) \\&= 1-\Phi \left( -Z_n\right) +\big (\Phi \left( -Z_n\right) -P^*\left( Z_n^*<-Z_n\right) \big ). \end{aligned}$$

In conjunction with Polya’s Theorem, we get from Lemma 21 that

$$\begin{aligned} \left| \Phi \left( -Z_n\right) -P^*\left( Z_n^*<-Z_n\right) \right| \le \sup _{x\in \mathbb {R}}\left| \Phi \left( x\right) -P^*\left( Z_n^*<x\right) \right| =o_P(1). \end{aligned}$$

By Slutsky’s Theorem, it remains to show that

$$\begin{aligned} 1-\Phi \left( -Z_n\right) \mathop {\longrightarrow }\limits ^{{\mathcal D}} U\sim \mathrm{Unif}(0,1), \end{aligned}$$

which follows from \(Z_n\mathop {\longrightarrow }\limits ^{{\mathcal D}} Z\sim \mathcal {N}(0,1)\), the Simulation Lemma and from \(U:=1-\widetilde{U}\sim \mathrm{Unif}(0,1)\) if \(\widetilde{U}\sim \mathrm{Unif}(0,1)\). The result in (i) follows immediately from (11) and

$$\begin{aligned} \left| P^*(\widehat{X}_\mathrm{med}^*=0)-P(\widehat{X}_\mathrm{med}=0)\right|&\overset{\mathcal {D}}{\longrightarrow }&\left| U-\frac{1}{2}\right| \sim \mathrm{Unif}(0,1/2). \end{aligned}$$

Now, we show the result in (ii). As \(\widehat{X}_\mathrm{med},\widehat{X}_\mathrm{med}^*\in \{0,1\}\), \(X_\mathrm{med}=0\) and due to (5) and (13), we have to derive the asymptotics of the bivariate random variables

$$\begin{aligned}&\left( \begin{array}{l} P^*(\widehat{X}_\mathrm{med}^*-\widehat{X}_\mathrm{med}\le -1)-P(\widehat{X}_\mathrm{med}-X_\mathrm{med}\le -1) \\ P^*(\widehat{X}_\mathrm{med}^*-\widehat{X}_\mathrm{med}\le 0)-P(\widehat{X}_\mathrm{med}-X_\mathrm{med}\le 0)\end{array}\right) \\&\quad =\left( \begin{array}{l} P^*(\widehat{X}_\mathrm{med}^*-\widehat{X}_\mathrm{med}\le -1) \\ P^*(\widehat{X}_\mathrm{med}^*-\widehat{X}_\mathrm{med}\le 0)-\frac{1}{2}\end{array}\right) +o_P(1) \end{aligned}$$

to compute the supremum of both components. By straightforward calculations and due to \(P^*(\widehat{X}_\mathrm{med}^*\le 0)=1-\Phi (-Z_n)+o_P(1)\) as obtained in the first part of this proof, the last expression becomes

$$\begin{aligned}&\left( \begin{array}{l} 1(\widehat{X}_\mathrm{med}=0)P^*(\widehat{X}_\mathrm{med}^*\le -1)+1(\widehat{X}_\mathrm{med}=1)P^*(\widehat{X}_\mathrm{med}^*\le 0) \\ 1(\widehat{X}_\mathrm{med}=0)P^*(\widehat{X}_\mathrm{med}^*\le 0)+1(\widehat{X}_\mathrm{med}=1)P^*(\widehat{X}_\mathrm{med}^*\le 1)-\frac{1}{2} \end{array}\right) +o_P(1)\\&\quad = \left( \begin{array}{l} 1(\frac{1}{2}<\Phi (-Z_n))P^*(\widehat{X}_\mathrm{med}^*\le 0) \\ 1(\frac{1}{2}\ge \Phi (-Z_n))P^*(\widehat{X}_\mathrm{med}^*\le 0)+1(\frac{1}{2}<\Phi (-Z_n))-\frac{1}{2}\end{array}\right) +o_P(1) \\&\quad = \left( \begin{array}{l} 1(\frac{1}{2}<\Phi (-Z_n))(1-\Phi (-Z_n)) \\ 1(\frac{1}{2}\ge \Phi (-Z_n))(1-\Phi (-Z_n))+1(\frac{1}{2}<\Phi (-Z_n))-\frac{1}{2}\end{array}\right) +o_P(1), \end{aligned}$$

which converges in probability by the continuous mapping theorem [see e.g. Pollard (1984, III.6)] towards

$$\begin{aligned}&\left( \begin{array}{l} 1(\frac{1}{2}<\Phi (-Z))(1-\Phi (-Z)) \\ 1(\frac{1}{2}\ge \Phi (-Z))(1-\Phi (-Z))+1(\frac{1}{2}<\Phi (-Z))-\frac{1}{2}\end{array}\right) \nonumber \\&\quad =\left( \begin{array}{l} 1(\frac{1}{2}<\widetilde{U})(1-\widetilde{U}) \\ 1(\frac{1}{2}\ge \widetilde{U})(1-\widetilde{U})+1(\frac{1}{2}<\widetilde{U})-\frac{1}{2}\end{array}\right) . \end{aligned}$$
(32)

Further, it holds

$$\begin{aligned} 1\left( \frac{1}{2}<\widetilde{U}\right) (1-\widetilde{U})&= {\left\{ \begin{array}{ll}1-\widetilde{U}, &{} \frac{1}{2}<\widetilde{U} \\ 0, &{} \frac{1}{2}\ge \widetilde{U}\end{array}\right. }\le {\left\{ \begin{array}{ll}\frac{1}{2}, &{} \frac{1}{2}<\widetilde{U} \\ 1-\widetilde{U}-\frac{1}{2}, &{} \frac{1}{2}\ge \widetilde{U}\end{array}\right. }\\&= 1\left( \frac{1}{2}\ge \widetilde{U}\right) (1-\widetilde{U})+1\left( \frac{1}{2}<\widetilde{U}\right) -\frac{1}{2} \end{aligned}$$

such that the second component of (32) is always the maximum of both. To derive the cdf, let \(x\in \mathbb {R}\) and, with \(U=1-\widetilde{U}\), we get

$$\begin{aligned}&P\left( 1\left( \frac{1}{2}\ge \widetilde{U}\right) (1-\widetilde{U})+1\left( \frac{1}{2}<\widetilde{U}\right) -\frac{1}{2}\le x\right) \\&\quad = P\left( 1\left( \frac{1}{2}\le U\right) U+1\left( \frac{1}{2}>U\right) -\frac{1}{2}\le x\right) \\&\quad = P\left( 1\left( \frac{1}{2}\le U\right) U+1\left( \frac{1}{2}>U\right) -\frac{1}{2}\le x,U\ge \frac{1}{2}\right) \\&\qquad +P\left( 1\left( \frac{1}{2}\le U\right) U+1\left( \frac{1}{2}>U\right) -\frac{1}{2}\le x,U< \frac{1}{2}\right) \\&\quad = P\left( \frac{1}{2}\le U\le x+\frac{1}{2}\right) +P\left( \frac{1}{2}\le x,U<\frac{1}{2}\right) \\&\quad = {\left\{ \begin{array}{ll} 0, &{} x<0 \\ x, &{} x\in [0,1/2) \\ 1/2, &{} x\ge 1/2\end{array}\right. }+{\left\{ \begin{array}{ll} 0, &{} x<0 \\ 0, &{} x\in [0,1/2) \\ 1/2, &{} x\ge 1/2\end{array}\right. } \\&\quad = x1_{[0,\frac{1}{2})}(x)+1_{[\frac{1}{2},\infty )}(x). \end{aligned}$$

\(\square \)

Proof of Theorem 2

  1. (i)

    This part is a special case of Theorem 8.

  2. (ii)

    The second statement follows similarly to the proof of Theorem 1 using part (i), the results from above and from

    $$\begin{aligned}&\left( \begin{array}{l} P^*(\widehat{X}_{m,\mathrm{med}}^*-\widehat{X}_\mathrm{med}\le -1)-P(\widehat{X}_\mathrm{med}-X_\mathrm{med}\le -1) \\ P^*(\widehat{X}_{m,\mathrm{med}}^*-\widehat{X}_\mathrm{med}\le 0)-P(\widehat{X}_\mathrm{med}-X_\mathrm{med}\le 0)\end{array}\right) \\&\quad = \left( \begin{array}{l} 1(\frac{1}{2}<\Phi (-Z_n))P^*(\widehat{X}_{m,\mathrm{med}}^*\le 0) \\ 1(\frac{1}{2}\ge \Phi (-Z_n))P^*(\widehat{X}_{m,\mathrm{med}}^*\le 0)+1(\frac{1}{2}<\Phi (-Z_n))-\frac{1}{2}\end{array}\right) +o_P(1) \\&\quad = \left( \begin{array}{l} 1(\frac{1}{2}<\widetilde{U})\frac{1}{2} \\ 1(\frac{1}{2}\ge \widetilde{U})\frac{1}{2}+1(\frac{1}{2}<\widetilde{U})-\frac{1}{2}\end{array}\right) +o_P(1) \\&\quad = \left( \begin{array}{l} 1(U<\frac{1}{2})\frac{1}{2} \\ 1(U\ge \frac{1}{2})\frac{1}{2}+1(U<\frac{1}{2})-\frac{1}{2}\end{array}\right) +o_P(1) \end{aligned}$$

    as \(1(U<1/2)1/2=1(U\ge 1/2)1/2+1(U<1/2)-1/2\) and \(1(U<1/2)=2\widetilde{S}\) is Bernoulli-distributed. \(\square \)

Proof of Theorem 6

  1. (i)

    Note that \(\widehat{Q}_p\) and \(Q_p\) take their values in \(V\) only. Under our assumptions on \(V\) there exists an \(\epsilon >0\) such that for each \(p\in (0,1)\)

    $$\begin{aligned} P(\widehat{Q}_p=Q_p)=P(\widehat{Q}_p\in (Q_p-\epsilon , Q_p+\epsilon ]). \end{aligned}$$

    This implies

    $$\begin{aligned} P(\widehat{Q}_p=Q_p)=P(p\le \widehat{F}_n(Q_p+\epsilon ))-P(p\le \widehat{F}_n(Q_p-\epsilon )) \end{aligned}$$
    (33)

    due to the monotonicity of \(\widehat{F}_n\). The first term on the rhs tends to one by the WLLN, which is a consequence of Theorem 20, and the second term vanishes asymptotically with the same reasoning.

  2. (ii)

    This follows from part (iii) with \(d=k=1\) and symmetry of a univariate, centered normal random variable.

  3. (iii)

    As we proved consistency of the sample quantiles if \(F(Q_{p_i})>p_i\) in (i), we can restrict the computations to the case where \(F(Q_{p_i})=p_i\) and \(i=1,\ldots ,k\) in the following. Similarly to (33), we get

    $$\begin{aligned}&P(\underline{\widehat{Q}}=\underline{q})\nonumber \\&\quad = P\bigg (\underline{0}\in \times _{j=1}^k \big (\sqrt{n}(\widehat{F}_n(q_j-\epsilon )-F(Q_{p_j}+\epsilon )),\sqrt{n}(\widehat{F}_n(q_j+\epsilon )\nonumber \\&\qquad -F(Q_{p_j}+\epsilon ))\big ]\bigg )\nonumber \\&\qquad \underset{n\rightarrow \infty }{\rightarrow } P\big (\underline{0}\in \times _{j=1}^k \big (-\infty 1(q_j=Q_{p_j})\nonumber \\&\qquad +Z_j1(q_j=v_{l_j+1}),\infty 1(q_j=v_{l_j+1})+Z_j1(q_j=Q_{p_j})\big )\big )\nonumber \\&\quad = P\left( \bigcap _{j=1}^k\left\{ (2\cdot 1(q_j=Q_{p_j})-1)Z_j\ge 0\right\} \right) , \end{aligned}$$
    (34)

    where the multivariate CLT

    $$\begin{aligned} \sqrt{n}\left( \begin{array}{l} \widehat{F}_n(Q_{p_1})-F(Q_{p_1}) \\ \vdots \\ \widehat{F}_n(Q_{p_k})-F(Q_{p_k}) \end{array}\right) \overset{\mathcal {D}}{\rightarrow }\underline{Z}\sim \mathcal {N}(\underline{0},\mathbf {W}) \end{aligned}$$

    has been used; see Theorem 20.\(\square \)

Proof of Theorem 8

It suffices to verify

$$\begin{aligned} \sup _{\underline{x}\in \mathbb {R}^d}\left| P^*(\underline{\widehat{Q}}^*_m\le \underline{x})- P(\underline{\widehat{Q}}\le \underline{x})\right| =\sup _{\underline{k}\in V^d}\left| P^*(\underline{\widehat{Q}}^*_m\le \underline{k})- P(\underline{\widehat{Q}}\le \underline{k})\right| =o_P(1) \end{aligned}$$

due to the discrete nature of the underlying process. First we get from Theorem 6(i) above with \(Q_{p_j}=v_{l_j}\) that

$$\begin{aligned} \lim _{n\rightarrow \infty } P(\widehat{Q}_{p_j}<Q_{p_j})=\lim _{n\rightarrow \infty } P(\widehat{Q}_{p_j}>v_{l_j+1})=0. \end{aligned}$$

Similarly, deducing a bootstrap WLLN from Lemma 21, we get

$$\begin{aligned} P^*(\widehat{Q}_{p_j,m}^*<Q_{p_j})+ P^*(\widehat{Q}_{p_j,m}^*>v_{l_j+1})=o_P(1) \end{aligned}$$

as well. Using the notation of Theorem 6(iii), it remains to show that

$$\begin{aligned} P^*(\underline{\widehat{Q}}_m^*=\underline{q})\mathop {\longrightarrow }\limits ^{P}P\left( \bigcap _{j=1}^k\left\{ (2\cdot 1(q_j=Q_{p_j})-1)Z_j\ge 0\right\} \right) . \end{aligned}$$

Actually, we get

$$\begin{aligned}&P^*(\underline{\widehat{Q}}_m^*=\underline{q})\\&\quad = P^*\left( \underline{0}\in \times _{j=1}^k \left( \sqrt{m}(\widehat{F}_m^*(q_j-\epsilon )-F(Q_{p_j}+\epsilon )),\sqrt{m}(\widehat{F}_m^*(q_j+\epsilon )-F(Q_{p_j}+\epsilon ))\right) \right) \\&\quad = P^*\left( \underline{0}\in \times _{j=1}^k \left( \sqrt{m}(\widehat{F}_m^*(q_j-\epsilon )-\widehat{F}_n(Q_{p_j}+\epsilon ))+O_P((m/n)^{1/2}), \right. \right. \\&\qquad \qquad \quad \qquad \qquad \qquad \left. \left. \sqrt{m}(\widehat{F}_m^*(q_j+\epsilon )-\widehat{F}_n(Q_{p_j}+\epsilon ))+O_P((m/n)^{1/2})\right) \right) . \end{aligned}$$

Further, as \(m=o(n)\) and from Lemma 21 the last right-hand side converges in probability to \(P\left( \bigcap _{j=1}^k\left\{ (2\cdot 1(q_j=Q_{p_j})-1)Z_j\ge 0\right\} \right) \), which proves bootstrap consistency. \(\square \)

Proof of Theorem 9

The proof follows in analogy to the proof of Theorem 8 from Theorem 22. \(\square \)

Proof of Theorem 11

For a specific \(j\in \mathbb {Z}\) we have \(Q_p=v_j\). From bootstrap consistency we obtain \(P^*(\widehat{Q}^*_{p,m}\in [Q_p,v_{j+1}])\mathop {\longrightarrow }\limits ^{{\mathcal P}}1\) and \(P^*(\widehat{Q}^*_{p,m}=v_{j+1})\mathop {\longrightarrow }\limits ^{ {\mathcal P}}1/2\) or 0 if \(F(Q_p)=p\) and \(F(Q_p)>p\), respectively. Hence, \(\mathrm{cov}_L\mathop {\longrightarrow }\limits ^{{\mathcal P}}1\).

Concerning the coverage of the small set we obtain

$$\begin{aligned} \mathrm{cov}_S\mathop {\longrightarrow }\limits ^{ {\mathcal P}} \left\{ \begin{array}{l@{\quad }l} 1/2 \quad &{} \text {if } F(Q_p)=p\\ 0 \quad &{} \text {if } F(Q_p)>p \end{array}\right. . \end{aligned}$$

In particular, this implies that

$$\begin{aligned} p^*\mathop {\longrightarrow }\limits ^{ {\mathcal P}} \left\{ \begin{array}{l@{\quad }l} 1-2\alpha \quad &{} \text {if } F(Q_p)=p \\ 1-\alpha \quad &{} \text {if } F(Q_p)>p \end{array}\right. . \end{aligned}$$

From Theorems 6, 8 and 9, we get \(H(\widehat{Q}^*_{p,m})=Q_p\) with probability tending to one. Noting that the difference between both coverages is larger than 1/4 with probability tending to one, we obtain

$$\begin{aligned} P(Q_p\in \mathrm{CS})&=P(\widehat{Q}_p\in \mathrm{CS}_L\,1(Y\le p^*)+\mathrm{CS}_S\, 1(Y> p^*),\, \mathrm{cov}_L-\mathrm{cov}_S\ge 1/4)\\&\quad +o(1)\\&= E\bigg (p^* 1(\widehat{Q}_p\in \mathrm{CS}_L,\, \mathrm{cov}_L-\mathrm{cov}_S\ge 1/4)\\&\quad +(1-p^*) 1(\widehat{Q}_p\in \mathrm{CS}_S,\,\mathrm{cov}_L-\mathrm{cov}_S\ge 1/4)\bigg )\,+o(1)\\&=(1-\alpha )E\left( \frac{1}{\mathrm{cov}_L-\mathrm{cov}_S}1(\widehat{Q}_p\in \mathrm{CS}_L\backslash \mathrm{CS}_S,\,\mathrm{cov}_L-\mathrm{cov}_S\ge 1/4)\right) \\&\quad -E\left( \frac{\mathrm{cov}_S}{\mathrm{cov}_L-\mathrm{cov}_S}1(\widehat{Q}_p\in \mathrm{CS}_L,\,\mathrm{cov}_L-\mathrm{cov}_S\ge 1/4)\right) +o(1)\\&\quad +E\left( \frac{\mathrm{cov}_L}{\mathrm{cov}_L-\mathrm{cov}_S}1(\widehat{Q}_p\in \mathrm{CS}_S,\, \mathrm{cov}_L-\mathrm{cov}_S\ge 1/4)\right) +o(1)\\&=:P_1+P_2+P_3+o(1). \end{aligned}$$

Moreover, it holds

$$\begin{aligned} P\left( \widehat{Q}_p\in \mathrm{CS}_L\right) =P\left( \widehat{Q}_p\in \left[ F^{*-1}_{\widehat{Q}_{p,m}^*}(\alpha /2), \,F^{*-1}_{\widehat{Q}_{p,m}^*}(1-\alpha /2)\right] \right) \mathop {\longrightarrow }\limits _{n\rightarrow \infty }1, \end{aligned}$$

and

$$\begin{aligned} P\left( \widehat{Q}_p\in \mathrm{CS}_S\right)&=P\left( \widehat{Q}_p\in \left[ F^{*-1}_{\widehat{Q}_{p,m}^*}(\alpha /2), \,F^{*-1}_{\widehat{Q}_{p,m}^*}(1-\alpha /2)\right) \right) \\&\mathop {\longrightarrow }\limits _{n\rightarrow \infty }{\left\{ \begin{array}{ll} 1/2&{}\text {if } F(Q_p)=p\\ 0&{}\text {if } F(Q_p)>p \end{array}\right. }, \end{aligned}$$

and therefore

$$\begin{aligned} P\left( \widehat{Q}_p\in \mathrm{CS}_L\backslash \mathrm{CS}_S\right) \mathop {\longrightarrow }\limits _{n\rightarrow \infty }{\left\{ \begin{array}{ll} 1/2&{}\text {if } F(Q_p)=p\\ 1&{}\text {if } F(Q_p)>p \end{array}\right. }. \end{aligned}$$

Bringing all together, we get from Theorem 25.11 in Billingsley (1995)

$$\begin{aligned} P_1\mathop {\longrightarrow }\limits _{n\rightarrow \infty }1-\alpha ,\quad P_2\mathop {\longrightarrow }\limits _{n\rightarrow \infty }{\left\{ \begin{array}{ll} -1&{}\text {if } F(Q_p)=p\\ 0&{}\text {if } F(Q_p)>p \end{array}\right. },\quad \text {and}\quad P_3\mathop {\longrightarrow }\limits _{n\rightarrow \infty }{\left\{ \begin{array}{ll} 1&{}\text {if } F(Q_p)=p\\ 0&{}\text {if } F(Q_p)>p \end{array}\right. } \end{aligned}$$

since the random variables whose expectations we calculate in \(P_1,\dots , P_3\) are bounded by 4. This finally implies that \(CS\) has asymptotically exact level \(1-\alpha \). \(\square \)

Proof of Theorem 14

The proofs of the first two cases \(p<F_{mid}(v_1)\) and \(p>F_{mid}(v_d)\) follow the same lines. They can be carried out in complete analogy to the proofs of Theorem 2, Case 1 and Case 2 in Ma et al. (2011) if we can show that

$$\begin{aligned} \frac{1}{n}\sum _{t=1}^n 1(X_t=v_k)\mathop {\longrightarrow }\limits ^{P}\;P(X_1=v_k),\quad k=1,\dots , d. \end{aligned}$$
(35)

This in turn follows from the WLLN that can be deduced from Theorem 20 noting that \(1(X_t= v_k)=1(X_t\le v_k)-1(X_t\le v_{k-1})\) for \(k=2,\dots , d\) and \(1(X_t=v_1)=1(X_t\le v_1)\).

If \(p=\lambda F_\mathrm{mid}(v_{k+1})+(1-\lambda )F_\mathrm{mid}(v_{k+2})\) such that \(\lambda \in (0,1)\), we can pursue the steps of the proof of Theorem 2, Case 3 in Ma et al. (2011) to get

$$\begin{aligned}&\sqrt{n}\left( \widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}}\right) \nonumber \\&=\sqrt{n}\,(v_{k+1}-v_{k+2}) \left[ \frac{\widehat{F}_\mathrm{mid}(v_{k+2})-p}{\widehat{F}_\mathrm{mid}(v_{k+2})-\widehat{F}_\mathrm{mid}(v_{k+1})}-\frac{F_\mathrm{mid}(v_{k+2})-p}{F_\mathrm{mid}(v_{k+2})-F_\mathrm{mid}(v_{k+1})}\right] . \end{aligned}$$
(36)

Now, asymptotics of (36) can be deduced easily using the \(\Delta \)-method, if we can show that

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{t=1}^n\left( Y_{1}+\cdots + Y_n \right) \mathop {\longrightarrow }\limits ^{d}{\mathcal N}(0_d,\Sigma ), \end{aligned}$$
(37)

where \(Y_t=(1(X_t=v_1)-P(X_t=v_1),\dots , 1(X_t=v_d)-P(X_t=v_d))'\), \(t=1,\dots ,n\) and \(\Sigma =(\Sigma _{j_1,j_2})_{j_1,\dots , j_2=1,\dots , d}\). Now using the same representation of the indicator functions as in the first part of the proof, (37) follows from Theorem 20 and the continuous mapping theorem. To this end, note that \(\widehat{F}_\mathrm{mid}(v_k)=n^{-1}\sum _{t=1}^n\{\sum _{i=1}^k 1(X_t=v_i)-0.5\ 1(X_t=v_k)\}\) and similarly \(F_\mathrm{mid}(v_k)=\sum _{i=1}^ka_i-0.5\ a_k\).

The assertion for the case \(p=F_\mathrm{mid}(v_{k+1})\), \(k=1,\dots , d-2\) can be deduced from Theorem 20 in the same manner as in the proof of Theorem 2, Case 4 in Ma et al. (2011).

The proofs of the last two boundary cases \(p=F_\mathrm{mid}(v_1)\) and \(p=F_\mathrm{mid}(v_d)\) follow the same lines and we show only the first one. As \(\sqrt{n}(\widehat{F}_\mathrm{mid}(v_1)-F_\mathrm{mid}(v_1))=O_P(1)\) by Theorem 20, for sufficiently large \(n\), there is a \(\lambda _n\) such that \(0<\lambda _n<1\) and \(p=\lambda _n \widehat{F}_\mathrm{mid}(v_2)+(1-\lambda _n)\widehat{F}_\mathrm{mid}(v_1)\) if \(\widehat{F}_\mathrm{mid}(v_1)<p\). Then, from the definition of \(\widehat{Q}_{p,\mathrm{mid}}\), we get

$$\begin{aligned} \sqrt{n}(\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})=\widetilde{Z}_n\frac{v_2-v_1}{\widehat{F}_\mathrm{mid}(v_2)-\widehat{F}_\mathrm{mid}(v_1)}1(0<\widetilde{Z}_n), \end{aligned}$$
(38)

where \(\widetilde{Z}_n=\sqrt{n}(p-\widehat{F}_\mathrm{mid}(v_1))=\sqrt{n}(F_\mathrm{mid}(v_1)-\widehat{F}_\mathrm{mid}(v_1))\). From (38), we get

$$\begin{aligned}&P\left( \sqrt{n}(\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\le x\right) \\&\quad ={\left\{ \begin{array}{ll}0, &{} x<0 \\ P(\widetilde{Z}_n\ge 0), &{} x=0 \\ P(\widetilde{Z}_n\ge 0)+P\left( \widetilde{Z}_n\frac{v_2-v_1}{\widehat{F}_\mathrm{mid}(v_2)-\widehat{F}_\mathrm{mid}(v_1)}\in (0,x]\right) , &{} x>0\end{array}\right. } \end{aligned}$$

and the cases on the last right-hand side converge corresponding to the claimed limiting distribution again using Theorem 20. \(\square \)

Proof of Theorem 17

To prove the first case of (29), let \(\widehat{a}_j^*=m^{-1}\sum _{t=1}^m 1(X_t^*=j)\), \(\widehat{a}_j=n^{-1}\sum _{t=1}^n 1(X_t=j)\), \(j=1,\dots , d,\) and \(\widehat{a}_0=\widehat{a}_{d+1}=\widehat{a}_0^*=\widehat{a}_{d+1}^*=0\). For sufficiently large \(n\), with probability tending to 1 and because of \(\sqrt{n}(\widehat{a}_j-a_j)=O_P(1)\) and \(\sqrt{m}(\widehat{a}_j^*-\widehat{a}_j)=O_{P^*}(1)\) due to Lemma 21 and Theorem 22, we can find a \(\lambda _m^*\) with \(0<\lambda _m^*<1\) such that \(p=\lambda _m^*\widehat{a}_0^*+(1-\lambda _m^*)\widehat{a}_1^*\). Consequently from (28), we get

$$\begin{aligned} \widehat{Q}_{p,\mathrm{mid},m}^*=\lambda _m^*v_0+(1-\lambda _m^*)v_1=v_1=Q_{p,\mathrm{mid}} \end{aligned}$$

with probability tending to one as \(v_0=v_1\). By analogue arguments, we get also \(\widehat{Q}_{p,\mathrm{mid},m}^*=v_d=Q_{p,\mathrm{mid}}\) with probability tending to one if \(p>F_\mathrm{mid}(v_d)\).

Similarly, for the second case of (29), with probability tending to one, we can find a \(\lambda _m^*\) with \(0<\lambda _m^*<1\) such that \(p=\lambda _m^* \widehat{F}_{\mathrm{mid},m}^*(v_{k+1})+(1-\lambda _m^*)F_{\mathrm{mid},m}^*(v_{k+2})\). Similar to (36), this leads to

$$\begin{aligned}&\sqrt{m}(\widehat{Q}_{p,\mathrm{mid},m}^*-\widehat{Q}_{p,\mathrm{mid}}) \\&\quad = (v_{k+1}-v_{k+2})\!\left[ \!\frac{\widehat{F}_{\mathrm{mid},m}^*(v_{k+2})-p}{\widehat{F}_{\mathrm{mid},m}^*(v_{k+2}){-}\widehat{F}_{\mathrm{mid},m}^*(v_{k+1})}-\frac{\widehat{F}_{\mathrm{mid},m}(v_{k+2})-p}{\widehat{F}_{\mathrm{mid},m}(v_{k+2})-\widehat{F}_{\mathrm{mid},m}(v_{k+1})}\!\right] \end{aligned}$$

which converges conditionally to the claimed normal distribution by Lemma 21 and Theorem 22 and by the \(\Delta \)-method similar to the proof of Theorem 14.

To prove (30), as \(\sqrt{m}(\widehat{F}_{\mathrm{mid},m}^*(v_{k+1})-\widehat{F}_\mathrm{mid}(v_{k+1}))=O_{P^*}(1)\) by Lemma 21 and Theorem 22, we get similar to the proof of Case 4 of Theorem 2 in Ma et al. (2011) that

$$\begin{aligned} \widehat{F}_\mathrm{mid}(v_{k+1})&= \widehat{F}_{\mathrm{mid},m}^*(v_{k+1})+1(\widehat{F}_{\mathrm{mid},m}^*(v_{k+1})\\&\ge \widehat{F}_\mathrm{mid}(v_{k+1}))\lambda _{m1}^*(\widehat{F}_{\mathrm{mid},m}^*(v_{k})-\widehat{F}_{\mathrm{mid},m}^*(v_{k+1})) \\&+1(\widehat{F}_{\mathrm{mid},m}^*(v_{k+1})< \widehat{F}_\mathrm{mid}(v_{k+1}))\lambda _{m2}^*(\widehat{F}_{\mathrm{mid},m}^*(v_{k+2})-\widehat{F}_{\mathrm{mid},m}^*(v_{k+1})) \end{aligned}$$

holds for some \(0\le \lambda _{m1}^*,\lambda _{m2}^*<1\). With \(\widetilde{Z}_m^*=\sqrt{m}(\widehat{F}_\mathrm{mid}(v_{k+1})-\widehat{F}_{\mathrm{mid},m}^*(v_{k+1}))\) and (28), this leads to

$$\begin{aligned} \widehat{Q}_{p,\mathrm{mid},m}^*&= 1(0\ge \widetilde{Z}_m^*)\left\{ \lambda _{m1}^*v_k+(1-\lambda _{m1}^*)v_{k+1}\right\} \\&\quad +1(0<\widetilde{Z}_m^*)\left\{ \lambda _{m2}^*v_{k+2}+(1-\lambda _{m2}^*)v_{k+1}\right\} \\&= v_{k+1}+\left( 1(0\ge \widetilde{Z}_m^*)\frac{v_{k+1}-v_k}{\widehat{F}_{\mathrm{mid},m}^*(v_{k+1})-\widehat{F}_{\mathrm{mid},m}^*(v_{k})}\right. \\&\qquad \qquad \quad \left. +1(0<\widetilde{Z}_m^*)\frac{v_{k+2}-v_{k+1}}{\widehat{F}_{\mathrm{mid},m}^*(v_{k+2})-\widehat{F}_{\mathrm{mid},m}^*(v_{k+1})}\right) \frac{\widetilde{Z}_m^*}{\sqrt{m}} \end{aligned}$$

and

$$\begin{aligned} \sqrt{m}(\widehat{Q}_{p,\mathrm{mid},m}^*-Q_{p,\mathrm{mid}})&= \left( 1(0\ge \widetilde{Z}_m^*)\frac{v_{k+1}-v_k}{\widehat{F}_{\mathrm{mid},m}^*(v_{k+1})-\widehat{F}_{\mathrm{mid},m}^*(v_{k})}\right. \\&\quad \quad \left. +1(0<\widetilde{Z}_m^*)\frac{v_{k+2}-v_{k+1}}{\widehat{F}_{\mathrm{mid},m}^*(v_{k+2})-\widehat{F}_{\mathrm{mid},m}^*(v_{k+1})}\right) \widetilde{Z}_m^*. \end{aligned}$$

Finally, we can show for all \(x\in \mathbb {R}\) that

$$\begin{aligned}&P^*\left( \sqrt{m}(\widehat{Q}_{p,\mathrm{mid},m}^*-Q_{p,\mathrm{mid}})\le x\right) \rightarrow \\&\quad {\left\{ \begin{array}{ll}0, &{} x<0,k=0 \\ P\left( \widetilde{Z}\le x\frac{F_\mathrm{mid}(v_{k+1})-F_\mathrm{mid}(v_{k})}{v_{k+1}-v_k}\right) , &{} x<0,k>0 \\ \frac{1}{2}, &{} x=0,k\in \{0,1,\ldots ,d-2\} \\ \frac{1}{2}+P\left( \widetilde{Z}\in (0,x\frac{F_\mathrm{mid}(v_{k+2})-F_\mathrm{mid}(v_{k+1})}{v_{k+2}-v_{k+1}}]\right) , &{} x>0,k<d-1 \\ 1, &{} x\ge 0,k=d-1 \end{array}\right. } \end{aligned}$$

in probability, where \(\widetilde{Z}\sim {\mathcal N}(0,\sigma _{\widetilde{Z}}^2)\) and

$$\begin{aligned} \sigma _{\widetilde{Z}}^2&= \sum _{h\in \mathbb {Z}} \mathrm{cov}\left( \sum _{j=1}^{k+1} 1(X_h\le v_j)-0.5\cdot 1(X_h=v_{k+1}),\right. \\&\qquad \qquad \qquad \left. \sum _{j=1}^{k+1} 1(X_0\le v_{k+1})-0.5\cdot 1(X_0=v_{k+1})\right) \end{aligned}$$

can be obtained from Lemma 21 and Theorem 22, respectively. This concludes this proof. \(\square \)

Proof of Corollary 18

First it follows from Theorem 17 and (31) that the distribution of \(\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\) converges in probability to the same limit as the distribution of \(\sqrt{n} (\widehat{Q}_{p,\mathrm{mid}}^*-Q_{p,\mathrm{mid}})\), i.e. either to zero or to one of the distributions of \(Z_1\) to \(Z_4\). To prove convergence of the corresponding distribution functions in the Kolmogorov-Smirnov metric we treat the different cases separately. First, let \(p<F_\mathrm{mid}(v_1)\) (or \(p> F_\mathrm{mid}(v_d)\) which can be considered in the same manner and hence, the proof is omitted). From Remark 15(i) we obtain

$$\begin{aligned}&\sup _{x\in \mathbb {R}}\left| P^*(\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\le x)- P(\sqrt{n} (\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\le x)\right| \\&\quad \le \sup _{x< 0}\left| P^*(\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\le x)- P(\sqrt{n} (\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\le x)\right| \\&\qquad + 1-P^*(\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\le 0)+1-P(\sqrt{n} (\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\le 0)\\&\quad \le \lim _{x\uparrow 0}P^*(\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\le x) + \lim _{x\uparrow 0} P(\sqrt{n} (\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\le x) + o_P(1)\\&\quad \le P^*(\widehat{Q}^*_{p,\mathrm{mid},m}<\widehat{Q}_{p,\mathrm{mid}}) + P(\widehat{Q}_{p,\mathrm{mid}}< Q_{p,\mathrm{mid}}) + o_P(1)\\&\quad =o_P(1). \end{aligned}$$

In the second and third case, i.e. when the limiting distribution is \(Z_1\) or \(Z_2\), Polya’s theorem can be applied do deduce convergence in the Kolmogorov–Smirnov metric from distributional convergence since the limiting distribution function is continuous. It remains to consider the \(p=F_\mathrm{mid}(v_1)\) and \(p=F_\mathrm{mid}(v_d)\). Since they are similar again, we focus on the first setup. With the same arguments as in the proof of Polya’s theorem we get

$$\begin{aligned}&\sup _{x\in \mathbb {R}}\left| P^*(\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\le x)- P(\sqrt{n} (\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\le x)\right| \\&\quad \le \sup _{x\le 0}\left| P^*(\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\le x)- P(\sqrt{n} (\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\le x)\right| \\&\qquad + o_P(1). \end{aligned}$$

Now, we proceed similarly to the first case and, finally, we get

$$\begin{aligned}&\sup _{x\le 0}\left| P^*(\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}-\widehat{Q}_{p,\mathrm{mid}})\le x)- P(\sqrt{n} (\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\le x)\right| \\&\quad \le \left| P^*(\widehat{Q}^*_{p,\mathrm{mid},m}<\widehat{Q}_{p,\mathrm{mid}})- P(\widehat{Q}_{p,\mathrm{mid}}<Q_{p,\mathrm{mid}})\right| \\&\qquad +\!\frac{1}{2} \!-\! P^*(\sqrt{m} (\widehat{Q}^*_{p,\mathrm{mid},m}\!-\!\widehat{Q}_{p,\mathrm{mid}})\le 0)\!+\!\frac{1}{2}- P(\sqrt{n} (\widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\le 0)\\&\quad = o_P(1). \end{aligned}$$

\(\square \)

Proof of Theorem 19

First, note that

$$\begin{aligned} P(Q_ {p,\mathrm{mid}}\in \mathrm{CI})&=P\Big (\sqrt{n}( \widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\in \mathrm{CI}_{S,\mathrm{mid}}^{(r)}\cap \mathrm{CI}_{S,\mathrm{mid}}^{(l)}\\&\quad \qquad +\mathrm{CI}_{L,\mathrm{mid}}\backslash ( \mathrm{CI}_{S,\mathrm{mid}}^{(r)}\cap \mathrm{CI}_{S,\mathrm{mid}}^{(l)}) \;1(Y\le p^*_\mathrm{mid})\\&\qquad \quad +\mathrm{CI}_{L,\mathrm{mid}}\backslash \mathrm{CI}^{(l)}_{S,\mathrm{mid}} \;1(Y> p^*_\mathrm{mid}, \;\mathrm{cov}_{S,\mathrm{mid}}^{(r)}\le 1-\alpha )\\&\qquad \quad +\mathrm{CI}_{L,\mathrm{mid}}\backslash {\mathrm{CI}}_{S,\mathrm{mid}}^{(r)} \;1(Y> p^*_\mathrm{mid}, \;\mathrm{cov}_{S,\mathrm{mid}}^{(r)}> 1-\alpha )\Big ), \end{aligned}$$

where \(+\) above indicates the disjoint union. We consider the rhs in a case-by-case manner.

The cases \(p<F_\mathrm{mid}(v_1)\) and \(p>F_\mathrm{mid}(v_d)\) can be treated similarly and we only give the calculations for the first setup. Here, \(\mathrm{cov}_{L,\mathrm{mid}}\mathop {\longrightarrow }\limits ^{{\mathcal P}}1\) and \(\mathrm{cov}_{S,\mathrm{mid}}^{(r)}\mathop {\longrightarrow }\limits ^{{\mathcal P}} 0 \) which then implies that \(p^*_\mathrm{mid}\mathop {\longrightarrow }\limits ^{{\mathcal P}}1-\alpha \). Now the proof can be carried out in complete analogy to the proof of Theorem 11 (case of \(F(Q_p)>p\)).

Also the cases where \(Z_1\) and \(Z_2\) are the limiting variables have a similar structure which results from continuity of the corresponding limiting cdf’s. Here, we get

$$\begin{aligned} P(Q_{p,\mathrm{mid}}\in \mathrm{CI})&=P\left( \sqrt{n}( \widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\in \mathrm{CI}_{S,\mathrm{mid}}^{(r)}\cap \mathrm{CI}_{S,\mathrm{mid}}^{(l)}\right) + o_P(1) \end{aligned}$$

where the latter probability tends to \(1-\alpha \) as \(n\rightarrow \infty \).

Next we consider the case \(p=F_\mathrm{mid}(v_1)\). Since \(\alpha \) is assumed to be less than \(1/2\) and the limiting distribution has a normal density on the positive half line, \(\mathrm{cov}_{S,\mathrm{mid}}^{(r)}\mathop {\longrightarrow }\limits ^{{\mathcal P}}1-\alpha /2\) which in turn implies

$$\begin{aligned} P(Q_{p,\mathrm{mid}}\in \mathrm{CI})&=P\left( \sqrt{n}( \widehat{Q}_{p,\mathrm{mid}}-Q_{p,\mathrm{mid}})\in \mathrm{CI}_{S,\mathrm{mid}}^{(r)}\cap \mathrm{CI}^{(l)}_{S,\mathrm{mid}}\right) \\&\quad +CI_{L,\mathrm{mid}}\backslash ( \mathrm{CI}_{S,\mathrm{mid}}^{(r)}\cap \mathrm{CI}^{(l)}_{S,\mathrm{mid}}) \;1(Y\le p^*_\mathrm{mid})\\&\quad +\mathrm{CI}_{L,\mathrm{mid}}\backslash \mathrm{CI}^{(r)}_{S,\mathrm{mid}} \;1(Y> p^*_\mathrm{mid}, \;\mathrm{cov}_{S,\mathrm{mid}}^{(r)}> 1-\alpha ) + \, o(1). \end{aligned}$$

Since \(p_\mathrm{mid}^*\mathop {\longrightarrow }\limits ^{{\mathcal P}} 1-\alpha \), it can be shown in analogy to the proof of Theorem 11 that \(P(Q_{p,\mathrm{mid}}\in \mathrm{CI})\mathop {\longrightarrow }\limits _{n\rightarrow \infty }1-\alpha \). It remains to investigate the case \(p=F_\mathrm{mid}(v_d)\). Here, \(\mathrm{cov}_{L,\mathrm{mid}}\mathop {\longrightarrow }\limits ^{{\mathcal P}}1-\alpha /2\) and \(\mathrm{cov}_{S,\mathrm{mid}}^{(r)}\mathop {\longrightarrow }\limits ^{{\mathcal P}}1/2 -\alpha /2\) which then implies that \(p_\mathrm{mid}^*\mathop {\longrightarrow }\limits ^{{\mathcal P}} 1-\alpha \). The desired result follows with the same arguments as before. \(\square \)

5.2 Auxiliary results

Theorem 20

(CLT under \(\tau \)-dependence) Suppose that \((X_t)_{t\in \mathbb {Z}}\) is a \(\tau \)-dependent process with \(\sum _{h=0}^\infty \tau (h)<\infty \). Then for all \(x_1,\dots , x_D\in \mathbb {R}, \;D\in \mathbb {N},\)

$$\begin{aligned} \frac{1}{\sqrt{n}} \sum _{t=1}^n (1(X_t\le x_1)-F(x_1),\dots , 1(X_t\le x_D)-F(x_D))'\mathop {\longrightarrow }\limits ^{\mathcal {D}} {\mathcal N}(\underline{0},\mathbf{W}) \end{aligned}$$

with

$$\begin{aligned} \mathbf{W}=\left( \sum _{h\in \mathbb {Z}} \mathrm{cov} (1(X_h\le x_{j_1}),1(X_0\le x_{j_2}))\right) _{j_1,j_2=1,\dots D}. \end{aligned}$$

Proof

We apply the multivariate central limit theorem for weakly dependent data of Leucht and Neumann (2013, Theorem 6.1). To this end, we check its prerequisites with \(Z_t:=(1(X_t\le x_1)-F(x_1),\dots , 1(X_t\le x_D)-F(x_D))'/\sqrt{n}\). Obviously, these variables are centered and \(\sum _{t=1}^n E\Vert Z_t\Vert _2^2<\infty \). Also the Lindeberg condition clearly holds true by stationarity and boundedness of the underlying process \((X_t)_{t\in \mathbb {Z}}\). Next we have to show that

$$\begin{aligned} \left[ \mathrm{cov}\left( \sum _{t=1}^n Z_t\right) \right] _{j_1,j_2}\mathop {\longrightarrow }\limits _{n\rightarrow \infty }W_{j_1,j_2}. \end{aligned}$$

We consider the component-wise absolute difference between both terms

$$\begin{aligned}&\left| \frac{1}{n}\sum _{s,t=1}^n \mathrm{cov}(1(X_s\le x_{j_1}),1(X_t\le x_{j_2}))-\sum _{h\in \mathbb {Z}}\mathrm{cov}(1(X_h\le x_{j_1}),1(X_0\le x_{j_2}))\right| \\&\quad \le \sum _{h\in \mathbb {Z}} \min \left\{ \frac{|h|}{n},1\right\} |{\text {cov}} (1(X_h\le x_{j_1}),1(X_0\le x_{j_2}))| \end{aligned}$$

which converges to zero by dominated convergence theorem if \(\sum _{h\in \mathbb {Z}} |\mathrm{cov} (1(X_h\le x_{j_1}),1(X_0\le x_{j_2}))|<\infty \). This in turn can be deduced from the presumed summability of the \(\tau \)-coefficients if \(|\mathrm{cov} (1(X_h\le x_{j_1}),1(X_0\le x_{j_2}))|\le \mathrm{const.}\, \tau (h)\). To see this, first note that for any \( \nu <\min _{k}\{v_{k+1}-v_k\},\) and for \(v_{k}\le x< v_{k+1}\)

$$\begin{aligned} 1({X_1\le x})=1({X_1\le v_k})=1(X_1\le v_k+\nu )-\frac{X_1-v_k}{\nu } 1(v_k\le X_1\le v_k+\nu )\quad \mathrm{a.s.} \end{aligned}$$

where the rhs is a Lipschitz continuous function in \(X_1\). Now we use coupling arguments to obtain an upper bound for the absolute values of the covariances under consideration when \(h>0\). The case \(h<0\) can be treated similarly and is therefore omitted. Let \(\widetilde{X}_{h}\) denote a copy of \(X_h\) that is independent of \(X_0\) and such that \(E|\widetilde{X}_{h}-X_h|\le \tau (h)\). With \(x_{j_1} \in [v_k,v_{k+1})\) for a suitable \(k\), we obtain

$$\begin{aligned}&|\mathrm{cov} (1(X_h\le x_{j_1}),1(X_0\le x_{j_2}))|\nonumber \\&\quad \le E\left| 1(X_h\le x_{j_1})-1(\widetilde{X}_h\le x_{j_1})\right| \nonumber \\&\quad \le E\left| 1(X_h\le v_k+\nu )-\frac{X_h-v_k}{\nu } 1(v_k\le X_h\le v_k+\nu )-1(\widetilde{X}_h\le v_k+\nu )\right. \nonumber \\&\qquad \left. +\frac{\widetilde{X}_h-v_k}{\nu } 1(v_k\le \widetilde{X}_h\le v_k+\nu )\right| \nonumber \\&\quad \le \frac{1}{\nu } E|\widetilde{X}_{h}-X_h|\nonumber \\&\quad \le \frac{\tau (h)}{\nu }. \end{aligned}$$
(39)

Finally we have to check two conditions of weak dependence. Let \(g:\mathbb {R}^{du}\rightarrow \mathbb {R}\) be a measurable function with \(\Vert g\Vert _\infty \le 1\) and \(1\le s_1<s_2<\dots <s_u<s_u+h=t_1\le t_2\in \mathbb {N}\). Again, in analogy to (39), we obtain

$$\begin{aligned} \mathrm{cov}(g(Z_{s_1},\dots , Z_{s_u})Z_{s_u,j_1},Z_{t_1,j_2})\le \frac{1}{\nu \,n}\,\tau (t_1-s_u), \end{aligned}$$

which implies condition (6.27) with \(\theta _h=\tau (h)/\nu \) in Leucht and Neumann (2013). Validity of their condition (6.28) follows from

$$\begin{aligned} \mathrm{cov}(g(Z_{s_1},\dots , Z_{s_u}),Z_{t_1,j_1}\,Z_{t_2,j_2})\le \frac{4}{\nu \,n}\tau (t_1-s_u), \end{aligned}$$

which completes the proof of the multivariate CLT. \(\square \)

Lemma 21

(Bootstrap analogue to Theorem 20 for i.i.d. data) Suppose that \((X_t)_{t\in \mathbb {Z}}\) is a sequence of i.i.d. random variables. Let \(X_1^*,\dots , X_m^*\) be drawn independently from \(\widehat{F}_n\). Suppose that \(m\rightarrow \infty \) and \(m=o(n)\) or \(m=n\). Then, for all \(x_1,\dots , x_D\in \mathbb {R}, \;D\in \mathbb {N},\)

$$\begin{aligned} \frac{1}{\sqrt{m}} \sum _{t=1}^m (1(X_t^*\le x_1)-\widehat{F}_n(x_1),\dots , 1(X_t^*\le x_D)-\widehat{F}_n(x_D))'\mathop {\longrightarrow }\limits ^{D}{\mathcal N}(\underline{0},\mathbf{W}) \end{aligned}$$

in probability, where

$$\begin{aligned} \mathbf{W}=\big ( \mathrm{cov} (1(X_0\le x_{j_1}),1(X_0\le x_{j_2}))\big )_{j_1,j_2=1,\dots D}. \end{aligned}$$

Proof

This is an immediate consequence of Theorem 2.2 in Bickel and Friedman (1981). \(\square \)

Theorem 22

(Block bootstrap analogue to Theorem 20) Suppose that the assumptions of Theorem 20 hold true and that \(\sum _{h=1}^\infty h\,\tau (h)<\infty \). Let \(X_1^*,\dots , X_m^*\) be an \(m\)-out-of-\(n\) block bootstrap sample. Suppose that \(l/m+1/l=o(1)\) as well as \(m=o(n)\) or \(m=n\) as \(n\rightarrow \infty \). Then, for all \(x_1,\dots , x_D\in \mathbb {R}, \;D\in \mathbb {N}\),

$$\begin{aligned} \frac{1}{\sqrt{m}} \sum _{k=1}^m (1(X_k^*\le x_1)-\widehat{F}_n(x_1),\dots , 1(X_k^*\le x_D)-\widehat{F}_n(x_D))'\mathop {\longrightarrow }\limits ^{D}{\mathcal N}(\underline{0},\mathbf{W}) \end{aligned}$$

in probability, where

$$\begin{aligned} \mathbf {W}=\left( \sum _{h\in \mathbb {Z}} \mathrm{cov}\left( 1(X_h\le x_{j_1}),1(X_0\le x_{j_2})\right) \right) _{j_1,j_2=1,\ldots ,D}. \end{aligned}$$

Proof

For notational convenience, we suppose \(m=lb\) and let us introduce the notation

$$\begin{aligned} Z_k^*&= \frac{1}{\sqrt{m}}\left( 1(X^*_k\le x_1)-\widehat{F}_n(x_1),\ldots ,1(X^*_k\le x_D)-\widehat{F}_n(x_D)\right) ^\prime , \\ \widetilde{Z}_k^*&= \frac{1}{\sqrt{m}}\left( 1(X^*_k\le x_1)-E^*(1(X^*_k\le x_1)),\ldots ,1(X^*_k\le x_D)\!-\!E^*(1(X^*_k\le x_D))\right) ^\prime , \end{aligned}$$

such that it suffices to show \(\sum _{k=1}^m (Z_k^*-\widetilde{Z}_k^*)=o_{P^*}(1)\) and \(\sum _{k=1}^m \widetilde{Z}_k^*\mathop {\longrightarrow }\limits ^{D}{\mathcal N}(\underline{0},\mathbf{W})\) in probability. Considering the first part component-wise, for all \(j\), we get

$$\begin{aligned} \sum _{k=1}^m (Z_k^*-\widetilde{Z}_k^*)_j&= \frac{1}{\sqrt{m}}\sum _{k=1}^m \left( E^*(1(X^*_k\le x_j))-\widehat{F}_n(x_j)\right) \\&= \sqrt{m}\left( \frac{1}{n-l+1}\sum _{t=1}^n 1(X_t\le x_j)-\frac{1}{n}\sum _{t=1}^n 1(X_t\le x_j)\right) \\&+\frac{\sqrt{m}}{n-l+1}\sum _{t=1}^{l-1} \frac{t-l}{l}1(X_t\le x_j)\\&+\frac{\sqrt{m}}{n-l+1}\sum _{t=n-l+2}^n \frac{n-l+1-t}{l}1(X_t\le x_j) \\&= A_1+A_2+A_3. \end{aligned}$$

Taking unconditional expectation of the last right-hand side gives a zero such that it suffices to show \(A_i-E(A_i)=o_P(1)\) for \(i=1,2,3\). For the first term, we get from Theorem 20 that

$$\begin{aligned} A_1-E(A_1)&= \frac{\sqrt{m}}{\sqrt{n}} \frac{l-1}{n-l+1}\left( \frac{1}{\sqrt{n}}\sum _{t=1}^n \big (1(X_t\le x_j)-E(1(X_t\le x_j))\big )\right) \\&= O_P\left( \frac{\sqrt{m}}{\sqrt{n}} \frac{l}{n}\right) \end{aligned}$$

vanishes as \(l=o(m)\) by assumption. For the second term, we obtain

$$\begin{aligned} \mathrm{var}(A_2)&= \frac{m}{(n-l+1)^2}\sum _{t_1,t_2=1}^{l-1} \frac{(t_1-l)(t_2-l)}{l^2}\mathrm{cov}(1(X_{t_1}\le x_j),1(X_{t_2}\le x_j)) \\&\le \frac{ml}{(n-l+1)^2}\sum _{h=-(l-2)}^{l-2} \frac{1}{l}\sum _{t=\max (1,1-h)}^{\min (l-1,l-1-h)}\left| \frac{(h+t-l)(t-l)}{l^2}\right| \\&\quad \cdot |\mathrm{cov}(1(X_{h+t}\le x_j),1(X_{t}\le x_j))| \\&= O\left( \frac{ml}{n^2}\right) \end{aligned}$$

since the covariances are summable by \(\sum _{h=1}^\infty \tau (h)<\infty \); see also (39) for details. Hence \(A_2\) vanishes under the same conditions for \(l\) and \(m\) as for term \(A_1\) above. The arguments for \(A_3\) are completely analogue and we omit the details. To prove the (conditional) CLT along the lines of Section 4.2.2 in Wieczorek (2014) for

$$\begin{aligned} \sum _{k=1}^m \widetilde{Z}_k^*=\sum _{r=1}^b\left( \sum _{s=(r-1)l+1}^{rl} \widetilde{Z}_s^*\right) =: \sum _{r=1}^b \widetilde{Y}_r^*, \end{aligned}$$

observe that \(\{\widetilde{Y}_r^*,r=1,\ldots ,b\}\) forms a triangular array of (conditionally) i.i.d.  random variables with \(E^*(\widetilde{Y}_r^*)=0\) by construction. Further, for [the \((j_1,j_2)\)-component of] the conditional covariance, we have

$$\begin{aligned} \left[ \mathrm{cov}^*\left( \sum _{r=1}^b\widetilde{Y}_r^*\right) \right] _{j_1,j_2}&= \sum _{r=1}^b \sum _{s_1,s_2=(r-1)l+1}^{rl} \mathrm{cov}^*\left( \widetilde{Z}_{s_1,j_1}^*,\widetilde{Z}_{s_2,j_2}^*\right) \\&= \frac{1}{l}\sum _{s_1,s_2=1}^l \mathrm{cov}^*(1(X_{s_1}^*\le x_{j_1}),1(X_{s_2}^*\le x_{j_2})) \\&= \frac{1}{l}\sum _{s_1,s_2=1}^l\left( \frac{1}{n-l+1}\sum _{t=0}^{n-l} 1(X_{t+s_1}\le x_{j_1})1(X_{t+s_2}\le x_{j_2})\right) \\&\quad -\frac{1}{l}\left( \frac{1}{n-l+1}\sum _{t_1=0}^{n-l}\sum _{s_1=1}^l 1(X_{t_1+s_1}\le x_{j_1})\right) \\&\quad \times \left( \frac{1}{n-l+1}\sum _{t_2=0}^{n-l}\sum _{s_2=1}^l 1(X_{t_2+s_2}\le x_{j_2})\right) . \\&= \frac{1}{l}\sum _{s_1,s_2=1}^l\left( \frac{1}{n-l+1}\sum _{t=0}^{n-l}T_{t+s_1,j_1}T_{t+s_2,j_2}\right) \\&\quad -\left( \frac{1}{\sqrt{l} (n-l+1)}\sum _{t_1=0}^{n-l}\sum _{s_1=1}^l T_{t_1+s_1,j_1}\right) \\&\quad \times \left( \frac{1}{\sqrt{l} (n-l+1)}\sum _{t_2=0}^{n-l}\sum _{s_2=1}^l T_{t_2+s_2,j_2}\right) \\&=: I_1-I_2*I_3, \end{aligned}$$

where we have set \(T_{t,j}=1(X_t\le x_j)-P(X_t\le x_j)\). The terms \(I_2\) and \(I_3\) behave similarly and we only consider \(I_2\). Since \(E I_2=0\), we show \(I_2=o_P(1)\) by proving that its variance vanishes asymptotically. We get

$$\begin{aligned} \mathrm{var}(I_2)&\le \frac{1}{n-l+1}\sum _{h_1=-(n-l)}^{n-l}\sum _{h_2=-(l-1)}^{l-1}\left( \frac{n-l+1-|h_1|}{n-l+1}\right) \left( \frac{l-|h_2|}{l}\right) \\&\quad \times \left| \mathrm{cov}(1(X_{h_1+h_2}\le j_1),1(X_{0}\le j_1))\right| \end{aligned}$$

which is of order \(O(l/n)\) by (39).

By taking unconditional expectation of the first term \(I_1\), we obtain

$$\begin{aligned} E(I_1)=\sum _{h=-(l-1)}^{l-1} \frac{l-|h|}{l} \mathrm{cov}(1(X_h\le x_{j_1}),1(X_0\le x_{j_2})) \end{aligned}$$

and, by dominated convergence, the latter tends to \(W_{j_1,j_2}\) as desired. Hence, it remains to show that \(\mathrm{var}(I_1)=o(1)\) holds. By rewriting the arising covariances in terms of cumulants, we get

where we have used that \(\mathrm{cum}(A,B,C,D)=E(ABCD)-E(AB)E(CD)-E(AC)E(BD)-E(AD)E(BC)\) for centered random variables \(A,B,C,D\) holds. As \(E\left( T_{t_1+s_1,j_1}T_{t_2+s_3,j_1}\right) =\mathrm{cov}(1(X_{t_1+s_1\le x_{j_1}}),1(X_{t_2+s_3\le x_{j_1}}))\le C\tau (|t_1+s_1-t_2-s_3|)\), by invoking the covariance inequality (39), the first and second summands on the rhs above can shown to be of order \(O(l/n)\).

Next, we establish an upper bound bound for \(|cum(T_{t_1,j_1},T_{t_2,j_2},T_{t_3,j_3},T_{t_4,j_4})|\), where we assume w.l.o.g. that \(t_1\le \cdots \le t_4\). Let \(R=\max \{t_4-t_3,t_3-t_2,t_2-t_1\}\). We consider each of the three possible values of \(R\) separately. First, suppose that \(R=t_4-t_3\). Then using the same coupling techniques as in the proof of Theorem 20, we get similarly to (39)

$$\begin{aligned} |\mathrm{cum}(T_{t_1,j_1},T_{t_2,j_2},T_{t_3,j_3},T_{t_4,j_4})|&\le C\tau (R)\,[1+|ET_{t_1,j_1}T_{t_2,j_2}|+|ET_{t_1,j_1}T_{t_3,j_3}|\nonumber \\&\quad +|ET_{t_2,j_2}T_{t_3,j_3}|]\nonumber \\&\le 4C \,\tau (R) \end{aligned}$$
(40)

with some finite constant \(C\) since \(\Vert T_{t_l,j_l}\Vert _\infty \le 1\). If \(R=t_3-t_2\), we obtain

$$\begin{aligned} |\mathrm{cum}(T_{t_1,j_1},T_{t_2,j_2},T_{t_3,j_3},T_{t_4,j_4})|&\le cov(T_{t_1,j_1}T_{t_2,j_2},T_{t_3,j_3}T_{t_4,j_4})\nonumber \\&\quad +C\, \tau (R)\,[|ET_{t_2,j_2}T_{t_4,j_4}|+|ET_{t_1,j_1}T_{t_4,j_4}|]\nonumber \\&\le 4C\, \tau (R). \end{aligned}$$
(41)

Finally, in case of \(R=t_2-t_1\) the cumulant can be bounded as follows

$$\begin{aligned} |\mathrm{cum}(T_{t_1,j_1},T_{t_2,j_2},T_{t_3,j_3},T_{t_4,j_4})|&\le 3C\tau (R)+\,C \tau (R)\,[|ET_{t_3,j_3}T_{t_4,j_4}|] \nonumber \\&\quad +\, |ET_{t_2,j_2}T_{t_4,j_4}| \,+\, |ET_{t_2,j_2}T_{t_3,j_3}|]\nonumber \\&\le 6C\,\tau (R). \end{aligned}$$
(42)

To sum up, we obtain

$$\begin{aligned} \mathrm{var}(I_1)&\le \frac{1}{l^2(n-l+1)^2}\sum _{s_1,s_2,s_3,s_4=1}^l\sum _{t_1,t_2=0}^{n-l}|\mathrm{cum}\left( T_{t_1+s_1,j_1},T_{t_1+s_2,j_2},T_{t_2+s_3,j_1},T_{t_2+s_4,j_2}\right) |+o(1)\\&\le \frac{6C\,l}{n} \sum _{h=1}^{n-l}h\,\tau (h) +o(1), \end{aligned}$$

which vanishes asymptotically since we assumed \(\sum _{h=1}^\infty h\, \tau (h)<\infty \).

To complete the proof of the bootstrap CLT, it remains to show the Lindeberg condition to be able to apply (a multivariate version of) Lindeberg-Feller’s CLT for independent triangular arrays. That is, as \(\mathrm{cov}^*(\sum _{r=1}^b\widetilde{Y}_r^*)=O_P(1)\) holds by the calculations above, for all \(\epsilon >0\), it remains to show

$$\begin{aligned} \sum _{r=1}^b E^*\left( \Vert \widetilde{Y}_r^*\Vert _2^21(\Vert \widetilde{Y}_r^*\Vert _2\ge \epsilon )\right) =b\, E^*\left( \Vert \widetilde{Y}_1^*\Vert _2^21(\Vert \widetilde{Y}_1^*\Vert _2\ge \epsilon )\right) =o_P(1) \end{aligned}$$

as \(\{\widetilde{Y}_r^*,r=1,\ldots ,b\}\) forms a triangular array of (conditionally) i.i.d.  random variables. Computing the conditional expectation leads to

$$\begin{aligned} b\, E^*\left( \Vert \widetilde{Y}_1^*\Vert _2^21(\Vert \widetilde{Y}_1^*\Vert _2\ge \epsilon )\right)&= \frac{b}{n-l+1}\sum _{t=0}^{n-l}\left\| \sum _{s=1}^l \widetilde{Z}_{s+t}\right\| _2^2\,1\left( \Vert \sum _{s=1}^l \widetilde{Z}_{s+t}\Vert \ge \epsilon \right) ,\nonumber \\ \end{aligned}$$
(43)

where

$$\begin{aligned} \widetilde{Z}_{t+s}&= \frac{1}{\sqrt{m}}\big (1(X_{t+s}\le x_1)-E^*(1(X_s^*\le x_1)),\ldots ,1(X_{t+s}\le x_D)\\&\quad -E^*(1(X_s^*\le x_D))\big )^\prime \end{aligned}$$

with \(E^*(1(X_s^*\le x_1))=\frac{1}{n-l+1}\sum _{t_1=0}^{n-l} 1(X_{t_1+s}\le x_i)\). Now, we want to replace \(\Vert \sum _{s=1}^l \widetilde{Z}_{s+t}\Vert _2^2\) by \(\Vert \sum _{s=1}^l Z_{s+t}\Vert _2^2\), where

$$\begin{aligned} Z_{t+s}=\frac{1}{\sqrt{m}}\big (1(X_{t+s}\le x_1)-F( x_1),\ldots ,1(X_{t+s}\le x_D)-F( x_D)\big )^\prime , \end{aligned}$$

which leads to the upper bound

$$\begin{aligned}&\frac{2b}{n-l+1}\sum _{t=0}^{n-l}\left\| \sum _{s=1}^l Z_{s+t}\right\| _2^2\,1\left( \Vert \sum _{s=1}^l \widetilde{Z}_{s+t}\Vert \ge \epsilon \right) \\&\quad +\frac{2b}{n-l+1}\sum _{t=0}^{n-l}\left\| \sum _{s=1}^l (\widetilde{Z}_{s+t}-Z_{s+t})\right\| _2^2\,1\left( \Vert \sum _{s=1}^l \widetilde{Z}_{s+t}\Vert \ge \epsilon \right) \\&=: II_1+II_2 \end{aligned}$$

for (43). Considering the second summand above component-wise, it is straightforward to show that for all \(j\), it holds

$$\begin{aligned} \sum _{s=1}^l (Z_{s+t}-\widetilde{Z}_{s+t})=\frac{1}{\sqrt{m}(n-l+1)}\sum _{s=1}^l\sum _{t_1=0}^{n-l}(T_{t_1+s,1},\ldots , T_{t_1+s,D})' \end{aligned}$$

which is independent of \(t\), such that with \(T_t=(T_{t,1},\ldots , T_{t,D})'\)

$$\begin{aligned} II_2\le \frac{2b}{m}\left\| \frac{1}{n-l+1}\sum _{s=1}^l\sum _{t_1=0}^{n-l}T_{t_1+s}\right\| _2^2=O_P\left( \frac{l}{n}\right) \end{aligned}$$

by the same arguments as used before to treat \(I_2\). Concerning \(II_1\), as all summands are non-negative, it suffices to show \(E|II_1|=E(II_1)=o(1)\). From stationarity and by application of Cauchy–Schwarz inequality, we get

$$\begin{aligned} E(II_{1}^{2})&= E\left( 2b\left\| \sum _{s=1}^l Z_s\right\| _2^21\left( \Vert \sum _{s=1}^l \widetilde{Z}_s\Vert _2\ge \epsilon \right) \right) ^{2}\\&\le 4b^2E\left( \left\| \sum _{s=1}^l Z_s\right\| _2^4\right) P\left( \left\| \sum _{s=1}^l \widetilde{Z}_s\right\| _2\ge \epsilon \right) . \end{aligned}$$

As the second factor above is vanishing by Markov inequality and since \(E(\Vert \sum _{s=1}^l \widetilde{Z}_s\Vert _2^2)=O(l/m)\), it remains to show that \(b^2E(\Vert \sum _{s=1}^l Z_{s,j}\Vert _4^4)=O(1)\) for all \(j\). Rewriting things in terms of cumulants, we get

$$\begin{aligned} b^2E\left( \Vert \sum _{s=1}^l \widetilde{Z}_{s,j}\Vert _4^4\right)&= b^2\sum _{s_1,s_2,s_3,s_4=1}^l E(Z_{s_1,j}Z_{s_2,j}Z_{s_3,j}Z_{s_4,j}) \\&= 3\left( \frac{1}{l}\sum _{s_1,s_2=1}^l \mathrm{cov}(1(X_{s_1}\le x_j),1(X_{s_2}\le x_j))\right) ^2 \\&+\frac{1}{l^2}\sum _{s_1,s_2,s_3,s_4=1}^l \mathrm{cum}\left( T_{s_1,j},T_{s_2,j},T_{s_3,j},T_{s_4,j}\right) \end{aligned}$$

as higher-order cumulants are invariant to shifts. The first summand on the last rhs is uniformly bounded. The second summand is also of order \(O(1)\) by (40) to (42) and \(\sum _{h=1}^\infty h\,\tau (h)<\infty \). \(\square \)