Functional Uniform-in-Bandwidth Moderate Deviation Principle for the Local Empirical Processes Involving Functional Data

Berrahou, Nour-Eddine; Bouzebda, Salim; Douge, Lahcen

doi:10.3103/S1066530724700030

Functional Uniform-in-Bandwidth Moderate Deviation Principle for the Local Empirical Processes Involving Functional Data

Published: 25 April 2024

Volume 33, pages 26–69, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Mathematical Methods of Statistics Aims and scope Submit manuscript

Functional Uniform-in-Bandwidth Moderate Deviation Principle for the Local Empirical Processes Involving Functional Data

Download PDF

Nour-Eddine Berrahou¹,
Salim Bouzebda² &
Lahcen Douge¹

40 Accesses
2 Citations
Explore all metrics

Abstract

Our research employs general empirical process methods to investigate and establish moderate deviation principles for kernel-type function estimators that rely on an infinite-dimensional covariate, subject to mild regularity conditions. In doing so, we introduce a valuable moderate deviation principle for a function-indexed process, utilizing intricate exponential contiguity arguments. The primary objective of this paper is to contribute to the existing literature on functional data analysis by establishing functional moderate deviation principles for both Nadaraya–Watson and conditional distribution processes. These principles serve as fundamental tools for analyzing and understanding the behavior of these processes in the context of functional data analysis. By extending the scope of moderate deviation principles to the realm of functional data analysis, we enhance our understanding of the statistical properties and limitations of kernel-type function estimators when dealing with infinite-dimensional covariates. Our findings provide valuable insights and contribute to the advancement of statistical methodology in functional data analysis.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 INTRODUCTION

The regression analysis has proved to be a flexible tool and provided a powerful statistical modeling framework in various applied and theoretical contexts where one intends to model the predictive relationship between related responses and predictors. It is worth noticing that the parametric regression models provide useful tools for analyzing practical data when the models are correctly specified but may suffer from large modelling biases if the structures of the models are misspecified, which is the case in many practical problems. As an alternative, nonparametric smoothing methods ease the concerns of modeling biases. Kernel nonparametric function estimation methods are popular, presenting only one of many approaches to constructing good function estimators, including nearest-neighbor, spline and wavelet methods. These methods have been applied to a wide variety of data. In the present paper, we shall focus on constructing consistent kernel-type estimators. For good sources of references to the research literature in this area along with statistical applications, consult [20, 26, 27, 30, 31, 35, 39, 51, 52, 57, 76, 78, 106, 108, 124, 126, 128, 130, 136, 138] and the references therein.

Recently, increasing interest has been given to regression models in which the response variable is real-valued, and the explanatory variable takes the form of smooth functions that vary randomly between repeated observations or measurements. Statistical problems related to the study of functional random variables, that is to say, variables with values in an infinite-dimensional space, have known, since the last decades, a growing interest in the statistics literature. The development of this research theme is indeed motivated by the abundance of data measured on an increasingly fine temporal/spatial grid as is the case, for instance, in meteorology, medicine, satellite imagery, and many other research areas. Thus, the statistical modeling of these data, seen as random functions, led to several challenging theoretical and numerical research questions. For an overview of theoretical as well as practical aspects of functional data analysis, the reader can refer to the monographs of [14] for linear models for random variables taking values in a Hilbert space, [120] for scalar-on-function and function-on-function linear models, functional principle component analysis, and parametric discriminant analysis. Ferraty and Vieu [62] however focused more on nonparametric methods and mainly kernel-type estimation for scalar-on-function nonlinear regression models. They extended such tools to classification and discrimination analysis. Horváth and Kokoszka [82] discussed the generalization of several interesting concepts in statistics, such as goodness-of-fit tests, portmanteau tests, and change detection, to the functional data framework. For the latest contributions in FDA and its related topics, one can refer to [1–4, 18, 25, 37, 54, 93, 103, 139]. In various scenarios, there is a keen interest in gauging the rate at which specific probabilities converge. Frequently, these probabilities exhibit rapid exponential convergence. Numerous researchers have explored large deviations and unearthed various applications, primarily within the realm of mathematical physics. To be more precise, the utilization of large deviation results in the fields of probability and statistics encompasses a wide range of applications. The primary application of statistics is in the evaluation of efficiency through the comparison of test procedures. This allows for the determination of the procedure that is most efficient based on the minimum amount of data required to achieve predefined performance levels, which are typically expressed in terms of the risk of the first kind, test power, and alternative hypotheses. For further information, refer to [112]. Additional applications arise in the assessment of estimate techniques, wherein the inaccuracy rates associated with each method are taken into account and subsequently compared. It is important to observe that large deviation results can serve as effective tools for establishing the consistency of estimators and their convergence rates. Valuable resources on the subject of large deviations can be found in [9, 49, 50, 132]. Extending beyond the conventional results of weak and strong convergence in regression analysis, the problem of functional moderate deviations introduces new challenges that these existing frameworks cannot readily address. There exists an extensive large and moderate deviation literature involving many areas of probability and statistics. We refer to the book of [49] and the references therein for an account of large deviation results and applications. In the nonparametric function estimation setting, several results have been obtained in these last years. We refer to [114], where the studies involve the Nadaraya–Watson and histogram estimates of the regression function, respectively both in the real vector case. Louani and Ould Maouloud [95] established a large deviation principle (LDP, in short) for a vector process, allowing to derive LDP’s for the kernel density and regression function estimators by the contraction principle. Further results for the multivariate regression estimates are due to [105], where large together with moderate deviation principles are stated for the Nadaraya–Watson estimator as well as for the semi-recursive kernel estimator. Large deviation results for the kernel regression function estimate on a functional covariate are obtained by [95]. For more references we refer to [10–12, 42, 60, 66, 75, 84–86, 88, 104, 117, 119, 123, 125, 133].

The selection of the kernel function in our setup is mostly unconstrained, except for meeting certain mild requirements that will be provided subsequently. However, the choice of bandwidth presents a greater challenge. It is important to acknowledge that the selection of the bandwidth plays a critical role in achieving a good rate of consistency. Specifically, it significantly impacts the magnitude of bias in the estimate. Broadly speaking, our focus lies in determining bandwidth that yields an estimator exhibiting a favorable trade-off between the bias and variance of the estimators under consideration. It is more suitable to evaluate the variability of bandwidth based on the applied criteria, available data, and location, which cannot be obtained by conventional methods. For further elaboration and analysis on the subject, readers are encouraged to consult [96]. The main aim of the present paper is to establish large deviation and moderate deviation^{Footnote 1} results in the functional data uniformly in the bandwidth. The uniform in bandwidth problem has attracted great attention, we refer to [15, 17, 21, 22, 25, 27, 29, 33, 34, 43, 47, 53, 59, 96, 98, 99]. To effectively tackle the challenges posed by functional moderate deviations, novel methodologies, and theoretical frameworks must be developed. Advanced statistical techniques, such as empirical process theory offer promising avenues for addressing these challenges. By extending our focus beyond the conventional regression models and embracing the complexities of functional moderate deviations, we can enhance our understanding of the behavior and limitations of kernel estimators. This, in turn, provides valuable insights into the underlying processes and patterns within functional data.

The layout of the article is as follows. Section 2 gives the notation and the definitions we need. Section 3 shows the moderate deviation principle, which is equivalent to the moderate deviation principle of the finite-dimensional distributions, given in Theorem 3.1, plus an exponential asymptotic equicontinuity condition concerning a pseudometric, given in Theorem 3.2. Section 4 provides applications of our main results including the kernel regression function estimate in Subsection 4.1, the kernel conditional distribution function in Subsection 4.2, The kernel quantile regression in Subsection 4.3, the kernel conditional density function in Subsection 4.4 and finally the kernel conditional copula function in Subsection 4.5. We discuss a bandwidth choice for practical use in Section 5. A summary of the findings highlighting remaining open issues may be found in Section 6. All proofs are deferred to Section 7. Due to the lengthiness of the proofs, we limit ourselves to the most important arguments. Finally, a few relevant technical results are given in Appendix A.

2 THE GENERAL PROCESS

We consider a sequence $\{\left(X_{i},Y_{i}\right):i\geq 1\}$ of i.i.d. pairs of random copies of the random element [rv] $(X,Y)$, where $X$ takes its values in some abstract space $\mathcal{E}$ and $Y$ is a $\mathbb{R}^{q}$-valued random variable, $q\geq 1,$ with density $g(\cdot)$, concerning the Lebesgue measure on $\mathbb{R}^{q}$. Suppose that $\mathcal{E}$ is endowed with a semi-metric $d(\cdot,\cdot)$ ^{Footnote 2} defining a topology to measure the proximity between two elements of $\mathcal{E}$ and which is disconnected from the definition of $X$ to avoid measurability problems. This covers the case of semi-normed spaces of possibly infinite dimension (e.g., Hilbert or Banach spaces). We will consider especially the conditional expectation of $l(Y)$ given $X=x$,

$$r^{l}(x):=\mathbb{E}(l(Y_{1})|X_{1}=x),\quad x\in\mathcal{E},$$

whenever this regression function is meaningful. Here and elsewhere, $l(\cdot)$ denotes a specified measurable function from $\mathbb{R}^{q}$ to $\mathbb{R}$, which is assumed to be bounded on each compact sub-interval of $\mathbb{R}^{q}$. The general Nadaraya–Watson [107, 137] type estimator of $r^{l}(\cdot)$ has been introduced by [61], for $l(y)=y$. It is defined, for any fixed $x\in\mathcal{E}$, by

$$\widehat{r}_{n}^{l}(x,h)=\frac{\sum_{i=1}^{n}l(Y_{i})K(h^{-1}d(x,X_{i}))}{\sum_{i=1}^{n}K(h^{-1}d(x,X_{i}))}:=\frac{\widehat{r}_{n,2}^{l}(x,h)}{\widehat{r}_{n,1}^{l}(x,h)},$$

(2.1)

where $K(\cdot)$ is a real-valued kernel function, $h$ is the bandwidth parameter and, for $k=1,2$ and

$$\widehat{r}_{n,k}^{l}(x,h)=\frac{1}{n\mathbb{E}[\Delta_{1}(x,h)]}\sum_{i=1}^{n}l^{k-1}(Y_{i})\Delta_{i}(x,h),$$

where $\Delta_{i}(x,h)=K(h^{-1}d(x,X_{i}))$. Notice that taking $l_{A}(y)={\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{y\in A\}}$ in the statement (2.1), where $A$ is a subset of $\mathbb{R}$, we obtain the well-known kernel estimator $\hat{\mu}(A|x)$ of the conditional empirical measure

$$\mu(A|x):=\mathbb{P}(Y_{1}\in A|X=x).$$

Properties of $\hat{\mu}(A|x)$, whenever $A=(\infty,t]$ for $t\in\mathbb{R}$ and $x\in\mathbb{R}$, have been investigated by several authors among whom we cite [16, 17, 121, 122, 129]. In the functional data case, see [18, 65, 103].

The purpose of this paper is to establish some general moderate deviation results which allow us to derive, under mild regularity conditions, as a by-product the uniform functional moderate deviation principle for the kernel $l$-indexed regression function estimator, whenever $l(\cdot)$ belongs to an appropriate class $\mathcal{L}$. Towards this end, consider two real continuous functions, $c_{l}(\cdot)$ from $\mathcal{E}$ to $\mathbb{R}$, $d_{l}(\cdot)$ from $\mathcal{E}$ to $\mathbb{R}$ and define the following process. For any $x\in\mathcal{E}$ and $z=(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}$, where $\mathcal{H}_{0}=[\vartheta_{1},\vartheta_{2}]$ and $0<\vartheta_{1}<\vartheta_{2}<\infty$, set (assuming that this expression is meaningful)

$$W_{n}(x,z)=\frac{1}{n\mathbf{E}[\Delta_{1}(x,\varrho h)]}\sum_{i=1}^{n}\Bigg{\{}\Big{(}c_{l}(x)l(Y_{i})+d_{l}(x)\Big{)}\Delta_{i}(x,\varrho h)-\mathbf{E}\bigg{[}\Big{(}c_{l}(x)l(Y_{i})+d_{l}(x)\Big{)}\Delta_{i}(x,\varrho h)\bigg{]}\Bigg{\}}$$

$${}:=\frac{1}{n\mathbf{E}[\Delta_{1}(x,\varrho h)]}\sum_{i=1}^{n}\Bigg{\{}\mathcal{M}_{x,l}(Y_{i})\Delta_{i}(x,\varrho h)-\mathbf{E}\bigg{[}\mathcal{M}_{x,l}(Y_{i})\Delta_{i}(x,\varrho h)\bigg{]}\Bigg{\}}.$$

(2.2)

In what follows, we first establish a functional uniform moderate deviation principle for the process $\{W_{n}(x,z):z\in\mathcal{L}\times\mathcal{H}_{0}\}$, with a fixed $x\in\mathcal{J}$, where $\mathcal{J}$ denotes a suitable subset of $\mathcal{E}$. Subsequently, through the utilization of exponential contiguity arguments, we deduce the corresponding moderate deviation principle for the regression estimate $\widehat{r}_{n}^{l}(\cdot)$, encompassing the kernel distribution estimator. To provide more clarity, we present a corollary that delineates the behavior of the conditional distribution kernel estimator under scenarios of moderate deviations. This is complemented by the introduction of innovative applications of our main findings, as detailed in Section 4. Finally, we establish the moderate deviation principle for

$$\left\{\sup_{(x,z)\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}}\left|W_{n}(x,z)\right|\right\}.$$

3 MAIN RESULTS

We will impose the following set of assumptions for our main results.

(A1) $K(\cdot)$ is a nonnegative bounded differentiable kernel over its support $[0,1]$ and $K(1)>0$. The derivative $K^{\prime}(\cdot)$ of $K(\cdot)$ exists on the interval $[0,1]$ , and it is bounded and satisfies the condition $K^{\prime}(t)\leq 0$, $\forall t\in(0,1)$.

(A2) For each $x\in\mathcal{E}$ and a real number $v$, there exist a nonnegative functional $f_{v}(\cdot)$, a function $g_{x,v}(\cdot)$ and a nonnegative real function $\phi(\cdot)$ tending to zero, as its argument tends to $0$, such that,

(i) $F_{x}^{v}(u):=\mathbb{P}(d(x,X_{1})\leq u|Y=v)=\phi(u)f_{v}(x)+g_{x,v}(u)$ with, uniformly in $v$, $g_{x,v}(u)=o(\phi(u))$ as $u\rightarrow 0$ and $g_{x,v}(u)/\phi(u)$ is almost surely bounded;

(ii) there exists a nondecreasing bounded function $\tau_{0}(u)$ such that, uniformly in $u\in[0,1]$,

$$\frac{\phi(uh)}{\phi(h)}=\tau_{0}(u)+o(1),\quad\textrm{as}\quad h\downarrow 0.$$

(A3)Let $h=h_{n}$ and $w_{n}$ be sequences of positive numbers such that, as $n\rightarrow\infty$,

$${h_{n}}\rightarrow 0,\quad{w_{n}}\rightarrow\infty,\quad\frac{w_{n}^{2}}{n\phi(\gamma h)}\rightarrow 0\quad\text{for some}\quad\gamma\in[\vartheta_{1},\vartheta_{2}].$$

(A4)For any real numbers $a$ and $b$, and any $(x,l)\in\mathcal{E}\times\mathcal{L}$

$$\textrm{(i)}\int|l_{1}(v)l_{2}(v)|g(v)dv<\infty\quad\text{for any}\quad l_{1},\ l_{2}\in\mathcal{L};$$

$$\textrm{(ii)}\int e^{|a+bl(v)|}f_{v}(x)g(v)dv<\infty;\quad\textrm{(iii)}\int e^{|a+bl(v)|}g(v)dv<\infty,$$

Discussion of Assumptions

Condition (A1) is very usual in nonparametric estimation literature devoted to the functional data context. Notice that [115] symmetric kernel is not adequate in this context since the random process $d\left(x,X_{i}\right)$ is positive, therefore we consider $K(\cdot)$ with support $[0,1]$. This is a natural generalization of the assumption usually made on the kernel in the multivariate case where $K(\cdot)$ is supposed to be a spherically symmetric density function. Because the fact that Lebesgue measure does not exist on infinite dimension space, assumptions (A2) involve the small ball techniques related to the fractal dimension used in this paper, for instance, see [100], who in turn was inspired by [68] in his non-parametric density estimation under functional observations. From [64], one can cite:

1. in the case when $\mathcal{E}=\mathbb{R}^{d},\mathbb{P}(d(x,X_{1})\leq u|Y=v)\approx C(d)u^{d}f_{v}(x)$, where $C(d)$ is the volume of the unit ball in $\mathbb{R}^{d}$;

2. $\mathbb{P}(d(x,X_{1})\leq u|Y=v)\approx f_{v}(x)u^{\gamma}$ for some $\gamma>0$, then $\tau_{0}(u)=u^{\gamma}$;

3. $\mathbb{P}(d(x,X_{1})\leq u|Y=v)\approx f_{v}(x)u^{\gamma}\exp\left\{-c/u^{\kappa}\right\}$ for some $\gamma>0$ and $\kappa>0$, then $\tau_{0}(u)=\delta_{1}(u)$, where $\delta_{1}(\cdot)$ is a Dirac function.

Masry [100] explains that if $\mathcal{E}=\mathbb{R}$, then the condition coincides with the fundamental axioms of probability calculus, furthermore if $\mathcal{E}$ is an infinitely dimensional Hilbert space then $\phi(h)$ can decrease toward $0$ by an exponential speed as $n\to\infty.$ (A2) (ii) approximately shows that the small ball probability can be written approximately as the product of two independent functions, refer to [101] for the diffusion process, [13] for a Gaussian measure, [91] for a general Gaussian process and has employed these assumptions [100] for strongly mixing processes. For example, the function $\phi(\cdot)$ can be expressed as $\phi(\epsilon)=\epsilon^{\delta}\exp(-C/\epsilon^{a})$ with $\delta\geq 0$ and $a\geq 0$, and it corresponds to the Ornstein–Uhlenbeck and general diffusion processes (for such processes, $a=2$ and $\delta=0$) and the fractal processes (for such processes, $\delta>0$ and $a=0$). This class of processes also satisfies condition (A2). For other examples, we refer to [24, 28, 64, 128]. Since $n\phi\left(h\right)\rightarrow\infty$, suppose $\phi\left(h\right)=n^{-c}$ for some $0<c<1$. Then Condition (A3) is satisfied provided $w_{n}=n^{\gamma}$, for $0<\gamma<(1-c)/2$. Assumptions (A4) are set on to ensure the needed properties of finiteness and differentiability of the finite-dimensional moment generating function associated with the process $\{W_{n}(x,z):z\in\mathcal{L}\times\mathcal{H}_{0}\}$. All these general assumptions are sufficiently weak relative to the different objects involved in the statement of our main results. They cover and exploit the principal axes of this contribution, which are the topological structure of the functional variables, and the probability measure in this functional space. From now on, consider the following covariance function defined, for any $z_{1}:=(l_{1},\varrho_{1})\in\mathcal{L}\times\mathcal{H}_{0}$, $z_{2}:=(l_{2},\varrho_{2})\in\mathcal{L}\times\mathcal{H}_{0}$ and $\gamma\in\mathcal{H}_{0}$,

$$R_{\gamma}(x,z_{1},z_{2}):=\frac{\displaystyle\alpha_{1}(\varrho_{1},\varrho_{2})\tau(\varrho,\gamma)}{\displaystyle\alpha_{0}^{2}\tau(\varrho_{1},\gamma)\tau(\varrho_{2},\gamma)}\frac{\displaystyle\int\mathcal{M}_{x,l_{1}}(v)\mathcal{M}_{x,l_{2}}(v)f_{v}(x)g(v)\,dv}{\displaystyle\Bigg{(}\int f_{v}(x)g(v)\,dv\Bigg{)}^{2}},$$

(3.1)

where

$$\alpha_{0}=K(1)-\int\limits_{0}^{1}K^{\prime}(u)\tau_{0}(u)\,du$$

(3.2)

and

$$\alpha_{1}(\varrho_{1},\varrho_{2})=K\bigg{(}\frac{\varrho}{\varrho_{1}}\bigg{)}K\bigg{(}\frac{\varrho}{\varrho_{2}}\bigg{)}-\int\limits_{0}^{1}\bigg{(}K\bigg{(}\frac{\varrho}{\varrho_{1}}u\bigg{)}K\bigg{(}\frac{\varrho}{\varrho_{2}}u\bigg{)}\bigg{)}^{\prime}\tau_{0}(u)du$$

with $\varrho=\min(\varrho_{1},\varrho_{2})$ and

$$\tau(a,b)=\tau_{0}\left(\frac{a}{b}\right){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{a\leq b\}}+\left[\tau_{0}\left(\frac{b}{a}\right)\right]^{-1}{\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{a>b\}},$$

where ${\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{A}$ denotes the indicator of $A$. Note that, $\tau(a,b)=(\tau(b,a))^{-1}$ which gives that $R_{\gamma}(x,z_{1},z_{2})=R_{\gamma}(x,z_{2},z_{1})$. Let $\{\Xi_{\gamma}(x,z):\ z\in\mathcal{L}\times\mathcal{H}_{0}\}$ be a mean-zero Gaussian process such that, for any $(z_{1},z_{2})\in\big{(}\mathcal{L}\times\mathcal{H}_{0}\big{)}^{2}$,

$$\mathbf{E}[\Xi_{\gamma}(x,z_{1})\Xi_{\gamma}(x,z_{2})]=R_{\gamma}(x,z_{1},z_{2}).$$

Let $\mathcal{Z}_{x,\gamma}$ be the closed linear subspace of the space $L_{2}$, generated by

$$\Big{\{}\Xi_{\gamma}(x,z):z\in\mathcal{L}\times\mathcal{H}_{0}\Big{\}}.$$

Define the function $\varphi:\mathcal{Z}_{x,\gamma}\rightarrow l_{\infty}\big{(}\mathcal{L}\times\mathcal{H}_{0}\big{)}$ by

$$\varphi(\xi)(z)=\mathbf{E}[\Xi_{\gamma}(x,z)\xi].$$

Note that the reproducing kernel Hilbert space is associated with the covariance function $R_{\gamma}(x,z_{1},z_{2})$ is the Hilbert space $\{\varphi(\xi):\xi\in\mathcal{Z}_{x,\gamma}\}$ equipped with the inner product

$$\langle\varphi(\xi_{1}),\varphi(\xi_{2})\rangle=\mathbf{E}[\xi_{1}\xi_{2}].$$

For any $(z_{1},\ldots,z_{m})\in\mathcal{L}\times\mathcal{H}_{0}$ and any $\lambda_{1},\ldots,\lambda_{m}\in\mathbb{R}$, set

$$\Gamma^{\gamma}_{x,z_{1}\ldots,z_{m}}(\lambda_{1},\ldots,\lambda_{m})=\inf\Big{\{}2^{-1}\mathbf{E}[\xi^{2}]:\xi\in\mathcal{Z}_{x,\gamma},\ \varphi(\xi)(z_{j})=\lambda_{j},1\leq j\leq m\Big{\}}.$$

(3.3)

The following theorem gives a finite-dimensional moderate deviation principle for the process $\{W_{n}(x,z):z\in\mathcal{L}\times\mathcal{H}_{0}\}$. Let, for any $x\in\mathcal{E}$,

$$f(x)=\int f_{v}(x)g(v)\,dv.$$

Theorem 3.1. Assume that the assumptions (A1)–(A4) are fulfilled. If $f(x)>0$ and $\tau_{0}(\frac{\vartheta_{1}}{\vartheta_{2}})>0$. Then the random sequence $w_{n}(W_{n}(x,z_{1}),\ldots,W_{n}(x,z_{m}))$ satisfies a LDP with the speed $n\phi(\gamma h)/w_{n}^{2}$ $(\gamma\in\mathcal{H}_{0})$ and the good rate function $\Gamma^{\gamma}_{x,z_{1}\ldots,z_{m}}(\cdot)$ defined in $(3.3)$, where $z_{i}=(l_{i},\varrho_{i})$, for $i=1,\ldots,m$.

The proof of Theorem 3.1 is postponed until Section 7.

Remark 3.1. For any $z:=(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}$, it follows by Theorem 3.1 that the random sequence $w_{n}W_{n}(x,z)$ satisfies a LDP with the speed $n\phi(\gamma h)/w_{n}^{2}$ $(\gamma\in\mathcal{H}_{0})$ and the good rate function $\Gamma_{x,z}^{\gamma}(\cdot)$ given by

$$\Gamma_{x,z}^{\gamma}(\lambda)=\sup_{a}\Bigg{\{}\lambda a-\frac{1}{2}a^{2}R_{\gamma}(x,z,z)\Bigg{\}}=\frac{\lambda^{2}}{2R_{\gamma}(x,z,z)},$$

where

$$R_{\gamma}(x,z,z):=\frac{\bigg{(}K^{2}(1)-\int\limits_{0}^{1}\Big{(}K^{2}(u)\Big{)}^{\prime}\tau_{0}(u)\,du\bigg{)}}{\alpha_{0}^{2}\tau(\varrho,\gamma)}\frac{\displaystyle\int\mathcal{M}_{x,l}^{2}(v)f_{v}(x)g(v)\,dv}{\displaystyle\Bigg{(}\int f_{v}(x)g(v)\,dv\Bigg{)}^{2}}.$$

(3.4)

Remark 3.2. If we suppose that the derivative of the function $\tau_{0}(\cdot)$ exists and considering the fact that $\tau_{0}(0)=0$ and $\tau_{0}(1)=1$, then, integrating by parts, we obtain

$$R_{\gamma}(x,z,z)=\frac{1}{\displaystyle\tau(\varrho,\gamma)}\frac{\displaystyle\int\limits_{0}^{1}K^{2}(u)\tau^{\prime}_{0}(u)du\int\mathcal{M}_{x,l}^{2}(v)f_{v}(x)g(v)dv}{\displaystyle\Bigg{[}\int\limits_{0}^{1}K(u)\tau^{\prime}_{0}(u)du\Bigg{]}^{2}\Bigg{(}\int f_{v}(x)g(v)dv\Bigg{)}^{2}},$$

which gives a simpler form of the rate function. Also, we observe, whenever $K(u)={\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{[0,1]}(u)$, that

$$R_{\gamma}(x,z,z)=\frac{1}{\displaystyle\tau(\varrho,\gamma)}\frac{\displaystyle\int\mathcal{M}_{x,l}^{2}(v)f_{v}(x)g(v)dv}{\displaystyle\bigg{(}\int f_{v}(x)g(v)dv\bigg{)}^{2}}.$$

In the sequel, we investigate the functional moderate deviation principle of the process

$$\bigg{\{}W_{n}(x,z):z\in\mathcal{L}\times\mathcal{H}_{0}\bigg{\}}$$

in the space $l_{\infty}\big{(}\mathcal{L}\times\mathcal{H}_{0}\big{)}$ equipped with the uniform topology.

Let $L(\cdot)$ denote the finite-valued measurable envelope function of the class $\mathcal{L}$ of measurable functions on $\mathbb{R}$, that is,

$$L(y)\geq\sup_{l\in\mathcal{L}}|l(y)|,\quad y\in\mathbb{R}.$$

Define

$$N(\epsilon,\mathcal{L})=\sup_{Q}\mathcal{N}\bigg{(}\epsilon\sqrt{Q(L^{2})},\mathcal{L},d_{Q}\bigg{)},$$

where the supremum is taken over all probability measures $Q$ on $\mathbb{R}$, for

$$0<Q(L^{2})=\int L^{2}(y)dQ(y)<\infty,$$

and $d_{Q}$ is the $L_{2}(Q)$-metric. As usual, $N(\epsilon,\mathcal{L},d_{Q})$ is the minimal number of balls $\{l:d_{Q}(l,l{{}^{\prime}})<\epsilon\}$ of $d_{Q}$-radius $\epsilon$ needed to cover $\mathcal{L}$. To formulate our functional moderate deviation principle, we consider some additional conditions.

(B1) (i) For some $C^{\prime}>0$ and $\nu_{2}>0$,

$$N(\epsilon,\mathcal{L})\leq C^{\prime}\epsilon^{-\nu_{2}},\quad 0<\epsilon<1;$$

(ii) $\mathcal{L}$ is a pointwise measurable class, that is, there exists a countable subclass $\mathcal{L}_{0}$ of $\mathcal{L}$ such that, for any function $l\in\mathcal{L}$, we can find a sequence of function $\{l_{n}\}$ in $\mathcal{L}_{0}$ for which

$$l_{n}(y)\rightarrow l(y),\quad y\in\mathbb{R}.$$

(B2) The class

$$\mathcal{K}=\Bigg{\{}x\mapsto K\left(\frac{d(x,u)}{h}\right):u\in\mathcal{J},\ h>0\Bigg{\}}$$

satisfies the Condition (B1).

(B3) Uniformly in $v\in\mathbb{R}$, the function $f_{v}(.)$ is continuous and strictly positive on $\mathcal{J}.$

As in [48], let us denote by $\{\mathcal{M}(x):x\geqslant 0\}$ a continuous, increasing and non-negative function fulfilling, for some $q>2$, ultimately as $x\uparrow\infty$,

$$\textrm{(i) }x^{-q}\mathcal{M}(x)\uparrow,\quad\textrm{(ii) }x^{-1}\log\mathcal{M}(x)\downarrow,$$

where ‘$\uparrow$’ (resp. ‘$\downarrow$’) stands for non-decreasing (resp. non-increasing). For each $t\geqslant\mathcal{M}(0)$, we denote by $\mathcal{M}^{\text{inv}}(t)$ the uniquely defined non-negative number such that $\mathcal{M}\left(\mathcal{M}^{\mathrm{inv}}(t)\right)=t.$ The following choices of $\mathcal{M}(\cdot)$ are of particular interest:

(i) $\mathcal{M}(x)=x^{p}$ for some $p>2$;

(ii) $\mathcal{M}(x)=\exp(sx)$ for some $s>0$.

(B4) (i) For some $t>\max(s,1)$,

$$\mathbf{E}\big{[}\exp\big{(}tL(Y)\big{)}\big{]}<\infty;$$

(ii)

$$\displaystyle{\lim_{n\rightarrow\infty}\frac{\displaystyle w_{n}^{2}\max\Big{(}\mathcal{M}^{\mathrm{inv}}(n),\log(n)\Big{)}}{\displaystyle n\phi(\gamma h)}=\infty};$$

(iii)

$$\displaystyle{\lim_{n\rightarrow\infty}\frac{n\phi(\gamma h)\mathcal{M}^{\mathrm{inv}}(n)^{-2}}{\max\Big{(}\log\big{(}\mathcal{M}^{\mathrm{inv}}(n)\big{)},\log\big{(}1/\phi(\gamma h)\big{)}\Big{)}}}=\infty\quad\textrm{and}\quad\displaystyle{\limsup_{n\rightarrow\infty}\displaystyle{\frac{\displaystyle w_{n}^{2}\log\big{(}\mathcal{M}^{\mathrm{inv}}(n)\big{)}}{\displaystyle n\phi(\gamma h)}}<\infty};$$

(iv)

$$\displaystyle{\limsup_{n\rightarrow\infty}\displaystyle{\frac{\displaystyle w_{n}^{2}\log\big{(}1/\phi(\gamma h)\big{)}}{\displaystyle n\phi(\gamma h)}}=0}.$$

$\mathbf{{(B}^{\prime}}\mathbf{4)}$

(i) There exists a constant $L_{0}>0$ such that $L(Y){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{X\in\mathcal{J}\}}\leq L_{0}$ a.s.;

(ii)

$$\displaystyle{\limsup_{n\rightarrow\infty}\frac{\displaystyle w_{n}^{2}\log(1/\phi(\gamma h))}{\displaystyle n\phi(\gamma h)}=0\quad\textrm{and}\quad\limsup_{n\rightarrow\infty}\frac{\displaystyle n\phi(\gamma h)}{\displaystyle\log(1/\phi(\gamma h))}=\infty}.$$

Comments on Additional Hypotheses

For Assumption (B1)(i), see [116, Examples 26 and 38], [113, Lemma 22], [55, Subsection 4.7.], [131, Theorem 2.6.7], [89, Subsection 9.1] provide a number of sufficient conditions under which (B1)(i) holds, we may also refer to [46, 15, 16, 27, Subsection 3.2] for further discussions. For instance, it is satisfied, for general $d\geq 1$, whenever $l(\mathbf{x})=\Psi(p(\mathbf{x}))$, with $p(\mathbf{x})$ is either a polynomial in $d$ variables or the $\alpha$th power of the absolute value of a real polynomial for some $\alpha>0$ and $\Psi(\cdot)$ is a real-valued function of bounded variation, which covers commonly used kernels, such as Gaussian, Epanečnikov, Uniform, etc, we refer the reader to [59, p. 1381]. We also mention that Condition (B1)(i) is satisfied whenever that class of functions contains functions of bounded variation on $\mathbb{R}^{q}$ (in the sense of Hardy and Kauser ([80, 90, 134]), see, e.g., [44, 81, 111, 135]). Assumption (B1)(ii) is made to avoid measurability difficulties. Our definition of ‘‘pointwise measurability’’ is borrowed from example 2.3.4 in [131], [72, p. 262] calls a pointwise measurable function class satisfying the pointwise countable approximation property. This condition is discussed in [131, Example 2.3.4, p. 110] and [89, Subsection 8.2, p. 110] and it is satisfied whenever $l(\cdot)$ is right continuous. Assumption (B1)(i) ensures that $\mathcal{L}$ is VC type with characteristics $C^{\prime}$ and $\nu$. Condition (B2) in [73] is formulated as follows:

(B$\mathbf{{}^{\prime}}$2) $K(x)>0$, is a bounded and compactly supported measurable function that belongs to the linear span (the set of finite linear combinations) of functions $k(x)\geq 0$ satisfying the following property: the subgraph of $k(\cdot),$ $\{(s,u):k(s)\geq u\}$, can be represented as a finite number of Boolean operations among sets of the form

$$\bigg{\{}(s,u):p(s,u)\geq\varphi(u)\bigg{\}},$$

where $p$ is a polynomial on $\mathbb{R}\times\mathbb{R}$ and $\varphi$ is an arbitrary real function.

Indeed, for a fixed polynomial $p$, the family of sets

$$\bigg{\{}\{(s,u):p((s-t)/h,u)\geq\varphi(u)\}:t\in\mathbb{R},h>0\bigg{\}}$$

is contained in the family of positivity sets of a finite-dimensional space of functions, and then the entropy bound follows by Theorems 4.2.1 and 4.2.4 in [55]. Our results are demonstrated through an examination of the bounded scenario, where the Condition (B${}^{\prime}$4) is imposed, and the unbounded scenarios, which are explored under the Condition (B4). To establish the result of Theorem 3.2, we may relax and replace the Assumption (B4)(iv) by the following assumption:

$$\displaystyle{\limsup_{n\rightarrow\infty}\frac{\displaystyle w_{n}^{2}\log(1/\phi(\gamma h))}{\displaystyle n\phi(\gamma h)}<\infty.}$$

A function $\pi:T\rightarrow T$ is said to be a finite partition function of a set $T$ if, for each $t\in T$, $\pi(\pi(t))=\pi(t)$ and the cardinality of $\{\pi(t):t\in T\}$ is finite. Let $\pi(T)=\{t_{1},\ldots,t_{m}\}$ and $A_{j}=\{t\in T:\pi(t)=t_{j}\}$ for $1\leq j\leq m$; then $\{A_{1},\ldots,A_{m}\}$ is a partition of the set $T$. Note that finite partition functions can be used to characterize the compactness of $l_{\infty}(T)$. A set $B$ of $l_{\infty}(T)$ is compact if and only if it is closed and bounded and if for each $\tau>0$ there exists a finite partition function $\pi:T\rightarrow T$ such that

$$\sup_{\psi\in B}\big{|}\psi(t)-\psi(\pi(t))\big{|}\leq\tau,$$

for instance, see (Theorem IV.5.6 in [56] or [6], p. 573). We also have that if $B$ is a compact set of $l_{\infty}(T)$, then $B$ is a set of uniformly bounded and uniformly equicontinuous functions in the pseudo-metric space $(T,d_{B})$, where

$$d_{B}(t_{1},t_{2})=\sup_{\psi\in B}|\psi(t_{1})-\psi(t_{2})|.$$

From now on, for any real-valued function $\psi$ defined on a set $T$ , we use the notation

$$||\psi||_{T}:=\sup_{t\in T}|\psi(t)|.$$

For future use, let us introduce two classes of continuous and bounded functions on $\mathcal{J}$ indexed by $\mathcal{L}$

$$\mathcal{C}:=\Big{\{}c_{l}:l\in\mathcal{L}\Big{\}}\quad\textrm{and}\quad\mathcal{D}:=\Big{\{}d_{l}:l\in\mathcal{L}\Big{\}}.$$

We shall always assume that the classes $\mathcal{C}$ and $\mathcal{D}$ are compact with respect to the uniform topology. Let us define

$$C_{\mathcal{L}}:=\sup\Big{\{}||c_{l}||_{\mathcal{J}}:l\in\mathcal{L}\Big{\}}\quad\textrm{and}\quad D_{\mathcal{L}}:=\sup\Big{\{}||d_{l}||_{\mathcal{J}}:l\in\mathcal{L}\Big{\}}.$$

For any $x\in\mathcal{J}$, the moderate deviation principle for the process $\{W_{n}(x,z):z\in\mathcal{L}\times\mathcal{H}_{0}\}$ in the space $l_{\infty}\big{(}\mathcal{L}\times\mathcal{H}_{0}\big{)}$ is presented in the following theorem.

Theorem 3.2. Assume that Assumptions (A1)–(A4), (B1)–(B3), and (B4) or (B${}^{\prime}$4) hold true, and $\tau_{0}(\displaystyle\frac{\vartheta_{1}}{\vartheta_{2}})>0$. Furthermore, consider the classes of continuous functions $\mathcal{C}$ and $\mathcal{D}$ given above. Then we have, for any $\gamma\in\mathcal{H}_{0}$ and any $x\in\mathcal{J}$,

(i) for any $0<c<\infty$, $\{\psi\in l_{\infty}(\mathcal{L}\times\mathcal{H}_{0}):\,I_{x}^{\gamma}(\psi)\leq c\}$ is a compact set of $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$;

(ii) for all open subsets $O$ of $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$,

$$\liminf_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}W_{n}(x,\cdot)\in O\bigg{\}}\Bigg{)}\geq-\inf_{\psi\in O}I_{x}^{\gamma}(\psi)$$

and for all closed subsets $F$ of $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$,

$$\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}W_{n}(x,\cdot)\in F\bigg{\}}\Bigg{)}\leq-\inf_{\psi\in F}I_{x}^{\gamma}(\psi),$$

where

$$I_{x}^{\gamma}(\psi)=\inf\Bigg{\{}2^{-1}\mathbf{E}[\xi^{2}]:\xi\in\mathcal{Z}_{x,\gamma},\ \varphi(\xi)=\psi\Bigg{\}}.$$

(3.5)

The proof of Theorem 3.2 is postponed until Section 7.

Remark 3.3. Making use of similar arguments as in the paper [8, p. 5], it follows, whenever ${\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}R_{\gamma}(x,z,z)>0}$, that for any $\lambda\geq 0$,

$$\inf_{\{\psi\in l_{\infty}(\mathcal{L}\times\mathcal{H}_{0}):\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}|\psi(z)|\geq\lambda\}}I_{x}^{\gamma}(\psi)=\frac{\displaystyle\lambda^{2}}{\displaystyle 2\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}R_{\gamma}(x,z,z)}$$

and, whenever $\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}R_{\gamma}(x,z,z)=0$,

$$I_{x}^{\gamma}(\psi)=\begin{cases}0,\quad\textrm{if}\quad\displaystyle\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}|\psi(z)|=0\\ \infty,\quad\textrm{if}\quad\displaystyle\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}|\psi(z)|>0.\end{cases}$$

Therefore, by Theorem 3.2, for any $\lambda\geq 0$, we have

$${\lim_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\mathbb{P}\bigg{(}w_{n}\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}|W_{n}(x,z)|\geq\lambda\bigg{)}}$$

$${}=\begin{cases}\displaystyle-\frac{\lambda^{2}}{\displaystyle{2\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}R_{\gamma}(x,z,z)}}\quad\textrm{if}\quad\displaystyle\displaystyle{\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}R_{\gamma}(x,z,z)>0}\\ \displaystyle-\infty\quad\textrm{if}\quad\displaystyle{\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}R_{\gamma}(x,z,z)=0}\end{cases}$$

$${}=:-I_{x}^{\gamma}(\lambda).$$

The uniform moderate deviations principle on $\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}$ is presented in the following theorem. Towards this end, for any $\epsilon>0$, consider the following number

$$\mathcal{N}(\epsilon,\mathcal{J},d)=\min\Bigg{\{}n:\text{ there exist }x_{1},\ldots,x_{n}\text{ in }\mathcal{J}\text{ such that for any }x\in\mathcal{J}$$

$$\text{there exists }1\leq k\leq n\text{ such that }d(x,x_{k})<\epsilon\Bigg{\}},$$

which is the minimal number of open ball with $d$-radius $\epsilon$ needed to cover the subset $\mathcal{J}$. We assume that $\mathcal{J}$ satisfies the following property.

(J) For any $\epsilon>0$, there exists $C>0$ and $\nu>0$ such that $\mathcal{N}(\epsilon,\mathcal{J},d)\leq C\epsilon^{-\nu}.$

Theorem 3.3. If assumptions of Theorem 3.2 and assumption (J) hold, then for any $\lambda>0$,

$$\lim_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}\sup_{(x,z)\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}}w_{n}\left|W_{n}(x,z)\right|>\lambda\bigg{\}}\Bigg{)}=-I^{\gamma}(\lambda),$$

where

$$I^{\gamma}(\lambda)=\inf_{x\in\mathcal{J}}I_{x}^{\gamma}(\lambda).$$

The proof of Theorem 3.3 is postponed until Section 7.

Remark 3.4. As in the [97] bootstrap, see also [19, 36, 127] for recent references. Following [46], we introduce an auxiliary i.i.d. sequence $Z=Z_{1},Z_{2},\ldots$ of real-valued rv’s, independent of $\left\{\left({X}_{i},{Y}_{i}\right):i\geq 1\right\}$, and such that

(R1) $\mathbb{E}(Z)=1$; $\mathbb{E}\left(Z^{2}\right)=2;$

(R2) for some $\epsilon>0,$ $\mathbb{E}\left(e^{tZ}\right)<\infty$ for all $|t|\leq\epsilon$.

Setting $T_{n}=Z_{1}+\cdots+Z_{n}$ we define the $\left\{\mathfrak{W}_{i,n}:1\leq i\leq n\right\}$, by setting, for $i=1,\ldots,n$

$$\mathfrak{W}_{i,n}=\begin{cases}\displaystyle\frac{\displaystyle Z_{i}}{\displaystyle T_{n}}=\frac{\displaystyle Z_{i}}{\displaystyle\sum_{j=1}^{n}Z_{j}}\quad\text{when}\quad T_{n}>0\\ \displaystyle\frac{\displaystyle 1}{\displaystyle n}\quad\text{when}\quad T_{n}\leq 0.\end{cases}$$

Introduce the resampled version of the process $(2.2)$ given by

$$W_{n}^{*}(x,z)=\frac{1}{n\mathbf{E}[\Delta_{1}(x,\varrho h)]}\sum_{i=1}^{n}\Bigg{\{}\Big{(}c_{l}(x)l(Y_{i})Z_{i}+d_{l}(x)\Big{)}\Delta_{i}(x,\varrho h)$$

$${}-\mathbf{E}\bigg{[}\Big{(}c_{l}(x)l(Y_{i})Z_{i}+d_{l}(x)\Big{)}\Delta_{i}(x,\varrho h)\bigg{]}\Bigg{\}}.$$

(3.6)

Following [46], we observe that $W_{n}^{*}(x,l,h)$ reduces to a process of the form $W_{n}(x,l^{*},h)$, for a suitable measurable $l^{*}(\cdot)$, and after some easy changes. Without loss of generality, we set $Z=Q(U)$ and $Z_{i}=Q\left(U_{i}\right)$ for $i=1,\ldots,n$, where $U$ and $U_{1},\ldots,U_{n}$ are independent rv’s, with a uniform distribution on $(0,1)$, and independent of $\left\{\left({X}_{i},{Y}_{i}\right):1\leq i\leq n\right\}$. This allows us to define a measurable function $\psi^{*}(\cdot)$ on $\mathbb{R}^{q+1}$, and a $\textrm{rv}\mathbf{Y}^{*}$, by

$$\mathbf{Y}^{*}=\left[\begin{matrix}U&Y_{1}\end{matrix}\right]^{\prime}\in\mathbb{R}^{1+q}\quad\text{ and }\quad l^{*}\left(\mathbf{Y}^{*}\right)=Q(U)l({Y})=Zl({Y}).$$

Letting $\left({X}_{i},\mathbf{Y}_{i}^{*}\right),i=1,2,\ldots$, denote i.i.d. random copies of $\left(X,\mathbf{Y}^{*}\right)$, it is readily checked that $\{\left({X}_{i},\mathbf{Y}_{i}^{*}\right):1\leq i\leq n\}$, and $l^{*}(\cdot)$ fulfill the general assumptions imposed in Theorem 3.2. Then the process

$$\bigg{\{}w_{n}W_{n}^{*}(x,z):z\in\mathcal{L}\times\mathcal{H}_{0}\bigg{\}}$$

satisfies a LDP in the space $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$ with the speed $n\phi(\gamma h)/w_{n}^{2}$ and the good rate function $I_{x}^{\gamma}(\cdot)$. Then we have, for any $\gamma\in\mathcal{H}_{0}$ and any $x\in\mathcal{J}$,

(i) for any $0<c<\infty$, $\{\psi\in l_{\infty}(\mathcal{L}\times\mathcal{H}_{0}):\,I_{x}^{\gamma}(\psi)\leq c\}$ is a compact set of $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$;

(ii) for all open subsets $O$ of $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$,

$$\liminf_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}W_{n}^{*}(x,\cdot)\in O\bigg{\}}\Bigg{)}\geq-\inf_{\psi\in O}I^{\gamma}(\psi)$$

and for all closed subsets $F$ of $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$,

$$\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}W_{n}^{*}(x,\cdot)\in F\bigg{\}}\Bigg{)}\leq-\inf_{\psi\in F}I_{x}^{\gamma}(\psi),$$

where we recall

$$I_{x}^{\gamma}(\psi)=\inf\Bigg{\{}2^{-1}\mathbf{E}[\xi^{2}]:\xi\in\mathcal{Z}_{\gamma},\ \varphi(\xi)=\psi\Bigg{\}}.$$

(3.7)

By Theorem 3.3 we have, for any $\lambda>0$,

$$\lim_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}\sup_{(x,z)\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}}w_{n}\left|W_{n}^{*}(x,z)\right|>\lambda\bigg{\}}\Bigg{)}=-I^{\gamma}(\lambda),$$

while it is possible that the last result also holds for the exchangeably weighted bootstrap, such a determination is beyond the scope of this paper and appears to be quite difficult.

Remark 3.1. According to [64], our methodology is heavily dependent on the distribution function $\phi(\cdot)$. This is evident in our conditions (via the function $\tau_{0}(\cdot)$) and in the convergence rates of our estimate (via the asymptotic behavior of the quantity $n\phi(h))$. More precisely, the behavior of $\phi(\cdot)$ around $0$ turns out to be of paramount importance. Thus, the tiny ball probabilities of the underlying functional variable $X$ are crucial. In probability theory, the calculation of the quantity $\mathbb{P}(||X-x||<s)$ for ‘‘small’’ $s$ (i.e., for $s$ tending toward zero) and for a fixed $x$ is known as the ‘‘small ball problem.’’ Unfortunately, there are solutions for very few random variables (or processes) $X$ even when $x=0$. In several functional spaces, taking $x\neq 0$ results in formidable obstacles that may be insurmountable. Typically, authors emphasize Gaussian random variables. We refer you to [91] for a summary of the key findings regarding the probability of small balls. If $X$ is a Gaussian random element on the separable Banach space $\mathcal{E}$ and $x$ belongs to the reproducing kernel Hilbert space associated with $X$, then the following well-known result holds:

$$\mathbb{P}(||X-x||\leq s)\sim C_{x}\mathbb{P}(||X||\leq s),\quad\text{as}\quad s\rightarrow 0.$$

As far as we know, the results that are available in the published literature are basically all of the forms

$$\mathbb{P}(||X-x||<s)\sim c_{x}s^{-\alpha}\exp\left(-C/s^{\beta}\right),$$

where $\alpha,\beta,c_{\chi}$, and $C$ are positive constants and $||\cdot||$ may be a supremum norm, a $L^{p}$ norm or a Besov norm. The interested reader can refer to [62–64, 128] for more discussion. Notice that the pioneer book by [62] extensively comments on the links between nonparametric functional statistics small-ball probability theory and topological structure on the functional space $\mathcal{E}$.

4 APPLICATIONS

While only the examples provided below will be discussed, they serve as archetypes for various functionals and can be explored similarly.

4.1 The Kernel Regression Function Estimate

Some further conditions are needed to establish the functional moderate deviations principle for the kernel regression function estimate $\widehat{r}_{n}^{l}(x)$.

(C1) (i) For each $(x,x^{\prime})\in\mathcal{J}^{2}$, $l\in\mathcal{L}$, and some constants $\beta>0$ and $\varsigma>0$

$$|r^{l}(x)-r^{l}(x^{\prime})|\leq\varsigma d(x,x^{\prime})^{\beta};$$

(ii) $w_{n}h^{\beta}\rightarrow 0$ as $n\rightarrow\infty$.

Assumption (C1)(i) imposes some smoothness of the regression operator. Define the function $I_{x,1}^{\gamma}(\cdot)$ as the function $I_{x}^{\gamma}(\cdot)$ in the statement (3.5), with $c_{l}(x)=1$ and $d_{l}(x)=-r^{l}(x)$ for any $x\in\mathcal{J}$. The large deviation principle for the process $\{w_{n}(\widehat{r}_{n}^{l}(x,\varrho h)-r^{l}(x)):(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}\}$ in the space $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$ is presented in following corollary.

Corollary 4.1. Under the assumption of Theorem $3.2$, assume that the Conditions (C1) hold. Then the process

$$\bigg{\{}w_{n}(\widehat{r}_{n}^{l}(x,\varrho h)-r^{l}(x)):(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}\bigg{\}}$$

satisfies a LDP in the space $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$ with the speed $n\phi(\gamma h)/w_{n}^{2}$ and the good rate function $I_{x,1}^{\gamma}(\cdot)$.

The proof of Corollary 4.1 is postponed until Section 7.

Proposition 4.1. Under the assumption of Theorem $3.2$, assume that the Conditions (C1) hold. Then, for any $\delta>0$, we have

$$\lim_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\mathbb{P}\left(\exists x\in\mathcal{E},\quad r^{l}(x)\notin\left[\widehat{r}_{n}^{l}(x,h)-\delta\frac{\hat{\sigma}_{n}}{w_{n}},\widehat{r}_{n}^{l}(x,h)+\delta\frac{\hat{\sigma}_{n}}{w_{n}}\right]\right)=-\frac{\delta^{2}}{2}.$$

Moreover, the sequence of sets of functions

$$D_{n}=\left\{g:\mathcal{E}\rightarrow\mathbb{R},\left|g(x)-\widehat{r}_{n}^{l}(x,h)\right|\leq\delta\frac{\hat{\sigma}_{n}}{w_{n}},\quad\forall x\in\mathcal{E}\right\},$$

is an asymptotic almost sure sequence of confidence regions of $r^{l}(x)$, here $\hat{\sigma}_{n}^{2}:=\hat{\sigma}_{n}^{2}(x)$ is any consistent estimate of the $\widehat{r}_{n}^{l}(x,h)$’s variance.

Remark 4.2. Let $||\cdot||$ be a norm on $\mathcal{E}=\mathbb{R}^{d}$. Denote by $B(x,r)$ the set of all points $z\in\mathbb{R}^{d}$ satisfying $||x-z||\leq r$. For each $n\geq 1$ and $k\in\{1,\ldots,n\}$, the $k$-nearest neighbor bandwidth at $x$ is denoted by $\hat{k}_{n,x}$ and defined as the smallest radius $r\geq 0$ such that the ball $B(x,r)$ contains at least $k$ points from the collection $\{X_{1},\ldots,X_{n}\}$, i.e.,

$$\hat{k}_{n,x}=\inf\left\{r\geq 0\,:\,\sum_{i=1}^{n}{\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{B(x,r)}(X_{i})\geq k\right\}.$$

The $k$-nearest neighbor estimate of the regression function, $x\mapsto\mathbb{E}[l(Y)|X=x]$, is defined as, for all $x\in\mathbb{R}^{d}$,

$$\sum_{i=1}^{n}l(Y_{i}){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{B(x,\hat{k}_{n,x})}(X_{i})/\sum_{i=1}^{n}{\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{B(x,\hat{k}_{n,x})}(X_{i}).$$

This estimate is an adaptive bandwidth version of the Nadaraya–Watson estimate which here would be defined in the same way except that a non-random bandwidth (depending only on $n$, e.g., $n^{-1/5}$) is used in place of $\hat{k}_{n,x}$. It would be of interest to investigate the process defined in (2.2) in the $k$-nearest neighbor setting.

4.2 The Kernel Conditional Distribution Function

To present the functional moderate deviations principle for the conditional distribution function estimate $\widehat{F}_{nh}(t|x):=\widehat{\mu}((\infty,t]|x)$ for $(x,t)\in\mathcal{J}\times\mathbb{R}$ as a special case, the Conditions (C1) have to be formulated for the conditional distribution function $F(t|x):=\mu((\infty,t]|x)$ for $(x,t)\in\mathcal{J}\times\mathbb{R}$ as follows.

(C2) (i) For any $(x,x^{\prime})\in\mathcal{J}^{2}$, any $(t,t^{\prime})\in\mathbb{R}^{2q}$, some $\beta_{1}>0$ $\beta_{2}>0$ and a constant $\varsigma^{\prime}>0$

$$|F(t|x)-F(t^{\prime}|x^{\prime})|\leq\varsigma^{\prime}\Big{(}d(x,x^{\prime})^{\beta_{1}}+||t-t^{\prime}||^{\beta_{2}}\Big{)};$$

(ii) $w_{n}h^{\beta_{1}}\rightarrow 0$ as $n\rightarrow\infty$.

Assumption (C2)(i) introduces a level of smoothness to the conditional distribution. Define the function $I_{2,x}^{\gamma}(\cdot)$ as the function $I^{\gamma}_{x}(\cdot)$ in statement (3.5), where $l(y)={\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{(\infty,t]}(y)$, $c_{l}(x)=1$, and $d_{l}(x)=-F(t|x)$ for any $(x,t)\in\mathcal{J}\times\mathbb{R}$. Let the conditional distribution function estimate be defined as follows

$$\widehat{F}_{n,h}(t|x)=\frac{\sum_{i=1}^{n}{\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{(\infty,t]}(Y_{i})K(h^{-1}d(x,X_{i}))}{\sum_{i=1}^{n}K(h^{-1}d(x,X_{i}))}.$$

(4.1)

The large deviation principle for the process $\{w_{n}(\widehat{F}_{n,\varrho h}(t|x)-F(t|x)):(t,\varrho)\times\mathbb{R}\times\mathcal{H}_{0}\}$ in $l_{\infty}(\mathbb{R}\times\mathcal{H}_{0})$ is presented in the following corollary.

Corollary 4.2. Assume that assumptions (A1)–(A5), (B2)–(B3), and (C2) hold true. Then the process

$$\bigg{\{}w_{n}(\widehat{F}_{n,\varrho h}(t|x)-F(t|x)):(t,\varrho)\in\mathbb{R}\times\mathcal{H}_{0}\bigg{\}}$$

satisfies a large deviations in $l_{\infty}(\mathbb{R}\times\mathcal{H}_{0})$ with the speed $n\phi(\gamma h)/w_{n}^{2}$ and the good rate function $I_{2,x}^{\gamma}(\cdot)$.

The proof of Corollary 4.2 is postponed until Section 7.

4.3 The Kernel Quantile Regression

For a given $\alpha\in(0,1)$, the $\alpha$th-order conditional quantile of the distribution of a real-valued $Y$ given $X=x$ is defined as

$$q_{\alpha}(x)=\inf\bigg{\{}y\in\mathbb{R}:F(y|x)\geq\alpha\bigg{\}}.$$

Notice that, whenever $F(\cdot|x)$ is strictly increasing and continuous in a neighborhood of $q_{\alpha}(x)$, the function $F(\cdot|x)$ has a unique quantile of order $\alpha$ at a point $q_{\alpha}(x)$, that is $F\left(q_{\alpha}(x)|x\right)=\alpha$. In such case

$$q_{\alpha}(x)=F^{-1}(\alpha|x)=\inf\bigg{\{}y\in\mathbb{R}:F(y|x)\geq\alpha\bigg{\}},$$

which may be estimated uniquely by

$$\widehat{q}_{n,\alpha}(x,\varrho)=\widehat{F}_{n,\varrho h}^{-1}(\alpha|x).$$

Conditional quantiles have been widely studied in the literature when the predictor $X$ is of finite dimension, see for instance, [45]. Let us first recall some conceptions of Hadamard differentiability [70, 89, 131]. Let $\mathcal{X}$ and $\mathcal{Y}$ be two metrizable topological linear spaces. A map $\Phi$ defined on a subset $\mathcal{D}_{\Phi}$ of $\mathcal{X}$ with values in $\mathcal{Y}$ is called Hadamard differentiable at $x$ if there exists a continuous mapping $\Phi_{x}^{\prime}:\mathcal{X}\mapsto\mathcal{Y}$ such that

$$\lim_{n\rightarrow\infty}\frac{\Phi\left(x+t_{n}\nu_{n}\right)-\Phi(x)}{t_{n}}=\Phi_{x}^{\prime}(h)$$

holds for all sequences $t_{n}$ converging to $0+$ and $\nu_{n}$ converging to $\nu$ in $\mathcal{X}$ such that $x+t_{n}\nu_{n}\in\mathcal{D}_{\Phi}$ for every $n$.

Corollary 4.3. Let $0<p<q<1$ be fixed and let $F(\cdot|x)$ be a conditional distribution function with continuous and positive derivative $f(\cdot|x)$ on the interval $\left[F^{-1}(p|x)-\varepsilon,F^{-1}(q|x)+\varepsilon\right]$ for some $\varepsilon>0$. Then, under conditions of Corollary $4.2$, the process $\left\{w_{n}\left(\widehat{q}_{n,\alpha}(x,\varrho)-q_{\alpha}(x)\right)\right\}$ satisfies the $LDP$ in $l_{\infty}([p,q]\times\mathcal{H}_{0})$ with speed $n\phi(\gamma h)/w_{n}^{2}$ and rate function $I_{\gamma,x}^{EQ}$ given by

$$I_{\gamma,x}^{EQ}(\phi)=\inf\Bigg{\{}I_{2,x}^{\gamma}(\psi):-\frac{\psi\left(F^{-1}(t|x)\right)}{f\left(F^{-1}(t|x)\right)}=\phi(x)\text{ for all }t\in[p,q]\Bigg{\}}.$$

The proof of Corollary 4.3 is postponed until Section 7.

Remark 4.3. It will be interesting to extend our findings to the following settings.

1. (Expectile regression.) For $p\in(0,1)$, let $l(T-\boldsymbol{\theta})=(p-\{T-\boldsymbol{\theta}\leq 0\})|T-\boldsymbol{\theta}|$, then the zero of $r^{l}(\cdot)$ with respect to $\boldsymbol{\theta}$ leads to quantities called expectiles by [110]. Expectiles, as defined by [110], may be introduced either as a generalization of the mean or as an alternative to quantiles. Indeed, classical regression provides us with a high sensitivity to extreme values, allowing for more reactive risk management. Quantile regression, on the other hand, provides the ability to acquire exhaustive information on the effect of the explanatory variable on the response variable by examining its conditional distribution, refer to [102, 103] for further details on expectiles in functional data setting.

2. (Conditional winsorized mean.) As in [83], if we consider $l(T-\boldsymbol{\theta})=-k,T-\boldsymbol{\theta},k$ if $T-\boldsymbol{\theta}<-k$, $|T-\boldsymbol{\theta}|\leq k$, or $T-\boldsymbol{\theta}>k$, then the zero of $r^{l}(\cdot)$ with respect to $\boldsymbol{\theta}$ will be the conditional winsorized mean. Notably, this parameter was not considered in the literature on nonparametric functional data analysis involving wavelet estimators. Our paper offers asymptotic results for the conditional winsorized mean when the covariates are functions.

4.4 The Kernel Conditional Density Function

By setting $l(\cdot)=\frac{1}{h_{1}}K_{1}(h_{1}^{-1}(\cdot-t))$, for ${t}\in\mathbb{R}$, $h_{1}$ is a bandwidth parameter and $K_{1}(\cdot)$ is the kernel function, into (2.1), we obtain the kernel estimator of the conditional density function $f({t}|{x})$ given by

$$\widehat{f}_{n,bh}({t}|{x})=\frac{\sum_{i=1}^{n}\frac{1}{h_{1}}K_{1}(h_{1}^{-1}(Y_{i}-t))K(bh^{-1}d(x,X_{i}))}{\sum_{i=1}^{n}K(bh^{-1}d(x,X_{i}))}.$$

(4.2)

(C3) (i) For any $(x,x^{\prime})\in\mathcal{J}^{2}$, any $(t,t^{\prime})\in[a,b]^{2}\subset\mathbb{R}^{2}$, some $\beta_{1}>0$, $\beta_{2}>0$, and a constant $\varrho>0$

$$|f(t|x)-f(t^{\prime}|x^{\prime})|\leq\varrho\Bigg{(}d(x,x^{\prime})^{\beta_{1}}+|t-t^{\prime}|^{\beta_{2}}\Bigg{)};$$

(ii) $w_{n}(h^{\beta_{1}}+h_{1}^{\beta_{2}})\rightarrow 0$ as $n\rightarrow\infty$.

$\mathbf{{(B}^{\prime}}\mathbf{2)}$ The class

$$\tilde{\mathcal{K}}=\Bigg{\{}(h,x,y)\mapsto K\left(\frac{y-t}{h_{1}}\right)K\left(\frac{d(x,x^{\prime})}{h}\right):x^{\prime}\in\mathcal{J},t\in\mathbb{R},h>0\Bigg{\}}$$

satisfies the Condition (B1).

Assumption (C3)(i) imposes some smoothness of the conditional density. Define the function $I_{3,x}^{\gamma}(\cdot)$ as the function $I^{\gamma}_{x}(\cdot)$ in the statement (3.5), with $l(\cdot)=\frac{1}{h_{1}}K_{1}(h_{1}^{-1}(\cdot-t))$, $c_{l}(x)=1$ and $d_{l}(x)=-f(t|x)$ for any $(x,t)\in\mathcal{J}\times[a,b]$.

Corollary 4.4. Assume that assumptions (A1)–(A5), (B${}^{\prime}$ ${}^{\prime}$2)–(B4), and (C3) hold. Then the process

$$\bigg{\{}w_{n}(\widehat{f}_{n,\varrho h}(t|x)-f(t|x)):(t,\varrho)\in[a,b]\times\mathcal{H}_{0}\bigg{\}}$$

satisfies large deviations in $l_{\infty}([a,b]\times\mathcal{H}_{0})$ with the speed $nh_{1}\phi(\gamma h)/w_{n}^{2}$ and the good rate function $I_{3,x}^{\gamma}(\cdot)$.

The proof of Corollary 4.4 is similar to the proof of Corollary 4.2 and therefore omitted.

4.5 The Kernel Conditional Copula Function

Let us recall the setting of [69]. Assume that $\left(X_{1},Y_{11},Y_{21}\right),\ldots,\left({X}_{n},Y_{1n},Y_{2n}\right)$ is a sample of $n$ independent and identically distributed triples of random variables. The random variables $Y_{1i}$ and $Y_{2i}$ are real and the ${X}_{i}$’s are random elements. Suppose that the conditional distribution of $\left(Y_{1},Y_{2}\right)^{\top}$ given ${X}=x$ exists and denote the corresponding conditional joint distribution function by

$$H_{x}\left(y_{1},y_{2}\right)=\mathbb{P}\left(Y_{1}\leq y_{1},Y_{2}\leq y_{2}|{X}=x\right).$$

If the marginals of $H_{x}(\cdot,\cdot)$ denoted as

$$F_{1x}\left(y_{1}\right)=\mathbb{P}\left(Y_{1}\leq y_{1}|{X}=x\right),\quad F_{2x}\left(y_{2}\right)=\mathbb{P}\left(Y_{2}\leq y_{2}|{X}=x\right)$$

are continuous, then according to Sklar’s theorem (see e.g., [109]) there exists a unique copula $C_{x}(\cdot,\cdot)$ which equals

$$C_{x}\left(u_{1},u_{2}\right)=H_{x}\left(F_{1x}^{-1}\left(u_{1}\right),F_{2x}^{-1}\left(u_{2}\right)\right),$$

where

$$F_{1x}^{-1}(u)=\inf\left\{y:F_{1x}(y)\geq u\right\}$$

is the conditional quantile function of $Y_{1}$ given ${X}=x$ and $F_{2x}^{-1}(\cdot)$ is the conditional quantile function of $Y_{2}$ given ${X}=x$. The conditional copula $C_{x}(\cdot,\cdot)$ fully describes the conditional dependence structure of $\left(Y_{1},Y_{2}\right)^{\top}$ given ${X}=x$. An estimator of the joint conditional distribution function $H_{x}(\cdot,\cdot)$ is

$$H_{xh}\left(y_{1},y_{2}\right)=\frac{\sum_{i=1}^{n}{\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}\left\{Y_{1i}\leq y_{1},Y_{2i}\leq y_{2}\right\}K(h^{-1}d(x,X_{i}))}{\sum_{i=1}^{n}K(h^{-1}d(x,X_{i}))}.$$

(4.3)

Then analogously as in [69] and [23] one can suggest the following empirical estimator of the copula $C_{x}(\cdot,\cdot)$

$$C_{x,h}\left(u_{1},u_{2}\right)=H_{xh}\left(F_{1xh}^{-1}\left(u_{1}\right),F_{2xh}^{-1}\left(u_{2}\right)\right),\quad 0\leq u_{1},\quad u_{2}\leq 1,$$

where $F_{1xh}(\cdot)$ and $F_{2xh}(\cdot)$ are the corresponding marginal distribution functions of $H_{xh}(\cdot,\cdot)$, i.e., $F_{1xh}\left(y_{1}\right)=H_{xh}\left(y_{1},+\infty\right)$ and $F_{2xh}\left(y_{2}\right)=H_{xh}\left(+\infty,y_{2}\right)$.

Corollary 4.5. Let $0<p<q<1$ be fixed. Suppose that $F_{1x}\left(\cdot\right)$ and $F_{2x}\left(\cdot\right)$ are continuously differentiable on the intervals $\left[F_{1x}^{-1}(p)-\varepsilon,F_{1x}^{-1}(q)+\varepsilon\right]$ and $\left[F_{2x}^{-1}(p)-\right.$ $\left.\varepsilon,F_{2x}^{-1}(q)+\varepsilon\right]$ with strictly positive derivatives $f_{1}(\cdot|x)$ and $f_{2}(\cdot|x)$, respectively, for some $\varepsilon>0$. Furthermore, assume that $\partial H_{x}/\partial y_{1}$ and $\partial H_{x}/\partial y_{2}$ exist and are continuous on the product intervals. Then, under conditions of Corollary $4.2$, the process

$$\bigg{\{}w_{n}(C_{x,\varrho h}(u_{1},u_{2})-C_{x}(u_{1},u_{2})):((u_{1},u_{2}),\varrho)\in[p,q]^{2}\times\mathcal{H}_{0}\bigg{\}}$$

satisfies the $LDP$ in $l_{\infty}\left([p,q]^{2}\times\mathcal{H}_{0}\right)$ with speed $n\phi(\gamma h)/w_{n}^{2}$ and rate function $I^{C}_{\gamma,x}(\cdot)$ defined by

$$I^{C}_{\gamma,x}(\phi)=\inf\Big{\{}I_{1,x}^{\gamma}(\varpi):\Phi_{H}^{\prime}(\varpi)=\phi\Big{\}},$$

where

$$\Phi_{H}^{\prime}(\varpi)(u_{1},u_{2})=\varpi\left(F^{-1}_{1x}(u_{1}),F^{-1}_{2x}(u_{2})\right)$$

$${}-\frac{\partial H_{x}}{\partial y_{1}}\left(F^{-1}_{1,x}(u_{1}),F^{-1}_{2,x}(u_{2})\right)\frac{\varpi\left(F^{-1}_{1,x}(u_{1}),\infty\right)}{f_{1}\left(F^{-1}_{1,x}(u_{1})|x\right)}$$

$${}-\frac{\partial H_{x}}{\partial y_{1}}\left(F^{-1}_{1,x}(u_{1}),F^{-1}_{2,x}(u_{2})\right)\frac{\varpi\left(F^{-1}_{2,x}(u_{2}),\infty\right)}{f_{2}\left(F^{-1}_{2,x}(u_{2})|x\right)}.$$

Corollary 4.5 is a direct consequence Corollary 4.2, Lemma 3.9.28 of [131] (or [32]) and Theorem 3.1 of [67] in a similar way as in Theorem 4.6 of the last mentioned reference.

Remark 4.4. We define the conditional hazard function on $\mathbb{R}$ by

$$S(\cdot|x)=\frac{f(\cdot|{x})}{1-F(\cdot|{x})}.$$

The kernel estimator ${S}_{n;h_{n}}(y|x)$ is defined for all $y\in\mathbb{R}$ such that $F(y|x)<1$, by

$${S}_{n;h_{n}}(t|\mathbf{x})=\frac{\widehat{f}_{n}({t}|{x})}{1-\widehat{F}_{n}(t|x)}.$$

Our result can be applied to ${S}_{n;h_{n}}(t|\mathbf{x})$ by combining Corollaries 4.2 and 4.4.

Remark 4.5. The use of the single index model has been adopted to decrease the dimensionality of the explanatory variable, aiming to circumvent the ‘‘curse of dimensionality’’ while preserving the benefits of nonparametric smoothing in multivariate regression over the past few decades. By assuming $\mathcal{X}$ is a Hilbert space, in the single index setting, the process in (2.2) takes the form

$$W_{n}(x,l,\theta,\vartheta)=\frac{1}{n\mathbf{E}[\Delta_{1}(x,\theta,\vartheta h)]}\sum_{i=1}^{n}\Bigg{\{}\Big{(}c_{l}(x)l(Y_{i})+d_{l}(x)\Big{)}\Delta_{i}(x,\theta,\vartheta h)$$

$${}-\mathbf{E}\bigg{[}\Big{(}c_{l}(x)l(Y_{i})+d_{l}(x)\Big{)}\Delta_{i}(x,\theta,\vartheta h)\bigg{]}\Bigg{\}},$$

(4.4)

where $\theta$ is a functional single index valued in a subset $\Theta$ of a separable Hilbert space $\mathcal{X}$ and $\Delta_{i}(x,\theta,h)=K(h^{-1}|\langle X-x,\theta\rangle|),$ one can refer to [37, 38]. Although it is conceivable that our findings may apply to the single index model, establishing such a conclusion is outside the purview of this paper and seems to pose significant challenges.

Remark 4.6. This study holds significance in the realm of functional data analysis. Firstly, the results presented in this paper are enriched by an additional uniformity constraint, specifically $a_{n}\leq h\leq b_{n}$. Secondly, the scope of applications is broadened to encompass novel areas in the field, including kernel quantile regression, kernel conditional density function, and kernel conditional copula function. These extensions represent pioneering contributions in functional data analysis. The findings of this study play a crucial role in establishing uniform consistency for various estimators, employing data-driven bandwidths, associated with the aforementioned applications. Further insights and relevant results can be explored in [41] and [94].

5 THE BANDWIDTH SELECTION CRITERION

Numerous methods have been developed to construct asymptotically optimal bandwidth selection rules for nonparametric kernel estimators, particularly for the Nadaraya–Watson regression estimator. Prominent works in this regard include [25, 30, 77, 79, 118]. The selection of this parameter is crucial, whether in the standard finite-dimensional case or within the infinite-dimensional framework, to ensure effective practical performance. Let’s define the leave-out-$\left(X_{i},Y_{i}\right)$ estimator for the regression function

$$\widehat{r}_{n}^{l;j}(x)=\frac{\sum_{i=1,i\neq j}^{n}l(Y_{i})K(h^{-1}d(x,X_{i}))}{\sum_{i=1,i\neq j}^{n}K(h^{-1}d(x,X_{i}))}.$$

(5.1)

To minimize the quadratic loss function, we introduce the following criterion, where $\mathbb{W}(\cdot)$ is a known non-negative weight function

$$CV\left(\varphi,h_{n}\right):=\frac{1}{n}\sum_{j=1}^{n}\left(l\left(Y_{j}\right)-\widehat{r}_{n}^{l;j}(X_{i})\right)^{2}\mathbb{W}\left(X_{j}\right).$$

(5.2)

Following the ideas developed by [118], a natural way for choosing the bandwidth is to minimize the precedent criterion, so let’s choose $\widehat{h}_{n}\in[a_{n},b_{n}]$ minimizing among $h\in[a_{n},b_{n}]$ :

$$CV\left(\varphi,h_{n}\right).$$

One can replace (5.2) by

$$\widehat{CV}\left(\varphi,h\right):=\frac{1}{n}\sum_{j=1}^{n}\left(l\left(Y_{j}\right)-\widehat{r}_{n}^{l;j}(X_{i})\right)^{2}\widehat{\mathcal{W}}\left(\mathbf{x}_{j},\mathbf{x}\right).$$

(5.3)

In practice, one takes for $j=1,\ldots,n$, the uniform global weights $\mathbb{W}\left(\mathbf{x}_{j}\right)=1$, and the local weights

$$\widehat{\mathcal{W}}(\mathbf{x}_{j},{\mathbf{x}})=\begin{cases}1\quad\textrm{if}\quad d(x_{j},{\mathbf{x}})\leq h\\ 0\quad\textrm{otherwise}.\end{cases}$$

By similar reasoning, one can select $\widehat{h}_{n}$ for the process defined in (2.2). Let us define

$$\widehat{W}_{n}(x,l)=W_{n}(x,l,\widehat{h}_{n})=\frac{1}{n\mathbf{E}[\Delta_{1}(x,\widehat{h}_{n})]}\sum_{i=1}^{n}\Bigg{\{}\Big{(}c_{l}(x)l(Y_{i})+d_{l}(x)\Big{)}\Delta_{i}(x,\widehat{h}_{n})$$

$${}-\mathbf{E}\bigg{[}\Big{(}c_{l}(x)l(Y_{i})+d_{l}(x)\Big{)}(x,\widehat{h}_{n})\bigg{]}\Bigg{\}}.$$

(5.4)

The following corollary is an immediate consequence of Theorem 3.2.

Corollary 5.1. Assume that assumptions (A1)–(A4), (B1)–(B3), and (B4) $or$ (B${}^{\prime}$4) hold true. Furthermore, consider the classes of continuous functions $\mathcal{C}$ and $\mathcal{D}$ given above. Then we have, for any $x\in\mathcal{J}$,

(i) for any $0<c<\infty$, $\{\psi\in l_{\infty}(\mathcal{L}):\,I_{x}(\psi)\leq c\}$ is a compact set of $l_{\infty}(\mathcal{L})$;

(ii) for all open subsets $O$ of $l_{\infty}(\mathcal{L})$,

$$\liminf_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\widehat{h}_{n})}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}\widehat{W}_{n}(x,\cdot)\in O\bigg{\}}\Bigg{)}\geq-\inf_{\psi\in O}I_{x}(\psi)$$

and for all closed subsets $F$ of $l_{\infty}(\mathcal{L})$,

$$\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\widehat{h}_{n})}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}\widehat{W}_{n}(x,\cdot)\in F\bigg{\}}\Bigg{)}\leq-\inf_{\psi\in F}I_{x}(\psi),$$

where

$$I_{x}(\psi)=\inf\Bigg{\{}2^{-1}\mathbf{E}[\xi^{2}]:\xi\in\tilde{\mathcal{Z}}_{x},\ \varphi(\xi)=\psi\Bigg{\}},$$

(5.5)

and $\tilde{\mathcal{Z}}_{x}$ is the closed linear subspace of the space $L_{2}$ generated by the mean-zero Gaussian process $\Big{\{}\Xi(x,l):l\in\mathcal{L}\Big{\}}.$

As in [87], let us minimize the following errors:

$$\textrm{Err}_{1}\left(\widehat{f}_{n,b},f\right)=\iint\left\{\widehat{f}_{n,b}({y}|{x})-f({y}|{x})\right\}^{2}W_{1}(x)W_{2}(y)d\mathbb{P}(x,y),$$

$$\textrm{Err}_{2}\left(\widehat{f}_{n,b},f\right)=\frac{1}{n}\sum_{i=1}^{n}\left\{\widehat{f}_{n,b}({Y_{i}}|X_{i})-f({Y_{i}}|{X_{i}})\right\}^{2}\frac{W_{1}\left(X_{i}\right)W_{2}\left(Y_{i}\right)}{f({Y_{i}}|{X_{i}})},$$

or

$$\textrm{Err}_{3}\left(\widehat{f}_{n,b},f\right)=\iint\mathbf{E}\left\{\widehat{f}_{n,b}({y}|{x})-f({y}|{x})\right\}^{2}W_{1}(x)W_{2}(y)d\mathbb{P}(x,y),$$

where $W_{1}(\cdot)$ and $W_{2}(\cdot)$ are some non-negative weight functions. These theoretical errors are not computable in practice and the following leave-one-out cross-validation criterion can be constructed to approximate them in some fully data-driven way:

$$\mathrm{CV}(b)=\frac{1}{n}\sum_{i=1}^{n}W_{1}\left(X_{i}\right)\int\left(\widehat{f}_{n,b}^{-i}({y}|{X_{i}})\right)^{2}W_{2}(y)d\mathbb{P}(y)-\frac{2}{n}\sum_{i=1}^{n}\widehat{f}_{n,b}^{-i}({Y_{i}}|{X_{i}})W_{1}\left(X_{i}\right)W_{2}\left(Y_{i}\right),$$

where

$$\widehat{f}_{n,b}^{-i}({t}|{X_{i}})=\frac{\sum_{j=1,j\neq i}^{n}\frac{1}{h_{1}}K_{1}(h_{1}^{-1}(Y_{j}-t))K(h^{-1}d(x,X_{j}))}{\sum_{j=1,i\neq j}^{n}K(b^{-1}d(x,X_{j}))}.$$

(5.6)

Then the smoothing parameter $b$ is selected by the following procedure:

$$\widehat{h}=\arg\min_{a_{n}\leq h\leq b_{n}}\textrm{CV}(h).$$

While the aforementioned cross-validation procedures focus on approximating quadratic errors of estimation, alternative approaches for selecting smoothing parameters may prioritize optimizing the predictive power of the method. This can be achieved by minimizing one of the following prediction criteria

$$\widetilde{h}^{(1)}=\arg\min_{a_{n}\leq h\leq b_{n}}\sum_{i=1}^{n}\left(Y_{i}-\widehat{Y}_{i}^{(1)}\right)^{2},$$

where the prediction is performed using either of the conditional median

$$\widehat{F}_{nh}^{-i}\left(\frac{1}{2}|x,h\right)=\frac{\sum_{j=1,j\neq i}^{n}{\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{(\infty,t]}(Y_{j})K(h^{-1}d(x,X_{j}))}{\sum_{j=1,j\neq i}^{n}K(h^{-1}d(x,X_{j}))}\text{ and }\widehat{Y}_{i}^{(1)}=\left\{\widehat{F}_{nh}^{-i}\right\}^{-1}\left(\frac{1}{2}|X_{i},h\right)$$

(5.7)

or

$$\widetilde{h}^{(2)}=\arg\min_{a_{n}\leq b\leq b_{n}}\sum_{i=1}^{n}\left(Y_{i}-\widehat{Y}_{i}^{(2)}\right)^{2},$$

using the conditional mode, viz.

$$\widehat{Y}_{i}^{(2)}=\arg\max_{y}\widehat{f}_{n,b}^{-i}({y}|{X_{i}}).$$

For more discussion, one can refer to [16, 17, 29].

Remark 5.2. It is essential to highlight that the primary challenge in employing an estimator like the one in (2.1) lies in properly selecting the smoothing parameter $h$. The consistency results with uniformity in bandwidth, as presented in Corollary 4.1, indicate that the choice of $h_{1}$ and $h_{2}$ within certain intervals guarantees the moderate deviations principle for $\widehat{r}_{n}^{l}(x,h)$. In other words, the fluctuations in bandwidth within a small interval do not impact the moderate deviations principle of the nonparametric estimator $\widehat{r}_{n}^{l}(x,h)$ for $r^{l}(x)$.

Remark 5.3. It is straightforward to modify the proofs of our results to show that it remains true when the entropy condition is substituted by the bracketing condition.^{Footnote 3} For some $C_{0}>0$ and $v_{0}>0,$

$$\mathcal{N}_{[]}\left(\mathcal{F},L_{2}(\mathbb{P}),\epsilon\right)\leq C_{0}\epsilon^{-v_{0}},\quad 0<\epsilon<1.$$

Remark 5.4. Observe that the standardizing factor is $\left(n\phi(h_{n})\right)^{1/2}$ with $\phi(h_{n})\rightarrow 0$, indicating a lower rate of convergence. This is the cost incurred when concluding the conditional (local) quantities.

6 CONCLUSIONS

We employ general empirical process methods to establish, under mild regularity conditions, the functional moderate deviations of kernel-type function estimators that are dependent on an infinite-dimensional covariate. We present a valuable moderate deviation principle for a function-indexed process by leveraging intricate exponential contiguity arguments. The moderate deviation principles are a useful tool in analyzing the behavior of the estimators in question. This paper aims to make several noteworthy contributions to the existing literature on functional data analysis. Specifically, we focus on establishing functional moderate deviation principles for the Nadaraya–Watson estimators, the conditional distribution processes, the kernel quantile regression, the kernel conditional density function, and the kernel conditional copula function. Our findings extend the current knowledge in the field and offer new avenues for future research in functional data analysis. Extending our work to encompass $k$-nearest neighbors estimators holds a significant interest. However, achieving this goal requires the development of new technical arguments, as it presently lies beyond reasonable expectations. Exploring the realm of $k$-nearest neighbors estimators would expand the scope of our research and provide valuable insights into their performance and properties. Additional extensions involve models about dimension reduction. While our paper presents fully nonparametric functional models, recent FDA literature has emphasized the interest in semi-parametric models as a bridge between flexible (but excessively dimensional) and low dimensional (but excessively restrictive) linear models. This field comprises, for example, functional single index models (refer to [37, 38, 74] for the most recent advancements), projection pursuit models (refer to [40]), and partial linear models (refer to [5, 92]). As far as we are aware, the majority of the literature proposed in these models relates to moderate deviations principals, and we hypothesize that our ideas and methodologies could likely be used successfully for deriving such results, thereby opening up new research avenues. Such an extension would require innovative approaches and advanced theoretical frameworks to effectively tackle the challenges associated with projection pursuit regression and projection pursuit conditional distribution processes. By embarking on this path, we can further enrich the existing literature and contribute to advancing functional data analysis. Finally, the extensions of our ideas would concern dependent statistical samples with possible time series applications. The literature on dependent kernel functional estimators has been rather developed (cf. [24, 28, 62, 100, 128]) but always with moderate deviation results. Note that this extension should be harder to get than the previous ones because one main difficulty in developing such a dependent extension of our work would be the statement of new probabilistic results since those used herein (cf. results in [6, 8]) are specific to iid samples. In conclusion, our ongoing exploration involves extending our ideas to encompass dependent statistical samples, with potential applications in time series analysis. The existing body of work on dependent kernel functional estimators has seen significant development, as documented in sources such as [24, 28, 62, 100]. However, it is worth noting that these developments have generally been investigated without moderate deviation results. It is important to acknowledge that extending our research in this direction presents a more formidable challenge compared to our previous endeavors. The primary obstacle lies in the necessity of formulating novel probabilistic results, as the ones employed in our current work, as demonstrated in [6–7], are tailored specifically for independent and identically distributed (i.i.d.) samples.

PROOFS

In this section, we present the proofs for all the theoretical findings outlined in this study. The previously introduced notation will be consistently applied throughout the ensuing discussion. We now explore slightly more generalized processes compared to those defined in (2.2). Specifically, we offer a broader framework that does not necessitate $\mathcal{M}_{x,l}(\cdot)=c_{l}(x)l(\cdot)+d_{l}(x)$. The proof of Theorems 3.1 and 3.2 is intricate and will be dissected into multiple lemmas, each elucidated in Section 8.

Proof of Theorem 3.1

The proof utilizes the Gärtner–Ellis theorem as a principal tool. For any integer $k\geq 1$ and arbitrary $(x,l,\varrho)\in\mathcal{E}\times\mathcal{L}\times\mathcal{H}_{0}$, it is asserted that, for any nonnegative real function $M_{x,l}(\cdot)$ defined over the domain $\mathbb{R}$, one may discern the following

$$\mathbf{E}[M_{x,l}(Y)\Delta_{1}(x,\varrho h)]=\int\limits_{0}^{1}\int M_{x,l}(v)K(u)\,d\mathbb{P}\left(\frac{d(x,X_{1})}{\varrho h}\leq u,Y\leq v\right)$$

$${}=\int\limits_{0}^{1}\int M_{x,l}(v)K(u)\,d\mathbb{P}\left(\frac{d(x,X_{1})}{\varrho h}\leq u|Y=v\right)g(v)\,dv$$

$${}:=\int M_{x,l}(v)\mathcal{S}_{x,\varrho}(v,h)g(v)\,dv.$$

Given the existence of $K^{\prime}(\cdot)$, it follows

$$K(u)=K(0)+\int\limits_{0}^{u}K^{\prime}(t)\,dt,$$

which, by using a Condition (A2)(i), implies that

$$\mathcal{S}_{x,\varrho}(v,h)=K(0)F_{x}^{v}(\varrho h)+\int\limits_{0}^{1}\left(\int\limits_{0}^{t}K^{\prime}(u)\,du\right)\,d\mathbb{P}\left(\frac{d(x,X_{1})}{\varrho h}\leq t|Y=v\right)$$

$${}=K(0)F_{x}^{v}(\varrho h)+\int\limits_{0}^{1}K^{\prime}(u)\mathbb{P}\left(u\leq\frac{d(x,X_{1})}{\varrho h}\leq 1|Y=v\right)du$$

$${}=K(1)F_{x}^{v}(\varrho h)-\int\limits_{0}^{1}K^{\prime}(u)F_{x}^{v}(u\varrho h)\,du$$

$${}=K(1)\Big{(}\phi(\varrho h)f_{v}(x)+g_{x,v}(\varrho h)\Big{)}-\int\limits_{0}^{1}K^{\prime}(u)\Big{(}\phi(u\varrho h)f_{v}(x)+g_{x,v}(u\varrho h)\Big{)}\,du$$

$${}=\phi(\varrho h)\left\{K(1)\Big{(}f_{v}(x)+\frac{g_{x,v}(\varrho h)}{\phi(\varrho h)}\Big{)}-\int\limits_{0}^{1}K^{\prime}(u)\frac{\phi(u\varrho h)}{\phi(\varrho h)}\Big{(}f_{v}(x)+\frac{g_{x,v}(u\varrho h)}{\phi(u\varrho h)}\Big{)}\,du\right\}$$

$${}:=\phi(\varrho h)L_{x,\varrho}(v,h).$$

Therefore, for any $(x,l,\varrho)\in\mathcal{E}\times\mathcal{L}\times\mathcal{H}_{0}$, we infer

$$\mathbf{E}[M_{x,l}(Y)\Delta_{1}(x,\varrho h)]=\phi(\varrho h)\int L_{x,\varrho}(v,h)M_{x,l}(v)g(v)\,dv.$$

(7.1)

Let $y_{1}:=(x,l_{1})$ and $y_{2}:=(x,l_{2})$ in $\mathcal{E}\times\mathcal{L}$, and $(\varrho_{1},\varrho_{2})\in(\mathcal{H}_{0})^{2}$. For any real function $\mathcal{M}_{y_{1},y_{2}}(\cdot)$ defined on $\mathbb{R}$ such that $\mathbf{E}[|\mathcal{M}_{y_{1},y_{2}}(Y)|]<\infty$, observe that

$$\mathbf{E}[\mathcal{M}_{y_{1},y_{2}}(Y)\Delta_{1}(x,\varrho_{1}h)\Delta_{1}(x,\varrho_{2}h)]$$

$${}=\int\limits_{0}^{1}\int\mathcal{M}_{y_{1},y_{2}}(v)K(\frac{\varrho}{\varrho_{1}}u)K(\frac{\varrho}{\varrho_{2}}u)\,d\mathbb{P}\left(\frac{d(x,X_{1})}{\varrho h}\leq u|Y=v\right)g(v)\,dv$$

$${}:=\int\mathcal{M}_{y_{1},y_{2}}(v)\mathcal{S}_{x,\varrho_{1},\varrho_{2}}(v,h)g(v)\,dv.$$

Similarly, we have

$$K\bigg{(}\frac{\varrho}{\varrho_{1}}u\bigg{)}K\bigg{(}\frac{\varrho}{\varrho_{2}}u\bigg{)}=K(0)^{2}+\int\limits_{0}^{u}\bigg{(}K\bigg{(}\frac{\varrho}{\varrho_{1}}t\bigg{)}K\bigg{(}\frac{\varrho}{\varrho_{2}}t\bigg{)}\bigg{)}^{\prime}\,dt,$$

which, by using the Condition (A2)(i), implies that

$$\mathcal{S}_{x,\varrho_{1},\varrho_{2}}(v,h)=\phi(\varrho h)\Bigg{\{}K\bigg{(}\frac{\varrho}{\varrho_{1}}\bigg{)}K\bigg{(}\frac{\varrho}{\varrho_{2}}\bigg{)}\Bigg{(}f_{v}(x)+\frac{g_{x,v}(\varrho h)}{\phi(\varrho h)}\Bigg{)}$$

$${}-\int\limits_{0}^{1}\bigg{(}K\bigg{(}\frac{\varrho}{\varrho_{1}}u\bigg{)}K\bigg{(}\frac{\varrho}{\varrho_{2}}u\bigg{)}\bigg{)}^{\prime}\frac{\phi(u\varrho h)}{\phi(\varrho h)}\Bigg{(}f_{v}(x)+\frac{g_{x,v}(u\varrho h)}{\phi(u\varrho h)}\Bigg{)}\,du\Bigg{\}}$$

$${}:=\phi(\varrho h)\mathcal{L}_{x,\varrho_{1},\varrho_{2}}(v,h).$$

Therefore, we have

$$\mathbf{E}[\mathcal{M}_{y_{1},y_{2}}(Y)\Delta_{1}(x,\varrho_{1}h)\Delta_{1}(x,\varrho_{2}h)]=\phi(\varrho h)\int\mathcal{L}_{x,\varrho_{1},\varrho_{2}}(v,h)\mathcal{M}_{y_{1},y_{2}}(v)g(v)\,dv.$$

(7.2)

For some $\gamma\in\mathcal{H}_{0}$, set $\beta_{n}=n\phi(\gamma h)/w_{n}$. For any tuple $(\theta_{1},\ldots,\theta_{m})\in\mathbb{R}^{m}$, the Laplace transform corresponding to $\beta_{n}(W_{n}(x,z_{1}),\ldots,W_{n}(x,z_{m}))$ is explicitly defined as follows

$${\Phi^{x,z_{1},\ldots,z_{m}}_{n}(\theta_{1},\ldots,\theta_{m})}$$

$${}=\mathbf{E}\Bigg{[}\exp\bigg{\{}\Big{\langle}(\theta_{1},\ldots,\theta_{m}),\beta_{n}\big{(}W_{n}(x,z_{1}),\ldots,W_{n}(x,z_{m})\big{)}\Big{\rangle}\bigg{\}}\Bigg{]}$$

$${}=\mathbf{E}\left[\exp\left\{\sum_{i=1}^{n}\sum_{j=1}^{m}\frac{\beta_{n}\theta_{j}}{n\mathbf{E}\big{[}\Delta_{1}(x,\varrho_{j}h)]}\Big{(}\mathcal{M}_{x,l_{j}}(Y_{i})\Delta_{i}(x,\varrho_{j}h)-\mathbf{E}\big{[}\mathcal{M}_{x,l_{j}}(Y_{i})\Delta_{i}(x,\varrho_{j}h)\big{]}\Big{)}\right\}\right]$$

$${}=\left(\mathbf{E}\left[\exp\left\{\sum_{j=1}^{m}\frac{\beta_{n}\theta_{j}}{n\mathbf{E}\big{[}\Delta_{1}(x,\varrho_{j}h)]}\Big{(}\mathcal{M}_{x,l_{j}}(Y)\Delta_{1}(x,\varrho_{j}h)-\mathbf{E}\big{[}\mathcal{M}_{x,l_{j}}(Y)\Delta_{1}(x,\varrho_{j}h)\big{]}\Big{)}\right\}\right]\right)^{n}$$

$${}:=\Big{(}\varphi^{x,z_{1},\ldots,z_{m}}_{n}(\theta_{1},\ldots,\theta_{m})\Big{)}^{n}.$$

(7.3)

Let us now evaluate the quantity $\varphi^{x,z_{1},\ldots,z_{m}}_{n}(\theta_{1},\ldots,\theta_{m})$. We remark that

$$\varphi^{x,z_{1},\ldots,z_{m}}_{n}(\theta_{1},\ldots,\theta_{m})=1+\frac{1}{2}\mathcal{T}_{1}+\mathcal{T}_{2},$$

(7.4)

where

$$\mathcal{T}_{1}=\mathbf{E}\left[\left\{\sum_{j=1}^{m}\frac{\theta_{j}\beta_{n}}{n\mathbf{E}[\Delta_{1}(x,\varrho_{j}h)]}\bigg{(}\mathcal{M}_{x,l_{j}}(Y)\Delta_{1}(x,\varrho_{j}h)-\mathbf{E}[\mathcal{M}_{x,l_{j}}(Y)\Delta_{1}(x,\varrho_{j}h)]\bigg{)}\right\}^{2}\right]$$

and

$$\mathcal{T}_{2}=\sum_{k=3}^{\infty}\frac{1}{k!}\bigg{(}\frac{\beta_{n}}{n}\bigg{)}^{k}\mathbf{E}\left[\left|\sum_{j=1}^{m}\frac{\theta_{j}}{\mathbf{E}[\Delta_{1}(x,\varrho_{j}h)]}\bigg{\{}\mathcal{M}_{x,l_{j}}(Y)\Delta_{1}(x,\varrho_{j}h)-\mathbf{E}\Big{[}\mathcal{M}_{x,l_{j}}(Y)\Delta_{1}(x,\varrho_{j}h)\Big{]}\bigg{\}}\right|^{k}\right].$$

Now, observe that

$$\mathcal{T}_{1}=\sum_{j=1}^{m}\sum_{p=1}^{m}\frac{\theta_{j}\theta_{p}\beta_{n}^{2}}{n^{2}\mathbf{E}[\Delta_{1}(x,\varrho_{j}h)]\mathbf{E}[\Delta_{1}(x,\varrho_{p}h)]}\mathbf{E}\Bigg{[}\mathcal{M}_{x,l_{j}}(Y))\mathcal{M}_{x,l_{p}}(Y)\Delta_{1}(x,\varrho_{j}h)\Delta_{1}(x_{p},\varrho_{p}h)\Bigg{]}$$

$${}-\sum_{j=1}^{m}\sum_{p=1}^{m}\frac{\theta_{j}\theta_{p}\beta_{n}^{2}}{n^{2}\mathbf{E}[\Delta_{1}(x,\varrho_{j}h)]\mathbf{E}[\Delta_{1}(x,\varrho_{p}h)]}\mathbf{E}\Bigg{[}\mathcal{M}_{x,l_{j}}(Y)\Delta_{1}(x,\varrho_{j}h)\Bigg{]}\mathbf{E}\Bigg{[}\mathcal{M}_{x,l_{p}}(Y)\Delta_{1}(x,\varrho_{p}h)\Bigg{]}$$

$${}:=\frac{\beta_{n}^{2}}{n^{2}}(\mathcal{T}_{1,1}-\mathcal{T}_{1,2}).$$

(7.5)

Let $\varrho=\min(\varrho_{j},\varrho_{p})$ be defined. Referring to (7.2) and under the conditions (A1)–(A2) and (A4)(ii) we deduce that

$$\lim_{n\rightarrow\infty}\frac{1}{\phi(\varrho h)}\mathbf{E}\Bigg{[}\mathcal{M}_{x,l_{j}}(Y))\mathcal{M}_{x,l_{p}}(Y)\Delta_{1}(x,\varrho_{j}h)\Delta_{1}(x,\varrho_{p}h)\Bigg{]}$$

$${}=\alpha_{1}(\varrho_{j},\varrho_{p})\int\mathcal{M}_{x,l_{j}}(v))\mathcal{M}_{x,l_{p}}(v)f_{v}(x)g(v)\,dv,$$

(7.6)

where

$$\alpha_{1}(\varrho_{j},\varrho_{p})=K\bigg{(}\frac{\varrho}{\varrho_{j}}\bigg{)}K\bigg{(}\frac{\varrho}{\varrho_{p}}\bigg{)}-\int\limits_{0}^{1}\bigg{(}K\bigg{(}\frac{\varrho}{\varrho_{j}}u\bigg{)}K\bigg{(}\frac{\varrho}{\varrho_{p}}u\bigg{)}\bigg{)}^{\prime}\tau_{0}(u)du,$$

Using the equality (7.1) with $M_{x,l}(v)=1$, in accordance with conditions (A1)–(A2), we find

$$\lim_{n\rightarrow\infty}\frac{1}{\phi(\varrho_{j}h)}\mathbf{E}[\Delta_{1}(x,\varrho_{j}h)]=\left(K(1)-\int\limits_{0}^{1}K^{\prime}(u)\tau_{0}(u)\,du\right)\int f_{v}(x)g(v)\,dv$$

$${}:=\alpha_{0}\int f_{v}(x)g(v)\,dv.$$

(7.7)

Now, by Condition (A2)(ii), we have

$$\lim_{n\rightarrow\infty}\frac{\phi(\gamma h)\phi(\varrho h)}{\phi(\varrho_{j}h)\phi(\varrho_{p}h)}=\frac{\tau(\varrho,\gamma)}{\tau(\varrho_{j},\gamma)\tau(\varrho_{p},\gamma)}.$$

Therefore, we obtain

$$\lim_{n\rightarrow\infty}\phi(\gamma h)\mathcal{T}_{1,1}=\sum_{j=1}^{m}\sum_{p=1}^{m}\theta_{j}\theta_{p}\frac{\alpha_{1}(\varrho_{j},\varrho_{p})\tau(\varrho,\gamma)}{\alpha_{0}^{2}\tau(\varrho_{j},\gamma)\tau(\varrho_{p},\gamma)}\frac{\displaystyle\int\mathcal{M}_{x,l_{j}}(v))\mathcal{M}_{x,l_{p}}(v)f_{v}(x)g(v)\,dv}{\displaystyle\Bigg{(}\int f_{v}(x)g(v)\,dv\Bigg{)}\Bigg{(}\int f_{v}(x)g(v)\,dv\Bigg{)}}.$$

(7.8)

Making use of the condition (A3), we infer that

$$\lim_{n\rightarrow\infty}\frac{\beta_{n}^{2}}{n^{2}}\mathcal{T}_{1,1}=0.$$

(7.9)

Likewise, by the condition (A3), we derive

$$\lim_{n\rightarrow\infty}\phi(\gamma h)\mathcal{T}_{1,2}=0\quad\textrm{and}\quad\lim_{n\rightarrow\infty}\frac{\beta_{n}^{2}}{n^{2}}\mathcal{T}_{1,2}=0.$$

(7.10)

Using the $C_{r}$-inequality and the boundedness of $K(\cdot)$, we get, for some $\kappa_{0}>0,$

$$\mathcal{T}_{2}\leq\sum_{j=1}^{m}\sum_{k=3}^{\infty}\frac{1}{k!}\bigg{(}\frac{2^{m}\beta_{n}}{n}\bigg{)}^{k}\frac{\kappa_{0}^{k}|\theta_{j}|^{k}\mathbf{E}\left[|\mathcal{M}_{x,l_{j}}(Y)|^{k}\Delta_{1}(x,\varrho_{j}h)\right]}{\Big{(}\mathbf{E}[\Delta_{1}(x,\varrho_{j}h)]\Big{)}^{k}}.$$

(7.11)

By (7.1), and making use of the conditions (A1)–(A2), there exists a constant $\kappa_{1}>0$ such that

$${\mathbf{E}\left[|\mathcal{M}_{x,l_{j}}(Y)|^{k}\Delta_{1}(x,\varrho_{j}h)\right]}$$

$${}<\kappa_{1}^{k}\phi(\varrho_{j}h)\left(\int|\mathcal{M}_{x,l_{j}}(v)|^{k}f_{v}(x)g(v)dv+\int|\mathcal{M}_{x,l_{j}}(v)|^{k}g(v)dv\right).$$

(7.12)

Again, by the equality (7.1) with $M_{x,l}(\cdot)=1$, we have

$$\mathbf{E}[\Delta_{1}(x,\varrho_{j}h)]=\phi(\varrho_{j}h)\int L_{x,\varrho_{j}}(v,h)g(v)\,dv:=\phi(\varrho_{j}h)T(x,\varrho_{j}h).$$

(7.13)

For each $j=1,\ldots,m$, under conditions (A1)–(A2) and considering the fact that $f(x)>0$, we obtain

$$\lim_{h\rightarrow 0}T(x,\varrho_{j}h)>0.$$

(7.14)

For sufficiently large $n$ and for some $\gamma\in\mathcal{H}_{0}$, there exists $t_{0}>0$ such that $2^{m}\beta_{n}/(n\phi(\gamma h))=2^{m}/w_{n}\leq t_{0}$. Now, by employing (7.11)–(7.14), we derive

$$\mathcal{T}_{2}\leq\bigg{(}\frac{2^{m}\beta_{n}}{t_{0}n\phi(\gamma h)}\bigg{)}^{3}\phi(\gamma h)\sum_{j=1}^{m}\big{(}\tau(\varrho_{j},\gamma)+o(1)\big{)}\bigg{(}\int\bigg{[}\exp\bigg{\{}\frac{t_{0}\kappa_{0}|\theta_{j}\mathcal{M}_{x,l_{j}}(v)|}{\big{(}\tau(\varrho_{j},\gamma)+o(1)\big{)}T(x,\varrho_{j}h)}\bigg{\}}f_{v}(x)$$

$${}+\exp\bigg{\{}\frac{t_{0}C|\theta_{j}\mathcal{M}_{x,l_{j}}(v)|}{\big{(}\tau(\varrho_{j},\gamma)+o(1)\big{)}T(x,\varrho_{j}h)}\bigg{\}}\bigg{]}g(v)dv\bigg{)}.$$

Making use of (7.14), we readily infer

$$\mathcal{T}_{2}\leq\kappa_{2}\left(\frac{2^{m}\beta_{n}}{t_{0}n\phi(\gamma h)}\right)^{3}\phi(\gamma h)\sum_{j=1}^{m}\Bigg{(}\int\big{[}\exp\left(t_{0}\kappa_{3}|\theta_{j}\mathcal{M}_{x,l_{j}}(v)|\right)f_{v}(x)$$

$${}+\exp\left(t_{0}\kappa_{3}|\theta_{j}\mathcal{M}_{x,l_{j}}(v)|\right)\big{]}g(v)dv\bigg{)},$$

where $\kappa_{2},\kappa_{3}>0$. By Conditions (A4)(ii)–(iii), we get

$$\mathcal{T}_{2}=O\Bigg{(}\bigg{(}\frac{\beta_{n}}{n\phi(\gamma h)}\bigg{)}^{3}\phi(\gamma h)\Bigg{)}.$$

(7.15)

Thus, based on the Condition (A3), we deduce

$$\lim_{n\rightarrow\infty}\mathcal{T}_{2}=0\quad\textrm{and}\quad\lim_{n\rightarrow\infty}\frac{n^{2}\phi(\gamma h)}{\beta_{n}^{2}}\mathcal{T}_{2}=0.$$

(7.16)

By combining (7.3)–(7.10) with (7.15), we readily obtain

$$\lim\limits_{n\rightarrow\infty}\frac{n\phi(\gamma h)}{\beta_{n}^{2}}\log\Phi^{x,z_{1},\ldots,z_{m}}_{n}(\theta_{1},\ldots,\theta_{m})=\frac{1}{2}\sum_{j=1}^{m}\sum_{p=1}^{m}\theta_{j}\theta_{p}R_{\gamma}(x,z_{j},z_{p})$$

$${}:=\Psi_{\gamma}^{x,z_{1},\ldots,z_{m}}(\theta_{1}\ldots,\theta_{m}),$$

where the function $R_{\gamma}$ is defined in the Statement (3.1). Note that the Condition (A4) implies that the function $\Psi_{\gamma}^{x,z_{1},\ldots,z_{m}}$ is finite and differentiable everywhere. The Fenchel–Legendre transform of $\Psi_{\gamma}^{x,z_{1},\ldots,z_{m}}$ is given by

$$\Gamma^{\gamma}_{x,z_{1}\ldots,z_{m}}(\lambda_{1},\ldots,\lambda_{m})=\sup_{(\theta_{1},\ldots,\theta_{m})}\left\{\sum_{i=1}^{m}\lambda_{j}\theta_{j}-\Psi_{\gamma}^{x,z_{1},\ldots,z_{m}}(\theta_{1},\ldots,\theta_{m})\right\}.$$

We need to establish the essential smoothness of the function $\Psi_{\gamma}^{x,z_{1},\ldots,z_{m}}$ and utilize the Gärtner–Ellis theorem for the proof, as outlined in [49] on page 44. Under the assumption (A4), it is evident that the interior of the set

$$D=\{(\theta_{1},\ldots,\theta_{m}):\Psi_{\gamma}^{x,z_{1},\ldots,z_{m}}(\theta_{1},\ldots,\theta_{m})<\infty\}$$

is not empty. Furthermore, considering these conditions, the function $\Psi_{\gamma}^{z_{1},\ldots,z_{m}}$ is shown to be steep, establishing its essential smoothness. In conclusion, this ensures that for any $\theta_{1},\ldots,\theta_{m}\in\mathbb{R}$, the function holds the properties necessary for applying the Gärtner–Ellis theorem in our proof

$$\sum_{j,p}^{m}\theta_{j}\theta_{p}R_{\gamma}(x,z_{j},z_{p})\geq 0,$$

there exists a mean-zero Gaussian process $\{\Xi_{\gamma}(x,z),z\in\mathcal{L}\times\mathcal{H}_{0}\}$ such that

$$\mathbf{E}[\Xi_{\gamma}(x,z_{1})\Xi_{\gamma}(x,z_{2})]=R_{\gamma}(x,z_{1},z_{2})\text{ for each }z_{1},z_{2},\in\mathcal{L}\times\mathcal{H}_{0}.$$

Considering Eq. (3.3) and employing Lemma $4.1$ from the work of [7], one can express the rate function as follows

$$\Gamma^{\gamma}_{x,z_{1}\ldots,z_{m}}(\lambda_{1},\ldots,\lambda_{m})=\inf\Bigg{\{}2^{-1}\mathbf{E}[\xi^{2}]:\xi\in\mathcal{Z}_{\gamma},\ \mathbf{E}[\Xi_{\gamma}(x,z_{j})\xi]=\lambda_{j}\ \text{for each}\ 1\leq j\leq m\Bigg{\}},$$

where $\mathcal{Z}_{\gamma}$ is defined in the introduction of this rate function [7, p. 6]. $\Box$

The proof of The demonstration of Theorem 3.2 requires several intermediary results, which we present in the following set of lemmas.

Lemma 7.1. Under the assumptions (A1)–(A2) and (B3) we have, for any nonnegative function $M_{0}(\cdot)$ such that $\mathbf{E}[M_{0}(Y)]<\infty$,

$$\lim_{n\rightarrow\infty}\sup_{(x,\varrho)\in\mathcal{J}\times\mathcal{H}_{0}}\bigg{|}\frac{\mathbb{E}[M_{0}(Y)\Delta_{1}(x,\varrho h)]}{\phi(\varrho h)}-\alpha_{0}\int M_{0}(v)f_{v}(x)g(v)\,dv\bigg{|}=0,$$

where $\alpha_{0}$ is given in (3.2).

The proof of Lemma 7.1 is postponed until Section 8.

In the proof of Theorem 3.2 we shall apply Lemma 7.1 with $M_{0}(\cdot)$ belongs to a finite class $\mathcal{M}_{0}$ of nonnegative real valued functions such that $\mathbf{E}[M_{0}(Y)]<\infty$. Denote

$$\epsilon_{n}=\sup_{(x,\varrho)\in\mathcal{J}\times\mathcal{H}_{0}}\left|\frac{\mathbb{E}[M_{0}(Y)\Delta_{1}(x,\varrho h)]}{\phi(\varrho h)}-\alpha_{0}\int M_{0}(v)f_{v}(x)g(v)\,dv\right|.$$

By Lemma 7.1

$$\epsilon_{n}\rightarrow 0\quad\textrm{as}\quad n\rightarrow\infty.$$

By condition (B3), we have $\delta_{f}:=\inf_{v}\inf_{x\in\mathcal{J}}f_{v}(x)>0$ and therefore, for $n$ large enough,

$$\epsilon_{n}\leq\frac{\delta_{f}}{2}\min\left\{\int M_{0}(v)g(v)\,dv:M_{0}\in\mathcal{M}_{0}\right\}.$$

Then for any $(x,\varrho)\in\mathcal{J}\times\mathcal{H}_{0}$ and $M_{0}(\cdot)\in\mathcal{M}_{0}$, we have

$$\frac{\phi(\varrho h)\alpha_{0}}{2}J(M_{0})\leq\mathbb{E}[M_{0}(Y)\Delta_{1}(x,\varrho h)]\leq 2\phi(\varrho h)\alpha_{0}J(M_{0}),$$

where $J(M_{0}):=\int M_{0}(v)f_{v}(x)g(v)\,dv$. By assumptions (A1) and (A2) we can see that $\alpha_{0}>0$. Then by condition (B4)(i) and the fact that $0<\tau_{0}\left(\displaystyle\frac{\vartheta_{1}}{\vartheta_{2}}\right)\leq\tau_{0}\left(\displaystyle\frac{\varrho}{\gamma}\right)$, we obtain, for any $(x,\varrho)\in\mathcal{J}\times\mathcal{H}_{0}$,

$$C_{1}\phi(\gamma h)\leq\mathbb{E}[M_{0}(Y)\Delta_{1}(x,\varrho h)]\leq C_{2}\phi(\gamma h),$$

(7.17)

where $C_{1}$ and $C_{2}$ are two strictly positive constants. For any $l\in\mathcal{L}$, set

$$\bar{\mathcal{M}}_{x,l}(v)=c_{l}(x)l(v){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(v)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+d_{l}(x).$$

(7.18)

Lemma 7.2. Assuming that the conditions (A1)–(A2), (B1), (B3), and (B4)(i) or (B${}^{\prime}$4)(i) are satisfied, along with $\tau_{0}\left(\displaystyle\frac{\vartheta_{1}}{\vartheta_{2}}\right)>0$, then for any given $\epsilon>0$, there exists a finite subclass $\mathcal{L}_{\epsilon}$ of $\mathcal{L}$ such that, for sufficiently large $n$, for any $l_{1}\in\mathcal{L}$, we have

$$\min_{l_{2}\in\mathcal{L}_{\epsilon}}\sup_{x\in\mathcal{J}}\sup_{\varrho\in\mathcal{H}_{0}}\mathbf{E}\Bigg{[}\bigg{\{}\bar{\mathcal{M}}_{x,l_{1}}(Y)-\bar{\mathcal{M}}_{x,l_{2}}(Y)\bigg{\}}^{2}\frac{\big{(}\Delta_{1}(x,\varrho h)\big{)}^{2}}{(\mathbf{E}[\Delta_{1}(x,\varrho h)])^{2}/\phi(\gamma h)}\Bigg{]}\leq\epsilon.$$

(7.19)

The proof of Lemma 7.2 is postponed until Section 8.

Set $\epsilon>0$ and select $n_{0}>0$ to be sufficiently large such that (7.19) holds for all $n\geq n_{0}$. For any $l_{1},l_{2}\in\mathcal{L}$, define

$$d_{\mathcal{L}}(l_{1},l_{2})=\sup_{n\geq n_{0}}\sup_{x\in\mathcal{J}}\sup_{\varrho\in\mathcal{H}_{0}}\mathbf{E}\Bigg{[}\bigg{\{}\bar{\mathcal{M}}_{x,l_{1}}(Y)-\bar{\mathcal{M}}_{x,l_{2}}(Y)\bigg{\}}^{2}\frac{\big{(}\Delta_{1}(x,\varrho h)\big{)}^{2}}{(\mathbf{E}[\Delta_{1}(x,\varrho h)])^{2}/\phi(\gamma h)}\Bigg{]}.$$

Consider any $\zeta_{1},\zeta_{2}>0$ and define

$$\mathcal{F}_{0}(\zeta_{1},\zeta_{2})=\Big{\{}(z_{1},z_{2})\in\big{(}\mathcal{L}\times\mathcal{H}_{0}\big{)}^{2}:d_{\mathcal{L}}(l_{1},l_{2})\leq\zeta_{1}\ \textrm{and}\ |\varrho_{1}-\varrho_{2}|\leq\zeta_{2}\Big{\}},$$

where $z_{i}=(l_{i},\varrho_{i})$, $i=1,2$. For any $0<\delta<1$, let $l_{n}=\mathcal{N}(\delta\phi(\gamma h),\mathcal{J},d)$.

Proposition 7.3. Under assumptions of Theorem 3.2, for any $\eta>0$, we have, for any $x\in\mathcal{J}$

$$\lim\limits_{(\delta,\zeta_{1},\zeta_{2})\rightarrow(0,0,0)}\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}$$

$${}\times\log\mathbb{P}\Bigg{(}\sup_{y\in B(x,\delta\phi(\gamma h))}\sup_{(z_{1},z_{2})\in\mathcal{F}_{0}(\zeta_{1},\zeta_{2})}w_{n}|W_{n}(x,z_{1})-W_{n}(y,z_{2})|\geq\eta\Bigg{)}=-\infty,$$

where $B(x,\delta\phi(\gamma h))$ is the open ball with the center $x$ and the radius $\delta\phi(\gamma h)$. In order to prove Proposition 7.3, we need an exponential inequality for the empirical process. Let us introduce first some additional notation. Let $(\mathcal{X},\mathcal{A})$ be a measurable space on which we consider a uniformly bounded collection of measurable functions $\mathcal{F}$. The class $\mathcal{F}$ is said to be a bounded measurable VC class of functions if it satisfies the Condition (B1). For any map $T$ from $\mathcal{F}$ into $\mathbb{R}$, set

$$||T||_{\mathcal{F}}=\sup_{g\in\mathcal{F}}|T(g)|.$$

Let $\mu$ be any probability measure on $(\mathcal{X},\mathcal{A})$ and $Pr=\prod_{i\in\mathbb{N}}$ $\mu_{i}$ the probability measure product where, for $i\in\mathbb{N}$, $\mu_{i}=\mu$. Set $\pi_{i}:\mathcal{X}^{\mathbb{N}}\mapsto\mathcal{X}$, $i\in\mathbb{N}$, to be the coordinate functions. The following lemma is due to [71].

Lemma 7.4. Let $\mathcal{F}$ be a measurable uniformly VC class of functions, and $\sigma^{2}$ and $U$ be any numbers such that $\sigma^{2}\geq\sup_{f\in\mathcal{F}}{\textrm{Var}}_{Pr}(f)$, $U\geq\sup_{f\in\mathcal{F}}||f||_{\infty},$ and $0<\sigma^{2}\leq U/2$. Then there exist constants $C$ and $M$ depending only on the characteristic $(C,\nu)$ of the class $\mathcal{F}$, such that the inequality

$$Pr\Bigg{(}\left|\left|\sum_{i=1}^{n}g(\pi_{i})-\mathbf{E}_{Pr}[g(\pi_{i})]\right|\right|_{\mathcal{F}}>t\Bigg{)}$$

$${}\leq M\exp\Bigg{\{}-\frac{t}{MU}\log\bigg{(}1+\frac{tU}{M(\sqrt{n}\sigma+U\sqrt{\log(U/\sigma)})^{2}}\bigg{)}\Bigg{\}},$$

(7.20)

whenever

$$t\geq C\left(U\log\Bigg{(}\frac{U}{\sigma}\Bigg{)}+\sqrt{n}\sigma\sqrt{\log(U/\sigma)}\right).$$

The proof of Proposition 7.3 is split up into two cases, the unbounded case where the Assumption (B4) is assumed and the bounded case where we suppose that the Condition (B${}^{\prime}$4) is satisfied.

The unbounded case will follow as a consequence of a sequence of lemmas. Set, for any $(x,z)=(x,l,\varrho)\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}$

$$\tilde{\mathbf{W}}_{n}(x,z)=\frac{1}{n\mathbf{E}[\Delta_{1}(x,\varrho h)]}\sum_{i=1}^{n}\Bigg{\{}\bar{\mathcal{M}}_{x,l}(Y_{i})\Delta_{i}(x,\varrho h)-\mathbf{E}\bigg{[}\bar{\mathcal{M}}_{x,l}(Y_{i})\Delta_{i}(x,\varrho h)\bigg{]}\Bigg{\}}.$$

We will first show that the processes

$$\Bigg{\{}w_{n}W_{n}(x,z):z\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}\Bigg{\}}\quad\textrm{and}\quad\Bigg{\{}w_{n}\tilde{\mathbf{W}}_{n}(x,z):z\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}\Bigg{\}}$$

are exponentially contiguous.

Lemma 7.5. Under the assumptions (A1)–(A3) and (B3–(B4)(ii), for any $\eta>0$, we have

$$\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\mathbb{P}\Bigg{(}\sup_{z\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}}w_{n}|W_{n}(x,z)-\tilde{\mathbf{W}}_{n}(x,z)|\geq\eta\Bigg{)}=-\infty.$$

The proof of Lemma 7.5 is postponed until Section 8.

For any $z=(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}$ and $x\in\mathcal{J}$, set

$${B}_{n,x,z}(u,v)=\phi(\gamma h)\frac{\bar{\mathcal{M}}_{x,l}(v)K(d(u,x)/\varrho h)}{\mathbf{E}[\Delta_{1}(x,\varrho h)]}.$$

(7.21)

Lemma 7.6. Assume that assumptions (A1)–(A2) and (B1)–(B3) hold true. Furthermore, consider the classes of continuous functions $\mathcal{C}$ and $\mathcal{D}$ given above. Then the class

$${\mathfrak{M}}:=\Bigg{\{}(u,v)\mapsto B_{n,x_{1},z_{1}}(u,v)-B_{n,x_{2},z_{2}}(u,v):(x_{1},x_{2})\in\mathcal{J}^{2},(z_{1},z_{2})\in\big{(}\mathcal{L}\times\mathcal{H}_{0}\big{)}^{2}\Bigg{\}}$$

is a pointwise measurable class of functions with the envelope function

$$G(x,v):={C_{0}}(C_{\mathcal{L}}L(v)+D_{\mathcal{L}}),$$

and satisfying the condition

$$N(\epsilon,\mathfrak{M})\leq C_{1}\epsilon^{-\nu},\quad 0<\epsilon<1,$$

where $C_{0}$, $C_{1}$, and $\nu$ are suitable positive constants.

The proof of Lemma 7.6 is postponed until Section 8.

Lemma 7.7. Assuming that the conditions (A1)–(A2), (B1), (B3), and (B4)(i) or (B${}^{\prime}$4)(i) are satisfied, along with $\tau_{0}\left(\displaystyle\frac{\vartheta_{1}}{\vartheta_{2}}\right)>0$, then, for $n$ large enough and any $\epsilon>0$, there exist $\delta,\zeta_{1},\zeta_{2}>0$ such that for any $(z_{1},z_{2})\in\mathcal{F}_{0}(\zeta_{1},\zeta_{2})$, we have, for any $x_{1},x_{2}\in\mathcal{J}$ such that $d(x_{1},x_{2})\leq\delta\phi(\gamma h)$,

$${\textrm{Var}}(B_{n,x_{1},z_{1}}(X,Y)-B_{n,x_{2},z_{2}}(X,Y))\leq C\,\epsilon\,\phi(\gamma h)$$

for suitable constant $C>0$.

The proof of Lemma 7.7 is postponed until Section 8.

Proof of Proposition 7.3

The proof uses Lemma 7.4 as a device. By Lemma 7.6 the following classes

$$\mathcal{F}_{n,x}(\delta,\zeta_{1},\zeta_{2})=\Bigg{\{}(u,v)\mapsto B_{n,x,z_{1}}(u,v)-B_{n,y,z_{2}}(x,v):y\in B(x,\delta\phi(\gamma h)),\,(z_{1},z_{2})\in\mathcal{F}_{0}(\zeta_{1},\zeta_{2})\Bigg{\}}$$

are measurable VC classes of functions. Now, for $n$ large enough, take $U=C_{0}(C_{\mathcal{L}}+D_{\mathcal{L}})\mathcal{M}^{\textrm{inv}}(n)$ where $C_{0}$ is as in Lemma 7.6, $\mathcal{F}=\mathcal{F}_{n,x}(\delta,\zeta_{1},\zeta_{2})$ and by Lemma 7.7 $\sigma^{2}=C\epsilon\phi(\gamma h)$. Then, for $n$ large enough, we have $\sigma\leq U/2$, and by the Condition (B4)(iii)–(iv), we infer

$$\sqrt{n}\sigma\bar{C}\geq U\sqrt{\log(U/\sigma)},\quad\text{for suitable constant }\bar{C}$$

and

$$\limsup_{n\rightarrow\infty}\frac{w_{n}\bigg{(}U\log(U/\sigma)+\sqrt{n}\sigma\sqrt{\log(U/\sigma)}\bigg{)}}{n\phi(\gamma h)}\leq\limsup_{n\rightarrow\infty}\frac{(1+\bar{C})w_{n}\sqrt{n}\sigma\sqrt{\log(U/\sigma)}}{n\phi(\gamma h)}<\infty.$$

Consequently, there exists a positive integer number $n_{0}$ such that for any $n\geq n_{0}$

$$\frac{{n\phi(\gamma h)}}{w_{n}}\geq\bar{C}_{1}\bigg{(}U\log(U/\sigma)+\sqrt{n}\sigma\sqrt{\log(U/\sigma)}\bigg{)}$$

and

$$\sqrt{n}\sigma+U\sqrt{\log(U/\sigma)}\leq(1+\bar{C})\sqrt{nC\epsilon\phi(\gamma h)}.$$

Applying Lemma 7.4, for any $n\geq n_{0}$, we obtain

$${\mathbb{P}\Bigg{(}\sup_{(y,z_{1},z_{2})\in B(x,\delta\phi(\gamma h))\times\mathcal{F}_{0}(\zeta_{1},\zeta_{2})}w_{n}|\tilde{\mathbf{W}}_{n}(x,z_{1})-\tilde{\mathbf{W}}_{n}(y,z_{2})|\geq\eta\Bigg{)}}$$

$${}\leq M\exp\Bigg{\{}-\frac{n\phi(\gamma h)\eta}{w_{n}MC_{0}(C_{\mathcal{L}}+D_{\mathcal{L}})\mathcal{M}^{\textrm{inv}}(n)}\log\Bigg{(}1+\frac{\eta C_{0}(C_{\mathcal{L}}+D_{\mathcal{L}})\mathcal{M}^{\textrm{inv}}(n)}{w_{n}M(1+\bar{C})^{2}C\epsilon}\Bigg{)}\Bigg{\}}.$$

Therefore, by (B4)(iii)–(iv), we infer

$$\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{(}\sup_{(y,z_{1},z_{2})\in B(x,\delta\phi(\gamma h))\times\mathcal{F}_{0}(\zeta_{1},\zeta_{2})}w_{n}|\tilde{\mathbf{W}}_{n}(x,z_{1})-\tilde{\mathbf{W}}_{n}(y,z_{2})|\geq\eta\bigg{)}\Bigg{)}$$

$${}\leq-\frac{\eta^{2}}{M^{2}(1+\bar{C})^{2}C\epsilon}.$$

(7.22)

Letting $\epsilon$ go to $0$, we prove that the process

$$\bigg{\{}\tilde{\mathbf{W}}_{n}(x,z):(x,z)\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}\bigg{\}}$$

fulfills the results of Proposition 7.3. Finally, the same conclusion holds for the process $\{{W}_{n}(x,z):(x,z)\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}\}$ in the unbounded case in view of Lemma 7.5. In the Bounded case, under Condition (B${}^{\prime}$4), we take

$$U=C_{0}(\mathcal{C}_{\mathcal{L}}L_{0}+\mathcal{D}_{\mathcal{L}})\quad\textrm{and}\quad\sigma^{2}=C\epsilon\phi(\gamma h).$$

The same arguments as above yield, under the Condition (B${}^{\prime}$4), the same inequality as in (7.22). This achieves the proof. $\Box$

Proof of Theorem 3.2

By combining the findings from Theorem 3.1, Proposition 7.3, and Lemma 7.2, and by applying Theorem $3.1$ in [6], we can establish that the process $\{w_{n}W_{n}(x,z):z\in\mathcal{L}\times\mathcal{H}_{0}\}$ satisfies the large deviation principle within $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$, featuring the speed $n\phi(\gamma h)/w_{n}^{2}$ and the corresponding good rate function

$$I_{x}^{\gamma}(\psi)=\sup\Bigg{\{}\Gamma_{x,z_{1}\ldots,z_{m}}^{\gamma}(\psi(z_{1}),\ldots,\psi(z_{m})):z_{i}\in\mathcal{L}\times\mathcal{H}_{0},i=1,\ldots,m,m\geq 1\Bigg{\}}.$$

Finally, by Theorem $4.2$ in [7] this rate function can be expressed as

$$I_{x}^{\gamma}(\psi)=\inf\Bigg{\{}2^{-1}\mathbf{E}[\xi^{2}]:\xi\in\mathcal{Z}_{x,\gamma},\ \varphi(\xi)=\psi\Bigg{\}}.$$

Hence the proof is complete. $\Box$

By condition (J) there exists $x_{n,1},\ldots,x_{n,l_{n}}$ in $\mathcal{J}$ such that

$$\mathcal{J}\subset\bigcup_{k=1}^{l_{n}}B_{n,k},$$

and there exists $\nu\geq 0$ such that $l_{n}\leq C(\delta\phi(\gamma h))^{-\nu}$, for some suitable positive constant $C$. Here $B_{n,k}$ denotes the open ball with center $x_{n,k}$ and radius $\delta\phi(\gamma h)$.

Proof of Theorem 3.3

The lower bound is easy. In fact, for any $x\in\mathcal{J}$, by Theorem 3.2 we have

$$\liminf_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}||W_{n}||_{\infty}>\lambda\bigg{\}}\Bigg{)}\geq\liminf_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}|W_{n}(x,z)|>\lambda\bigg{\}}\Bigg{)}$$

$${}\geq-I_{x}^{\gamma}(\lambda).$$

Hence

$$\liminf_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}||W_{n}||_{\infty}>\lambda\bigg{\}}\Bigg{)}\geq-I^{\gamma}(\lambda).$$

Now we show the upper bound. By considering the condition (J) and applying Proposition 7.3 we obtain from the Condition (B4)(iv) and inequality $\log(a+b)\leq\log 2+\max(\log a,\log b)$, $a\geq 0$, $b\geq 0$, for any $\epsilon<\lambda$,

$$\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}||W_{n}||_{\infty}>\lambda\bigg{\}}\Bigg{)}$$

$${}\leq\limsup_{\delta\rightarrow 0}\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}\max_{1\leq k\leq l_{n}}\sup_{x\in B_{n,k}}\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}|W_{n}(x,z)-W_{n}(x_{n,k},z)|>\epsilon\bigg{\}}$$

$${}+\mathbb{P}\bigg{\{}w_{n}\max_{1\leq k\leq l_{n}}\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}|W_{n}(x_{n,k},z)|>\lambda-\epsilon\bigg{\}}\Bigg{)}$$

$${}\leq\limsup_{\delta\rightarrow 0}\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}\max_{1\leq k\leq l_{n}}\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}|W_{n}(x_{n,k},z)|>\lambda-\epsilon\bigg{\}}\Bigg{)}.$$

On the other hand

$$\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}\max_{1\leq k\leq l_{n}}\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}|W_{n}(x_{n,k},z)|>\lambda-\epsilon\bigg{\}}\Bigg{)}$$

$${}\leq\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\big{(}l_{n})+\sup_{x\in\mathcal{J}}\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}\sup_{z\in\mathcal{L}\times\mathcal{H}_{0}}|W_{n}(x,z)|>\lambda-\epsilon\bigg{\}}\Bigg{)}.$$

It follows, by Condition (B4)(iv) that

$$\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Bigg{(}\mathbb{P}\bigg{\{}w_{n}||W_{n}||_{\infty}>\lambda\bigg{\}}\Bigg{)}\leq-I^{\gamma}(\lambda-\epsilon).$$

The demonstration of Theorem 3.3 may be concluded through the asymptotic limit of the parameter $\epsilon$ converging towards zero since the function $I^{\gamma}(\cdot)$ is continuous. $\Box$

To prove Corollary 4.1, we need some intermediate results. For any $x\in\mathcal{J}$ and any $(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}$, set

$$\mathfrak{B}_{n}(x,l,\varrho)=c_{l}(x)\bigg{(}\widehat{r}_{n,2}^{l}(x,\varrho h)-r^{l}(x)\bigg{)}+d_{l}(x)\bigg{(}\widehat{r}_{n,1}^{l}(x,\varrho h)-1\bigg{)}.$$

(7.23)

Lemma 7.8. Under the assumption of Theorem $3.2$, assume that the condition (C1) holds. Then the process

$$\bigg{\{}w_{n}\mathfrak{B}_{n}(x,l,\varrho):(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}\bigg{\}}$$

satisfies a large deviation principle in $l_{\infty}(\mathcal{L}\times\mathcal{H}_{0})$ with the speed $(n\phi(\gamma h)/w_{n}^{2})$ and the good rate function $I_{1,x}^{\gamma}(\cdot)$.

The proof of Lemma 7.8 is postponed until Section 8.

Proof of Corollary 4.1

By choosing $c_{l}(x)=1$ and $d_{l}(x)=-r^{l}(x)$, the sequence $\mathfrak{B}_{n}(x,l,\varrho h)$ in (7.23) may be then written as

$$\mathfrak{B}_{n}(x,l,\varrho h)=\bigg{(}\widehat{r}_{n,2}^{l}(x,\varrho h)-r^{l}(x)\bigg{)}-r^{l}(x)\bigg{(}\widehat{r}_{n,1}^{l}(x,\varrho h)-1\bigg{)}.$$

The subsequent statements show that the processes

$$\Bigg{\{}w_{n}(\widehat{r}_{n}^{l}(x,\varrho h)-{r}^{l}(x,\varrho h)):(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}\Bigg{\}}$$

and

$$\Bigg{\{}w_{n}\mathfrak{B}_{n}(x,l,\varrho h):(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}\Bigg{\}}$$

are exponentially contiguous. Indeed, we have

$$\mathbb{P}\Bigg{(}w_{n}\sup_{(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}}\bigg{|}\widehat{r}_{n}^{l}(x,\varrho h)-{r}^{l}(x,\varrho h)-\mathfrak{B}_{n}(x,l,\varrho h)\bigg{|}>\eta\Bigg{)}$$

$${}\leq\mathbb{P}\Bigg{(}w_{n}\sup_{(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}}\bigg{|}\mathfrak{B}_{n}(x,l,\varrho h)\bigg{(}\frac{1}{\widehat{r}_{n,1}^{l}(x,\varrho h)}-1\bigg{)}\bigg{|}>\eta,\inf_{\varrho\in\mathcal{H}_{0}}\widehat{r}_{n,1}^{l}(x,\varrho h)\geq 1/2\Bigg{)}$$

$${}+\mathbb{P}\Bigg{(}\inf_{\varrho\in\mathcal{H}_{0}}\widehat{r}_{n,1}^{l}(x,\varrho h)<1/2\Bigg{)}$$

$${}\leq\mathbb{P}\Bigg{(}\sqrt{w_{n}}\sup_{(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}}|\mathfrak{B}_{n}(x,l,\varrho h)|>\sqrt{\eta}\Bigg{)}+\mathbb{P}\Bigg{(}\sqrt{w_{n}}\sup_{\varrho\in\mathcal{H}_{0}}|\widehat{r}_{n,1}^{l}(x,\varrho h)-1|>\sqrt{\eta}/2\Bigg{)}$$

$${}+\mathbb{P}\Bigg{(}\sup_{\varrho\in\mathcal{H}_{0}}|\widehat{r}_{n,1}^{l}(x,\varrho h)-1|>\frac{1}{2}\Bigg{)}.$$

Since $\lim\limits_{n\rightarrow\infty}w_{n}=\infty$, for $n$ large enough, it follows that

$$\mathbb{P}\Bigg{(}w_{n}\sup_{(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}}\bigg{|}\widehat{r}_{n}^{l}(x,\varrho h)-{r}^{l}(x,\varrho h)-\mathfrak{B}_{n}(x,l,\varrho h)\bigg{|}>\eta\Bigg{)}$$

$${}\leq 3\max\Bigg{\{}\mathbb{P}\Big{(}\sqrt{w_{n}}\sup_{(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}}|\mathfrak{B}_{n}(x,l,\varrho h)|>\sqrt{\eta}\Big{)},\mathbb{P}\Big{(}\sqrt{w_{n}}\sup_{\varrho\in\mathcal{H}_{0}}|\widehat{r}_{n,1}^{l}(x,\varrho h)-1|>\sqrt{\eta}/2\Big{)}\Bigg{\}}.$$

Now, by Lemma 7.8 the sequence

$$\Bigg{\{}w_{n}\mathfrak{B}_{n}(x,l,\varrho h):(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}\Bigg{\}}$$

satisfies a large deviation principle with the speed $(n\phi(\gamma h)/w_{n}^{2})$ and the good rate function $I^{\gamma}_{1,x}(\cdot)$, there exists a constant $c_{1}>0$ such that

$$\limsup_{n\rightarrow\infty}\frac{w_{n}}{n\phi(\gamma h)}\log\mathbb{P}\Bigg{(}\sqrt{w_{n}}\sup_{(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}}|\mathfrak{B}_{n}(x,l,\varrho h)|>\sqrt{\eta}\Bigg{)}<-c_{1}.$$

Moreover, an application of Theorem 3.2 guarantees the existence of a real $c_{2}>0$ such that

$$\limsup_{n\rightarrow\infty}\frac{w_{n}}{n\phi(\gamma h)}\log\mathbb{P}\Bigg{(}\sqrt{w_{n}}\sup_{\varrho\in\mathcal{H}_{0}}|\widehat{r}_{n,1}^{l}(x,\varrho h)-1|>\sqrt{\eta}/2\Bigg{)}<-c_{2}.$$

We then deduce that

$$\limsup_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\mathbb{P}\Bigg{(}w_{n}\sup_{(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}}\bigg{|}\widehat{r}_{n}^{l}(x,\varrho h)-{r}^{l}(x,\varrho h)-\mathfrak{B}_{n}(x,l,\varrho h)\bigg{|}>\eta\Bigg{)}=-\infty,$$

which means that the process

$$\Bigg{\{}w_{n}(\widehat{r}_{n}^{l}(x,\varrho h)-{r}^{l}(x,\varrho h)):(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}\Bigg{\}}$$

and

$$\Bigg{\{}w_{n}\mathfrak{B}_{n}(x,l,\varrho h):(l,\varrho)\in\mathcal{L}\times\mathcal{H}_{0}\Bigg{\}}$$

are exponentially contiguous. Thus, Corollary 4.1 follows by making use of Lemma 7.8. $\Box$

Proof of Corollary 4.2

To see how Corollary 4.2 follows from Theorem 3.2, take in this case

$$\mathcal{L}=\Bigg{\{}{\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{y\leq t\}}:t\in\mathbb{R}\Bigg{\}},\quad\mathcal{D}=\Bigg{\{}-F(t|\cdot):t\in\mathbb{R}\Bigg{\}}\quad\textrm{and}\quad c_{l}(x)=1,$$

for each $x\in\mathcal{J}$. Under the Condition (C2), it becomes evident that the set of functions $\mathcal{D}$ constitutes a collection of uniformly equicontinuous functions in $(\mathcal{J},d)$. Consequently, $\mathcal{D}$ forms a set of uniformly bounded functions. Applying the Arzéla–Ascoli theorem (refer to, for instance, Theorem IV.6.7 in [56]), it follows that $\mathcal{D}$ is a compact set within $l_{\infty}(\mathcal{J})$. It is worth noting that we can employ this theorem even when $(\mathcal{J},d)$ is a totally bounded pseudo-metric space, not necessarily a compact one. Now, employing the same line of reasoning as in the proof of Corollary 4.1, the proof is successfully established. $\Box$

Proof of Corollary 4.3

By Lemma 3.9.23 of [131], it follows that the inverse map $\Phi:G\mapsto G^{-1}$ as a map $D_{1}[F^{-1}(p|x)-\varepsilon,F^{-1}(q|x)+\varepsilon]\mapsto l_{\infty}[p,q]$ is Hadamard differentiable at $F(\ \cdot|x)$ tangentially to $C[F^{-1}(p|x)-\varepsilon,F^{-1}(q|x)+\varepsilon]$, and the derivative is the map

$$\psi\mapsto-\psi\left(F^{-1}(\cdot|x)\right)/f\left(F^{-1}(\cdot|x)\right).$$

Therefore, by Theorem 3.1 of [67] and Corollary 4.2, we conclude that

$$\left\{w_{n}\left(\widehat{q}_{n,\alpha}(x,\varrho)-q_{\alpha}(x)\right),(\alpha,\varrho)\in l_{\infty}([p,q]\times\mathcal{H}_{0})\right\}$$

satisfies the LDP in $l_{\infty}([p,q]\times\mathcal{H}_{0})$ with speed $n\phi(\gamma h)/w_{n}^{2}$ and the rate function $I_{\gamma,x}^{EQ}(\cdot).$ $\Box$

PROOF OF THE TECHNICAL LEMMAS

Proof of Lemma 7.1

Note that, the condition (A1) and (A2)(i) implies the equality (7.1), and this last equality with $M_{x,l}(v)=M_{0}(v)$ gives that for any $(x,\varrho)\in\mathcal{J}\times\mathcal{H}_{0}$,

$$\frac{\mathbf{E}[M_{0}(Y_{1})\Delta_{1}(x,\varrho h)]}{\phi(\varrho h)}=\int L_{x,\varrho}(v,h)M_{0}(v)g(v)\,dv,$$

where

$$L_{x,\varrho}(v,h)=K(1)\Big{(}f_{v}(x)+\frac{g_{x,v}(\varrho h)}{\phi(\varrho h)}\Big{)}-\int\limits_{0}^{1}K^{\prime}(u)\frac{\phi(u\varrho h)}{\phi(\varrho h)}\Big{(}f_{v}(x)+\frac{g_{x,v}(u\varrho h)}{\phi(u\varrho h)}\Big{)}\,du.$$

Now, observe that, for any $(x,\varrho)\in\mathcal{J}\times\mathcal{H}_{0}$,

$$\bigg{|}\int L_{x,\varrho}(v,h)M_{0}(v)g(v)\,dv-\alpha_{0}\int M_{0}(v)f_{v}(x)g(v)\,dv\bigg{|}$$

$${}\leq K(1)\int\frac{g_{x,v}(\varrho h)}{\phi(\varrho h)}M_{0}(v)g(v)\,dv$$

$${}+\bigg{(}\int M_{0}(v)f_{v}(x)g(v)\,dv\bigg{)}\int\limits_{0}^{1}|K^{\prime}(u)|\Big{|}\frac{\phi(u\varrho h)}{\phi(\varrho h)}-\tau_{0}(u)\Big{|}\,du$$

$${}+\int\int\limits_{0}^{1}|K^{\prime}(u)|\frac{g_{x,v}(u\varrho h)}{\phi(u\varrho h)}M_{0}(v)g(v)\,du\,dv.$$

Making use of the conditions (A1)–(A2) and (B3), we derive

$$\lim_{n\rightarrow\infty}\sup_{(x,\varrho)\in\mathcal{J}\times\mathcal{H}_{0}}\bigg{|}\frac{\mathbf{E}[M_{0}(Y_{1})\Delta_{1}(x,\varrho h)]}{\phi(\varrho h)}-\alpha_{0}\int M_{0}(v)f_{v}(x)g(v)\,dv\bigg{|}=0.$$

This completes the proof of Lemma 7.1. $\Box$

Proof of Lemma 7.2

Making use of the conditions (A1)–(A2) and (B3)–(B4)(i), and applying Lemma 7.1, there exists a positive constant $C_{1}$ such that, for every pair $l,l^{\prime}\in\mathcal{L}$, every $x$ in $\mathcal{J}$, every $\varrho$ in $\mathcal{H}_{0}$, and for sufficiently large $n$, the following holds:

$$\mathbf{E}\left[\bigg{\{}(c_{l}(x)l(Y)-c_{l^{\prime}}(x)l^{\prime}(Y)){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(Y)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+(d_{l}(x)-d_{l^{\prime}}(x))\bigg{\}}^{2}\frac{\big{(}\Delta_{1}(x,\varrho h)\big{)}^{2}}{(\mathbf{E}[\Delta_{1}(x,\varrho h)])^{2}/\phi(\gamma h)}\right]$$

$${}\leq C_{1}\frac{\phi(\gamma h)}{\phi(\varrho h)}\int\Bigg{\{}(c_{l}(x)l(v)-c_{l^{\prime}}(x)l^{\prime}(v)){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(v)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+(d_{l}(x)-d_{l^{\prime}}(x))\Bigg{\}}^{2}g(v)dv.$$

According to (A2)(ii), we obtain, uniformly for $\varrho$ in $\mathcal{H}_{0}$,

$$\frac{\phi(\varrho h)}{\phi(\gamma h)}=\tau(\varrho,\gamma)+o(1).$$

By combining the last equation with the observation that $0<\tau_{0}\left(\displaystyle\frac{\vartheta_{1}}{\vartheta_{2}}\right)\leq\tau_{0}\left(\displaystyle\frac{\varrho}{\gamma}\right)$, we can deduce the existence of a positive constant $C_{2}$ such that, for sufficiently large $n,$

$$C_{2}\phi(\gamma h)\leq\inf_{\varrho\in\mathcal{H}_{0}}\phi(\varrho h).$$

Henceforth, there exists a positive constant $C>0$ such that, for every pair $l,l^{\prime}\in\mathcal{L}$, every $\varrho$ in $\mathcal{H}_{0}$, and for sufficiently large $n$,

$$\mathbf{E}\left[\bigg{\{}(c_{l}(x)l(Y)-c_{l^{\prime}}(x)l^{\prime}(Y)){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(Y)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+(d_{l}(x)-d_{l^{\prime}}(x))\bigg{\}}^{2}\frac{\big{(}\Delta_{1}(x,\varrho h)\big{)}^{2}}{(\mathbf{E}[\Delta_{1}(x,\varrho h)])^{2}/\phi(\gamma h)}\right]$$

$${}\leq C\int\Bigg{\{}(c_{l}(x)l(v)-c_{l^{\prime}}(x)l^{\prime}(v)){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(v)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+(d_{l}(x)-d_{l^{\prime}}(x))\Bigg{\}}^{2}g(v)dv.$$

(8.1)

Given the Condition (B1) on the class $\mathcal{L}$, which states that it is totally bounded with respect to the distance $d_{Q}(\cdot,\cdot)$, where $Q(\cdot)$ is a distribution with density $g(\cdot)$, it follows that for any $\delta>0$, there exists a finite subclass $\mathcal{L}_{1}\subset\mathcal{L}$ such that

$$\sup_{l\in\mathcal{L}}\min_{l^{\prime}\in\mathcal{L}_{1}}\int(l(v)-l^{\prime}(v))^{2}g(v)dv<\delta.$$

Furthermore, due to the compactness of the function classes $\mathcal{C}$ and $\mathcal{D}$, we can identify finite subclasses $\mathcal{L}_{2}\subset\mathcal{L}$ and $\mathcal{L}_{3}\subset\mathcal{L}$ such that

$$\sup_{l\in\mathcal{L}}\min_{l^{\prime}\in\mathcal{L}_{2}}||c_{l}-c_{l^{\prime}}||_{\mathcal{J}}\vee\sup_{l\in\mathcal{L}}\min_{l^{\prime}\in\mathcal{L}_{3}}||d_{l}-d_{l^{\prime}}||_{\mathcal{J}}<\delta.$$

Therefore, employing the observation that both

$$\sup_{l\in\mathcal{L}}||c_{l}||_{\mathcal{J}}<\infty\quad\textrm{and}\quad\sup_{l\in\mathcal{L}}\int(l(v))^{2}g(v)dv<\infty.$$

After a brief arithmetic manipulation and selecting a sufficiently small $\delta>0$, we obtain

$$\sup_{l\in\mathcal{L}}\min_{l_{1}^{\prime},l_{2}^{\prime},l_{3}^{\prime}}\sup_{x\in\mathcal{J}}\int\Bigg{\{}(c_{l}(x)l(v)-c_{l_{2}^{\prime}}(x)l_{1}^{\prime}(v)){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(v)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+(d_{l}(x)-d_{l_{3}^{\prime}}(x))\Bigg{\}}^{2}g(v)dv\leq\epsilon/2C,$$

where the minimum is taken over $\mathcal{L}_{1}\times\mathcal{L}_{2}\times\mathcal{L}_{3}$. Now, consider any triple $(l_{1}^{\prime},l_{2}^{\prime},l_{3}^{\prime})\in\mathcal{L}_{1}\times\mathcal{L}_{2}\times\mathcal{L}_{3}$ for which there exists $l^{\prime}\in\mathcal{L}$ such that

$$\sup_{x\in\mathcal{J}}\int\Bigg{\{}(c_{l^{\prime}}(x)l^{\prime}(v)-c_{l_{2}^{\prime}}(x)l_{1}^{\prime}(v)){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(v)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+(d_{l^{\prime}}(x)-d_{l_{3}^{\prime}}(x))\Bigg{\}}^{2}g(v)dv\leq\epsilon/2C,$$

select one of them to form the desired subclass $\mathcal{L}_{\epsilon}$. To conclude the proof, it is enough to apply the triangle inequality. $\Box$

Proof of Lemma 7.5

Set, for $z=(x,l,\varrho)\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}$,

$$\mathbf{W}_{n}^{\prime\prime}(z)=\frac{1}{n\mathbf{E}[\Delta_{1}(x,\varrho h)]}\sum_{i=1}^{n}c_{l}(x)l(Y_{i}){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(Y_{i})>\mathcal{M}^{\textrm{inv}}(n)\}}\Delta_{i}(x,\varrho h).$$

Observe first that $s/\mathcal{M}(s)$ is a decreasing function. In turn, this implies, for any $v$ such that $L(v)\geq\mathcal{M}^{\textrm{inv}}(n)$, that

$$|l(v)|=\frac{|l(v)|}{L(v)}\frac{L(v)}{\mathcal{M}(L(v))}\mathcal{M}(L(v))\leq\Bigg{\{}\frac{\mathcal{M}^{\textrm{inv}}(n)}{n}\Bigg{\}}\mathcal{M}(L(v)).$$

Thus, for $n$ large enough, uniformly in $(x,l,\varrho)\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}$, we have

$$\mathbf{E}\Big{[}\big{|}\mathbf{W}_{n}^{\prime\prime}(z)\big{|}\Big{]}\leq\Bigg{\{}\frac{\mathcal{M}^{\textrm{inv}}(n)}{n}\Bigg{\}}C_{\mathcal{L}}\mathbf{E}\Bigg{\{}\mathcal{M}(L(Y))\frac{\Delta_{1}(x,\varrho h)}{\mathbf{E}[\Delta_{1}(x,\varrho h)]}\Bigg{\}}.$$

Thus, by conditions (A1)–(A2) and (B3)–(B4)(i) and using Lemma 7.1, there exists two strictly positive constants $C_{1},C_{2}$ such that, for any $x\in\mathcal{J}$ and $\varrho\in\mathcal{H}_{0}$, we have

$$C_{1}\phi(\varrho h)\leq\mathbf{E}[\Delta_{1}(x,\varrho h)]$$

(8.2)

and

$$\mathbf{E}[\mathcal{M}(L(Y))\Delta_{1}(x,\varrho h)]\leq C_{2}\phi(\varrho h).$$

(8.3)

Hence, we readily infer

$$\sup_{z\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}}w_{n}\mathbf{E}\Big{[}\big{|}\mathbf{W}_{n}^{\prime\prime}(z)\big{|}\Big{]}\leq\frac{w_{n}\mathcal{M}^{\textrm{inv}}(n)}{n}\frac{C_{2}C_{\mathcal{L}}}{C_{1}},$$

which by (A3) converges to $0$ as $n\rightarrow\infty$. Considering now the inequality (8.2) and the boundedness of the kernel $K(\cdot)$ in (A1), for any for $z\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}$, we have

$$\big{|}\mathbf{W}_{n}^{\prime\prime}(z)\big{|}\leq\frac{C_{\mathcal{L}}}{C_{1}}\frac{1}{n\phi(\varrho h)}\sum_{i=1}^{n}L(Y_{i}){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(Y_{i})>\mathcal{M}^{\textrm{inv}}(n)\}}.$$

By (A2)(ii), we have, uniformly in $\varrho\in\mathcal{H}_{0}$,

$$\frac{\phi(\varrho h)}{\phi(\gamma h)}=\tau(\varrho,\gamma)+o(1).$$

Combining this with the fact that $0<\tau_{0}(\displaystyle\frac{\vartheta_{1}}{\vartheta_{2}})\leq\tau_{0}(\displaystyle\frac{\varrho}{\gamma})$ implies that there exists a constant $C_{2}>0$ such that, for $n$ large enough,

$$C_{2}\phi(\gamma h)\leq\inf_{\varrho\in\mathcal{H}_{0}}\phi(\varrho h).$$

Therefore, for any $\eta>0$ and $n$ large enough, we have

$$\mathbb{P}\Bigg{(}\sup_{z\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}}w_{n}|\mathbf{W}_{n}(z)-\tilde{\mathbf{W}}_{n}(z)|\geq\eta\Bigg{)}\leq\mathbb{P}\Bigg{(}\sup_{z\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}}w_{n}|\mathbf{W}^{\prime\prime}_{n}(z)|\geq\eta/2\Bigg{)}$$

$${}\leq\mathbb{P}\Bigg{(}\frac{C_{\mathcal{L}}}{C_{1}}\frac{w_{n}}{nC_{2}\phi(\gamma h)}\sum_{i=1}^{n}L(Y_{i}){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(Y_{i})>\mathcal{M}^{\textrm{inv}}(n)\}}\geq\eta/2\Bigg{)}$$

$${}\leq\mathbb{P}\Big{(}\max_{1\leq i\leq n}L(Y_{i})>\mathcal{M}^{\textrm{inv}}(n)\Big{)}$$

$${}\leq n\mathbb{P}\Big{(}L(Y)>\mathcal{M}^{\textrm{inv}}(n)\Big{)}.$$

The application of the exponential Tchebychev inequality with $t$ chosen as in the condition (B4)(i) yields to

$${\lim\limits_{n\rightarrow\infty}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\bigg{(}n\mathbb{P}\Big{(}L(Y)>\mathcal{M}^{\textrm{inv}}(n)\Big{)}\bigg{)}}$$

$${}\leq\lim\limits_{n\rightarrow\infty}\Bigg{(}\frac{w_{n}^{2}}{n\phi(\gamma h)}\log n-t\frac{w_{n}^{2}}{n\phi(\gamma h)}\mathcal{M}^{\textrm{inv}}(n)+\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Big{(}\mathbf{E}\big{[}e^{tL(Y)}\big{]}\Big{)}\Bigg{)},$$

$${}\leq\lim\limits_{n\rightarrow\infty}\Bigg{(}(1-t)\frac{w_{n}^{2}}{n\phi(\gamma h)}\max\big{(}\mathcal{M}^{\textrm{inv}}(n),\log n\big{)}+\frac{w_{n}^{2}}{n\phi(\gamma h)}\log\Big{(}\mathbf{E}\big{[}e^{tL(Y)}\big{]}\Big{)}\Bigg{)},$$

which by Assumptions (A3) and (B4)(ii) converges to $-\infty$ as $n\rightarrow\infty$. $\Box$

Proof of Lemma 7.6

Similarly as in the proof of Lemma 5 in [58], it follows that the class

$$\tilde{\mathfrak{M}_{1}}=\Bigg{\{}(u,v)\rightarrow{\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(v)\leq t\}}K(d(x,u)/\varrho h),\varrho\in\mathcal{H}_{0},x\in\mathcal{J},t>0\Bigg{\}}$$

satisfies for any probability measure $Q$ on Borel subsets of $\mathcal{J}\times\mathbb{R}$ the condition

$$N(\epsilon\kappa,\tilde{\mathfrak{M}_{1}},d_{Q})\leq\tilde{C}\epsilon^{-\tilde{\nu}},\quad 0<\epsilon<1,$$

where $\tilde{C},\tilde{\nu}>0$ are suitable positive constants. Next, consider the function class

$$\tilde{\mathfrak{M}_{2}}=\Bigg{\{}(u,v)\mapsto\phi(\gamma h)\frac{c_{l}(x)l(v)}{\mathbf{E}[\Delta_{1}(x,\varrho h)]}:\varrho\in\mathcal{H}_{0},l\in\mathcal{L},x\in\mathcal{J}\Bigg{\}}.$$

Applying the same reasoning as presented in the proof of Lemma $5$ in [58] and leveraging the Vapnik–Červonenkis property of $\mathcal{L}$, along with the inequality (8.2) and the bounded nature of the classes $\mathcal{C}$, it follows that the class $\tilde{\mathfrak{M}_{2}}$ possesses a polynomial covering number. Consequently, by referring to Lemma A.1 in [58], it can be deduced that the product class $\tilde{\mathfrak{M}_{1}}\cdot\tilde{\mathfrak{M}_{2}}$ exhibits a polynomial covering number. Now, let’s consider the class

$$\tilde{\mathfrak{M}_{3}}=\Bigg{\{}(u,v)\mapsto\phi(\gamma h)\frac{d_{l}(x)}{\mathbf{E}[\Delta_{1}(x,\varrho h)]}:\varrho\in\mathcal{H}_{0},l\in\mathcal{L},x\in\mathcal{J}\Bigg{\}}.$$

Utilizing inequality (8.2), the bounded nature of the classes $\mathcal{D}$, and invoking Lemma A.$1$ from [58], it can be established that the product class $\mathcal{K}\cdot\tilde{\mathfrak{M}_{3}}$ possesses a polynomial covering number. Consequently, the class resulting from the summation of $\tilde{\mathfrak{M}_{1}}\cdot\tilde{\mathfrak{M}_{2}}$ and $\mathcal{K}\cdot\tilde{\mathfrak{M}_{3}}$ also exhibits a polynomial covering number. Finally, it can be concluded that the class $\mathfrak{M}$ satisfies this covering property as well. The measurability is straightforwardly derived from the kernel function’s continuity, the separability of $(\mathcal{J},d)$, and the fact that the functions belong to the classes $\mathcal{C}$ and $\mathcal{D}$. $\Box$

Proof of Lemma 7.7

Recall the Eq. (7.21). Note that, for any $(x_{1},x_{2})\in\mathcal{J}^{2}$ and any $(z_{1},z_{2})\in\big{(}\mathcal{L}\times\mathcal{H}_{0}\big{)}^{2}$,

$${{\textrm{Var}}(B_{n,x_{1},z_{1}}(X,Y)-B_{n,x_{2},z_{2}}(X,Y))}$$

$${}\leq 2\mathbb{E}\Bigg{[}\phi(\gamma h)\Big{\{}\bar{\mathcal{M}}_{x_{1},l_{1}}(Y)-\bar{\mathcal{M}}_{x_{1},l_{2}}(Y)\Big{\}}\frac{\Delta_{1}(x_{1},\varrho_{1}h)}{\mathbf{E}[\Delta_{1}(x_{1},\varrho_{1}h)]}\Bigg{]}^{2}$$

$${}+2\mathbb{E}\Bigg{[}\phi(\gamma h)\bigg{\{}\frac{\bar{\mathcal{M}}_{x_{1},l_{2}}(Y)\Delta_{1}(x_{1},\varrho_{1}h)}{\mathbf{E}[\Delta_{1}(x_{1},\varrho_{1}h)]}-\frac{\bar{\mathcal{M}}_{x_{2},l_{2}}(Y)\Delta_{1}(x_{2},\varrho_{2}h)}{\mathbf{E}[\Delta_{1}(x_{2},\varrho_{2}h)]}\bigg{\}}\Bigg{]}^{2}$$

$${}:=I+II.$$

According to Lemma 7.2, for sufficiently large $n$, there exists a constant $C>0$ such that

$$I\leq 2Cd_{\mathcal{L}}(l_{1},l_{2})\phi(\gamma h).$$

Now, observe that

$$II\leq 4\phi(\gamma h)^{2}\mathbb{E}\Bigg{[}\bigg{(}[c_{l_{2}}(x_{1})-c_{l_{2}}(x_{2}])l_{2}(Y){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(Y)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+[d_{l_{2}}(x_{1})-d_{l_{2}}(x_{2})]\bigg{)}\frac{\Delta_{1}(x_{1},\varrho_{1}h)}{\mathbf{E}[\Delta_{1}(x_{1},\varrho_{1}h)]}\Bigg{]}^{2}$$

$${}+4\phi(\gamma h)^{2}\mathbb{E}\Bigg{[}\bigg{(}c_{l_{2}}(x_{2})l_{2}(Y){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(Y)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+d_{l_{2}}(x_{2})\bigg{)}\bigg{[}\frac{\Delta_{1}(x_{2},\varrho_{2}h)}{\mathbf{E}[\Delta_{1}(x_{2},\varrho_{2}h)]}-\frac{\Delta_{1}(x_{1},\varrho_{1}h)}{\mathbf{E}[\Delta_{1}(x_{1},\varrho_{1}h)]}\bigg{]}\Bigg{]}^{2}$$

$${}:=III+IV.$$

Using the fact that the classes $\mathcal{C}$ and $\mathcal{D}$ are uniformly equicontinuous, it follows, for any $\epsilon>0$, that there exists $\delta>0$ such that for any $x_{1},x_{2}\in\mathcal{J}$, with $d(x_{1},x_{2})\leq\delta$, and

$$\sup_{l\in\mathcal{L}}|c_{l}(x_{1})-c_{l}(x_{2})|\vee\sup_{l\in\mathcal{L}}|d_{l}(x_{1})-d_{l}(x_{2})|\leq\epsilon.$$

Hence, we have

$$III\leq 4(\epsilon\phi(\gamma h))^{2}\mathbb{E}\Bigg{[}\bigg{(}L(Y)+1\bigg{)}\frac{\Delta_{1}(x_{1},\varrho_{1}h)}{\mathbf{E}[\Delta_{1}(x_{1},\varrho_{1}h)]}\Bigg{]}^{2}.$$

Using the same arguments as in (8.1), by the Conditions (A1)–(A2) and (B3)–(B4)(i), and applying Lemma 7.1 with the observation that $0<\tau_{0}\left(\displaystyle\frac{\vartheta_{1}}{\vartheta_{2}}\right)\leq\tau_{0}\left(\displaystyle\frac{\varrho_{1}}{\gamma}\right)$, there exists a positive constant $C_{1}$ such that

$$III\leq C_{1}\epsilon^{2}\phi(\gamma h).$$

Similarly, and using the fact that the classes $\mathcal{C}$ and $\mathcal{D}$ are uniformly bounded, for suitable finite constants $C_{1}$ and $C_{2}$, we have

$$IV\leq 8\phi(\gamma h)^{2}\mathbb{E}\Bigg{[}\bigg{(}c_{l_{2}}(x_{2})l_{2}(Y){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(Y)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+d_{l_{2}}(x_{2})\bigg{)}\bigg{[}\frac{\Delta_{1}(x_{2},\varrho_{2}h))}{\mathbf{E}[\Delta_{1}(x_{2},\varrho_{2}h)]}-\frac{\Delta_{1}(x_{1},\varrho_{1}h)}{\mathbf{E}[\Delta_{1}(x_{2},\varrho_{2}h)]}\bigg{]}\Bigg{]}^{2}$$

$${}+8\phi(\gamma h)^{2}\mathbb{E}\Bigg{[}\bigg{(}c_{l_{2}}(x_{2})l_{2}(Y){\mathchoice{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.0mu l}{\rm 1\mskip-4.5mu l}{\rm 1\mskip-5.0mu l}}_{\{L(Y)\leq\mathcal{M}^{\textrm{inv}}(n)\}}+d_{l_{2}}(x_{2})\bigg{)}\bigg{[}\frac{\Delta_{1}(x_{1},\varrho_{1}h))}{\mathbf{E}[\Delta_{1}(x_{2},\varrho_{2}h)]}-\frac{\Delta_{1}(x_{1},\varrho_{1}h)}{\mathbf{E}[\Delta_{1}(x_{1},\varrho_{1}h)]}\bigg{]}\Bigg{]}^{2}.$$

By (7.17) and (B4)(i), and by using the fact that the classes $\mathcal{C}$ and $\mathcal{D}$ are uniformly bounded, for suitable finite constants $C_{1}$ and $C_{2}$, we have

$${IV}$$

$${}\leq C_{1}\mathbf{E}\Bigg{[}\bigg{(}L(Y)+1\bigg{)}\bigg{[}\Delta_{1}(x_{2},\varrho_{2}h)-\Delta_{1}(x_{1},\varrho_{1}h)\bigg{]}\Bigg{]}^{2}+\frac{C_{2}}{\phi(\gamma h)}\Bigg{[}\mathbf{E}\Big{[}\Delta_{1}(x_{2},\varrho_{2}h)-\Delta_{1}(x_{1},\varrho_{1}h)\Big{]}\Bigg{]}^{2}$$

$${}\leq 2C_{1}\left\{\mathbf{E}\Bigg{[}\bigg{(}L(Y)+1\bigg{)}\bigg{[}\Delta_{1}(x_{2},\varrho_{2}h)-\Delta_{1}(x_{2},\varrho_{1}h)\bigg{]}\Bigg{]}^{2}+\mathbf{E}\Bigg{[}\bigg{(}L(Y)+1\bigg{)}\bigg{[}\Delta_{1}(x_{2},\varrho_{1}h)-\Delta_{1}(x_{1},\varrho_{1}h)\bigg{]}\Bigg{]}^{2}\right\}$$

$${}+2\frac{C_{2}}{\phi(\gamma h)}\left\{\Bigg{[}\mathbf{E}\Big{[}\Delta_{1}(x_{2},\varrho_{2}h)-\Delta_{1}(x_{2},\varrho_{1}h)\Big{]}\Bigg{]}^{2}+\Bigg{[}\mathbf{E}\Big{[}\Delta_{1}(x_{2},\varrho_{1}h)-\Delta_{1}(x_{1},\varrho_{1}h)\Big{]}\Bigg{]}^{2}\right\}.$$

Now, by (A1)(i), we obtain, for some constant $C$,

$$\mathbf{E}\Bigg{[}\bigg{(}L(Y)+1\bigg{)}\bigg{[}\Delta_{1}(x_{2},\varrho_{2}h)-\Delta_{1}(x_{2},\varrho_{1}h)\bigg{]}\Bigg{]}^{2}$$

$${}=\int\limits_{0}^{1}\int\left(L(v)+1\right)^{2}\left(K\left(\frac{u}{\varrho_{2}}\right)-K\left(\frac{u}{\varrho_{1}}\right)\right)^{2}\,d\mathbb{P}\left(\frac{d(x_{2},X_{1})}{h}\leq u,Y\leq v\right)$$

$${}\leq C\frac{(\varrho_{2}-\varrho_{1})^{2}}{\vartheta_{1}^{2}}\int\limits_{0}^{1}\int\left(L(v)+1\right)^{2}u^{2}\,d\mathbb{P}\left(\frac{d(x_{2},X_{1})}{h}\leq u|Y=v\right)g(v)\,dv.$$

Using the same arguments as in (7.1), we get

$$\mathbf{E}\Bigg{[}\bigg{(}L(Y)+1\bigg{)}\bigg{[}\Delta_{1}(x_{2},\varrho_{2}h)-\Delta_{1}(x_{2},\varrho_{1}h)\bigg{]}\Bigg{]}^{2}\leq C\,(\varrho_{2}-\varrho_{1})^{2}\phi(h)\int\limits_{0}^{1}\Gamma_{x_{2}}(v,h)\left(L(v)+1\right)^{2}g(v)\,dv,$$

where

$$\Gamma_{x}(v,h)=f_{v}(x)+\frac{g_{x,v}(h)}{\phi(h)}-2\int\limits_{0}^{1}u\frac{\phi(uh)}{\phi(h)}\Big{(}f_{v}(x)+\frac{g_{x,v}(uh)}{\phi(uh)}\Big{)}\,du.$$

By the Conditions (A1)–(A2) and (B4)(i), It follows that

$$\mathbf{E}\Bigg{[}\bigg{(}L(Y)+1\bigg{)}\bigg{[}\Delta_{1}(x_{2},\varrho_{2}h)-\Delta_{1}(x_{2},\varrho_{1}h)\bigg{]}\Bigg{]}^{2}\leq C\,|\varrho_{2}-\varrho_{1}|\phi(\gamma h).$$

(8.4)

By using the same arguments as above, it follows that

$$\mathbf{E}\Bigg{[}\bigg{(}L(Y)+1\bigg{)}\bigg{[}\Delta_{1}(x_{2},\varrho_{1}h)-\Delta_{1}(x_{1},\varrho_{1}h)\bigg{]}\Bigg{]}^{2}\leq C\,\left(\frac{d(x_{1},x_{2})}{h}\right)^{2}\phi(\gamma h),$$

which implies, by using the fact that $d(x_{1},x_{2})\leq\delta\phi(\gamma h)$ and $\phi(\gamma h)/h=O(1)$,

$$\mathbf{E}\Bigg{[}\bigg{(}L(Y)+1\bigg{)}\bigg{[}\Delta_{1}(x_{2},\varrho_{1}h)-\Delta_{1}(x_{1},\varrho_{1}h)\bigg{]}\Bigg{]}^{2}\leq C\,\delta\phi(\gamma h).$$

(8.5)

Similarly

$$\mathbf{E}\Big{|}\Delta_{1}(x,\varrho_{2}h)-\Delta_{1}(x,\varrho_{1}h)\Big{|}\leq C\,|\varrho_{2}-\varrho_{1}|\phi(\gamma h)$$

(8.6)

and

$$\mathbf{E}\Big{|}\Delta_{1}(x_{2},\varrho_{1}h)-\Delta_{1}(x,\varrho_{1}h)\Big{|}\leq C\,\delta\phi(\gamma h).$$

(8.7)

Finally, from (8.4)–(8.7), we deduce

$$IV\leq C\,\phi(\gamma h)\left(\delta+|\varrho_{2}-\varrho_{1}|\right),$$

which completes the proof of the Lemma 7.7. $\Box$

Proof of Lemma 7.8

Observe, for any $(x,z)\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}$, that

$$\mathfrak{B}_{n}(x,l,\varrho h)=W_{n}(x,z)+c_{l}(x)(\mathbf{E}[\widehat{r}_{n,2}^{l}(x,\varrho h)]-r^{l}(x))$$

$${}=:W_{n}(x,z)+V_{n}(x,z),$$

where $z=(l,\varrho)$. To prove Lemma 7.8, we have to show that

$$\lim\limits_{n\rightarrow\infty}w_{n}\sup_{z\in\mathcal{L}\times\mathcal{H}_{n}}V_{n}(x,z)=0.$$

Now, for any $(x,l,\varrho)\in\mathcal{J}\times\mathcal{L}\times\mathcal{H}_{0}$, observe that

$${\mathbf{E}[\widehat{r}_{n,2}^{l}(x,\varrho h)-r^{l}(x)]}$$

$${}=\sum_{i=1}^{n}\frac{\mathbf{E}[l(Y_{i})\Delta_{i}(x,\varrho h)]}{n\mathbf{E}[\Delta_{1}(x,\varrho h)]}-{r}^{l}(x)=\frac{\mathbf{E}[l(Y_{1})\Delta_{1}(x,\varrho h)]}{\mathbf{E}[\Delta_{1}(x,\varrho h)]}-{r}^{l}(x)$$

$${}=\frac{\mathbf{E}[\mathbf{E}[l(Y_{1})\Delta_{1}(x,\varrho h)|X_{1}]]}{\mathbf{E}[\Delta_{1}(x,\varrho h)]}-{r}^{l}(x)=\frac{\mathbf{E}[{r}^{l}(X_{1})\Delta_{1}(x,\varrho h)]}{\mathbf{E}[\Delta_{1}(x,\varrho h)]}-{r}^{l}(x)$$

$${}=\frac{\mathbf{E}\Bigg{[}\bigg{(}{r}^{l}(X_{1})-{r}^{l}(x)\bigg{)}\Delta_{1}(x,\varrho h)\Bigg{]}}{\mathbf{E}[\Delta_{1}(x,\varrho h)]}.$$

Since the kernel function $K(\cdot)$ is $[0,1]$-supported by condition (A1), it follows that

$$\left|{r}^{l}(X_{1})-{r}^{l}(x)\right|\Delta_{1}(x,\varrho h))\leq\sup_{\{x^{\prime}:d(x^{\prime},x)\leq\varrho h\}}\left|{r}^{l}(x^{\prime})-{r}^{l}(x)\right|\Delta_{1}(x,\varrho h).$$

The use of the Condition (C1) allows us to obtain the claimed result. $\Box$

Notes

Let us first recall the concept of large and moderate deviations. A sequence $\left\{Z_{n},n\geq 1\right\}$ of $\mathbb{R}$-valued random variables is said to satisfy a large deviation principle (LDP) with speed $v_{n}$ and rate function $I(\cdot)$ if for any closed set $F\subset\mathbb{R}$,
$$\limsup_{n\rightarrow\infty}v_{n}^{-1}\log\left(\mathbb{P}\left(Z_{n}\in F\right)\right)\leq-\inf_{x\in F}I(x)$$
and any open set $G\subset\mathbb{R}$,
$$\liminf_{n\rightarrow\infty}v_{n}^{-1}\log\left(\mathbb{P}\left(Z_{n}\in G\right)\right)\geq-\inf_{x\in G}I(x).$$
Let $a_{n}$ be a nonrandom sequence that goes to infinity, if there exists function $c(n)$, and $\left(a_{n}\left(Z_{n}-c(n)\right)\right)$ satisfies an LDP, then $Z_{n}$ is said to satisfy a moderate deviation principles (MDP). Roughly speaking, the MDP for $Z_{n}$ is the LDP for $\left(a_{n}\left(Z_{n}-c(n)\right)\right)$.
A semi-metric (sometimes called pseudo-metric) $d(\cdot,\cdot)$ is a metric which allows $d(x_{1},x_{2})=0$ for some $x_{1}\neq x_{2}$.
Given two functions $l$ and $u$, the interval $[l,u]$ represents the set of all functions $f$ such that $l\leq f\leq u$. An $\varepsilon$-bracket is defined as $[l,u]$ with $||u-l||<\varepsilon$. The bracketing number $N_{[]}(\mathcal{F},||\cdot||,\varepsilon)$ corresponds to the minimum number of $\varepsilon$-brackets required to encompass the class $\mathcal{F}$. The entropy with bracketing is expressed as the logarithm of the bracketing number. It’s important to note that, in the definition of the bracketing number, the upper and lower bounds $u$ and $l$ of the brackets need not be part of $\mathcal{F}$ itself, but they are assumed to have finite norms, refer to Definition 2.1.6 in [131].

REFERENCES

I. M. Almanjahie, S. Bouzebda, Z. Chikr Elmezouar, and A. Laksaci, ‘‘The functional $k$NN estimator of the conditional expectile: Uniform consistency in number of neighbors,’’ Stat. Risk Model. 38 (3–4), 47–63 (2022).
Article MathSciNet Google Scholar
I. M. Almanjahie, S. Bouzebda, Z. Kaid, and A. Laksaci, ‘‘Nonparametric estimation of expectile regression in functional dependent data,’’ J. Nonparametr. Stat. 34 (1), 250–281 (2022).
Article MathSciNet Google Scholar
I. M. Almanjahie, S. Bouzebda, Z. Kaid, and A. Laksaci, ‘‘The local linear functional $k$NN estimator of the conditional expectile: Uniform consistency in number of neighbors,’’ Metrika 34 (1), 1–29 (2024).
Google Scholar
G. Aneiros, R. Cao, R. Fraiman, and P. Vieu, ‘‘Editorial for the special issue on functional data analysis and related topics,’’ J. Multivariate Anal. 170, 1–2 (2019).
Article MathSciNet Google Scholar
G. Aneiros-Pérez and P. Vieu, ‘‘Nonparametric time series prediction: A semi-functional partial linear modeling,’’ J. Multivariate Anal. 99 (5), 834–857 (2008).
Article MathSciNet Google Scholar
M. A. Arcones, ‘‘The large deviation principle for stochastic processes. I,’’ Teor. Veroyatnost. i Primenen. 47 (4), 727–746 (2002).
Article Google Scholar
M. A. Arcones, ‘‘The large deviation principle for stochastic processes. II,’’ Teor. Veroyatnost. i Primenen. 48 (1), 122–150 (2003).
Article MathSciNet Google Scholar
M. A. Arcones, Moderate Deviations of Empirical Processes. In Stochastic Inequalities and Applications, Vol. 56 of Progr. Probab. (Birkhäuser, Basel, 2003), pp. 189–212.
Google Scholar
R. R. Bahadur, Some limit theorems in statistics, No. 4: Conference Board of the Mathematical Sciences Regional Conference Series in Applied Mathematics, Society for Industrial and Applied Mathematics (Philadelphia, PA, 1971).
Book Google Scholar
R. R. Bahadur and S. L. Zabell, ‘‘Large deviations of the sample mean in general vector spaces,’’ Ann. Probab. 7 (4), 587–621 (1979).
Article MathSciNet Google Scholar
N. Berrahou, ‘‘Principe de grandes déviations uniforme pour l’estimateur de la densité par la méthode des delta-suites,’’ C. R. Math. Acad. Sci. Paris 343 (9), 595–600 (2006).
Article MathSciNet Google Scholar
N. Berrahou, ‘‘Large deviations probabilities for a symmetry test statistic based on delta-sequence density estimation,’’ Statist. Probab. Lett. 78 (3), 238–248 (2008).
Article MathSciNet Google Scholar
V. I. Bogachev, Gaussian Measures, Vol. 62: Mathematical Surveys and Monographs. American Mathematical Society (Providence, RI, 1998).
D. Bosq, Linear Processes in Function Spaces, Vol. 149: Lecture Notes in Statistics (Springer-Verlag, New York. Theory and Applications, 2000).
S. Bouzebda, ‘‘On the strong approximation of bootstrapped empirical copula processes with applications,’’ Math. Methods Statist. 21 (3), 153–188 (2012).
Article MathSciNet Google Scholar
S. Bouzebda, ‘‘General tests of conditional independence based on empirical processes indexed by functions,’’ Jpn. J. Stat. Data Sci. 6 (1), 115–177 (2023).
Article MathSciNet Google Scholar
S. Bouzebda, ‘‘On the weak convergence and the uniform-in-bandwidth consistency of the general conditional $U$-processes based on the copula representation: Multivariate setting,’’ Hacet. J. Math. Stat. 52 (5), 1303–1348 (2023).
Article MathSciNet Google Scholar
S. Bouzebda and M. Chaouch, ‘‘Uniform limit theorems for a class of conditional $Z$-estimators when covariates are functions,’’ J. Multivariate Anal. 189 (104872), 21 (2022).
Article Google Scholar
S. Bouzebda and M. Cherfi, ‘‘General bootstrap for dual $\phi$-divergence estimates,’’ J. Probab. Stat., Art. ID 834107, 33 (2012).
Google Scholar
S. Bouzebda and S. Didi, ‘‘Some results about kernel estimators for function derivatives based on stationary and ergodic continuous time processes with applications,’’ Comm. Statist. Theory Methods 51 (12), 3886–3933 (2022).
Article MathSciNet Google Scholar
S. Bouzebda and I. Elhattab, ‘‘Uniform in bandwidth consistency of the kernel-type estimator of the Shannon’s entropy,’’ C. R. Math. Acad. Sci. Paris 348 (5–6), 317–321 (2010).
Article MathSciNet Google Scholar
S. Bouzebda and I. Elhattab, ‘‘Uniform-in-bandwidth consistency for kernel-type estimators of Shannon’s entropy,’’ Electron. J. Stat. 5, 440–459 (2011).
Article MathSciNet Google Scholar
S. Bouzebda and N. Limnios, ‘‘The uniform CLT for the empirical estimator of countable state space semi-Markov kernels indexed by functions with applications,’’ J. Nonparametr. Stat. 34 (4), 758–788 (2022).
Article MathSciNet Google Scholar
S. Bouzebda and B. Nemouchi, ‘‘Weak-convergence of empirical conditional processes and conditional $U$-processes involving functional mixing data,’’ Stat. Inference Stoch. Process. 26 (1), 33–88 (2023).
Article MathSciNet Google Scholar
S. Bouzebda and A. Nezzal, ‘‘Uniform consistency and uniform in number of neighbors consistency for nonparametric regression estimates and conditional $U$-statistics involving functional data,’’ Jpn. J. Stat. Data Sci. 5 (2), 431–533 (2022).
Article MathSciNet Google Scholar
S. Bouzebda and A. Nezzal, ‘‘Asymptotic properties of conditional $U$-statistics using delta sequences,’’ Comm. Statist. Theory Methods, 1–56 (2024). https://doi.org/10.1080/03610926.2023.2179887
S. Bouzebda and A. Nezzal, ‘‘Uniform in number of neighbors consistency and weak convergence of $k$NN empirical conditional processes and $k$NN conditional $U$-processes involving functional mixing data,’’ AIMS Math. 9 (2), 4427–4550 (2024).
Article MathSciNet Google Scholar
S. Bouzebda and I. Soukarieh, ‘‘Non-parametric conditional $U$-processes for locally stationary functional random fields under stochastic sampling design,’’ Mathematics 11 (1), 1–70 (2023).
Google Scholar
S. Bouzebda and N. Taachouche, On the variable bandwidth kernel estimation of conditional $U$-statistics at optimal rates in sup-norm. Phys. A 625 (129000), 72 (2023).
Article Google Scholar
S. Bouzebda and N. Taachouche, ‘‘Rates of the strong uniform consistency for the kernel-type regression function estimators with general kernels on manifolds,’’ Math. Methods Statist. 32 (1), 27–80 (2023).
Article MathSciNet Google Scholar
S. Bouzebda and N. Taachouche, ‘‘Rates of the strong uniform consistency with rates for conditional $U$-statistics estimators with general kernels on manifolds,’’ Math. Methods Statist. 33 (1), 1–55 (2024).
Google Scholar
S. Bouzebda and T. Zari, ‘‘Strong approximation of multidimensional $\mathbb{P}$–$\mathbb{P}$ plots processes by Gaussian processes with applications to statistical tests,’’ Math. Methods Statist. 23 (3), 210–238 (2014).
Article MathSciNet Google Scholar
S. Bouzebda, I. Elhattab, and C. T. Seck, ‘‘Uniform in bandwidth consistency of nonparametric regression based on copula representation,’’ Statist. Probab. Lett. 137, 173–182 (2018).
Article MathSciNet Google Scholar
S. Bouzebda, I. Elhattab, and B. Nemouchi, ‘‘On the uniform-in-bandwidth consistency of the general conditional $U$-statistics based on the copula representation,’’ J. Nonparametr. Stat. 33 (2), 321–358 (2021).
Article MathSciNet Google Scholar
S. Bouzebda, M. Chaouch, and S. Didi Biha, ‘‘Asymptotics for function derivatives estimators based on stationary and ergodic discrete time processes,’’ Ann. Inst. Statist. Math. 74 (4), 737–771 (2022).
Article MathSciNet Google Scholar
S. Bouzebda, I. Elhattab, and A. A. Ferfache, ‘‘General $M$-estimator processes and their $m$ out of $n$ bootstrap with functional nuisance parameters,’’ Methodol. Comput. Appl. Probab. 24 (4), 2961–3005 (2022b).
Article MathSciNet Google Scholar
S. Bouzebda, A. Laksaci, and M. Mohammedi, ‘‘Single index regression model for functional quasi-associated time series data,’’ REVSTAT 20 (5), 605–631 (2022c).
MathSciNet Google Scholar
S. Bouzebda, A. Laksaci, and M. Mohammedi, ‘‘The $k$-nearest neighbors method in single index regression model for functional quasi-associated time series data,’’ Rev. Mat. Complut. 36 (2), 361–391 (2023).
Article MathSciNet Google Scholar
J. E. Chacón and T. Duong, Multivariate Kernel Smoothing and Its Applications, Vol. 160: Monographs on Statistics and Applied Probability (CRC Press, Boca Raton, FL, 2018).
D. Chen, P. Hall, and H.-G. Müller, ‘‘Single and multiple index functional regression models with nonparametric link,’’ Ann. Statist. 39 (3), 1720–1747 (2011).
Article MathSciNet Google Scholar
M. Cherfi, ‘‘Large deviations theorems in nonparametric regression on functional data,’’ C. R. Math. Acad. Sci. Paris 349 (9–10), 583–585 (2011).
Article MathSciNet Google Scholar
H. Chernoff, ‘‘A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations,’’ Ann. Math. Statistics 23, 493–507 (1952).
Article MathSciNet Google Scholar
K. Chokri and S. Bouzebda, ‘‘Uniform-in-bandwidth consistency results in the partially linear additive model components estimation,’’ Comm. Statist. Theory Methods, 1–42 (2023).
J. A. Clarkson and C. R. Adams, ‘‘On definitions of bounded variation for functions of two variables,’’ Trans. Amer. Math. Soc. 35 (4), 824–854 (1933).
Article MathSciNet Google Scholar
S. Dabo-Niang and A. Laksaci, ‘‘Nonparametric quantile regression estimation for functional dependent data,’’ Comm. Statist. Theory Methods 41 (7), 1254–1268 (2012).
Article MathSciNet Google Scholar
P. Deheuvels, ‘‘One bootstrap suffices to generate sharp uniform bounds in functional estimation,’’ Kybernetika (Prague) 47 (6), 855–865 (2011).
MathSciNet Google Scholar
P. Deheuvels, ‘‘Uniform-in-bandwidth functional limit laws for multivariate empirical processes,’’ in: High dimensional probability VIII–The Oaxaca volume, Vol. 74 of Progr. Probab. (Birkhäuser/Springer, Cham, 2019), pp. 201–239.
Google Scholar
P. Deheuvels and D. M. Mason, ‘‘General asymptotic confidence bands based on kernel-type function estimators,’’ Stat. Inference Stoch. Process. 7 (3), 225–277 (2004).
Article MathSciNet Google Scholar
A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications, Vol. 38: Applications of Mathematics, 2nd ed. (Springer-Verlag, New York, 1998).
Google Scholar
J.-D. Deuschel and D. W. Stroock, Large Deviations, Vol. 137: Pure and Applied Mathematics (Academic Press, Inc., Boston, MA, 1989).
L. Devroye and L. Györfi, Nonparametric Density Estimation. Wiley Series in Probability and Mathematical Statistics: Tracts on Probability and Statistics (John Wiley and Sons, Inc., New York, The $L_{1}$ View, 1985).
L. Devroye and G. Lugosi, Combinatorial Methods in Density Estimation, Springer Series in Statistics (Springer-Verlag, New York, 2001).
J. Dony and U. Einmahl, ‘‘Uniform in bandwidth consistency of kernel regression estimators at a fixed point,’’ in: High dimensional probability V: The Luminy volume, Vol. 5: Inst. Math. Stat. (IMS) Collect. Inst. Math. Statist (Beachwood, OH, 2009), p. 308–325.
L. Douge, ‘‘Théorèmes limites pour des variables quasi-associées hilbertiennes,’’ Ann. I.S.U.P. 54 (1–2), 51–60 (2010).
MathSciNet Google Scholar
R. M. Dudley, Uniform Central Limit Theorems, Vol. 63: Cambridge Studies in Advanced Mathematics (Cambridge University Press, Cambridge, 1999).
N. Dunford and J. T. Schwartz, Linear Operators, I: General Theory, Pure and Applied Mathematics, Vol. 7 (Interscience Publishers, Inc., New York; Interscience Publishers Ltd., London. With the assistance of W. G. Bade and R. G. Bartle, 1958).
P. P. B. Eggermont and V. N. LaRiccia, Maximum Penalized Likelihood Estimation, Vol. II (Springer Series in Statistics. Springer, Dordrecht, 2009).
U. Einmahl and D. M. Mason, ‘‘An empirical process approach to the uniform consistency of kernel-type function estimators,’’ J. Theoret. Probab. 13 (1), 1–37 (2000).
Article MathSciNet Google Scholar
U. Einmahl and D. M. Mason, ‘‘Uniform in bandwidth consistency of kernel-type function estimators,’’ Ann. Statist. 33 (3), 1380–1403 (2005).
Article MathSciNet Google Scholar
R. S. Ellis, Entropy, Large Deviations, and Statistical Mechanics, Classics in Mathematics (Springer-Verlag, Berlin, Reprint of the 1985 original, 2006).
F. Ferraty and P. Vieu, ‘‘Dimension fractale et estimation de la régression dans des espaces vectoriels semi-normés,’’ C. R. Acad. Sci. Paris Sér. I Math. 330 (2), 139–142 (2000).
Article MathSciNet Google Scholar
F. Ferraty and P. Vieu, Nonparametric Functional Data Analysis, Springer Series in Statistics (Springer, New York, Theory and Practice, 2006).
F. Ferraty, A. Laksaci, and P. Vieu, ‘‘Estimating some characteristics of the conditional distribution in nonparametric functional models,’’ Stat. Inference Stoch. Process. 9 (1), 47–76 (2006).
Article MathSciNet Google Scholar
F. Ferraty, A. Mas, and P. Vieu, ‘‘Nonparametric regression on functional data: Inference and practical aspects,’’ Australian and New Zealand Journal of Statistics 49, 267–286 (2007).
Article MathSciNet Google Scholar
F. Ferraty, A. Laksaci, A. Tadj, and P. Vieu, ‘‘Rate of uniform consistency for nonparametric estimates with functional variables,’’ J. Statist. Plann. Inference 140 (2), 335–352 (2010).
Article MathSciNet Google Scholar
J. C. Fu, ‘‘Large sample point estimation: A large deviation theory approach,’’ Ann. Statist. 10 (3), 762–771 (1982).
Article MathSciNet Google Scholar
F. Gao and X. Zhao, ‘‘Delta method in large deviations and moderate deviations for estimators,’’ Ann. Statist. 39 (2), 1211–1240 (2011).
Article MathSciNet Google Scholar
T. Gasser, P. Hall, and B. Presnell, ‘‘Nonparametric estimation of the mode of a distribution of random curves,’’ J. R. Stat. Soc. Ser. B Stat. Methodol. 60 (4), 681–691 (1998).
Article Google Scholar
I. Gijbels, M. Omelka, and N. Veraverbeke, ‘‘Multivariate and functional covariates and conditional copulas,’’ Electron. J. Stat. 6, 1273–1306 (2012).
Article MathSciNet Google Scholar
R. D. Gill, ‘‘Non- and semi-parametric maximum likelihood estimators and the von Mises method. I,’’ Scand. J. Statist. 16 (2), 97–128. With a discussion by J. A. Wellner and J. Præstgaard and a reply by the author (1989).
E. Giné and A. Guillou, ‘‘On consistency of kernel density estimators for randomly censored data: Rates holding uniformly over adaptive intervals,’’ Ann. Inst. H. Poincaré Probab. Statist. 37 (4), 503–522 (2001).
Google Scholar
E. Giné and R. Nickl, Mathematical Foundations of Infinite-Dimensional Statistical Models, Cambridge Series in Statistical and Probabilistic Mathematics (Cambridge University Press, New York, 2016).
E. Giné, V. Koltchinskii, and J. Zinn, ‘‘Weighted uniform consistency of kernel density estimators,’’ Ann. Probab. 32 (3B), 2570–2605 (2004).
Article MathSciNet Google Scholar
A. Goia and P. Vieu, ‘‘A partitioned single functional index model,’’ Comput. Statist. 30 (3), 673–692 (2015).
Article MathSciNet Google Scholar
P. Groeneboom, J. Oosterhoff, and F. H. Ruymgaart, ‘‘Large deviation theorems for empirical probability measures,’’ Ann. Probab. 7 (4), 553–586 (1979).
Article MathSciNet Google Scholar
L. Györfi, M. Kohler, A. Krzyżak, and H. Walk, A Distribution-Free Theory of Nonparametric Regression, Springer Series in Statistics (Springer-Verlag, New York, 2002).
P. Hall, ‘‘Asymptotic properties of integrated square error and cross-validation for kernel estimation of a regression function,’’ Z. Wahrsch. Verw. Gebiete 67 (2), 175–196 (1984).
Article MathSciNet Google Scholar
W. Härdle, Applied Nonparametric Regression, Vol. 19: Econometric Society Monographs (Cambridge University Press, Cambridge, 1990).
W. Härdle and J. S. Marron, ‘‘Optimal bandwidth selection in nonparametric regression function estimation,’’ Ann. Statist. 13 (4), 1465–1481 (1985).
Article MathSciNet Google Scholar
G. H. Hardy, ‘‘On double fourier series and especially those which represent the double zeta-function with real and incommensurable parameters,’’ Quart. J. Math 37 (1), 53–79 (1905).
Google Scholar
E. W. Hobson, The theory of functions of a real variable and the theory of Fourier’s series, Vol. II (Dover Publications, Inc., New York, N.Y., 1958).
Book Google Scholar
L. Horváth and P. Kokoszka, Inference for Functional Data with Applications (Springer Series in Statistics. Springer, New York, 2012).
P. J. Huber, ‘‘Robust estimation of a location parameter,’’ Ann. Math. Statist. 35, 73–101 (1964).
Article MathSciNet Google Scholar
W. C. M. Kallenberg, ‘‘Chernoff efficiency and deficiency,’’ Ann. Statist. 10 (2), 583–594 (1982).
Article MathSciNet Google Scholar
W. C. M. Kallenberg, ‘‘Intermediate efficiency, theory and examples,’’ Ann. Statist. 11 (1), 170–182 (1983a).
Article MathSciNet Google Scholar
W. C. M. Kallenberg, ‘‘On moderate deviation theory in estimation,’’ Ann. Statist. 11 (2), 498–504 (1983b).
Article MathSciNet Google Scholar
L.-Z. Kara, A. Laksaci, M. Rachdi, and P. Vieu, ‘‘Data-driven $k$NN estimation in nonparametric functional data analysis,’’ J. Multivariate Anal. 153, 176–188 (2017).
Article MathSciNet Google Scholar
A. D. M. Kester and W. C. M. Kallenberg, ‘‘Large deviations of estimators,’’ Ann. Statist. 14 (2), 648–664 (1986).
Article MathSciNet Google Scholar
M. R. Kosorok, Introduction to Empirical Processes and Semiparametric Inference (Springer Series in Statistics, Springer, New York, 2008).
M. Krause, ‘‘Über Mittelwertsätze im Gebiete der Doppelsummen und Doppelintegrale,’’ Leipz. Ber. 55, 239–263 (1903).
Google Scholar
W. V. Li and Q.-M. Shao, Gaussian Processes: Inequalities, Small Ball Probabilities, and Applications, in: Stochastic Processes: Theory and Methods, Vol. 19: Handbook of Statist. (North-Holland, Amsterdam, 2001), p. 533–597.
H. Lian, ‘‘Functional partial linear model,’’ J. Nonparametr. Stat. 23 (1), 115–128 (2011).
Article MathSciNet Google Scholar
N. Ling and P. Vieu, ‘‘Nonparametric modelling for functional data: Selected survey and tracks for future,’’ Statistics 52 (4), 934–949 (2018).
Article MathSciNet Google Scholar
Q. Liu and S. Zhao, ‘‘Pointwise and uniform moderate deviations for nonparametric regression function estimator on functional data,’’ Statist. Probab. Lett. 83 (5), 1372–1381 (2013).
Article MathSciNet Google Scholar
D. Louani and S. M. Ould Maouloud, ‘‘Large deviation results for the nonparametric regression function estimator on functional data,’’ Math. Methods Statist. 21 (4), 298–313 (2012).
Article MathSciNet Google Scholar
D. M. Mason, ‘‘Proving consistency of non-standard kernel estimators,’’ Stat. Inference Stoch. Process. 15 (2), 151–176 (2012).
Article MathSciNet Google Scholar
D. M. Mason and M. A. Newton, ‘‘A rank statistics approach to the consistency of a general bootstrap,’’ Ann. Statist. 20 (3), 1611–1624 (1992).
Article MathSciNet Google Scholar
D. M. Mason and J. W. H. Swanepoel, ‘‘A general result on the uniform in bandwidth consistency of kernel-type function estimators,’’ TEST 20 (1), 72–94 (2011).
Article MathSciNet Google Scholar
D. M. Mason and J. W. H. Swanepoel, ‘‘Uniform in bandwidth consistency of kernel estimators of the density of mixed data,’’ Electron. J. Stat. 9 (1), 1518–1539 (2015).
Article MathSciNet Google Scholar
E. Masry, ‘‘Nonparametric regression estimation for dependent functional data: asymptotic normality,’’ Stochastic Process. Appl. 115 (1), 155–177 (2005).
Article MathSciNet Google Scholar
E. Mayer-Wolf and O. Zeitouni, ‘‘The probability of small Gaussian ellipsoids and associated conditional moments,’’ Ann. Probab. 21 (1), 14–24 (1993).
Article MathSciNet Google Scholar
M. Mohammedi, S. Bouzebda, and A. Laksaci, ‘‘On the nonparametric estimation of the functional expectile regression,’’ C. R. Math. Acad. Sci. Paris 358 (3), 267–272 (2020).
Article MathSciNet Google Scholar
M. Mohammedi, S. Bouzebda, and A. Laksaci, ‘‘The consistency and asymptotic normality of the kernel type expectile regression estimator for functional data,’’ J. Multivariate Anal. 181 (104673), 24 (2021).
Article MathSciNet Google Scholar
A. Mokkadem and M. Pelletier, ‘‘Moderate deviations principles for the kernel estimator of nonrandom regression functions,’’ Afr. Stat. 11 (2), 995–1021 (2016).
MathSciNet Google Scholar
A. Mokkadem, M. Pelletier, and B. Thiam, ‘‘Large and moderate deviations principles for kernel estimators of the multivariate regression,’’ Math. Methods Statist. 17 (2), 146–172 (2008).
Article MathSciNet Google Scholar
H.-G. Müller, Nonparametric Regression Analysis of Longitudinal Data, Vol. 46: Lecture Notes in Statistics (Springer-Verlag, Berlin, 1988).
E. A. Nadaraja, ‘‘On a regression estimate,’’ Teor. Verojatnost. i Primenen. 9, 157–159 (1964).
MathSciNet Google Scholar
E. A. Nadaraja, Nonparametric Estimation of Probability Densities and Regression Curves, Vol. 20: Mathematics and Its Applications (Soviet Series) (Kluwer Academic Publishers Group, Dordrecht, 1989).
R. B. Nelsen, An introduction to copulas. Springer Series in Statistics (Springer, New York, 2nd ed., 2006).
W. K. Newey and J. L. Powell, ‘‘Asymmetric least squares estimation and testing,’’ Econometrica 55 (4), 819–847 (1987).
Article MathSciNet Google Scholar
H. Niederreiter, Random Number Generation and Quasi-Monte-Carlo Methods, Vol. 63: CBMS-NSF Regional Conference Series in Applied Mathematics, Society for Industrial and Applied Mathematics (SIAM) (Philadelphia, PA, 1992).
Y. Nikitin, Asymptotic Efficiency of Nonparametric Tests (Cambridge University Press, Cambridge, 1995).
Book Google Scholar
D. Nolan and D. Pollard, ‘‘$U$-processes: Rates of convergence,’’ Ann. Statist. 15 (2), 780–799 (1987).
Article MathSciNet Google Scholar
S. M. Ould Maouloud, ‘‘Some uniform large deviation results in nonparametric function estimation,’’ J. Nonparametr. Stat. 20 (2), 129–152 (2008).
Article MathSciNet Google Scholar
E. Parzen, ‘‘On estimation of a probability density function and mode,’’ Ann. Math. Statist. 33, 1065–1076 (1962).
Article MathSciNet Google Scholar
D. Pollard, Convergence of Stochastic Processes, Springer Series in Statistics (Springer-Verlag, New York, 1984).
A. Puhalskii and V. Spokoiny, ‘‘On large-deviation efficiency in statistical inference,’’ Bernoulli 4 (2), 203–272 (1998).
Article MathSciNet Google Scholar
M. Rachdi and P. Vieu, ‘‘Nonparametric regression for functional data: Automatic smoothing parameter selection,’’ J. Statist. Plann. Inference 137 (9), 2784–2801 (2007).
Article MathSciNet Google Scholar
M. E. Radavichyus, ‘‘Probabilities of large deviations for maximum likelihood estimators,’’ Dokl. Akad. Nauk SSSR 268 (3), 551–556 (1983).
MathSciNet Google Scholar
J. O. Ramsay and B. W. Silverman, Functional Data Analysis, Springer Series in Statistics (Springer, New York, 2nd ed., 2005).
G. G. Roussas, ‘‘Nonparametric estimation of the transition distribution function of a Markov process,’’ Ann. Math. Statist. 40, 1386–1400 (1969).
Article MathSciNet Google Scholar
M. Samanta, ‘‘Nonparametric estimation of conditional quantiles,’’ Statist. Probab. Lett. 7 (5), 407–412 (1989).
Article MathSciNet Google Scholar
I. N. Sanov, ‘‘On the probability of large deviations of random magnitudes,’’ Mat. Sb. (N.S.) 42(84), 11–44 (1957).
D. W. Scott, Multivariate Density Estimation, Wiley Series in Probability and Statistics (John Wiley and Sons, Inc., Hoboken, NJ, 2nd ed., 2015).
A. Sieders and K. Dzhaparidze, ‘‘A large deviation result for parameter estimators and its application to nonlinear regression analysis,’’ Ann. Statist. 15 (3), 1031–1049 (1987).
Article MathSciNet Google Scholar
B. W. Silverman, Density Estimation for Statistics and Data Analysis, Monographs on Statistics and Applied Probability (Chapman and Hall, London, 1986).
I. Soukarieh and S. Bouzebda, ‘‘Renewal type bootstrap for increasing degree $U$-process of a Markov chain,’’ J. Multivariate Anal. 195 (105143), 25 (2023).
Article Google Scholar
I. Soukarieh and S. Bouzebda, ‘‘Weak convergence of the conditional $U$-statistics for locally stationary functional time series,’’ Stat. Inference Stoch. Process. 1–78 (2024). https://doi.org/10.1007/s11203-023-09305-y
W. Stute, ‘‘Conditional empirical processes,’’ Ann. Statist. 14 (2), 638–647 (1986).
Article MathSciNet Google Scholar
R. A. Tapia and J. R. Thompson, Nonparametric Probability Density Estimation, Vol. 1: Johns Hopkins Series in the Mathematical Sciences (Johns Hopkins University Press, Baltimore, Md., 1978).
A. W. van der Vaart and J. A. Wellner, Weak Convergence and Empirical Processes, Springer Series in Statistics (Springer-Verlag, New York, 1996).
S. R. S. Varadhan, Large deviations and applications, in: École d’Été de Probabilités de Saint-Flour XV–XVII, 1985–1987, Vol. 1362: Lecture Notes in Math. (Springer, Berlin, 1988), pp. 1–49.
S. R. S. Varadhan, Large Deviations, Vol. 27: Courant Lecture Notes in Mathematics (Courant Institute of Mathematical Sciences, New York; American Mathematical Society, Providence, RI, 2016).
G. Vitali, ‘‘Sui gruppi di punti e sulle funzioni di variabili reali,’’ Torino Atti 43, 229–246 (1908).
Google Scholar
A. G. Vituškin, O mnogomernykh variatsiyakh (Gosudarstv. Izdat. Tehn.-Teor. Lit., Moscow, 1955).
Google Scholar
M. P. Wand and M. C. Jones, Kernel Smoothing, Vol. 60: Monographs on Statistics and Applied Probability (Chapman and Hall, Ltd., London, 1995).
G. S. Watson, ‘‘Smooth regression analysis,’’ Sankhyā Ser. A 26, 359–372 (1964).
MathSciNet Google Scholar
W. Wertz, Statistical Density Estimation: A Survey, Vol. 13: Angewandte Statistik und Ökonometrie [Applied Statistics and Econometrics] (Vandenhoeck and Ruprecht, Göttingen, With German and French Summaries, 1978).
C.Wu, N. Ling, P. Vieu, and W. Liang, ‘‘Partially functional linear quantile regression model and variable selection with censoring indicators MAR,’’ J. Multivariate Anal. 197, Paper No. 105189 (2023).

Download references

ACKNOWLEDGEMENTS

The authors express their gratitude to the Editor-in-Chief, an Associate Editor, and the referee for their invaluable comments. These remarks have significantly enhanced the original work, leading to a more focused and improved presentation.

Funding

This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.

Author information

Authors and Affiliations

Laboratoire de Modélisation des Systèmes Complexes, University Cadi Ayyad Marrakesh, Marrakesh, Morocco
Nour-Eddine Berrahou & Lahcen Douge
Université de technologie de Compiègne, LMAC (Laboratory of Applied Mathematics of Compiègne), Compiègne, France
Salim Bouzebda

Authors

Nour-Eddine Berrahou
View author publications
You can also search for this author in PubMed Google Scholar
Salim Bouzebda
View author publications
You can also search for this author in PubMed Google Scholar
Lahcen Douge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Nour-Eddine Berrahou, Salim Bouzebda or Lahcen Douge.

Ethics declarations

The authors of this work declare that they have no conflicts of interest.

Additional information

Publisher’s Note.

Allerton Press remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

APPENDIX A

This appendix contains supplementary information that is essential to providing a more comprehensive understanding of the paper.

Theorem A.1 (Theorem 3.1 [6]). Let $\left\{U_{n}(t):t\in T\right\}$ be a sequence of stochastic processes, where $T$ is an index set. Let $\left\{\varepsilon_{n}\right\}$ be a sequence of positive numbers that converge to zero. Let $I:l_{\infty}(T)\rightarrow[0,\infty]$ and let $I_{t_{1},\ldots,t_{m}}:\mathbb{R}^{m}\rightarrow[0,\infty]$ be a function for each $t_{1},\ldots,t_{m}\in T$. Let $d(\cdot,\cdot)$ be a pseudometric in $T$. Consider the following conditions:

(a.1) $(T,d)$ is totally bounded;

(a.2) for each $t_{1},\ldots,t_{m}\in T,\left(U_{n}\left(t_{1}\right),\ldots,U_{n}\left(t_{m}\right)\right)$ satisfies the LDP with the rate $\varepsilon_{n}^{-1}$ and good rate function $I_{t_{1},\ldots,t_{m}}$;

(a.3) for each $\tau>0$,

$$\lim_{\eta\rightarrow 0}\limsup_{n\rightarrow\infty}\varepsilon_{n}\log\left(\mathbb{P}^{*}\left\{\sup_{d(s,t)\leq\eta}\left|U_{n}(t)-U_{n}(s)\right|\geq\tau\right\}\right)=-\infty;$$

(b.1) for each $0\leq c<\infty,\bigg{\{}z\in l_{\infty}(T):I(z)\leq c\bigg{\}}$ is a compact set of $l_{\infty}(T)$;

(b.2) for each $A\subset l_{\infty}(T)$,

$$-\inf_{z\in A^{\circ}}I(z)\leq\liminf_{n\rightarrow\infty}\varepsilon_{n}\log\left(\mathbb{P}_{*}\left\{U_{n}\in A\right\}\right)$$

$${}\leq\limsup_{n\rightarrow\infty}\varepsilon_{n}\log\left(\mathbb{P}^{*}\left\{U_{n}\in A\right\}\right)\leq-\inf_{z\in\bar{A}}I(z).$$

If the set of conditions (a) is satisfied, then the set of conditions (b) holds with $I(\cdot)$ given by

$$I(z)=\sup\bigg{\{}I_{t_{1},\ldots,t_{m}}\left(z\left(t_{1}\right),\ldots,z\left(t_{m}\right)\right):t_{1},\ldots,t_{m}\in T,m\geq 1\bigg{\}}.$$

If the set of conditions (b) is satisfied, then the set of conditions (a) holds with

$$I_{t_{1},\ldots,t_{m}}\left(u_{1},\ldots,u_{m}\right)=\inf\bigg{\{}I(z):z\in l_{\infty}(T),\left(z\left(t_{1}\right),\ldots,z\left(t_{m}\right)\right)=\left(u_{1},\ldots,u_{m}\right)\bigg{\}}$$

and the pseudometric $d(\cdot,\cdot)$ is defined

$$d(s,t)=\sum_{k=1}^{\infty}k^{-2}\min\left(d_{k}(s,t),1\right),$$

where

$$d_{k}(s,t)=\sup\left\{\left|u_{2}-u_{1}\right|:I_{s,t}\left(u_{1},u_{2}\right)\leqq k\right\}.$$

About this article

Cite this article

Berrahou, NE., Bouzebda, S. & Douge, L. Functional Uniform-in-Bandwidth Moderate Deviation Principle for the Local Empirical Processes Involving Functional Data. Math. Meth. Stat. 33, 26–69 (2024). https://doi.org/10.3103/S1066530724700030

Download citation

Received: 17 April 2021
Revised: 12 September 2023
Accepted: 05 November 2023
Published: 25 April 2024
Issue Date: March 2024
DOI: https://doi.org/10.3103/S1066530724700030

Keywords:

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Functional Uniform-in-Bandwidth Moderate Deviation Principle for the Local Empirical Processes Involving Functional Data

Abstract

1 INTRODUCTION

2 THE GENERAL PROCESS

3 MAIN RESULTS

Discussion of Assumptions

Comments on Additional Hypotheses

4 APPLICATIONS

4.1 The Kernel Regression Function Estimate

4.2 The Kernel Conditional Distribution Function

4.3 The Kernel Quantile Regression

4.4 The Kernel Conditional Density Function

4.5 The Kernel Conditional Copula Function

5 THE BANDWIDTH SELECTION CRITERION

6 CONCLUSIONS

PROOFS

Proof of Theorem 3.1

Proof of Proposition 7.3

Proof of Theorem 3.2

Proof of Theorem 3.3

Proof of Corollary 4.1

Proof of Corollary 4.2

Proof of Corollary 4.3

PROOF OF THE TECHNICAL LEMMAS

Proof of Lemma 7.1

Proof of Lemma 7.2

Proof of Lemma 7.5

Proof of Lemma 7.6

Proof of Lemma 7.7

Proof of Lemma 7.8

Notes

REFERENCES

ACKNOWLEDGEMENTS

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Additional information

Publisher’s Note.

APPENDIX A

APPENDIX A

About this article

Cite this article

Share this article

Keywords:

Search

Navigation