1 Introduction

Let \(\psi \) be an integrable function, \(\int _{{\mathbb R}^n}\psi =0\), and, for some \(\varepsilon >0\),

$$\begin{aligned} |\psi (x)|\le \frac{c}{(1+|x|)^{n+\varepsilon }}\quad \text {and}\quad \int _{{\mathbb R}^n}|\psi (x+h)-\psi (x)|dx\le c|h|^{\varepsilon }. \end{aligned}$$
(1.1)

Let \({\mathbb R}^{n+1}_+={\mathbb R}^{n}\times {\mathbb R}_{+}\) and \(\Gamma _{\alpha }(x)=\{(y,t)\in {\mathbb {R}}^{n+1}_+:|y-x|<\alpha t\}\). Set \(\psi _t(x)=t^{-n}\psi (x/t)\). Define the square function \(S_{\alpha ,\psi }(f)\) by

$$\begin{aligned} S_{\alpha ,\psi }(f)(x)=\left( \int _{\Gamma _{\alpha }(x)}|f*\psi _t(y)|^2\frac{dydt}{t^{n+1}}\right) ^{1/2}\quad (\alpha >0). \end{aligned}$$

We drop the subscript \(\alpha \) if \(\alpha =1\).

Given a weight \(w\), define its \(A_p\) characteristic by

$$\begin{aligned}{}[w]_{A_p}=\sup _{Q}\left( \frac{1}{|Q|}\int _Qw\,dx\right) \left( \frac{1}{|Q|}\int _Qw^{-\frac{1}{p-1}}\,dx\right) ^{p-1}, \end{aligned}$$

where the supremum is taken over all cubes \(Q\subset {\mathbb R}^n\).

It was proved in [13] that for any \(1<p<\infty \),

$$\begin{aligned} \Vert S_{\psi }\Vert _{L^p(w)}\le c_{p,n,\psi }[w]_{A_p}^{\max \left( \frac{1}{2},\frac{1}{p-1}\right) }, \end{aligned}$$
(1.2)

and this estimate is sharp in terms of \([w]_{A_p}\) (we also refer to [13] for a detailed history of closely related results).

Similarly one can show that

$$\begin{aligned} \Vert S_{\alpha ,\psi }\Vert _{L^p(w)}\le c_{p,n,\psi }\gamma (\alpha )[w]_{A_p}^{\max \left( \frac{1}{2},\frac{1}{p-1}\right) }\quad (\alpha \ge 1,1<p<\infty ); \end{aligned}$$
(1.3)

however, the sharp dependence on \(\alpha \) in this estimate cannot be determined by means of the approach from [13]. The aim of this paper is to find the sharp \(\gamma (\alpha )\) in (1.3).

Let us explain first why the method from [13] gives a rough estimate for \(\gamma (\alpha )\). The proof in [13] was based on the intrinsic square function \(G_{\alpha ,\beta }(f)\) by Wilson [19] defined as follows. For \(0<\beta \le 1\), let \({\mathcal C}_{\beta }\) be the family of functions supported in the unit ball with mean zero and such that for all \(x\) and \(x', |\varphi (x)-\varphi (x')|\le |x-x'|^{\beta }\). If \(f\in L^1_{\text {loc}}({\mathbb R}^n)\) and \((y,t)\in {\mathbb R}^{n+1}_+\), we define \( A_{\beta }(f)(y,t)=\sup _{\varphi \in {\mathcal C}_{\beta }}|f*\varphi _t(y)| \) and

$$\begin{aligned} G_{\alpha ,\beta }(f)(x)=\left( \int _{\Gamma _{\alpha }(x)} \big (A_{\beta }(f)(y,t)\big )^2\frac{dydt}{t^{n+1}}\right) ^{1/2}. \end{aligned}$$

Set \(G_{1,\beta }(f)=G_{\beta }(f)\).

The intrinsic square function has several interesting features (established in [19]). First, though \(G_{\beta }(f)\) is defined by means of kernels with uniform compact support, it pointwise dominates \(S_{\psi }(f)\). Also there is a pointwise relation between \(G_{\alpha ,\beta }(f)\) with different apertures:

$$\begin{aligned} G_{\alpha ,\beta }(f)(x)\le \alpha ^{(3/2)n+\beta }G_{\beta }(f)(x)\quad (\alpha \ge 1). \end{aligned}$$
(1.4)

Notice that for the usual square functions \(S_{\alpha ,\psi }(f)\) such a pointwise relation is not available.

In [13], (1.2) with \(G_{\beta }(f)\) instead of \(S_{\psi }(f)\) was obtained. Combining this with (1.4), we would obtain that one can take \(\gamma (\alpha )=\alpha ^{(3/2)n+\beta }\) in (1.3) assuming that \(\psi \in {\mathcal C}_{\beta }\). For non-compactly supported \(\psi \) some additional ideas from [19] can be used that lead to even worse estimate on \(\gamma (\alpha )\). Observe also that it is not clear to us whether (1.4) can be improved.

It is easy to see that the dependence \(\gamma (\alpha )=\alpha ^{(3/2)n+\beta }\) in (1.3) is far from the sharp one. For instance, it is obvious that the information on \(\beta \) should not appear in (1.3). All this indicates that the intrinsic square function approach is not suitable for our purposes in determining the sharp \(\gamma (\alpha )\).

Suppose we seek for \(\gamma (\alpha )\) in the form \(\gamma (\alpha )=\alpha ^{r}\). Then a simple observation shows that \(r\ge n\) for any \(1<p<\infty \). Indeed, consider the Littlewood–Paley function \(g_{\mu ,\psi }^*(f)\) defined by

$$\begin{aligned} g_{\mu ,\psi }^*(f)(x)=\biggl (\int \int _{{\mathbb R}^{n+1}_+}\biggl (\frac{t}{t+|x-y|}\biggr )^{\mu n}|f*\psi _t(y)|^2\frac{dydt}{t^{n+1}}\biggr )^{1/2}. \end{aligned}$$

Using the standard estimate

$$\begin{aligned} g_{\mu ,\psi }^*(f)(x)\le S_{\psi }(f)(x)+\sum _{k=0}^{\infty }2^{-k\mu n/2}S_{2^{k+1},\psi }(f)(x), \end{aligned}$$

we obtain that (1.3) for some \(p=p_0\) and \(\gamma (\alpha )=\alpha ^{r_0}\) implies

$$\begin{aligned} \Vert g_{\mu ,\psi }^*\Vert _{L^{p_0}(w)}\lesssim \Big (\sum _{k=0}^{\infty }2^{-k\mu n/2}2^{kr_0}\Big ) [w]_{A_{p_0}}^{\max \left( \frac{1}{2},\frac{1}{p_0-1}\right) }. \end{aligned}$$
(1.5)

This means that if \(\mu >2r_0/n\), then \(g_{\mu ,\psi }^*\) is bounded on \(L^{p_0}(w), w\in ~A_{p_0}\). From this, by the Rubio de Francia extrapolation theorem, \(g_{\mu ,\psi }^*\) is bounded on the unweighted \(L^p\) for any \(p>1\), whenever \(\mu >2r_0/n\). But it is well known [8] that \(g_{\mu ,\psi }^*\) is not bounded on \(L^p\) if \(1<\mu <2\) and \(1<p\le 2/\mu \). Hence, if \(r_0<n\), we would obtain a contradiction to the latter fact for \(p\) sufficiently close to \(1\).

Our main result shows that for any \(1<p<\infty \) one can take the optimal power growth \(\gamma (\alpha )=\alpha ^n\).

Theorem 1.1

For any \(1<p<\infty \) and for all \(1\le \alpha <\infty \),

$$\begin{aligned} \Vert S_{\alpha ,\psi }\Vert _{L^p(w)}\le c_{p,n,\psi }\alpha ^n[w]_{A_p}^{\max \left( \frac{1}{2},\frac{1}{p-1}\right) }. \end{aligned}$$

By (1.5), we immediately obtain the following.

Corollary 1.2

Let \(\mu >2\). Then for any \(1<p<\infty \),

$$\begin{aligned} \Vert g_{\mu ,\psi }^*(f)\Vert _{L^{p}(w)}\le c_{p,n,\mu ,\psi }[w]_{A_p}^{\max \left( \frac{1}{2},\frac{1}{p-1}\right) }. \end{aligned}$$

Observe that if \(\mu =2\), then \(g_{2,\psi }^*\) is also bounded on \(L^p(w)\) for \(w\in A_p\) (see [17]). However, the sharp dependence on \([w]_{A_p}\) in the corresponding \(L^p(w)\) inequality is unknown to us.

We emphasize that the growth \(\gamma (\alpha )=\alpha ^n\) is best possible in the weighted \(L^p(w)\) estimate for \(w\in A_p\). In the unweighted case a better dependence on \(\alpha \) is known, namely, \(\Vert S_{\alpha ,\psi }\Vert _{L^p}\le c_{p,n,\psi }\alpha ^{\frac{n}{\min (p,2)}},\) see [1, 18].

Some words about the proof of Theorem 1.1. As in [13], we use here the local mean oscillation decomposition. But in [13] we worked with the intrinsic square function, and due to the fact that this operator is defined by uniform compactly supported kernels, we arrived to the operator

$$\begin{aligned} {\mathcal A}(f)(x)=\Big (\sum _{j,k}(f_{\gamma Q_j^k})^2\chi _{Q_j^k}(x)\Big )^{1/2}, \end{aligned}$$

where \(Q_j^k\) is a sparse family (see Sect. 2.2 for the definition of this notion) and \(\gamma >1\) (here we use the standard notations \(f_Q=\frac{1}{|Q|}\int _Qf\) and \(\gamma Q\) is the \(\gamma \)-fold concentric dilate of \(Q\)). This operator can be handled sufficiently easily.

Here we work with the square function \(S_{\alpha ,\psi }(f)\) directly, more precisely we consider its smooth variant \(\widetilde{S}_{\alpha ,\psi }(f)\). Applying the local mean oscillation decomposition to \(\widetilde{S}_{\alpha ,\psi }(f)\), we obtain that \(S_{\alpha ,\psi }(f)\) is essentially pointwise bounded by \(\alpha ^n{\mathcal B}(f)\), where

$$\begin{aligned} {\mathcal B}(f)(x)=\sum _{m=0}^{\infty }\frac{1}{2^{m\delta }}\Big (\sum _{j,k}(f_{2^m Q_j^k})^2\chi _{Q_j^k}(x)\Big )^{1/2}\quad (\delta >0). \end{aligned}$$

Observe that this pointwise aperture estimate is interesting in its own right. In order to handle \({\mathcal B}\), we use a mixture of ideas from recent papers on a simple proof of the \(A_2\) conjecture [14] and sharp weighted estimates for multilinear Calderón–Zygmund operators [5]. In particular, similarly to [14], we obtain the \(X^{(2)}\)-norm boundedness of \({\mathcal B}\) by \({\mathcal A}\) on an arbitrary Banach function space \(X\).

The paper is organized as follows. The next section contains some preliminary information. In Sect. 3, we obtain the main estimate, namely, the local mean oscillation estimate of \(\widetilde{S}_{\alpha ,\psi }(f)\). The proof of Theorem 1.1 is contained in Sect. 4. Section 5 contains some concluding remarks concerning the sharp aperture-weighted weak type estimates for \(S_{\alpha ,\psi }(f)\).

2 Preliminaries

2.1 A Weak Type \((1,1)\) Estimate for Square Functions

It is well known that the operator \(S_{\alpha ,\psi }\) is of weak type \((1,1)\). However, we could not find in the literature the sharp dependence on \(\alpha \) in the corresponding inequality. Hence we give below an argument based on general square functions.

For a measurable function \(F\) on \({\mathbb R}^{n+1}_+\) define

$$\begin{aligned} S_{\alpha }(F)(x)=\biggl (\int _{\Gamma _{\alpha }(x)}|F(y,t)|^2\frac{dydt}{t^{n+1}}\biggr )^{1/2}. \end{aligned}$$

Lemma 2.1

For any \(\alpha \ge 1\),

$$\begin{aligned} \Vert S_{\alpha }(F)\Vert _{L^{1,\infty }}\le c_n\alpha ^n\Vert S_{1}(F)\Vert _{L^{1,\infty }}. \end{aligned}$$
(2.1)

Proof

We will use the following estimate, which can be found in [18, p. 315]: if \(\Omega \subset {\mathbb R}^n\) is an open set and

$$\begin{aligned} U=\{x\in {\mathbb R}^n:M\chi _{\Omega }(x)>1/(2\alpha ^n)\}, \end{aligned}$$

where \(M\) is the Hardy–Littlewood maximal operator, then

$$\begin{aligned} \int _{{\mathbb R}^n\setminus U}S_{\alpha }(F)(x)^2dx\le 2\alpha ^n\int _{{\mathbb R}^n\setminus \Omega }S_{1}(F)(x)^2dx \end{aligned}$$

(observe that the definitions of \(S_{\alpha }(F)\) here and in [18] differ by the factor \(\alpha ^{n/2}\).)

Let \(\Omega _{\xi }=\{x:S_{1}(F)(x)>\xi \}\) and \(U_{\xi }=\{x:M\chi _{\Omega _{\xi }}(x)>1/2\alpha ^n\}\). Using the weak type \((1,1)\) estimate for \(M\), Chebyshev’s inequality, and the above estimate, we obtain

$$\begin{aligned}&|\{x\in {\mathbb R}^n:S_{\alpha }(F)(x)>\xi \}|\\&\le |U_{\xi }|+|\{x\in {\mathbb R}^n\setminus U_{\xi }:S_{\alpha }(F)(x)>\xi \}|\\&\le c_n\alpha ^n|\{x:S_{1}(F)(x)>\xi \}|+\frac{1}{\xi ^2}\int _{{\mathbb R}^n\setminus U_{\xi }}S_{\alpha }(F)(x)^2dx\\&\le c_n\alpha ^n|\{x:S_{1}(F)(x)>\xi \}|+\frac{2\alpha ^n}{\xi ^2}\int _{{\mathbb R}^n\setminus \Omega _{\xi }}S_{1}(F)(x)^2dx. \end{aligned}$$

Further,

$$\begin{aligned} \int _{{\mathbb R}^n\setminus \Omega _{\xi }}S_{1}(F)(x)^2dx\le 2\int _{0}^{\xi }\lambda |\{x:S_{1}(F)(x)>\lambda \}|d\lambda \le 2\xi \Vert S_{1}(F)\Vert _{L^{1,\infty }}. \end{aligned}$$

Combining this with the previous estimate gives

$$\begin{aligned} |\{x:S_{\alpha }(F)(x)>\xi \}|\le c_n\alpha ^n|\{x:S_{1}(F)(x)>\xi \}|+\frac{4\alpha ^n}{\xi }\Vert S_{1}(F)\Vert _{L^{1,\infty }}, \end{aligned}$$

which proves (2.1). \(\square \)

Note that the sharp unweighted \(L^p\) estimates relating square functions of different apertures were obtained recently in [1].

By Lemma 2.1 and by the weak type \((1,1)\) estimate for \(S_{\psi }(f)\) [9],

$$\begin{aligned} \Vert S_{\alpha ,\psi }(f)\Vert _{L^{1,\infty }}\le c_{n,\psi }\alpha ^{n}\Vert f\Vert _{L^1}. \end{aligned}$$
(2.2)

2.2 Dyadic Grids and Sparse Families

Recall that the standard dyadic grid in \({\mathbb R}^n\) consists of the cubes

$$\begin{aligned} 2^{-k}([0,1)^n+j),\quad k\in {\mathbb Z}, j\in {\mathbb Z}^n. \end{aligned}$$

Denote the standard grid by \({\mathcal D}\).

By a general dyadic grid \({\fancyscript{D}}\) we mean a collection of cubes with the following properties: (i) for any \(Q\in {\fancyscript{D}}\) its sidelength \(\ell _Q\) is of the form \(2^k, k\in {\mathbb Z}\); (ii) \(Q\cap R\in \{Q,R,\emptyset \}\) for any \(Q,R\in {\fancyscript{D}}\); (iii) the cubes of a fixed sidelength \(2^k\) form a partition of \({\mathbb R}^n\).

Given a cube \(Q_0\), denote by \({\mathcal D}(Q_0)\) the set of all dyadic cubes with respect to \(Q_0\), that is, the cubes from \({\mathcal D}(Q_0)\) are formed by repeated subdivision of \(Q_0\) and each of its descendants into \(2^n\) congruent subcubes. Observe that if \(Q_0\in {\fancyscript{D}}\), then each cube from \({\mathcal D}(Q_0)\) will also belong to \({\fancyscript{D}}\).

We will use the following proposition from [10].

Proposition 2.2

There are \(2^n\) dyadic grids \({\fancyscript{D}}_{i}\) such that for any cube \(Q\subset {\mathbb R}^n\) there exists a cube \(Q_{i}\in {\fancyscript{D}}_{i}\) such that \(Q\subset Q_{i}\) and \(\ell _{Q_{i}}\le 6\ell _Q\).

We say that \(\{Q_j^k\}\) is a sparse family of cubes if: (i) the cubes \(Q_j^k\) are disjoint in \(j\), with \(k\) fixed; (ii) if \(\Omega _k=\cup _jQ_j^k\), then \(\Omega _{k+1}\subset ~\Omega _k\); (iii) \(|\Omega _{k+1}\cap Q_j^k|\le \frac{1}{2}|Q_j^k|\).

2.3 A “Local Mean Oscillation Decomposition”

The non-increasing rearrangement of a measurable function \(f\) on \({\mathbb R}^n\) is defined by

$$\begin{aligned} f^*(t)=\inf \{\alpha >0:|\{x\in {\mathbb R}^n:|f(x)|>\alpha \}|<t\}\quad (0<t<\infty ). \end{aligned}$$

Given a measurable function \(f\) on \({\mathbb R}^n\) and a cube \(Q\), the local mean oscillation of \(f\) on \(Q\) is defined by

$$\begin{aligned} \omega _{\lambda }(f;Q)=\inf _{c\in {\mathbb R}} \big ((f-c)\chi _{Q}\big )^*\big (\lambda |Q|\big )\quad (0<\lambda <1). \end{aligned}$$

By a median value of \(f\) over \(Q\) we mean a possibly nonunique, real number \(m_f(Q)\) such that

$$\begin{aligned} \max \big (|\{x\in Q: f(x)>m_f(Q)\}|,|\{x\in Q: f(x)<m_f(Q)\}|\big )\le |Q|/2. \end{aligned}$$

It is easy to see that the set of all median values of \(f\) is either one point or a closed interval. In the latter case we will assume for the definiteness that \(m_f(Q)\) is the maximal median value. Observe that it follows from the definitions that

$$\begin{aligned} |m_f(Q)|\le (f\chi _Q)^*(|Q|/2). \end{aligned}$$
(2.3)

Given a cube \(Q_0\), the dyadic local sharp maximal function \(m^{\#,d}_{\lambda ;Q_0}f\) is defined by

$$\begin{aligned} m^{\#,d}_{\lambda ;Q_0}f(x)=\sup _{x\in Q'\in {\mathcal D}(Q_0)}\omega _{\lambda }(f;Q'). \end{aligned}$$

The following theorem was proved in [15] (a very similar version can be found in [12]).

Theorem 2.3

Let \(f\) be a measurable function on \({\mathbb R}^n\) and let \(Q_0\) be a fixed cube. Then there exists a (possibly empty) sparse family of cubes \(Q_j^k\in {\mathcal D}(Q_0)\) such that for a.e. \(x\in Q_0\),

$$\begin{aligned} |f(x)-m_f(Q_0)|\le 4m_{\frac{1}{2^{n+2}};Q_0}^{\#,d}f(x)+2\sum _{k,j} \omega _{\frac{1}{2^{n+2}}}(f;Q_j^k)\chi _{Q_j^k}(x). \end{aligned}$$

3 A Key Estimate

In this section we will obtain the main local mean oscillation estimate of \(S_{\alpha ,\psi }\). We consider a smooth version of \(S_{\alpha ,\psi }\) defined as follows. Let \(\Phi \) be a Schwartz function such that

$$\begin{aligned} \chi _{B(0,1)}(x)\le \Phi (x)\le \chi _{B(0,2)}(x). \end{aligned}$$

Define

$$\begin{aligned} \widetilde{S}_{\alpha ,\psi }(f)(x)=\left( \int \int _{{\mathbb R}^{n+1}_+}\Phi \Big (\frac{x-y}{t\alpha }\Big )|f*\psi _t(y)|^2\frac{dydt}{t^{n+1}}\right) ^{1/2}\quad (\alpha >0). \end{aligned}$$

It is easy to see that

$$\begin{aligned} S_{\alpha ,\psi }(f)(x)\le \widetilde{S}_{\alpha ,\psi }(f)(x)\le S_{2\alpha ,\psi }(f)(x). \end{aligned}$$

Hence, by (2.2),

$$\begin{aligned} \Vert \widetilde{S}_{\alpha ,\psi }(f)\Vert _{L^{1,\infty }}\le c_{n,\psi }\alpha ^{n}\Vert f\Vert _{L^1}. \end{aligned}$$
(3.1)

Lemma 3.1

For any cube \(Q\subset {\mathbb R}^n\),

$$\begin{aligned} \omega _{\lambda }(\widetilde{S}_{\alpha ,\psi }(f)^2;Q)\le c_{n,\lambda ,\psi }\alpha ^{2n}\sum _{k=0}^{\infty }\frac{1}{2^{k\delta }}\left( \frac{1}{|2^kQ|}\int _{2^kQ}|f|\right) ^2, \end{aligned}$$
(3.2)

where \(\delta =\varepsilon \) from condition (1.1) if \(\varepsilon <1\), and \(\delta <1\) if \(\varepsilon =1\).

Proof

Given a cube \(Q\), let \(T(Q)=\{(y,t):y\in Q, 0<t<{\ell }_Q\},\) where \({\ell }_Q\) denotes the side length of \(Q\). For \(x\in Q\) we decompose \(\widetilde{S}_{\alpha ,\psi }(f)(x)^2\) into the sum of

$$\begin{aligned} I_1(f)(x)=\int \int _{T(2Q)} \Phi \Big (\frac{x-y}{t\alpha }\Big )|f*\psi _t(y)|^2\frac{dydt}{t^{n+1}} \end{aligned}$$

and

$$\begin{aligned} I_2(f)(x)=\int \int _{{\mathbb R}^{n+1}_+\setminus T(2Q) }\Phi \Big (\frac{x-y}{t\alpha }\Big )|f*\psi _t(y)|^2\frac{dydt}{t^{n+1}}. \end{aligned}$$

Let us show first that

$$\begin{aligned} (I_1(f)\chi _Q)^*(\lambda |Q|)\le c_{n,\lambda ,\psi }\alpha ^{2n}\sum _{k=0}^{\infty }\frac{1}{2^{k\varepsilon }}\left( \frac{1}{|2^kQ|}\int _{2^kQ}|f|\right) ^2. \end{aligned}$$
(3.3)

Using that \((a+b)^2\le 2(a^2+b^2)\), we get

$$\begin{aligned} I_1(f)(x)\le 2\big (I_1(f\chi _{4Q})(x)+I_1(f\chi _{{\mathbb R}^n\setminus 4Q})(x)\big ). \end{aligned}$$

Hence,

$$\begin{aligned} (I_1(f)\chi _Q)^*(\lambda |Q|)&\le 2\big ((I_1(f\chi _{4Q}))^*(\lambda |Q|/2)\\&+(I_1(f\chi _{{\mathbb R}^n\setminus 4Q})\chi _Q)^*(\lambda |Q|/2)\big ).\nonumber \end{aligned}$$
(3.4)

By (3.1),

$$\begin{aligned} (I_1(f\chi _{4Q}))^*(\lambda |Q|/2)&\le (\widetilde{S}_{\alpha ,\psi }(f\chi _{4Q}))^*(\lambda |Q|/2)^2\\&\le c_{n,\lambda ,\psi }\alpha ^{2n}\left( \frac{1}{|4Q|}\int _{4Q}|f|\right) ^2.\nonumber \end{aligned}$$
(3.5)

Further, by (1.1), for \((y,t)\in T(2Q)\),

$$\begin{aligned} |(f\chi _{{\mathbb R}^n\setminus 4Q})*\psi _t(y)|&\le c_{\psi }t^{\varepsilon }\int _{{\mathbb R}^n\setminus 4Q}|f(\xi )|\frac{1}{(t+|y-\xi |)^{n+\varepsilon }}d\xi \\&\le c_{n,\psi }(t/\ell _Q)^{\varepsilon }\sum _{k=0}^{\infty }\frac{1}{2^{k\varepsilon }}\frac{1}{|2^kQ|}\int _{2^kQ}|f|. \end{aligned}$$

Hence, using Chebyshev’s inequality and that \(\int _{{\mathbb R}^n}\Phi \Big (\frac{x-y}{t\alpha }\Big )dx\le c_n(t\alpha )^n\), we have

$$\begin{aligned}&(I_1(f\chi _{{\mathbb R}^n\setminus 4Q})\chi _Q)^*(\lambda |Q|/2)\\&\le \frac{2}{\lambda |Q|} \int \int _{T(2Q)}\Big (\int _{{\mathbb R}^n}\Phi \Big (\frac{x-y}{t\alpha }\Big )dx\Big )|(f\chi _{{\mathbb R}^n\setminus 4Q})*\psi _t(y)|^2\frac{dydt}{t^{n+1}}\\&\le c_{n,\lambda ,\psi }\alpha ^n(1/\ell _Q)^{2\varepsilon }\left( \sum _{k=0}^{\infty }\frac{1}{2^{k\varepsilon }}\frac{1}{|2^kQ|}\int _{2^kQ}|f|\right) ^2\int _0^{2\ell _Q}t^{2\varepsilon -1}dt\\&\le c_{n,\lambda ,\psi }\alpha ^n\left( \sum _{k=0}^{\infty }\frac{1}{2^{k\varepsilon }}\frac{1}{|2^kQ|}\int _{2^kQ}|f|\right) ^2. \end{aligned}$$

By Hölder’s inequality,

$$\begin{aligned} \left( \sum _{k=0}^{\infty }\frac{1}{2^{k\varepsilon }}\frac{1}{|2^kQ|}\int _{2^kQ}|f|\right) ^2\le \left( \sum _{k=0}^{\infty }\frac{1}{2^{k\varepsilon }}\right) \sum _{k=0}^{\infty }\frac{1}{2^{k\varepsilon }}\left( \frac{1}{|2^kQ|}\int _{2^kQ}|f|\right) ^2. \end{aligned}$$

Combining this with the previous estimate and with (3.5) and (3.4) proves (3.3).

Let \(x,x_0\in Q\), and let us estimate now \(|I_2(f)(x)-I_2(f)(x_0)|\). We have

$$\begin{aligned}&|I_2(f)(x)-I_2(f)(x_0)|\\&\le \sum _{k=1}^{\infty }\int \int _{T(2^{k+1}Q)\setminus T(2^kQ)} \Big |\Phi \Big (\frac{x-y}{t\alpha }\Big )-\Phi \Big (\frac{x_0-y}{t\alpha }\Big )\Big ||f*\psi _t(y)|^2\frac{dydt}{t^{n+1}}. \end{aligned}$$

Suppose \((y,t)\in T(2^{k+1}Q)\setminus T(2^kQ)\). If \(y\in 2^kQ\), then \(t\ge 2^k\ell _Q\). On the other hand, if \(y\in 2^{k+1}Q\setminus 2^kQ\), then for any \(x\in Q, |y-x|\ge \frac{2^k-1}{2}\ell _Q\). Hence, if \(t<\frac{2^k-1}{4\alpha }\ell _Q\), then \(|y-x|/\alpha t>2\) and \(|y-x_0|/\alpha t>2\), and therefore,

$$\begin{aligned} \Phi \Big (\frac{x-y}{t\alpha }\Big )-\Phi \Big (\frac{x_0-y}{t\alpha }\Big )=0. \end{aligned}$$

Assume that \(t\ge \frac{2^k-1}{4\alpha }\ell _Q\). This easily implies \(t\ge 2^{k-3}\ell _Q/\alpha \). Thus, using that

$$\begin{aligned} \Big |\Phi \Big (\frac{x-y}{t\alpha }\Big )-\Phi \Big (\frac{x_0-y}{t\alpha }\Big )\Big |\le \frac{\sqrt{n}\ell _Q}{\alpha t}\Vert \nabla \Phi \Vert _{L^{\infty }}, \end{aligned}$$

we get

$$\begin{aligned}&\Big |\Phi \Big (\frac{x-y}{t\alpha }\Big )-\Phi \Big (\frac{x_0-y}{t\alpha }\Big )\Big |\chi _{\{T(2^{k+1}Q)\setminus T(2^kQ)\}}(y,t)\\&\le c_n\frac{\ell _Q}{\alpha t}\chi _{\{(y,t):y\in 2^{k+1}Q, 2^{k-3}\ell _Q/\alpha \le t\le 2^{k+1}\ell _Q}\}(y,t). \end{aligned}$$

Hence,

$$\begin{aligned}&\int \int _{T(2^{k+1}Q)\setminus T(2^kQ)} \Big |\Phi \Big (\frac{x-y}{t\alpha }\Big )-\Phi \Big (\frac{x_0-y}{t\alpha }\Big )\Big ||f*\psi _t(y)|^2\frac{dydt}{t^{n+1}}\\&\quad \le c_n\frac{\ell _Q}{\alpha }\int _{2^{k-3}\ell _Q/\alpha }^{2^{k+1}\ell _Q}\int _{2^{k+1}Q}|f*\psi _t(y)|^2\frac{dydt}{t^{n+2}}\le c_n(J_1+J_2), \end{aligned}$$

where

$$\begin{aligned} J_1=\frac{\ell _Q}{\alpha }\int _{2^{k-3}\ell _Q/\alpha }^{2^{k+1}\ell _Q}\int _{2^{k+1}Q} |(f\chi _{2^{k+2}Q})*\psi _t(y)|^2\frac{dydt}{t^{n+2}} \end{aligned}$$

and

$$\begin{aligned} J_2=\frac{\ell _Q}{\alpha }\int _{2^{k-3}\ell _Q/\alpha }^{2^{k+1}\ell _Q}\int _{2^{k+1}Q} |(f\chi _{{\mathbb R}^n\setminus 2^{k+2}Q})*\psi _t(y)|^2\frac{dydt}{t^{n+2}}. \end{aligned}$$

Let us first estimate \(J_1\). Using Minkowski’s integral inequality, we obtain

$$\begin{aligned} J_1\le \frac{\ell _Q}{\alpha }\left( \int _{2^{k+2}Q}|f(\xi )|\Big ( \int _{2^{k-3}\ell _Q/\alpha }^{2^{k+1}\ell _Q}\int _{2^{k+1}Q}\psi _t(y-\xi )^2\frac{dydt}{t^{n+2}}\Big )^{1/2}d\xi \right) ^2. \end{aligned}$$

Since

$$\begin{aligned} \int _{2^{k+1}Q}\psi _t(y-\xi )^2dy\le \frac{\Vert \psi \Vert _{L^{\infty }}}{t^n}\Vert \psi _t\Vert _{L^1}= \frac{\Vert \psi \Vert _{L^{\infty }}\Vert \psi \Vert _{L^1}}{t^n}, \end{aligned}$$

we get

$$\begin{aligned} J_1&\le c_{\psi }\frac{\ell _Q}{\alpha }\Big (\int _{2^{k+2}Q}|f(\xi )|d\xi \Big )^2\int _{2^{k-3}\ell _Q/\alpha }^{\infty }\frac{dt}{t^{2n+2}}\\&\le c_{n,\psi }\alpha ^{2n}2^{-k}\Big (\frac{1}{|2^{k+2}Q|}\int _{2^{k+2}Q}|f(\xi )|d\xi \Big )^2. \end{aligned}$$

We turn to the estimate of \(J_2\). By (1.1), for \((y,t)\in T(2^{k+1}Q)\),

$$\begin{aligned} |(f\chi _{{\mathbb R}^n\setminus 2^{k+2}Q})*\psi _t(y)|&\le c_{\psi }t^{\varepsilon }\int _{{\mathbb R}^n\setminus 2^{k+2}Q}|f(\xi )|\frac{1}{(t+|y-\xi |)^{n+\varepsilon }}d\xi \\&\le c_{n,\psi }(t/\ell _Q)^{\varepsilon }\sum _{i=k}^{\infty }\frac{1}{2^{i\varepsilon }}\frac{1}{|2^iQ|}\int _{2^iQ}|f|. \end{aligned}$$

Therefore,

$$\begin{aligned} J_2&\le c_{n,\psi }\frac{\ell _Q}{\alpha }\Big (\sum _{i=k}^{\infty }\frac{1}{2^{i\varepsilon }}\frac{1}{|2^iQ|}\int _{2^iQ}|f|\Big )^2\frac{1}{\ell _Q^{2\varepsilon }} \int _{2^{k-3}\ell _Q/\alpha }^{2^{k+1}\ell _Q}\int _{2^{k+1}Q}\frac{dydt}{t^{n+2-2\varepsilon }}\\&\le c_{n,\psi }\alpha ^{n-2\varepsilon }2^{(2\varepsilon -1)k}\Big (\sum _{i=k}^{\infty }\frac{1}{2^{i\varepsilon }}\frac{1}{|2^iQ|}\int _{2^iQ}|f|\Big )^2. \end{aligned}$$

Combining the estimates for \(J_1\) and \(J_2\), we obtain

$$\begin{aligned} |I_2(f)(x)-I_2(f)(x_0)|&\le c_{n,\psi }\alpha ^{2n}\sum _{k=1}^{\infty }\frac{1}{2^k}\Big (\frac{1}{|2^{k}Q|}\int _{2^{k}Q}|f(\xi )|d\xi \Big )^2\\&+ c_{n,\psi }\alpha ^{n-2\varepsilon }\sum _{k=1}^{\infty }\frac{2^{2\varepsilon k}}{2^k}\Big (\sum _{i=k}^{\infty }\frac{1}{2^{i\varepsilon }}\frac{1}{|2^iQ|}\int _{2^iQ}|f|\Big )^2. \end{aligned}$$

By Hölder’s inequality,

$$\begin{aligned}&\sum _{k=1}^{\infty }\frac{2^{2\varepsilon k}}{2^k}\left( \sum _{i=k}^{\infty }\frac{1}{2^{i\varepsilon }}\frac{1}{|2^iQ|}\int _{2^iQ}|f|\right) ^2\\&\quad \le c_{\varepsilon }\sum _{k=1}^{\infty }\frac{2^{\varepsilon k}}{2^k}\sum _{i=k}^{\infty }\frac{1}{2^{i\varepsilon }}\left( \frac{1}{|2^iQ|}\int _{2^iQ}|f|\right) ^2\\&\quad \le c_{\varepsilon }\sum _{k=1}^{\infty }\gamma (k,\varepsilon )\left( \frac{1}{|2^kQ|}\int _{2^kQ}|f|\right) ^2, \end{aligned}$$

where

$$\begin{aligned} \gamma (k,\varepsilon )={\left\{ \begin{array}{ll} \frac{1}{2^{\varepsilon k}}, &{} \varepsilon <1\\ \frac{k}{2^k}, &{} \varepsilon =1.\end{array}\right. } \end{aligned}$$

Therefore,

$$\begin{aligned} |I_2(f)(x)-I_2(f)(x_0)|\le c_{n,\psi }\alpha ^{2n}\sum _{k=1}^{\infty }\gamma (k,\varepsilon )\left( \frac{1}{|2^kQ|}\int _{2^kQ}|f|\right) ^2. \end{aligned}$$

From this and from (3.3),

$$\begin{aligned} \omega _{\lambda }(\widetilde{S}_{\alpha ,\psi }(f)^2;Q)&\le (I_1(f)\chi _Q)^*(\lambda |Q|)+\Vert I_2(f)-I_2(f)(x_0)\Vert _{L^{\infty }(Q)}\\&\le c_{n,\lambda ,\psi }\alpha ^{2n}\sum _{k=0}^{\infty }\gamma (k,\varepsilon )\left( \frac{1}{|2^kQ|}\int _{2^kQ}|f|\right) ^2, \end{aligned}$$

which completes the proof. \(\square \)

4 Proof of Theorem 1.1

4.1 Several Auxiliary Operators

Throughout this subsection we assume that \(f,g\ge 0\). Given a sparse family \({\mathcal S}=\{Q_j^k\}\subset {\fancyscript{D}}\), define

$$\begin{aligned} {\mathcal T}^{\mathcal S}_{2,m}f(x)=\left( \sum _{j,k}(f_{2^mQ_j^k})^2\chi _{Q_j^k}(x)\right) ^{1/2}. \end{aligned}$$

The following result was proved in [4].

Lemma 4.1

For any \(1<p<\infty \),

$$\begin{aligned} \Vert {\mathcal T}^{\mathcal S}_{2,0}\Vert _{L^p(w)}\le c_{n,p}[w]_{A_p}^{\max (\frac{1}{2},\frac{1}{p-1})}. \end{aligned}$$

Given a sparse family \({\mathcal S}=\{Q_j^k\}\subset {\mathcal D}\), define

$$\begin{aligned} {\fancyscript{M}}_{m}^{\mathcal S}(f,g)(x)=\sum _{j,k}(f_{2^mQ_j^k})\left( \frac{1}{|2^mQ_j^k|}\int _{Q_j^k}g\right) \chi _{2^mQ_j^k}(x). \end{aligned}$$

Applying Proposition 2.2, we decompose the family of cubes \(\{Q_j^k\}\) into \(2^n\) disjoint families \(F_{i}\) such that for any \(Q_j^k\in F_{i}\) there exists a cube \(P_{j,k}^{m,i}\in {\fancyscript{D}}_{i}\) such that \(2^mQ_j^k\subset P_{j,k}^{m,i}\) and \(\ell _{P_{j,k}^{m,i}}\le 6\ell _{2^mQ_j^k}\). Hence,

$$\begin{aligned} {\fancyscript{M}}_{m}^{\mathcal S}(f,g)(x)\le 6^{2n}\sum _{i=1}^{2^n} {\fancyscript{M}}_{i,m}^{\mathcal S}(f,g)(x), \end{aligned}$$
(4.1)

where

$$\begin{aligned} {\fancyscript{M}}_{i,m}^{\mathcal S}(f,g)(x) =\sum _{j,k}(f_{P_{j,k}^{m,i}})\left( \frac{1}{|P_{j,k}^{m,i}|}\int _{Q_j^k}g\right) \chi _{P_{j,k}^{m,i}}(x). \end{aligned}$$

The following statement was obtained in [5].

Lemma 4.2

Suppose that the sum defining \({\fancyscript{M}}_{i,m}^{\mathcal S}(f,g)\) contains finitely many terms. Then there are at most \(2^n\) cubes \(Q_{\nu }\in {\fancyscript{D}}_{i}\) covering the support of \({\fancyscript{M}}_{i,m}^{\mathcal S}(f,g)\) so that for every \(Q_{\nu }\) there are two sparse families \({\mathcal S}_{i,1}\) and \({\mathcal S}_{i,2}\) from \({\fancyscript{D}}_{i}\) having the property that for a.e. \(x\in Q_{\nu }\),

$$\begin{aligned} {\fancyscript{M}}_{i,m}^{\mathcal S}(f,g)(x)\le c_n(m+1)\sum _{\kappa =1}^2\sum _{Q_j^k\in {\mathcal S}_{i,\kappa }} f_{Q_j^k}g_{Q_j^k}\chi _{Q_j^k}(x). \end{aligned}$$

Observe that the proof of Lemma 4.2 is based on Theorem 2.3 along with [14, Lemma 3.2]. Formally Lemma 4.2 follows from [5, Lemma 4.2] taking there \(m=2\) (which corresponds to a bilinear case) and \(l=m\), and from the subsequent argument in [5], Sect. 4.2].

Let \(X\) be a Banach function space, and let \(X'\) denote the associate space (see [2, Ch. 1]). Given a Banach function space \(X\), denote by \(X^{(2)}\) the space endowed with the norm

$$\begin{aligned} \Vert f\Vert _{X^{(2)}}=\Vert |f|^2\Vert _{X}^{1/2}. \end{aligned}$$

It is well known [16, Ch. 1] that \(X^{(2)}\) is also a Banach space.

Lemma 4.3

For any Banach function space \(X\),

$$\begin{aligned} \sup _{{\mathcal S}\in {\mathcal D}}\Vert {\mathcal T}^{\mathcal S}_{2,m}f\Vert _{X^{(2)}}\le c_nm^{1/2}\max _{1\le i\le 2^n}\sup _{{\mathcal S}\in {\fancyscript{D}}_{i}}\Vert {\mathcal T}^{\mathcal S}_{2,0}f\Vert _{X^{(2)}}. \end{aligned}$$

Proof

By the standard argument, it suffices to prove the estimate for a finite partial sum \({\widetilde{\mathcal T}}^{\mathcal S}_{2,m}f\) from the series defining \({\mathcal T}^{\mathcal S}_{2,m}f\). Fix \({\mathcal S}\in {\mathcal D}\). By duality, there exists \(g\ge 0\) with \(\Vert g\Vert _{X'}=1\) such that

$$\begin{aligned} \Vert {\widetilde{\mathcal T}}^{\mathcal S}_{2,m}f\Vert _{X^{(2)}}^2&= \int _{{\mathbb R}^n} ({\widetilde{\mathcal T}}^{\mathcal S}_{2,m}f)^2g\,dx=\sum _{j,k}(f_{2^mQ_j^k})^2\int _{Q_j^k}g\\&= \int _{{\mathbb R}^n}{\fancyscript{M}}_{m}^{\mathcal S}(f,g)f\,dx,\nonumber \end{aligned}$$
(4.2)

where the sum defining \({\fancyscript{M}}_{m}^{\mathcal S}(f,g)\) contains finitely many terms. By Lemma 4.2 and by Hölder’s inequality,

$$\begin{aligned} \int _{Q_{\nu }}{\fancyscript{M}}_{i,m}^{\mathcal S}(f,g)f\,dx&\le c_nm\sum _{\kappa =1}^2\sum _{Q_j^k\in {\mathcal S}_{i,\kappa }} (f_{Q_j^k})^2\int _{Q_j^k}g\\&\le c_nm\sum _{\kappa =1}^2\int _{{\mathbb R}^n} ({\mathcal T}^{{\mathcal S}_{i,\kappa }}_{2,0}f)^2g\,dx\\&\le 2c_nm\sup _{{\mathcal S}\in {\fancyscript{D}}_{i}}\Vert {\mathcal T}^{\mathcal S}_{2,0}f\Vert _{X^{(2)}}^2. \end{aligned}$$

Summing up over \(Q_{\nu }\) and using (4.1), we obtain

$$\begin{aligned} \int _{{\mathbb R}^n}{\fancyscript{M}}_{m}^{\mathcal S}(f,g)f\,dx\le c_nm\max _{1\le i\le 2^n} \sup _{{\mathcal S}\in {\fancyscript{D}}_{i}}\Vert {\mathcal T}^{\mathcal S}_{2,0}f\Vert _{X^{(2)}}^2. \end{aligned}$$

This along with (4.2) completes the proof.

4.2 Proof of Theorem 1.1

Let \(Q\in {\mathcal D}\). By Lemma 3.1, for all \(x\in Q\),

$$\begin{aligned} m_{\frac{1}{2^{n+2}};Q}^{\#,d}\big ((\widetilde{S}_{\alpha ,\psi }(f)^2)\big )(x)\le c_{n,\psi }\alpha ^{2n}Mf(x)^2. \end{aligned}$$

Hence, applying Theorem 2.3 to \(\widetilde{S}_{\alpha ,\psi }(f)^2\), we get that there exists a sparse family \({\mathcal S}=\{Q_j^k\}\subset {\mathcal D}(Q)\) such that for a.e. \(x\in Q\),

$$\begin{aligned} |\widetilde{S}_{\alpha ,\psi }(f)(x)^2-m_{Q}(\widetilde{S}_{\alpha ,\psi }(f)^2)|\le c_{n,\psi }\alpha ^{2n}\Big (Mf(x)^2+\sum _{m=0}^{\infty }\frac{1}{2^{m\delta }}\big ({\mathcal T}^{\mathcal S}_{2,m}f(x)\big )^2\Big ). \end{aligned}$$

Hence,

$$\begin{aligned} |\widetilde{S}_{\alpha ,\psi }(f)^2-m_{Q}(\widetilde{S}_{\alpha ,\psi }(f)^2)|^{1/2}\le c_{n,\psi }\alpha ^n\big (Mf(x)+{\mathcal T}(f)(x)\big ), \end{aligned}$$
(4.3)

where

$$\begin{aligned} {\mathcal T}(f)(x)=\sum _{m=0}^{\infty }\frac{1}{2^{m\delta /2}}{\mathcal T}^{\mathcal S}_{2,m}f(x). \end{aligned}$$

Assuming, for instance, that \(f\in L^1\), and using (2.3) and (3.1), we get

$$\begin{aligned} \lim _{|Q|\rightarrow \infty }m_{Q}(\widetilde{S}_{\alpha ,\psi }(f)^2)=0. \end{aligned}$$

Therefore, letting \(Q\) tend to anyone of \(2^n\) quadrants and using Fatou’s lemma, by (4.3) we obtain

$$\begin{aligned} \Vert \widetilde{S}_{\alpha ,\psi }(f)\Vert _{L^p(w)}\le c_{n,\psi }\alpha ^n\big (\Vert Mf\Vert _{L^p(w)}+\Vert {\mathcal T}(f)\Vert _{L^p(w)}\big ). \end{aligned}$$
(4.4)

Combining Lemma 4.1 and Lemma 4.3 with \(X=L^{3/2}(w)\) yields

$$\begin{aligned} \Vert {\mathcal T}(f)\Vert _{L^3(w)}&\le \sum _{m=0}^{\infty }\frac{1}{2^{m\delta /2}}\Vert {\mathcal T}^{\mathcal S}_{2,m}f\Vert _{L^3(w)}\\&\le c_n\sum _{m=0}^{\infty }\frac{m^{1/2}}{2^{m\delta /2}} \max _{1\le i\le 2^n}\sup _{{\mathcal S}\in {\fancyscript{D}}_{i}}\Vert {\mathcal T}^{\mathcal S}_{2,0}f\Vert _{L^3(w)}\\&\le c_{n,\delta }[w]_{A_3}^{1/2}\Vert f\Vert _{L^3(w)}. \end{aligned}$$

Hence, by the sharp version of the Rubio de Francia extrapolation theorem (see [6] or [7]),

$$\begin{aligned} \Vert {\mathcal T}(f)\Vert _{L^p(w)}\le c_{n,p,\delta }[w]_{A_p}^{\max (\frac{1}{2},\frac{1}{p-1})}\Vert f\Vert _{L^p(w)}\quad (1<p<\infty ). \end{aligned}$$
(4.5)

Thus, applying this result along with Buckley’s estimate \(\Vert M\Vert _{L^p(w)}\le c_{n,p}[w]_{A_p}^{\frac{1}{p-1}}\) (see [3]) and (4.4), we get

$$\begin{aligned} \Vert S_{\alpha ,\psi }\Vert _{L^p(w)}\le \Vert \widetilde{S}_{\alpha ,\psi }\Vert _{L^p(w)}\le c_{n,p,\psi }\alpha ^n[w]_{A_p}^{\max (\frac{1}{2},\frac{1}{p-1})}, \end{aligned}$$

and therefore, the proof is complete.

5 Concluding Remarks

In a recent work [11], the following weak type estimate was obtained for \(G_{\beta }(f)\) (and hence for \(S_{\psi }(f)\)): if \(1<p<3\), then

$$\begin{aligned} \Vert G_{\beta }(f)\Vert _{L^{p,\infty }(w)}\lesssim [w]_{A_p}^{\max (\frac{1}{2},\frac{1}{p})}\Phi _p([w]_{A_p})\Vert f\Vert _{L^p(w)}, \end{aligned}$$

where \(\Phi _p(t)=1\) if \(1<p<2\) and \(\Phi _p(t)=1+\log t\) if \(p\ge 2\). The proof was based on the local mean oscillation decomposition technique along with the estimate

$$\begin{aligned} \Vert {\mathcal T}^{\mathcal S}_{2,0}f\Vert _{L^{p,\infty }(w)}\lesssim [w]_{A_p}^{\max (\frac{1}{2},\frac{1}{p})}\Phi _p([w]_{A_p})\Vert f\Vert _{L^p(w)}. \end{aligned}$$
(5.1)

Since the space \(L^{p,\infty }(w)\) is normable if \(p>1\) (see, e.g., [2, p. 220]), combining Lemma 4.3 with \(X=L^{1+\varepsilon ,\infty }(w), \varepsilon >0,\) and (5.1) yields for \(2<p<3\) that

$$\begin{aligned} \Vert {\mathcal T}f\Vert _{L^{p,\infty }(w)}\lesssim [w]_{A_p}^{\max (\frac{1}{2},\frac{1}{p})}\Phi _p([w]_{A_p})\Vert f\Vert _{L^p(w)}. \end{aligned}$$
(5.2)

Hence, exactly as above, by (4.3) (and by the weak type estimate for \(M\) proved in [3]), we obtain

$$\begin{aligned} \Vert S_{\alpha ,\psi }(f)\Vert _{L^{p,\infty }(w)}\lesssim \alpha ^n[w]_{A_p}^{\max (\frac{1}{2},\frac{1}{p})}\Phi _p([w]_{A_p})\Vert f\Vert _{L^p(w)}\quad (2<p<3). \end{aligned}$$

We emphasize that our approach does not allow to extend this estimate to \(1<p\le 2\). This is clearly related to the same problem with (5.2). The limitation \(2<p<3\) in (5.2) is due to Lemma 4.3 where the condition that \(X\) is a Banach function space was essential in the proof. This raises a natural question whether Lemma 4.3 holds under the condition that \(X\) is a quasi-Banach space. Observe that the same question can be asked regarding a recent estimate relating \(X\)-norms of Calderón–Zygmund and dyadic positive operators [15].