1 Introduction and Main Results

The Hardy–Littlewood maximal function Mf of a locally integrable function f on \({\mathbb {R}}^d\) is defined by

$$\begin{aligned} Mf(x)=\sup _{r>0}\frac{1}{|B(x,r)|}\int _{B(x,r)}|f(y)|\,dy,\quad x\in {\mathbb {R}}^d, \end{aligned}$$
(1.1)

where B(xr) denotes the Euclidean ball of radius r centred at x and |E| is the Lebesgue measure of a measurable subset E of \({\mathbb {R}}^d\). If the supremum is taken over all balls containing x, then the resulting operator is the uncentred Hardy–Littlewood maximal operator. The operator M was shown to be strong type (pp) for \(1<p\le \infty \) and weak type (1, 1) by Hardy and Littlewood [18] for \(d=1\) and by Wiener [31] for general d. More precisely, M satisfies the strong type (pp) inequality

$$\begin{aligned} \int _{{\mathbb {R}}^d}Mf(x)^p\,dx \le C_p \int _{{\mathbb {R}}^d}|f(x)|^p\,dx, \quad 1<p\le \infty , \end{aligned}$$
(1.2)

and the weak type (1, 1) inequality

$$\begin{aligned} |\{x\in {\mathbb {R}}^d:Mf(x)>\lambda \}|\le \frac{C}{\lambda }\Vert f\Vert _1\quad \text{ for } \text{ all }~\lambda >0. \end{aligned}$$
(1.3)

In [30], Stein observed that the Euclidean balls and the translation invariant Lebesgue measure in \({\mathbb {R}}^d\) can be replaced by more general sets and a nonnegative Borel measure respectively to pose and answer similar questions. More specifically, we have the following.

For each \(x\in {\mathbb {R}}^d\), let \(\{E_r(x):0<r<\infty \}\) be a collection of nonempty, bounded open subsets of \({\mathbb {R}}^d\) containing x and let \({\mathcal {E}}=\{E_r(x):0<r<\infty ,x\in {\mathbb {R}}^d\}\). Assume that the sets in \({\mathcal {E}}\) are monotonic in r in the sense that \(E_r(x)\subset E_s(x)\) if \(0<r\le s\). Let \(\mu \) be a nonnegative Borel measure with \(\mu ({\mathbb {R}}^d)>0\). Further, assume that the sets in the family \({\mathcal {E}}\) and the measure \(\mu \) satisfy the following properties:

  1. (i)

    there exists a constant \(\theta >1\) such that for all xy and r, \(E_r(x)\cap E_r(y)\not =\emptyset \) implies \(E_r(y)\subset E_{\theta r}(x)\);

  2. (ii)

    there exists a constant \(C_\mu >1\) such that

    $$\begin{aligned} \mu \bigl (E_{2r}(x)\bigr )\le C_\mu \,\mu \bigl (E_r(x)\bigr )~\text{ for } \text{ all }~x\in {\mathbb {R}}^d~\text{ and }~0<r<\infty ; \end{aligned}$$
  3. (iii)

    \(\bigcap _{r>0}\overline{E_r(x)}=\{x\}\) and \(\bigcup _{r>0}E_r(x)={\mathbb {R}}^d\);

  4. (iv)

    for each open set U and \(r>0\), the function \(x\rightarrow \mu \bigl (E_r(x)\cap U\bigr )\) is continuous.

Define the associated maximal operator \(M_{{\mathcal {E}}}\) by

$$\begin{aligned} M_{{\mathcal {E}}} f(x)=\sup _{r>0}\frac{1}{|E_r(x)|}\int _{E_r(x)}|f(y)|\,dy,\quad x\in {\mathbb {R}}^d. \end{aligned}$$

Stein proved the following result. We refer to Chapter I of [30] for the proof of the following theorem and several examples of families \({\mathcal {E}}\) in \({\mathbb {R}}^d\) satisfying these properties.

Theorem 1.1

(Stein). Let f be a function defined on \({\mathbb {R}}^d\) and \({\mathcal E}=\{E_r(x):0<r<\infty ,x\in {\mathbb {R}}^d\}\) be as above. If \(f\in L^p({\mathbb {R}}^d)\), \(1\le p\le \infty \), then \(M_{{\mathcal {E}}}f\) is defined almost everywhere. Moreover, the operator \(M_{{\mathcal {E}}}\) is of weak type (1, 1) and strong type (pp) for \(1<p\le \infty \).

This result can further be extended to other settings. For example, the underlying space \({\mathbb {R}}^d\) can be replaced by a finitely generated discrete group of polynomial growth, or a smooth compact Riemannian manifold. See p. 37 in [30] for the details.

If \({{\mathcal {E}}}\) consists of all Euclidean balls in \({\mathbb {R}}^d\), then we get the usual Hardy–Littlewood maximal operator defined in (1.1) which has the weak type (1, 1) and strong type (pp) properties, as mentioned earlier. However, if \({{\mathcal {E}}}\) is the family of all rectangles in \({\mathbb {R}}^d\), then the corresponding maximal operator is neither of weak type (1, 1) nor of strong type (pp) (see [17]). If the rectangles have sides parallel to the coordinate axes, then \(M_{{\mathcal {E}}}\) is not of weak type (1, 1), even though it is of strong type (pp) for \(p>1\), see [29]. The situation does not improve even if the sets in \({{\mathcal {E}}}\) satisfy a monotone condition. For instance, Hunt provided the following example on the real line. Let \(E_N(x)=x+S_N\), where \(S_N=\bigcup _{k=N}^\infty (2^{-k},2^{-k}+2^{-2k})\). Then \(S_{N+1}\subset S_N\) and the associated maximal operator is still not of weak type (1, 1), see [26] for the details of this construction.

We now consider the boundedness of the Hardy–Littlewood maximal operator on weighted spaces. A nonnegative, locally integrable function on \({\mathbb {R}}^d\) is called a weight. Let \(1<p<\infty \). We say that a weight w is in the Muckenhoupt weight class \(A_p\) if

$$\begin{aligned}{}[w]_{A_p}=\sup _{B}\left( \frac{1}{|B|}\int _B w(x)\,dx\right) \left( \frac{1}{|B|}\int _B w(x)^{-\frac{1}{p-1}}\,dx\right) ^{p-1}<\infty , \end{aligned}$$

where the supremum is taken over all balls B in \({\mathbb {R}}^d\). The quantity \([w]_{A_p}\) is called the \(A_p\) constant or the \(A_p\) characteristic of the weight w.

The weight w belongs to the weight class \(A_1\) if there exists a constant \(C>0\) such that

$$\begin{aligned} Mw(x)\le C w(x)\quad \text{ for } \text{ a.e. }~x\in {\mathbb {R}}^d. \end{aligned}$$

It follows from Hölder’s inequality that the \(A_p\) classes are increasing with respect to p. Hence, one can define the larger class \(A_\infty =\bigcup _{p>1}A_p\). The \(A_\infty \) class can also be characterized in terms of the finiteness of appropriate constants. The classical definition of the \(A_\infty \) constant is due to Hruščev [19] (see also [16]) and is obtained by taking the limit on the \(A_p\) constant as p goes to infinity:

$$\begin{aligned}{}[w]_{A_\infty }^*=\sup _B\left( \frac{1}{|B|}\int _B w(x)\,dx\right) \exp \left( \frac{1}{|B|}\int _B \log w(x)^{-1}\,dx\right) . \end{aligned}$$

However, in recent times the so-called Fujii–Wilson \(A_{\infty }\) constant has been used widely in the literature because of its flexibility to establish sharp bound for norms of some important operators in harmonic analysis. This definition goes back to the characterization of the \(A_{\infty }\) class given by Fujii [15] and later rediscovered by Wilson [32, 33]. We also follow this approach and define the \(A_{\infty }\) constant as

$$\begin{aligned}{}[w]_{A_\infty }=\sup _{B}\frac{1}{w(B)}\int _E M(w\chi _B)\,d\mu , \end{aligned}$$

where \(w(B)=\int _B w\,d\mu \), and M is the uncentred Hardy–Littlewood maximal operator.

The characterization of the weak type (1, 1) and strong type (pp), \(1<p<\infty \), for the Hardy–Littlewood maximal operator M on weighted spaces is the following.

Theorem 1.2

  1. (a)

    Let \(1<p<\infty \). Then \(w\in A_p\) if and only if

    $$\begin{aligned} \int _{{\mathbb {R}}^d} Mf(x)^p w(x)\,dx\le C\int _{{\mathbb {R}}^d}|f(x)|^p w(x)\,dx\quad \text{ for } \text{ all }~f\in L^p_w({\mathbb {R}}^d). \end{aligned}$$
    (1.4)
  2. (b)

    The weight \(w\in A_1\) if and only if

    $$\begin{aligned} w(\{x\in {\mathbb {R}}^d:Mf(x)>\lambda \})\le \frac{C}{\lambda }\int _{{\mathbb {R}}^d}|f(x)|w(x)\,dx\quad \text{ for } \text{ all }~f\in L^1_w({\mathbb {R}}^d). \nonumber \\ \end{aligned}$$
    (1.5)

Here \(L^p_w({\mathbb {R}}^d)\) denotes the space of p-integrable functions on \({\mathbb {R}}^d\) with respect to the measure \(w(x)\,dx\) for \(1\le p<\infty \). The one-dimensional result was obtained by Muckenhoupt [25] and the higher dimensional results are due to Coifman and Fefferman [9].

Theorem 1.2 was extended by Calderón [4] to spaces of homogeneous type \((\mathcal {S},d,\mu )\).

Suppose \(\mathcal {S}\) is a topological space. We recall that a quasi-metric d on \(\mathcal {S}\) is a function \(d:\mathcal {S}\times \mathcal {S}\rightarrow [0,\infty )\) satisfying the following properties:

  1. (a)

    \(d(x,y)=0\) if and only if \(x=y\);

  2. (b)

    \(d(x,y)=d(y,x)\) for all \(x,y\in \mathcal {S}\);

  3. (c)

    there exists a constant \(\kappa \ge 1\) such that \(d(x,z)\le \kappa (d(x,y)+d(y,z))\) for all \(x,y,z\in \mathcal {S}\).

Given \(x\in \mathcal {S}\) and \(r>0\), let \(B_r(x)=\{y\in \mathcal {S}:d(x,y)<r\}\) be the ball with centre x and radius r. Let \({\mathcal {B}}\) be the collection of all such balls. A set \(\mathcal {S}\) together with a quasi-metric d and a nonnegative Borel measure \(\mu \) is called a space of homogeneous type if \(\mu \) satisfies the doubling condition, that is, there exists a constant \(C>1\) such that

$$\begin{aligned} \mu (B_{2r}(x))\le C\mu (B_r(x))\quad \text{ for } \text{ all }~x\in \mathcal {S}, r>0. \end{aligned}$$

Analogous to the Euclidean case, the uncentred Hardy–Littlewood maximal operator \(\mathcal {M}\) on \(\mathcal {S}\) is defined by

$$\begin{aligned} \mathcal {M}f(x)=\sup _{x\in B\in \mathcal B}\frac{1}{\mu (B)}\int _{B}|f|\,d\mu ,\quad x\in \mathcal {S}. \end{aligned}$$

In a similar way, we also define the \(A_p\) constants of a weight w on a space of homogeneous type. We also denote the space of p-integrable functions on \(\mathcal {S}\) with respect to the measure \(w(x)\,d\mu (x)\) as \(L^p_w(\mathcal {S})\).

Calderón [4] showed that in a space of homogeneous type \(\mathcal {S}\), the results of Muckenhoupt, Coifman and Fefferman concerning the characterization of boundedness of the maximal operator in terms of the \(A_p\) weights hold provided

(\(\star \)):

the mapping \(r\rightarrow \mu (B_r(x))\) is continuous for each \(x\in \mathcal {S}\).

It is of interest to have optimal or at least good bounds for the operator norm \(\Vert M\Vert _{L^p_w({\mathbb {R}}^d)\rightarrow L^p_w({\mathbb {R}}^d)}\) in terms of the size of the constant \([w]_{A_p}\). Since \([w]_{A_p}\ge 1\), the problem is to find estimates of the form

$$\begin{aligned} \Vert M\Vert _{L^p_w({\mathbb {R}}^d)\rightarrow L^p_w({\mathbb {R}}^d)} \le C [w]_{A_p}^{\alpha (p)} \end{aligned}$$

with \(\alpha (p)\) as small as possible, where C is a constant depending only on p and the dimension d.

We remark that the results of Muckenhoupt [25], Coifman and Fefferman [9], and Calderón [4] are qualitative, in the sense that they do not provide any information about \(\alpha (p)\). Buckley [2] proved the first quantitative result on the boundedness of M by providing the best possible power dependence on the \(A_p\) constant. He proved the following result.

Theorem 1.3

Let M be the Hardy–Littlewood maximal operator defined in (1.1) and \(1<p<\infty \). Then there is a constant \(C>0\) such that

$$\begin{aligned} \Vert M\Vert _{L^p_w({\mathbb {R}}^d)\rightarrow L^p_w({\mathbb {R}}^d)}\le C [w]_{A_p}^{\frac{1}{p-1}}. \end{aligned}$$
(1.6)

This estimate is sharp in the sense that the exponent \(\frac{1}{p-1}\) cannot be replaced by any smaller quantity and hence \(\alpha (p)=\frac{1}{p-1}\). Recently, in spaces of homogeneous type, Hytönen, Pérez and Rela [20] proved the following result which shows that Buckley’s theorem can be improved further in terms of various mixed characteristics of the underlying weight w.

Theorem 1.4

Let \(\mathcal {S}\) be a space of homogeneous type and \(\mathcal {M}\) be the Hardy–Littlewood maximal operator on \(\mathcal {S}\) and let \(1<p<\infty \). Then there is a constant \(C>0\) such that

$$\begin{aligned} \Vert \mathcal {M}\Vert _{L^p_w(\mathcal {S})\rightarrow L^p_w(\mathcal {S})}\le C\bigl ([w]_{A_p}[\sigma ]_{A_{\infty }}\bigr )^{\frac{1}{p}}, \end{aligned}$$
(1.7)

where \(\sigma =w^{-\frac{1}{p-1}}\) is the dual weight of w.

We recall that if w is a weight in \(A_p\), \(1<p<\infty \), then the function \(\sigma =w^{-\frac{1}{p-1}}\) is a weight in \(A_{p'}\) and \([\sigma ]_{A_{p'}}=[w]_{A_{p}}^{\frac{1}{p-1}}\). The mixed bound in (1.7) is sharper than the estimate involving only the \(A_p\) constant in (1.6) and improves Buckley’s theorem since \([\sigma ]_{A_{\infty }}\le C [\sigma ]_{A_{p'}}=C[w]_{A_p}^\frac{1}{p-1}\), which yields (1.6).

The above result of Hytönen, Pérez and Rela is applicable only to the maximal operator and the weights associated with a family of balls generated by a quasi-metric of a space of homogeneous type. However, there are many examples of important families of measurable sets arising in harmonic analysis and PDE which cannot be generated by some quasi-metric. Let us illustrate this situation by exhibiting two concrete examples. Some more examples are given in [10].

Example 1.1

A family of convex sets in \({\mathbb {R}}^d\) was considered by Caffarelli and Gutiérrez in [3] as follows. Let \(\phi \) be a convex smooth function on \({\mathbb {R}}^d\). For \(x\in {\mathbb {R}}^d\), let \(\ell (y)\) be a supporting hyperplane of \(\phi \) at the point \((x,\phi (x))\). Given \(x\in {\mathbb {R}}^d\) and \(r>0\), define the set \(S(x,r)=\{y\in {\mathbb {R}}^d:\phi (y)<l(x)+r\}\). These sets are called sections and are obtained by projecting on \({\mathbb {R}}^d\) the points on the graph of \(\phi \) that are below a supporting hyperplane lifted in r. Let \(\mu =\det D^2\phi \) be the Monge–Amp\(\grave{e}\)re measure. The family \(\mathcal {F}=\{S(x,r):x\in {\mathbb {R}}^d,r>0\}\) is related to the convex solution of Monge–Amp\(\grave{e}\)re equation.

Example 1.2

Let \(\gamma : [0,\infty )\rightarrow {\mathbb {R}}\) be a convex function on \([0,\infty )\) and in \(C^2\) on \((0,\infty )\) and satisfies \(\gamma (0)=\gamma '(0)=0\). For \(r>0\), set \(h(r)=r\gamma '(r)-\gamma (r)\). Suppose there exists a constant \(C>0\) such that \(\frac{h(2r)}{h(r)}\le C\) for all \(r>0\). Moreover, let \(Q_0=\{(x,y)\in {\mathbb {R}}^2:|x|,|y|<1\}\) and for any \(r>0\), set \(P_r(0)=A_r Q_0\), where

$$\begin{aligned} A_r= \left( {\begin{array}{cc} r &{} 0 \\ \gamma (r)+ h(r) &{} h(r) \\ \end{array}} \right) \end{aligned}$$

Let \(\mathcal {P}=\{P_r(x)=x+P_r(0): x\in {\mathbb {R}}^2, r>0\}\) and \(\mu \) be the Lebesgue measure. The family \(\mathcal {P}\) was considered in [6] and it arises in the study of \(L^p\) estimates of the maximal operator and Hilbert transform associated with convex curves.

Thus, the following question arises naturally.

Let \(\mathcal {F}\) be a family of measurable sets which are not associated with any quasi-metric. Consider the maximal operator and the weights associated with the family \(\mathcal {F}\) in the usual way. Can we extend Theorem 1.4for such families?

Our first objective of this article is to give an affirmative answer to this question. We consider this question in a more general framework which will not only cover all these families, but also give a mixed type bound for Stein’s maximal operator (see Theorem 1.1) and also present an improvement of Theorem 1.4. For the precise statement of our result, see Theorem 1.6. We first describe the setting in which we are going to work.

Observe that it makes sense to define the sets \(E_r(x)\) of Theorem 1.1 in topological spaces. In [10], Ding, Lee and Lin considered the following more general setting.

Let X be a topological space equipped with a nonnegative Borel measure \(\mu \). For each \(x\in X\) and \(r>0\), let \(E_r(x)\) be a nonempty, bounded open subset of X containing x and let \({\mathbb {E}}=\{E_r(x):0<r<\infty ,x\in X\}\). Assume that the family \({\mathbb {E}}\) and the measure \(\mu \) satisfy the following conditions:

  1. (A)

    \(\bigcup _{r>0}E_r(x)=X\);

  2. (B)

    \(\bigcap _{r>0}E_r(x)=\{x\}\);

  3. (C)

    \(E_r(x)\subset E_s(x)\) if \(0<r\le s\);

  4. (D)

    for all \(x\in X\) and \(r>0\), we have \(0<\mu (E_r(x))<\infty \), and \(\mu \) satisfies a doubling condition, i.e., there exists a constant \(C_\mu >1\) such that

    $$\begin{aligned} \mu (E_{2r}(x))\le C_\mu \,\mu (E_r(x))~\text{ for } \text{ all }~x\in X~\text{ and }~E_r(x)\in {\mathbb {E}}; \end{aligned}$$
    (1.8)
  5. (E)

    for each open set U and \(r>0\), the function \(x\rightarrow \mu (E_r(x)\cap U)\) is continuous;

  6. (F)

    there exists a constant \(\theta >1\) such that for all \(E_r(x)\in {\mathbb {E}}\), \(y\in E_r(x)\) implies \(E_r(x)\subset E_{\theta r}(y)\) and \(E_r(y)\subset E_{\theta r}(x)\);

  7. (G)

    the mapping \(r\rightarrow \mu (E_r(x))\) is continuous for each \(x\in X\).

It is easy to verify that (F) is equivalent to the following condition:

(F\('\)):

there exists a constant \(\theta >1\) such that for all \(x,y\in X\) and \(r>0\), \(E_r(x)\cap E_r(y)\not =\emptyset \) implies \(E_r(y)\subset E_{\theta r}(x)\).

The maximal operator on X associated with the family \({\mathbb {E}}\) is defined by

$$\begin{aligned} M_{\mathbb {E}}f(x)=\sup _{E_r(x)\in {\mathbb {E}}}\frac{1}{\mu (E_r(x))}\int _{E_r(x)}|f(y)|\,d\mu (y),\quad x\in X. \end{aligned}$$
(1.9)

The Muckenhoupt weights can also be defined analogously in this setting. Let \(1<p<\infty \). A nonnegative locally integrable function w on X is said to be in the weight class \(A_{p,{\mathbb {E}}}\) if

$$\begin{aligned}{}[w]_{A_{p,{\mathbb {E}}}}=\sup _{E\in {\mathbb {E}}}\Bigl (\frac{1}{\mu (E)}\int _E w\,d\mu \Bigr )\Bigl (\frac{1}{\mu (E)}\int _E w^{-\frac{1}{p-1}}\,d\mu \Bigr )^{p-1}<\infty . \end{aligned}$$

The \(A_{\infty ,{\mathbb {E}}}\) constant of w is defined by

$$\begin{aligned}{}[w]_{A_{\infty ,{\mathbb {E}}}}=\sup _{E\in {\mathbb {E}}}\frac{1}{w(E)}\int _E M_{\mathbb {E}}(w\chi _E)\,d\mu , \end{aligned}$$

where \(w(E)=\int _E w\,d\mu \).

Observe that if \({\mathbb {E}}\) is the set of all balls (or cubes with sides parallel to the coordinate axes) and \(\mu \) is the Lebesgue measure on \({\mathbb {R}}^d\), then the \(A_{p,{\mathbb {E}}}\) weights are the usual Muckenhoupt \(A_p\) weights. In [10], the authors obtained the following result.

Theorem 1.5

Let \(1<p<\infty \). Then \(w\in A_{p,{\mathbb {E}}}\) if and only if the maximal operator \(M_{{\mathbb {E}}}\) satisfies the strong type (pp) inequality

$$\begin{aligned} \Vert M_{{\mathbb {E}}}f\Vert _{L_{w}^p(X)}\le C\Vert f\Vert _{L_{w}^p(X)},\quad f\in L_{w}^p(X), \end{aligned}$$

for some constant \(C>0\).

Here \(L^p_w(X)\) denotes the space of p-integrable functions on X with respect to the measure \(w(x)\,d\mu (x)\).

Our first goal of this article is to study the quantitative aspect of the constant C in Theorem 1.5, i.e., the dependence of the norm of the operator \(M_{{\mathbb {E}}}\) in terms of mixed characteristics of the underlying weight \(w\in A_{p,{\mathbb {E}}}\). We obtain sharp quantitative norm estimates of \(M_{{\mathbb {E}}}\) in the spirit of Theorem 1.4 and our result is the following.

Theorem 1.6

Let \(M_{{\mathbb {E}}}\) be the maximal operator defined on X and \(w\in A_{p,{\mathbb {E}}}\), \(1<p<\infty \). Then there exists a constant \(C>0\) such that

$$\begin{aligned} \Vert M_{{\mathbb {E}}}f\Vert _{L_w^p(X)}\le C\Bigl ([w]_{A_{p,{\mathbb {E}}}}[\sigma ]_{A_{\infty ,{\mathbb {E}}}}\Bigr )^{\frac{1}{p}}\Vert f\Vert _{L_w^p(X)}, \end{aligned}$$
(1.10)

where \(\sigma =w^{-\frac{1}{p-1}}\) is the dual weight of w.

We would like to mention that in order to prove the above theorem, we do not require the theory of \(A_{p,{\mathbb {E}}}\) weights established in [10]. We will make use of basic tools such as Vitali type covering lemma (Lemma 2.1) and Lebesgue differentiation theorem (Corollary 2.3) described in the next section. One of the major difficulties arising in our setting is the absence of geometry. Instead, we work directly with the basic assumptions (A)–(G). We also refer the reader to [28] for a variant of Theorem 1.4 on a locally compact abelian group having a covering family as defined in [12].

In [10], the authors have proved the analogue of the well known open property of \(A_p\) weights: if \(w\in A_p\) for some \(p>1\), then w also belongs to \(A_{p-\delta }\) for some \(\delta >0\). For our purpose we need some quantitative information about \(\delta \). To achieve this, we prove the following sharp version of reverse Hölder inequality for weights in \(A_{\infty ,{\mathbb {E}}}\) with a precise quantitative expression for the exponent. The price we pay for this is that the inequality is in a weak form.

Theorem 1.7

(Sharp weak reverse Hölder inequality). If \(w\in A_{\infty ,{\mathbb {E}}}\), then

$$\begin{aligned} \frac{1}{\mu (E)}\int _Ew^{1+\epsilon }\,d\mu \le 4C_\mu \theta ^{2\alpha }\Bigl (\frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}w\,d\mu \Bigr )^{1+\epsilon }\quad \text{ for } \text{ all }~E\in {\mathbb {E}}, \end{aligned}$$

where \(\epsilon =\frac{1}{2[w]_{A_{\infty ,{\mathbb {E}}}}C-1}\) and \(C=2(2C_\mu )^4\theta ^{8\alpha }4^\alpha \).

The constant \(\alpha \) that appears above is defined in (2.1). The above estimate is a weaker version of reverse Hölder inequality in the sense that the set on the right of the inequality is an enlargement of the set on the left. The set \(\widehat{E}\) is defined in (2.6). In case of spaces of homogeneous type, it is a dilation of E. As an application of Theorem 1.7, we appropriately quantify the \(\delta \) associated with the open property of \(A_p\) weights. Finally, using this precise open property of \(A_p\) weights, together with an interpolation type argument, we obtain the desired mixed bound in (1.10).

Another fundamental generalization of the maximal function inequality (1.3) is due to Fefferman and Stein [13]. In their pioneering work, they proved that the following two-weights inequality

$$\begin{aligned} w(\{x\in {\mathbb {R}}^d:Mf(x)>\lambda \})\le \frac{C}{\lambda }\int _{{\mathbb {R}}^d}|f(x)|Mw(x)\,dx \end{aligned}$$
(1.11)

holds for all nonnegative functions f and w.

In the literature, this inequality is popularly known as the endpoint Fefferman–Stein weighted inequality and is interesting for several reasons. The first of them is that it was a precursor of the weighted theory of Muckenhoupt and gives an improvement of the inequality (1.5). Note that if \(w\in A_1\), then inequality (1.5) readily follows from (1.11). In [13], Fefferman and Stein exploited this inequality to derive the vector-valued analogue of maximal inequalities (1.2) and (1.3) and applied these to obtain certain estimates for Marcinkiewicz integrals. If \(f=(f_1,f_2,\dots )\) is a sequence of functions on \({\mathbb {R}}^d\), \(Mf=(Mf_1,Mf_2,\dots )\) and \(\Vert f(x)\Vert _{\ell ^r}=(\sum _{k=1}^\infty |f_k(x)|^r)^\frac{1}{r}\), \(1<r<\infty \), then their result is the following.

Theorem 1.8

Let \(1<p<\infty \). Then there exists a constant \(C_{r,p}>0\) such that

$$\begin{aligned} \int _{{\mathbb {R}}^d}\Vert Mf(x)\Vert _{\ell ^r}^p\,dx\le C_{r,p}\int _{{\mathbb {R}}^d}\Vert f(x)\Vert _{\ell ^r}^p\,dx. \end{aligned}$$

Moreover, the following weak type (1, 1) estimate holds: there exists a constant \(C>0\) such that

$$\begin{aligned} |\{x\in {\mathbb {R}}^d:\Vert Mf(x)\Vert _{\ell ^r}>\lambda \}|\le \frac{C}{\lambda }\int _{{\mathbb {R}}^d}\Vert f(x)\Vert _{\ell ^r}\,dx\quad \text{ for } \text{ all }~\lambda >0. \end{aligned}$$

Note that if we put \(f_1=f,f_2=f_3=\dots =0\), then we obtain the inequalities (1.2) and (1.3).

This is a very deep theorem and has been generalized and used in many different contexts in modern harmonic analysis explaining the central role of the inequality (1.11). Another objective of this article is to extend the endpoint Fefferman–Stein weighted inequality (1.11) and Theorem 1.8 to our setting. Prior to stating our results, we indicate some difficulties we encounter in this setting.

There are many different proofs of Fefferman–Stein weighted inequality available in the literature in the standard case of \({\mathbb {R}}^d\). In the original paper [13], the authors first proved the inequality (1.11) for the dyadic maximal operator \(M_d\). Then a pointwise inequality associating \(M_d\) with the truncated maximal operators allowed them to obtain the inequality (1.11) for the truncated maximal operators by an application of Minkowski’s integral inequality. Finally, using the monotone convergence theorem, they extended those inequalities for the ordinary maximal operator.

There is another proof in [16] which is based on the following observation. At each level \(\lambda >0\), the set \(\{x\in {\mathbb {R}}^d:Mf(x)>\lambda \}\) can be covered by a countable union of dilated dyadic cubes with some control on the average of f on those dyadic cubes. By reprising the classical argument for proving the inequality (1.3) as mentioned in [30], this proof avoids the use of dyadic cubes. However, it relies on the Vitali covering lemma and regularity of the Lebesgue measure.

In our situation, we lack the concept of dyadic cubes and we do not assume any regularity on the measure \(\mu \). Consequently, it seems difficult to adapt the above approaches to our setting to obtain an analogue of (1.11). To overcome this problem, we prove the following covering lemma in our setting.

Lemma 1.9

Let \(\mathcal {F}=\{E_{r_{\alpha }}(x_{\alpha }): x_{\alpha }\in X,\alpha \in \Lambda \}\) be a family in \({\mathbb {E}}\) and \(\Sigma =\bigcup _{\alpha \in \Lambda }E_{r_{\alpha }}(x_{\alpha })\) such that \(\mu (\Sigma )<\infty \). Then there exists a disjoint countable subfamily \(\{E_{r_{i}}(x_{i})\}\subset \mathcal {F}\) satisfying the property: for any \(E_{r_{\alpha }}(x_{\alpha })\in \mathcal {F}\), there is an \(E_{r_{i}}(x_{i})\) such that \(E_{r_{\alpha }}(x_{\alpha })\subset E_{\theta ^3 r_{i}}(x_{i})\), where \(\theta \) is the constant given in condition (F).

As an application of this lemma, we shall derive an analogue of (1.11) in our setting as follows.

Theorem 1.10

There exists a constant \(C>0\) such that

$$\begin{aligned} w(\{x\in X:M_{{\mathbb {E}}}f(x)>\lambda \})\le C\int _{X}|f(x)|Mw(x)\,d\mu (x) \end{aligned}$$
(1.12)

for all measurable functions f and w.

The \(L^p\) version of the endpoint Fefferman–Stein estimate (1.12) is the following inequality. Let \(1<p<\infty \). Then

$$\begin{aligned} \int _X M_{{\mathbb {E}}}f(x)^p w(x)\,d\mu (x)\le C_p\int _X |f(x)|^p M_{{\mathbb {E}}}w(x)\,d\mu (x) \end{aligned}$$
(1.13)

for all measurable functions f and w. This estimate is a simple consequence of the fact

$$\begin{aligned} \Vert M_{{\mathbb {E}}}f\Vert _{L^{\infty }_w(X)}\le C\Vert f\Vert _{L^{\infty }_{M_{{\mathbb {E}}}w}(X)} \end{aligned}$$

combined with (1.12) and Marcinkiewicz interpolation theorem.

At this point, we take the opportunity to mention some related works on endpoint Fefferman–Stein inequalities in a variety of contexts. In the setting of the spaces of homogeneous type, Aimar, Bernardis and Nowak [1] proved a dyadic version of (1.12) following the proof given in [13]. Luque and Parissis [24] derived similar inequalities for the strong maximal function on the Euclidean spaces. In a recent work, Ombrosi, Rivera-Ríos and Safe [27] proved an analogue of (1.12) on the infinite rooted k-ary tree.

Finally, we provide an extension of Theorem 1.8 to our setting.

Theorem 1.11

Suppose \(1<p,q<\infty \). Let \(f=\{f_i\}_i\) be a sequence of measurable functions on X. Then, we have the strong-type (pp) inequality

$$\begin{aligned} \int _X\Bigl (\sum _i M_{{\mathbb {E}}}f_i(x)^q\Bigr )^{\frac{p}{q}}\,d\mu (x)\le C\int _X\Bigl (\sum _i|f_i(x)|^q\Bigr )^{\frac{p}{q}}\,d\mu (x). \end{aligned}$$

Moreover, we have the weak type (1, 1) inequality

$$\begin{aligned} \mu \Bigl (\Bigl \{x\in X:\Bigl (\sum _i M_{{\mathbb {E}}}f_i(x)^q\Bigr )^{\frac{1}{q}}>\lambda \Bigr \}\Bigr )\le \frac{C}{\lambda }\int _X\Bigl (\sum _i|f_i(x)|^q\Bigr )^{\frac{1}{q}}\,d\mu (x). \end{aligned}$$

Throughout the article we use the following notation. For \(1\le p\le \infty \), \(p'\) denotes the conjugate exponent of p defined by the condition \(\frac{1}{p}+\frac{1}{p'}=1\). For a measurable subset S of X and a measurable function f on X, we will use the notation \(f(S)=\int _S f\,d\mu \). We will also write \(f_S\) to denote the \(\mu \)-average of f over S, i.e.,

$$\begin{aligned} f_S=\frac{1}{\mu (S)}\int _S f\,d\mu . \end{aligned}$$

For \(1\le p<\infty \), the spaces \(L^p(X)\) and \(L^p_w(X)\) denote the p-integrable functions on X with respect to the measures \(\mu \) and \(w(x)\,d\mu (x)\), respectively.

For a measurable function f on a measure space \((X,\mu )\), the distribution function of f is the function \(d_f\) defined on \([0,\infty )\) as

$$\begin{aligned} d_f(\lambda )=\mu (\{x\in X:|f(x)|>\lambda \}). \end{aligned}$$

The set \(\{x\in X:|f(x)|>\lambda \}\) will be called the distribution set of f and will be denoted by \(\mathcal {D}_{f}(\lambda )\). That is,

$$\begin{aligned} \mathcal {D}_{f}(\lambda )=\{x\in X:|f(x)|>\lambda \}. \end{aligned}$$

In many occasions, we will need to evaluate the \(L^p\)-norm of a function precisely. The following formula, which computes the \(L^p\)-norm of f in terms of its distribution function \(d_f\), is very helpful and will be used several times.

Let \((X,\mu )\) be a \(\sigma \)-finite measure space. Then for all \(p>0\), we have

$$\begin{aligned} \int _X |f(x)|^p\,d\mu (x)= & {} \int _0^\infty p\lambda ^{p-1}d_f(\lambda )\,d\lambda \nonumber \\= & {} \int _0^\infty p\lambda ^{p-1}\mu (\{x\in X:|f(x)|>\lambda \})\,d\lambda . \end{aligned}$$
(1.14)

The article is organized as follows. We begin Sect. 2 by describing the setting in which we will prove the results of this article. We also prove a quantitative version of weighted weak type (pp) estimates (Lemma 2.4) for the maximal operator \(M_{{\mathbb {E}}}\). Furthermore, we introduce the concept of enlargement of an open set \(E\in {\mathbb {E}}\) and the associated local maximal function \(M_E\) and prove some results. This includes an important covering lemma (Lemma 2.7) and a localization property of the maximal operator \(M_E\) which will be crucial to prove the sharp form of the reverse Hölder inequality. Section 3 contains the proofs of all the results described above, namely, Theorem 1.6 about the sharp mixed bound for \(M_{{\mathbb {E}}}\), Theorem 1.7 about the sharp reverse Hölder inequality for \(A_{p,{\mathbb {E}}}\) weights, a covering lemma (Lemma 1.9), Fefferman–Stein weighted inequality (Theorem 1.10), and finally Theorem 1.11 regarding the vector-valued inequalities for the maximal operator \(M_{\mathbb {E}}\).

2 Preliminary Results

Let X be a topological space equipped with a nonnegative Borel measure \(\mu \). Let \({\mathbb {E}}=\{E_r(x):r>0,x\in X\}\) be a family of open subsets of X, where x is an interior point of \(E_r(x)\). We assume that \({\mathbb {E}}\) and \(\mu \) satisfy the conditions (A)–(G) described in the previous section.

It is easy to verify that there exists \(\alpha >1\) such that for all \(a>1\), we have

$$\begin{aligned} \mu (E_{ar}(x))\le 2C_\mu a^\alpha \mu (E_{r}(x)), \quad x\in X, r>0. \end{aligned}$$
(2.1)

Indeed, since \(a>1\) and \(C_\mu >1\), we can find nonnegative integers j and k such that \(2^j<a\le 2^{j+1}\) and \(2^k<C_\mu \le 2^{k+1}\). Using property (C) and the doubling condition (1.8), we get

$$\begin{aligned} \mu (E_{ar}(x))\le & {} \mu (E_{2^{j+1}r}(x))\le C_\mu ^{j+1}\mu (E_r(x)) \\\le & {} 2^{(j+1)(k+1)}\mu (E_r(x))\le 2C_\mu a^\alpha \mu (E_{r}(x)), \end{aligned}$$

where \(\alpha =\log _2C_\mu +1>1\) is independent of a.

We shall make use of the following Vitali type covering lemma in the sequel.

Lemma 2.1

(Lemma 2.2, [10]). For \(0<r_0<\infty \), \(X_0\subset X\), let

$$\begin{aligned} {\mathcal {F}}=\{E_r(x)\in {\mathbb {E}}:0<r\le r_0,x\in X_0\}. \end{aligned}$$

Then there exists a disjoint countable subfamily \(\{E_{r_i}(x_i):i\in J\}\subset {\mathcal {F}}\) satisfying the following property: for any \(E_r(x)\in {\mathcal {F}}\), there exists an \(E_{r_i}(x_i)\) such that \(E_r(x)\subset E_{4\theta ^4r_i}(x_i)\).

An immediate consequence of this lemma is the \(L^p\)-mapping properties of \(M_{{\mathbb {E}}}\) which are also extensions of the inequalities (1.2) and (1.3) in this setting. More precisely, we have the following theorem.

Theorem 2.2

(Theorem 2.4, [10]). Let \(M_{{\mathbb {E}}}\) be the maximal operator on X associated with the family \({\mathbb {E}}\) defined in (1.9). Then

  1. (a)

    \(M_{{\mathbb {E}}}\) is bounded from \(L^1(X)\) to weak \(L^1(X)\);

  2. (b)

    \(M_{{\mathbb {E}}}\) is bounded on \(L^p(X)\) for \(1 < p \le \infty \).

As a corollary of the above maximal theorem, we have the Lebesgue differentiation theorem. We will need this result later, so we state here for easy reference.

Corollary 2.3

Let f be a locally integrable function on X. Then

$$\begin{aligned} \lim _{r\rightarrow 0}\frac{1}{\mu (E_r(x))}\int _{E_r(x)} f(y)\,d\mu (y)=f(x)\quad \text{ for } \text{ a.e. }\,x\in X, \end{aligned}$$

where \(E_r(x)\in {\mathbb {E}}\).

In [10], the authors developed an analogue of the classical theory of \(A_p\) weights for the family \({\mathbb {E}}\) of open subsets in X with the underlying measure \(\mu \) as above. In particular, they proved that the class of weights for which the operator \(M_{{\mathbb {E}}}\) acts boundedly from \(L^p_w(X)\) to weak \(L^p_w(X)\) is precisely the Muckennhoupt \(A_{p, {\mathbb {E}}}\) class, \(1\le p<\infty \). However, no information was obtained regarding the weak norm of \(M_{{\mathbb {E}}}\) in terms of the \(A_{p,{\mathbb {E}}}\) characteristic of the weight w. In the sequel, we will need a quantitative estimate of the operator norm of \(M_{{\mathbb {E}}}\). Therefore, we now state and prove this requirement in a precise form. The estimate we need is the following.

Lemma 2.4

Let \(1<p<\infty \) and \(w\in A_{p,{\mathbb {E}}}\). Then the maximal operator \(M_{{\mathbb {E}}}\) satisfies the following weak type (pp) inequality:

$$\begin{aligned} w(\{x\in X:M_{{\mathbb {E}}}f(x)>\lambda \})\le \frac{(2C_\mu (4\theta ^4)^\alpha )^p}{\lambda ^p}[w]_{A_{p,{\mathbb {E}}}}\Vert f\Vert _{L^p_w(X)}^p,\quad \lambda >0. \nonumber \\ \end{aligned}$$
(2.2)

Proof

Let \(\lambda >0\) and \(f\in L^p_w(X)\) be given. By a simple application of Hölder’s inequality, it is easy to check that f is locally integrable and therefore \(M_{{\mathbb {E}}}f\) makes sense. Let \(r_0>0\) be fixed and define the following truncated (uncentred) maximal operator \(\widetilde{M}_{{\mathbb {E}},r_0}\) by

$$\begin{aligned} \widetilde{M}_{{\mathbb {E}},r_0}f(x)=\sup _{\begin{array}{c} x\in E_r(z)\in {\mathbb {E}}\\ r\le r_0 \end{array}}\frac{1}{\mu (E_r(z))}\int _{E_r(z)}|f(y)|\,d\mu (y). \end{aligned}$$
(2.3)

We consider the corresponding distribution set

$$\begin{aligned} \mathcal {D}_{\widetilde{M}_{{\mathbb {E}},r_0}f}(\lambda )=\{x\in X:\widetilde{M}_{{\mathbb {E}},r_0}f(x)>\lambda \}. \end{aligned}$$

It is easy to see that the family \(\{\mathcal {D}_{\widetilde{M}_{{\mathbb {E}},r_0}f}(\lambda ):r_0>0\}\) is increasing in \(r_0\) and its limit is \(\mathcal {D}_{M_{{\mathbb {E}}}f}(\lambda )=\{x\in X:M_{{\mathbb {E}}}f(x)>\lambda \}\). Therefore, in order to establish (2.2), it will be enough to prove that the inequality

$$\begin{aligned} w(\mathcal {D}_{\widetilde{M}_{{\mathbb {E}},r_0}f}(\lambda ))\le \frac{C}{\lambda ^p}[w]_{A_{p,{\mathbb {E}}}}\Vert f\Vert _{L^p_w(X)}^p \end{aligned}$$
(2.4)

holds, where the constant C is independent of \(r_0\) and \(\lambda \).

First of all, we note that

$$\begin{aligned} \mathcal {D}_{\widetilde{M}_{{\mathbb {E}},r_0}f}(\lambda )\subseteq \bigcup _{E_r(z)\in {\mathcal {F}}}E_r(z), \end{aligned}$$

where the family \({\mathcal {F}}\subset {\mathbb {E}}\) and for any \(E_r(z)\in {\mathcal {F}}\) and \(r\le r_0\), we have

$$\begin{aligned} \frac{1}{\mu (E_r(z))}\int _{E_r(z)} |f|\,d\mu >\lambda . \end{aligned}$$

Thus, by Lemma 2.1, there exists a disjoint family \(\{E_{r_i}(x_i)\}_i\) in \(\mathcal {F}\) such that \(\mathcal {D}_{\widetilde{M}_{{\mathbb {E}},r_0}f}(\lambda )\subseteq \bigcup _i E_{4\theta ^4r_i}(x_i)\) and

$$\begin{aligned} \frac{1}{\mu (E_{r_i}(x_i))}\int _{E_{r_i}(x_i)}|f|\,d\mu >\lambda \quad \text{ for } \text{ all }~i. \end{aligned}$$

Using these facts, along with (2.1), we obtain

$$\begin{aligned} \lambda ^p\,w(\mathcal {D}_{\widetilde{M}_{{\mathbb {E}},r_0}f}(\lambda ))\le & {} \sum _i\lambda ^pw(E_{4\theta ^4r_i}(x_i))\nonumber \\\le & {} \sum _iw(E_{4\theta ^4r_i}(x_i))\left( \frac{2C_\mu (4\theta ^4) ^\alpha }{\mu (E_{4\theta ^4r_i}(x_i))}\int _{E_{r_i}(x_i)}|f(y)|\,d\mu (y)\right) ^p. \nonumber \\ \end{aligned}$$
(2.5)

Now we estimate each term in this sum. By Hölder’s inequality and the \(A_{p,{\mathbb {E}}}\) condition, we get

$$\begin{aligned}{} & {} { \frac{w(E_{4\theta ^4r_i}(x_i))}{\mu (E_{4\theta ^4r_i}(x_i))^p}\Bigl (\int _{E_{r_i}(x_i)}|f(y)|\,d\mu (y)\Bigr )^p } \\{} & {} \quad \le \frac{w(E_{4\theta ^4r_i}(x_i))}{\mu (E_{4\theta ^4r_i}(x_i))^p}\Bigl (\int _{E_{r_i}(x_i)}|f|^pw\,d\mu \Bigr ) \Bigl (\int _{E_{4\theta ^4r_i}(x_i)}w^{1-p'}\,d\mu \Bigr )^{p-1}\\{} & {} \quad = \Bigl (\frac{1}{\mu (E_{4\theta ^4r_i}(x_i))}\int _{E_{4\theta ^4r_i}(x_i)}w\,d\mu \Bigr )\\{} & {} \qquad \times \Bigl (\frac{1}{\mu (E_{4\theta ^4r_i}(x_i))}\int _{E_{4\theta ^4r_i}(x_i)}w^{1-p'}\,d\mu \Bigr )^{p-1}\int _{E_{r_i}(x_i)}|f|^pw\,d\mu \nonumber \\{} & {} \quad \le [w]_{A_{p,{\mathbb {E}}}}\int _{E_{r_i}(x_i)}|f|^pw\,d\mu . \end{aligned}$$

Substituting in (2.5) and using the fact that \(\{E_{r_i} (x_i)\}_i\) is a disjoint family, we get our desired inequality

$$\begin{aligned} w(\mathcal {D}_{\widetilde{M}_{{\mathbb {E}},r_0}f}(\lambda ))\le \frac{(2C_\mu (4\theta ^4)^\alpha )^p}{\lambda ^p}[w]_{A_{p,{\mathbb {E}}}}\Vert f\Vert _{L^p_w(X)}^p. \end{aligned}$$

This proves (2.4) and the proof of the lemma is complete. \(\square \)

We will also need the following Calderón–Zygmund decomposition of an integrable function. For a proof, we refer to [11].

Lemma 2.5

(Calderón–Zygmund decomposition). Let \(f\in L^1(X)\) and \(\lambda >0\). There exists a collection of pairwise disjoint open sets \(\{E_{r_i}(x_i):i\in J\}\) such that

  1. (a)

    \(\lambda <\frac{1}{\mu (E_{r_i}(x_i))}\int _{E_{r_i}(x_i)}|f|\,d\mu \le 2C_\mu \theta ^\alpha \lambda \quad \text{ for } \text{ all }~i\in J\),

  2. (b)

    \(\sum _{i\in J}\mu (E_{r_i}(x_i))\le \frac{\Vert f\Vert _1}{\lambda }\),

  3. (c)

    \(|f(x)|\le \lambda \) if \(x\not \in \bigcup _{i\in J}E_{r_i}(x_i)\).

To prove the sharp reverse Hölder inequality, we need to enlarge the sets \(E_r(x)\). We first define this concept of enlargement and study some basic properties.

Let \(E=E_{r_0}(x_0)\) be a fixed open set in \({\mathbb {E}}\). Define

$$\begin{aligned} {{\mathcal {B}}}_E=\{E_r(y):y\in E,r\le r_0\} \end{aligned}$$

and

$$\begin{aligned} \widehat{E}=\bigcup _{E_r(y)\in {{\mathcal {B}}}_E}E_r(y). \end{aligned}$$
(2.6)

Lemma 2.6

Let \(E=E_{r_0}(x_0)\) be a fixed open set in \({\mathbb {E}}\).

  1. (a)

    If \(F\in {{\mathcal {B}}}_E\), then \(F\subset E_{\theta r_0}(x_0)\).

  2. (b)

    For any \(z\in E\), we have \(\widehat{E}\subset E_{\theta ^2r_0}(z)\). This implies

    $$\begin{aligned} \mu (\widehat{E})\le \mu (E_{\theta ^2r_0}(z))\le 2C_\mu \theta ^{2\alpha }\mu (E_{r_0}(z)). \end{aligned}$$

Proof

(a) If \(F\in {{\mathcal {B}}}_E\), then \(F=E_r(x)\) for some \(x\in E_{r_0}(x_0)\) and \(r\le r_0\). Hence, \(x\in E_{r_0}(x)\cap E_{r_0}(x_0)\) so that \(E_{r_0}(x)\subset E_{\theta r_0}(x_0)\), by property (F\('\)). Therefore, \(F=E_r(x)\subset E_{r_0}(x)\subset E_{\theta r_0}(x_0)\).

(b) Let \(F\in {{\mathcal {B}}}_E\) so that \(F=E_r(x)\), \(x\in E_{r_0}(x_0)\), \(r\le r_0\). By (a), \(F=E_r(x)\subset E_{\theta r_0}(x_0)\). Now, if \(z\in E_{r_0}(x_0)\), then \(z\in E_{\theta r_0}(x_0)\). Also, \(z\in E_{\theta r_0}(z)\). Therefore, \(z\in E_{\theta r_0}(x_0)\cap E_{\theta r_0}(z)\). Hence, \( E_{\theta r_0}(x_0)\subset E_{\theta ^2 r_0}(z)\). Thus, \(F=E_r(x)\subset E_{\theta r_0}(x_0)\subset E_{\theta ^2 r_0}(z)\). This shows that \(\widehat{E}\subset E_{\theta ^2r_0}(z)\). The last assertion follows from (2.1). \(\square \)

Given a fixed open set \(E=E_{r_0}(x_0)\) in \({\mathbb {E}}\), we define the local maximal function relative to E as follows:

$$\begin{aligned} M_E f(y)= \left\{ \begin{array}{lll} \sup \limits _{F\in \widehat{E}, y\in F}\frac{1}{\mu (F)}\int _F|f(z)|\,d\mu (z), &{} \text{ if }~y\in \widehat{E},\\ \\ 0, &{} \text{ otherwise }. \end{array} \right. \end{aligned}$$

Note that by Lebesgue differentiation theorem (Corollary 2.3), it follows that

$$\begin{aligned} f(y)\le M_E f(y)\quad \text{ for } \text{ a.e. }~y\in E. \end{aligned}$$
(2.7)

We consider the distribution set at level \(\lambda >0\) for the local maximal function of a weight w:

$$\begin{aligned} \mathcal {D}_{M_E w}(\lambda )=\{x\in \widehat{E}:M_E w(x)>\lambda \}. \end{aligned}$$
(2.8)

The next result contains a key decomposition of \(\mathcal {D}_{M_E w}(\lambda )\) of Calderón–Zygmund type. We shall make use of it several times in the sequel.

Lemma 2.7

Let \(E=E_{r_0}(x_0)\) be a fixed open set in \({\mathbb {E}}\) and w be a nonnegative integrable function with support in \(\widehat{E}\). For \(\lambda >w_{\widehat{E}}\), define \(\mathcal {D}_{M_E w}(\lambda )\) as above. Then, there exists a countable family of pairwise disjoint open sets \(\{E_{r_i}(x_i):i\in J\}\) from \({\mathcal {B}}_E\) such that

  1. (i)

    \(\bigcup _{i\in J} E_{r_i}(x_i)\subset \mathcal {D}_{M_E w}(\lambda )\subset \bigcup _{i\in J}E_{4\theta ^4r_i}(x_i)\),

  2. (ii)

    \(\lambda <\frac{1}{\mu (E_{r_i}(x_i))}\int _{E_{r_i}(x_i)}w\,d\mu \le 2C_\mu \theta ^\alpha \lambda \) for all \(i\in J\),

  3. (iii)

    \(r_i\le r_0\) for all \(i\in J\),

  4. (iv)

    if \(r>r_i\) for some i, then \(\frac{1}{\mu (E_{r}(x_i))}\int _{E_{r}(x_i)}w\,d\mu \le 2C_\mu \theta ^\alpha \lambda \).

Proof

For \(x\in \mathcal {D}_{M_E w}(\lambda )\), define

$$\begin{aligned} R_x=\sup \left\{ r:\text{ there } \text{ exists }~F=E_r(y)~\text{ with }~x\in F~\text{ and }~\frac{1}{\mu (F)}\int _{F}w\,d\mu >\lambda \right\} . \end{aligned}$$

Note that \(R_x<\infty \) as \(r\le r_0\). So, for each \(x\in \mathcal {D}_{M_E w}(\lambda )\), there exists \(r_x>0\) and \(y_x\in E_{r_0}(x_0)\) such that

$$\begin{aligned} \frac{1}{\mu (E_{r_x}(y_x))}\int _{E_{r_x}(y_x)}w\,d\mu >\lambda \quad \text{ and }\quad r_x\le R_x<\theta r_x. \end{aligned}$$

Observe that

$$\begin{aligned} \mathcal {D}_{M_E w}(\lambda )\subseteq \bigcup _{x\in \mathcal {D}_{M_E w}(\lambda )}E_{r_x}(y_x)\quad \text{ and }\quad r_x\le r_0~\text{ for } \text{ all }~x. \end{aligned}$$

Therefore, by Lemma 2.1, there exists a countable set J and a collection of pairwise disjoint open sets \(\{E_{r_{x_i}}(y_{x_i}):i\in J\}\) such that \(\mathcal {D}_{M_E w}(\lambda ) \subseteq \bigcup _{i\in J}E_{4\theta ^4 r_{x_i}}(y_{x_i})\). To simplify notation, we write \(r_{x_i}\) by \(r_i\), \(R_{x_i}\) by \(R_i\), and \(y_{x_i}\) by \(x_i\).

Note that (i) is trivial since each \(E_{r_i}(x_i)\) is contained in \(\mathcal {D}_{M_E w}(\lambda )\). To prove (ii), observe that by (2.1), we get

$$\begin{aligned} \lambda< & {} \frac{1}{\mu (E_{r_i}(x_i))}\int _{E_{r_i}(x_i)}w\,d\mu \\\le & {} \frac{\mu (E_{\theta r_i}(x_i))}{\mu (E_{r_i}(x_i))}\cdot \frac{1}{\mu (E_{\theta r_i}(x_i))}\int _{E_{\theta r_i}(x_i)}w\,d\mu \le 2C_\mu \theta ^\alpha \lambda . \end{aligned}$$

Part (iii) is obvious. Finally, if \(r>r_i\) for some i, then \(\theta r>\theta r_i>R_i\). Hence, again by (2.1),

$$\begin{aligned} \frac{1}{\mu (E_{r}(x_i))}\int _{E_{r}(x_i)}w\,d\mu= & {} \frac{\mu (E_{\theta r}(x_i))}{\mu (E_{r}(x_i))}\cdot \frac{1}{\mu (E_{\theta r}(x_i))}\int _{E_{r_i}(x_i)}w\,d\mu \\\le & {} 2C_\mu \theta ^\alpha \frac{1}{\mu (E_{\theta r}(x_i))}\int _{E_{\theta r}(x_i)}w\,d\mu \\\le & {} 2C_\mu \theta ^\alpha \lambda . \end{aligned}$$

This proves (iv). \(\square \)

The next result is a continuation of the above lemma where we present a localization argument for the local maximal function. The corresponding result for the dyadic case in \({\mathbb {R}}^d\) is a direct consequence of the maximality of the cubes in the Calderón–Zygmund decomposition. Preserving the notation of Lemma 2.7, we have the following.

Lemma 2.8

Let \(E=E_{r_0}(x_0)\) be a fixed open set in X and \(L=(2C_\mu \theta ^\alpha )^3\). If \(x\in E_{4\theta ^4r_i}(x_i)\cap \mathcal {D}_{M_E w}(L\lambda )\), then

$$\begin{aligned} M_E w(x)\le M_E(w\chi _{E_{4\theta ^5r_i}(x_i)})(x). \end{aligned}$$

Proof

Let \(x\in E_{4\theta ^4r_i}(x_i)\cap \mathcal {D}_{M_E w}(L\lambda )\). Then, there exists \(F=E_r(y)\), \(y\in E\) and \(x\in E_r(y)\) with \(r\le r_0\) such that \(\frac{1}{\mu (F)}\int _{F}w\,d\mu >L\lambda \). Thus, \(x\in E_r(y)\cap E_{4\theta ^4r_i}(x_i)\).

We claim that \(r\le 4\theta ^4r_i\). Once the claim is proved, we have \(x\in E_{4\theta ^4r_i}(y)\cap E_{4\theta ^4r_i}(x_i)\). Hence, \(F=E_r(y)\subset E_{4\theta ^4r_i}(y)\subset E_{4\theta ^5r_i}(x_i)\), by property (F\('\)). Then, by the definition of \(M_E\), we have

$$\begin{aligned} \frac{1}{\mu (F)}\int _{F}w\,d\mu =\frac{1}{\mu (F)}\int _{F}w\chi _{E_{4\theta ^5r_i}(x_i)}\,d\mu \le M_E\bigl (w\chi _{E_{4\theta ^5r_i}(x_i)}\bigr )(x). \end{aligned}$$

Therefore, \(M_E w(x)\le M_E\bigl (w\chi _{E_{4\theta ^5r_i}(x_i)}\bigr )(x)\).

We now prove the claim. If possible, let \(r>4\theta ^4r_i\). Since \(x\in E_r(x_i)\cap E_r(y)\), using property (F\('\)), we have \(E_r(x_i)\subset E_{\theta r}(y)\) and \(E_r(y)\subset E_{\theta r}(x_i)\). Hence,

$$\begin{aligned} \frac{\mu (E_{\theta r}(x_i))}{\mu (E_{r}(y))} \le 2C_\mu \theta ^\alpha \frac{\mu (E_r(x_i))}{\mu (E_{r}(y))} \le 2C_\mu \theta ^\alpha \frac{\mu (E_{\theta r}(y))}{\mu (E_{r}(y))} \le (2C_\mu \theta ^\alpha )^2. \end{aligned}$$

Furthermore, by Lemma 2.7 (iv), we have

$$\begin{aligned} \frac{1}{\mu (E_{\theta r}(x_i))}\int _{E_{\theta r}(x_i)}w\,d\mu \le 2C_\mu \theta ^\alpha \lambda , \end{aligned}$$

since \(\theta r>r>4\theta ^4r_i>r_i\). So we conclude that

$$\begin{aligned} \frac{1}{\mu (F)}\int _{F}w\,d\mu= & {} \frac{1}{\mu (E_r(y))}\int _{E_r(y)}w\,d\mu \\\le & {} \frac{\mu (E_{\theta r}(x_i))}{\mu (E_{r}(y))}\cdot \frac{1}{\mu (E_{\theta r}(x_i))}\int _{E_{\theta r}(x_i)}w\,d\mu \\\le & {} (2C_\mu \theta ^\alpha )^2\frac{1}{\mu (E_{\theta r}(x_i))}\int _{E_{\theta r}(x_i)}w\,d\mu \\\le & {} (2C_\mu \theta ^\alpha )^2(2C_\mu \theta ^\alpha )\lambda \\= & {} L\lambda , \end{aligned}$$

which is a contradiction. This finishes the proof of the lemma. \(\square \)

3 Proofs of the Main Results

3.1 Sharp Weak Reverse Hölder Inequality

Proof of Theorem 1.7

We want to prove that if \(w\in A_{\infty ,{\mathbb {E}}}\), then for all \(E\in {\mathbb {E}}\), we have

$$\begin{aligned} \frac{1}{\mu (E)}\int _Ew^{1+\epsilon }\,d\mu \le 4C_\mu \theta ^{2\alpha }\Bigl (\frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}w\,d\mu \Bigr )^{1+\epsilon }, \end{aligned}$$
(3.1)

where \(\epsilon =\frac{1}{2[w]_{A_{\infty ,{\mathbb {E}}}}C-1}\) and \(C=2(2C_\mu )^4\theta ^{8\alpha }4^\alpha \).

Let \(E=E_{r_0}(x_0)\in {\mathbb {E}}\) be a fixed open set. We begin by reducing the above inequality to a self-improving property of the maximal operator \(M_E\) when restricted to \(A_{\infty ,{\mathbb {E}}}\) weights. Using (2.7), we obtain

$$\begin{aligned} \int _Ew^{1+\epsilon }\,d\mu \le \int _E(M_Ew)^\epsilon w\,d\mu \le \int _{\widehat{E}}(M_Ew)^\epsilon w\,d\mu . \end{aligned}$$

Let \(\mathcal {D}_{M_E w}(\lambda )\) be defined as in (2.8). Using (1.14), we write the last integral as

$$\begin{aligned} \int _{\widehat{E}}(M_Ew)^\epsilon w\,d\mu= & {} \int _0^\infty \epsilon \lambda ^{\epsilon -1}w(\mathcal {D}_{M_E w}(\lambda ))\,d\lambda \nonumber \\= & {} \int _0^{w_{\widehat{E}}}\epsilon \lambda ^{\epsilon -1}w(\mathcal {D}_{M_E w}(\lambda ))\,d\lambda +\int _{w_{\widehat{E}}}^\infty \epsilon \lambda ^{\epsilon -1}w(\mathcal {D}_{M_E w}(\lambda ))\,d\lambda \nonumber \\\le & {} w(\widehat{E})(w_{\widehat{E}})^\epsilon +\int _{w_{\widehat{E}}}^\infty \epsilon \lambda ^{\epsilon -1}w(\mathcal {D}_{M_E w}(\lambda ))\,d\lambda . \end{aligned}$$
(3.2)

Now, in order to estimate the second term of (3.2), we apply Calderón–Zygmund decomposition of \(\mathcal {D}_{M_E w}(\lambda )\) for each \(\lambda >w_{\widehat{E}}\) (see Lemma 2.7) and obtain a countable family of pairwise disjoint open sets \(\{E_{r_i}(x_i)\}_i\) such that \(\mathcal {D}_{M_E w}(\lambda )\subseteq \bigcup _{i}E_{4\theta ^4 r_i}(x_i)\). Moreover,

$$\begin{aligned} \frac{1}{\mu (E_{4\theta ^4 r_i}(x_i))}\int _{E_{4\theta ^4 r_i}(x_i)}w(y)\,d\mu (y)\le 2C_{\mu }\theta ^{\alpha }\lambda \quad \text{ for } \text{ all }~i. \end{aligned}$$

That is,

$$\begin{aligned} w(E_{4\theta ^4 r_i}(x_i))\le 2C_{\mu }\theta ^{\alpha }\lambda \mu (E_{4\theta ^4 r_i}(x_i)). \end{aligned}$$

Therefore, using the doubling property (2.1), we have

$$\begin{aligned} \int _{w_{\widehat{E}}}^\infty \epsilon \lambda ^{\epsilon -1}w(\mathcal {D}_{M_E w}(\lambda ))\,d\lambda\le & {} \int _{w_{\widehat{E}}}^\infty \epsilon \lambda ^{\epsilon -1}\sum _iw(E_{4\theta ^4r_i}(x_i))\,d\lambda \\\le & {} 2C_\mu \theta ^\alpha \int _{w_{\widehat{E}}}^\infty \epsilon \lambda ^{\epsilon }\sum _i\mu (E_{4\theta ^4r_i}(x_i))\,d\lambda \\\le & {} 2C_\mu \theta ^\alpha \cdot 2C_\mu (4\theta ^4)^\alpha \int _{w_{\widehat{E}}}^\infty \epsilon \lambda ^{\epsilon }\sum _i\mu (E_{r_i}(x_i))\,d\lambda \\\le & {} (2C_\mu )^2 4^\alpha \theta ^{5\alpha }\int _0^\infty \epsilon \lambda ^{\epsilon }\mu (\mathcal {D}_{M_E w}(\lambda ))\,d\lambda \\= & {} (2C_\mu )^2 4^\alpha \theta ^{5\alpha }\frac{\epsilon }{\epsilon +1}\int _{\widehat{E}}(M_Ew)^{1+\epsilon }\,d\mu . \end{aligned}$$

The last line follows since

$$\begin{aligned} \int _{\widehat{E}}(M_Ew)^{\epsilon +1}\,d\mu =\int _0^\infty (\epsilon +1)\lambda ^{\epsilon }\mu (\mathcal {D}_{M_E w}(\lambda ))\,d\lambda . \end{aligned}$$

Substituting this estimate in (3.2) and taking average over E, we get

$$\begin{aligned}{} & {} { \frac{1}{\mu (E)}\int _Ew^{1+\epsilon }\,d\mu } \\{} & {} \quad \le \frac{1}{\mu (E)} w(\widehat{E})(w_{\widehat{E}})^\epsilon + (2C_\mu )^2 4^\alpha \theta ^{5\alpha }\frac{\epsilon }{\epsilon +1}\cdot \frac{1}{\mu (E)}\int _{\widehat{E}}(M_Ew)^{1+\epsilon }\,d\mu \\{} & {} \quad \le (w_{\widehat{E}})^{1+\epsilon }2C_\mu \theta ^{2\alpha }+(2C_\mu )^2 4^\alpha \theta ^{5\alpha }\frac{\epsilon }{\epsilon +1}\cdot 2C_\mu \theta ^{2\alpha } \frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}(M_Ew)^{1+\epsilon }\,d\mu , \end{aligned}$$

where in the last inequality, we have used Lemma 2.6 (b).

We turn now to the estimation of the integral in the second term of the last inequality. We intend to show that

$$\begin{aligned} \frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}(M_Ew)^{1+\epsilon }\,d\mu \le 2[w]_{A_{\infty ,{\mathbb {E}}}}\Bigl (\frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}w\,d\mu \Bigr )^{1+\epsilon } \end{aligned}$$
(3.3)

for any \(\epsilon \le \frac{1}{2C[w]_{A_{\infty ,{\mathbb {E}}}}-1}\), where \(C=2(2C_\mu )^4 4^\alpha \theta ^{8\alpha }\).

If this inequality is proved, then by a simple computation, we obtain

$$\begin{aligned} \frac{1}{\mu (E)}\int _Ew^{1+\epsilon }\,d\mu \le 4C_\mu \theta ^{2\alpha } \Bigl (\frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}w\,d\mu \Bigr )^{1+\epsilon }, \end{aligned}$$

which is precisely (3.1). So the proof of the theorem will be complete once we establish (3.3) and we now proceed to do so.

By the definition of the local maximal function \(M_Ew\) of w, we may assume that the weight w is supported on \(\widehat{E}\). Again, we consider the distribution set \(\mathcal {D}_{M_E w}(\lambda )\) as above and using (1.14), we write

$$\begin{aligned} \int _{\widehat{E}}(M_Ew)^{1+\epsilon }\,d\mu= & {} \int _0^\infty \epsilon \lambda ^{\epsilon -1}M_E w(\mathcal {D}_{M_E w}(\lambda ))\,d\lambda \nonumber \\= & {} \int _0^{w_{\widehat{E}}}\epsilon \lambda ^{\epsilon -1}M_E w(\mathcal {D}_{M_E w}(\lambda ))\,d\lambda \nonumber \\{} & {} +\int _{w_{\widehat{E}}}^\infty \epsilon \lambda ^{\epsilon -1}M_E w(\mathcal {D}_{M_E w}(\lambda ))\,d\lambda . \end{aligned}$$
(3.4)

Now, we estimate each integral separately. The first integral is relatively easier to handle. Indeed, an application of Lemma 2.6, combined with the definition of \(A_{\infty ,{\mathbb {E}}}\) constant, yields

$$\begin{aligned} \int _0^{w_{\widehat{E}}}\epsilon \lambda ^{\epsilon -1}M_E w(\mathcal {D}_{M_E w}(\lambda ))\,d\lambda\le & {} \Bigl (\int _0^{w_{\widehat{E}}}\epsilon \lambda ^{\epsilon -1}\,d\lambda \Bigr ) M_E w(\widehat{E}) \nonumber \\= & {} (w_{\widehat{E}})^\epsilon \int _{\widehat{E}}M_Ew(x)\,d\mu (x)\nonumber \\\le & {} (w_{\widehat{E}})^\epsilon \int _{E_{\theta ^2r_0}(y)}M_E(w\chi _{E_{\theta ^2r_0}(y)})(x)\,d\mu (x)\nonumber \\\le & {} (w_{\widehat{E}})^\epsilon [w]_{A_{\infty ,{\mathbb {E}}}}w(\widehat{E}), \end{aligned}$$
(3.5)

where in the second last inequality \(y\in E\).

We now estimate the second integral. Apply the Calderón–Zygmund decomposition of \(\mathcal {D}_{M_E w}(\lambda )\) for each \(\lambda >w_{\widehat{E}}\) to obtain a collection of pairwise disjoint open sets \(\{E_{r_i}(x_i)\}\) satisfying conditions (i) to (iv) of Lemma 2.7. In order to simplify notation, we write \(E_i=E_{4\theta ^4r_i}(x_i)\) and decompose \(E_i\) as \(E_i=E_i^1\cup E_i^2\), where

$$\begin{aligned} E_i^1=E_i\cap \mathcal {D}_{M_E w}(L\lambda )~\text{ and }~E_i^2=E_i\setminus \mathcal {D}_{M_E w}(L\lambda ). \end{aligned}$$

Recall that \(L=(2C_\mu \theta ^\alpha )^3\). So, in our new notation, \(\mathcal {D}_{M_E w}(\lambda )\subseteq \bigcup _{i}E_i\) and thus \(M_{E}w(\mathcal {D}_{M_E w}(\lambda )) \le \sum _{i}M_{E}w(E_i)\). Now, we proceed to estimate each \(M_{E}w(E_i)\). We have

$$\begin{aligned} M_Ew(E_i)= & {} \int _{E_i^1}M_Ew\,d\mu +\int _{E_i^2}M_Ew\,d\mu \\\le & {} \int _{E_i^1}M_E(w\chi _{E_{4\theta ^5r_i}(x_i)})\,d\mu +L\lambda \mu (E_i^2) \\\le & {} [w]_{A_{\infty ,{\mathbb {E}}}}w(E_{4\theta ^5r_i}(x_i))+L\lambda \mu (E_{4\theta ^5r_i}(x_i)) \\= & {} \Bigl ([w]_{A_{\infty ,{\mathbb {E}}}}w_{E_{4\theta ^5r_i}(x_i)}+L\lambda \Bigr )\mu (E_{4\theta ^5r_i}(x_i)), \end{aligned}$$

where we have used Lemma 2.8 in the first inequality and the definition of \(A_{\infty ,{\mathbb {E}}}\) constant in the second inequality.

Now if \(r>r_i\), then by Lemma 2.7, \(w_{E_r(x_i)}=\frac{1}{\mu (E_r(x_i))}\int _{E_r(x_i)}w\,d\mu \le 2C_\mu \theta ^\alpha \lambda \). Taking this fact into account and using the doubling condition (2.1), we obtain

$$\begin{aligned} M_Ew(E_i)\le & {} \Bigl ([w]_{A_{\infty ,{\mathbb {E}}}}2C_\mu \theta ^\alpha \lambda +L\lambda \Bigr )2C_\mu (4\theta ^5)^\alpha \mu (E_{r_i}(x_i)) \\\le & {} \Bigl ([w]_{A_{\infty ,{\mathbb {E}}}}(2C_\mu \theta ^\alpha )^3+(2C_\mu \theta ^\alpha )^3\Bigr )\lambda 2C_\mu (4\theta ^5)^\alpha \mu (E_{r_i}(x_i)) \\\le & {} [w]_{A_{\infty ,{\mathbb {E}}}}\lambda C\mu (E_{r_i}(x_i)), \end{aligned}$$

where \(C=2(2C_\mu )^4 4^\alpha \theta ^{8\alpha }\). Adding up these inequalities, we get

$$\begin{aligned} M_Ew(\mathcal {D}_{M_E w}(\lambda ))\le & {} \sum _iM_Ew(E_{i}) \\\le & {} [w]_{A_{\infty ,{\mathbb {E}}}}\lambda C\sum _i\mu (E_{r_i}(x_i))\\\le & {} \lambda C[w]_{A_{\infty ,{\mathbb {E}}}}\mu (\mathcal {D}_{M_E w}(\lambda )). \end{aligned}$$

Thus, the second integral in (3.4) is bounded by

$$\begin{aligned} \int _{w_{\widehat{E}}}^\infty \epsilon \lambda ^{\epsilon -1}M_E w(\mathcal {D}_{M_E w}(\lambda ))\,d\lambda\le & {} C[w]_{A_{\infty ,{\mathbb {E}}}}\int _{w_{\widehat{E}}}^\infty \epsilon \lambda ^{\epsilon }\mu (\mathcal {D}_{M_E w}(\lambda ))\,d\lambda \nonumber \\\le & {} C[w]_{A_{\infty ,{\mathbb {E}}}}\int _0^\infty \epsilon \lambda ^{\epsilon }\mu (\mathcal {D}_{M_E w}(\lambda ))\,d\lambda \nonumber \\= & {} C[w]_{A_{\infty ,{\mathbb {E}}}}\frac{\epsilon }{\epsilon +1}\int _{\widehat{E}}(M_Ew)^{1+\epsilon }\,d\mu . \end{aligned}$$
(3.6)

Now, we put together the estimates (3.5) and (3.6) in (3.4) and get

$$\begin{aligned} \int _{\widehat{E}}(M_Ew)^{1+\epsilon }\,d\mu \le (w_{\widehat{E}})^\epsilon [w]_{A_{\infty ,{\mathbb {E}}}}w(\widehat{E})+C[w]_{A_{\infty ,{\mathbb {E}}}}\frac{\epsilon }{\epsilon +1}\int _{\widehat{E}}(M_Ew)^{1+\epsilon }\,d\mu . \end{aligned}$$

Finally, taking average over \(\widehat{E}\), we have

$$\begin{aligned} \Bigl (1-[w]_{A_{\infty ,{\mathbb {E}}}}C\frac{\epsilon }{\epsilon +1}\Bigr )\frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}(M_Ew)^{1+\epsilon }\,d\mu \le (w_{\widehat{E}})^{1+\epsilon }[w]_{A_{\infty ,{\mathbb {E}}}}. \end{aligned}$$

Note that \(1-C[w]_{A_{\infty ,{\mathbb {E}}}}\frac{\epsilon }{\epsilon +1}\ge \frac{1}{2}\) if and only if \(\epsilon \le \frac{1}{2[w]_{A_{\infty ,{\mathbb {E}}}}C-1}\). In particular, if we choose \(\epsilon =\frac{1}{2[w]_{A_{\infty ,{\mathbb {E}}}}C-1}\), where \(C=2(2C_\mu )^4 4^\alpha \theta ^{8\alpha }\), then

$$\begin{aligned} \frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}(M_Ew)^{1+\epsilon }\,d\mu \le 2[w]_{A_{\infty ,{\mathbb {E}}}}\Bigl (\frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}w\,d\mu \Bigr )^{1+\epsilon }. \end{aligned}$$

Thus, (3.3) holds and the proof of Theorem 1.7 is complete. \(\square \)

3.2 Open Property of \(A_{p,{\mathbb {E}}}\) Weights

We denote \(r(w)=1+\epsilon \), where \(\epsilon \) is as above. Explicitly,

$$\begin{aligned} r(w)=1+\frac{1}{2[w]_{A_{\infty ,{\mathbb {E}}}}C-1},\quad \text{ where }~C=2(2C_\mu )^4 4^\alpha \theta ^{8\alpha }. \end{aligned}$$

An immediate consequence of Theorem 1.7 is the following quantitative open property of the \(A_{p,{\mathbb {E}}}\) weights.

Corollary 3.1

(Open property of \(A_{p,{\mathbb {E}}}\) weights) Let \(1<p<\infty \) and \(w\in A_{p,{\mathbb {E}}}\). Let \(\sigma =w^{1-p'}\) be the dual weight of w. Put \(\delta =\frac{p-1}{(r(\sigma ))'}\). Then, \(w\in A_{p-\delta ,{\mathbb {E}}}\) and

$$\begin{aligned}{}[w]_{A_{p-\delta ,{\mathbb {E}}}}\le (2C_\mu \theta ^{2\alpha })^p(4C_\mu \theta ^{2\alpha })^{\frac{p-1}{r(\sigma )}}[w]_{A_{p,{\mathbb {E}}}}. \end{aligned}$$
(3.7)

Proof

We use the same classical ideas as in the Euclidean case. For the above choice of \(\delta =\frac{p-1}{(r(\sigma ))'}\), we have \(1-(p-\delta )'=(1-p')r(\sigma )\) and \(r(\sigma )=\frac{p-1}{p-\delta -1}\). Applying the sharp weak reverse Hölder inequality for the weight \(\sigma \), we have

$$\begin{aligned} \Bigl (\frac{1}{\mu (E)}\int _Ew^{1-(p-\delta )'}\,d\mu \Bigr )^{p-\delta -1}= & {} \Bigl (\frac{1}{\mu (E)}\int _Ew^{(1-p')r(\sigma )}\,d\mu \Bigr )^{\frac{p-1}{r(\sigma )}} \\= & {} \Bigl (\frac{1}{\mu (E)}\int _E\sigma ^{r(\sigma )}\,d\mu \Bigr )^{\frac{p-1}{r(\sigma )}} \\\le & {} \Bigl ((4C_\mu \theta ^{2\alpha })^{\frac{1}{r(\sigma )}} \frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}\sigma \,d\mu \Bigr )^{p-1} \end{aligned}$$

for any \(E\in {\mathbb {E}}\).

Let \(E=E_{r_0}(x_0)\) be a fixed open set in \({\mathbb {E}}\). By Lemma 2.6, we have \(\widehat{E}\subset E_{\theta ^2r_0}(z)\). Hence, by the doubling property of \(\mu \), we have

$$\begin{aligned}{} & {} { \Bigl (\frac{1}{\mu (E)}\int _Ew\,d\mu \Bigr )\cdot \Bigl (\frac{1}{\mu (E)}\int _Ew^{1-(p-\delta )'}\,d\mu \Bigr )^{p-\delta -1} } \\{} & {} \quad \le \Bigl (\frac{1}{\mu (E)}\int _Ew\,d\mu \Bigr )\cdot \Bigl ((4C_\mu \theta ^{2\alpha })^{\frac{1}{r(\sigma )}} \frac{1}{\mu (\widehat{E})}\int _{\widehat{E}}\sigma \,d\mu \Bigr )^{p-1} \\{} & {} \quad \le 2C_\mu \theta ^{2\alpha }\Bigl (\frac{1}{\mu (E_{\theta ^2r_0}(x_0))}\int _{E_{\theta ^2r_0}(x_0)}w\,d\mu \Bigr ) \\{} & {} \qquad \times \Bigl ((4C_\mu \theta ^{2\alpha })^{\frac{1}{r(\sigma )}}2C_\mu \theta ^{2\alpha } \frac{1}{\mu (E_{\theta ^2r_0}(x_0))}\int _{E_{\theta ^2r_0}(x_0)}w^{1-p'}\,d\mu \Bigr )^{p-1} \\{} & {} \quad \le (2C_\mu \theta ^{2\alpha })^p(4C_\mu \theta ^{2\alpha })^{\frac{p-1}{r(\sigma )}}[w]_{A_{p,{\mathbb {E}}}}. \end{aligned}$$

In other words, \([w]_{A_{p-\delta ,{\mathbb {E}}}}\le (2C_\mu \theta ^{2\alpha })^p(4C_\mu \theta ^{2\alpha })^{\frac{p-1}{r(\sigma )}}[w]_{A_{p,{\mathbb {E}}}}\). \(\square \)

3.3 Sharp Mixed Bound for the Maximal Operator

Proof of Theorem 1.6

Let \(f\in L_w^p(X)\) be given. As we mentioned earlier, the idea behind the proof of (1.10) is to use an interpolation type argument. To do so, we need a suitable truncation of f, namely \(f_t=f\chi _{|f|>t}\), \(t>0\). Then, a simple computation shows that

$$\begin{aligned} \{x\in X:M_{{\mathbb {E}}}f(x)>2t\}\subset \{x\in X:M_{{\mathbb {E}}}f_t(x)>t\}. \end{aligned}$$
(3.8)

Let us estimate the \(L^p_w(X)\)-norm of \(M_{{\mathbb {E}}}f\). First of all, by (3.8), we write

$$\begin{aligned} \Vert M_{{\mathbb {E}}}f\Vert _{L_w^p(X)}^p= & {} \int _0^\infty pt^{p-1}w(\{x\in X:M_{{\mathbb {E}}}f(x)>t\})\,dt \\= & {} 2^p\int _0^\infty pt^{p-1}w(\{x\in X:M_{{\mathbb {E}}}f(x)>2t\})\,dt \\\le & {} 2^p\int _0^\infty pt^{p-1}w(\{x\in X:M_{{\mathbb {E}}}f_t(x)>t\})\,dt. \end{aligned}$$

We will now use the tools that we have developed earlier. Let \(w\in A_{p,{\mathbb {E}}}\). By Corollary 3.1, we have \(w\in A_{p-\delta ,{\mathbb {E}}}\), where \(\delta =\frac{p-1}{(r(\sigma ))'}\). Therefore, by Lemma 2.4 and (3.7), we obtain

$$\begin{aligned} \Vert M_{{\mathbb {E}}}f\Vert _{L_w^p(X)}^p\le & {} 2^p p\int _0^\infty t^{p-1}\frac{[w]_{A_{p-\delta ,{\mathbb {E}}}}}{t^{p-\delta }}(2C_\mu (4\theta ^4)^{\alpha })^{p-\delta }\int _X|f_t|^{p-\delta }w\,d\mu \,dt\nonumber \\\le & {} 2^p p\int _0^\infty t^{\delta -1}(2C_{\mu }\theta ^{2\alpha })^p (4C_{\mu }\theta ^{2\alpha })^\frac{p-1}{r(\sigma )} [w]_{A_{p,{\mathbb {E}}}} \\{} & {} \times (2C_\mu (4\theta ^4)^{\alpha })^{p-\delta }\int _X|f_t|^{p-\delta }w\,d\mu \,dt \\\le & {} 2^p p(2C_{\mu }\theta ^{2\alpha })^p (4C_{\mu }\theta ^{2\alpha })^\frac{p-1}{r(\sigma )}[w]_{A_{p,{\mathbb {E}}}} \\{} & {} \times (2C_\mu (4\theta ^4)^{\alpha })^{p}\int _0^\infty t^{\delta -1}\int _X|f_t|^{p-\delta }w\,d\mu \,dt\\= & {} A [w]_{A_{p,{\mathbb {E}}}}\int _0^\infty t^{\delta -1}\int _X|f_t|^{p-\delta }w\,d\mu \,dt \\\le & {} \frac{A [w]_{A_{p,{\mathbb {E}}}}}{\delta }\int _X|f|^pw\,d\mu , \end{aligned}$$

where \(A=2^p p(2C_{\mu }\theta ^{2\alpha })^p (4C_{\mu }\theta ^{2\alpha })^\frac{p-1}{r(\sigma )} (2C_\mu (4\theta ^4)^{\alpha })^{p}\). Note that \(r(\sigma )=1+\frac{1}{2[\sigma ]_{A_{\infty ,{\mathbb {E}}}}C-1}\), where \(C=2(2C_\mu )^4 4^\alpha \theta ^{8\alpha }\). Hence,

$$\begin{aligned} \delta =\frac{p-1}{(r(\sigma ))'}=\frac{p-1}{2[\sigma ]_{A_{\infty ,{\mathbb {E}}}}C}. \end{aligned}$$

With \(\delta \) as above, we obtain

$$\begin{aligned} \Vert M_{{\mathbb {E}}}f\Vert _{L_w^p(X)}^p\le 2AC\frac{[\sigma ]_{A_{\infty ,{\mathbb {E}}}}[w]_{A_{p,{\mathbb {E}}}}}{p-1}\int _X|f|^pw\,d\mu . \end{aligned}$$

From this we conclude that

$$\begin{aligned} \Vert M_{{\mathbb {E}}}f\Vert _{L_w^p(X)}\le C\Bigl ([w]_{A_{p,{\mathbb {E}}}}[\sigma ]_{A_{\infty ,{\mathbb {E}}}}\Bigr )^{\frac{1}{p}}\Vert f\Vert _{L_w^p(X)}. \end{aligned}$$

This completes the proof of the theorem. \(\square \)

3.4 Endpoint Fefferman–Stein Weighted Inequality

A crucial ingredient to prove this inequality is the covering lemma outlined in Lemma 1.9. We first prove this result.

Proof of Lemma 1.9

We will use an inductive argument to select the disjoint countable subfamily \(\{E_{r_{i}}(x_{i})\}\) of \(\mathcal {F}\) with the desired property.

Step I. We start by picking an open set \(E_{r_{0,1}}(x_{0,1})\in {\mathcal {F}}\) such that

$$\begin{aligned} 2\mu (E_{r_{0,1}}(x_{0,1}))>\sup \limits _{\alpha \in \Lambda }\mu (E_{r_\alpha }(x_\alpha )). \end{aligned}$$

Define the sets \({\mathcal {F}}_1\) and \({\mathcal {R}}_1\) as follows:

$$\begin{aligned} {\mathcal {F}}_1=\{E_r(x)\in {\mathcal {F}}:E_{\theta ^2r}(x)\cap E_{r_{0,1}}(x_{0,1})\not =\emptyset \} \end{aligned}$$

and

$$\begin{aligned} {\mathcal {R}}_1=\{r:E_r(x)\in {\mathcal {F}}_1\}. \end{aligned}$$

It is obvious that \(E_{r_{0,1}}(x_{0,1})\in {\mathcal {F}}_1\). We claim that \(\sup {\mathcal {R}}_1<\infty \). Suppose on the contrary \(\sup {\mathcal {R}}_1=\infty \). Then, we can find a sequence \(\{r_j\}\) such that \(r_j\rightarrow \infty \) as \(j\rightarrow \infty \) with the property that \(E_{r_j}(x_j)\in {\mathcal {F}}_1\) and \(E_{\theta ^2r_j}(x_j)\cap E_{r_{0,1}}(x_{0,1})\not =\emptyset \). Without loss, we can assume that \(\theta ^2r_j>r_{0,1}\) for all j. Since \(E_{r_{0,1}}(x_{0,1})\subset E_{\theta ^2r_j}(x_{0,1})\), we have \(E_{\theta ^2r_j}(x_j)\cap E_{\theta ^2r_j}(x_{0,1})\not =\emptyset \). By property (F\('\)), \(E_{\theta ^2r_j}(x_{0,1})\subset E_{\theta ^3r_j}(x_j)\) for all j. Applying (2.1), we see that

$$\begin{aligned} \mu (E_{\theta ^2r_j}(x_{0,1}))\le \mu (E_{\theta ^3r_j}(x_j))\le 2C_\mu \theta ^{3\alpha }\mu (E_{r_j}(x_j))\le 2C_\mu \theta ^{3\alpha }\mu (\Sigma ). \end{aligned}$$

Since \(\mu (E_{\theta ^2r_j}(x_{0,1}))\rightarrow \infty \) as \(j\rightarrow \infty \), this would mean that \(\mu (\Sigma )=\infty \). This is a contradiction and hence the claim is proved. Therefore, we can choose \(E_{r_1}(x_1)\in {\mathcal {F}}_1\) such that \(\theta r_1>\sup {\mathcal {R}}_1\).

We further claim that if \(E_r(x)\in {\mathcal {F}}_1\) such that \(E_r(x)\cap E_{r_1}(x_1)\not =\emptyset \), then \(E_r(x)\subset E_{\theta ^3r_1}(x_1)\). To prove this claim, we first observe that for any such \(E_r(x)\), we have \(r\le \theta ^2 r_1\). This is for the following reason. Suppose there is an r such that \(r>\theta ^2 r_1\). Since \(E_{r_{0,1}}(x_{0,1})\in {\mathcal {F}}_1\), by the definition of \({\mathcal {R}}_1\), we have \(\theta r_1>r_{0,1}\) so that \(r>\theta ^2 r_1>\theta r_1>r_{0,1}\). Also, by the choice of \(E_{r_1}(x_1)\), we have \(E_{r_{0,1}}(x_{0,1})\cap E_{\theta ^2 r_1}(x_1)\not =\emptyset \). Let y be a point in their intersection. Then, by property (F\('\)),

$$\begin{aligned} E_{r_{0,1}}(x_{0,1})\subset E_{\theta r_{0,1}}(y)\subset E_{\theta ^2 r_1}(y)\subset E_{\theta ^3 r_1}(x_1)\subset E_{\theta r}(x_1). \end{aligned}$$

On the other hand, by our hypothesis \(E_r(x)\cap E_{r_1}(x_1)\not =\emptyset \). This implies \(E_{\theta r}(x)\cap E_{\theta r}(x_1)\not =\emptyset \). Hence, \(E_{\theta r}(x_1)\subset E_{\theta ^2 r}(x)\) by property (F\('\)). Thus, \(E_{r_{0,1}}(x_{0,1})\subset E_{\theta ^2 r}(x)\). In particular, \(E_{r_{0,1}}(x_{0,1})\cap E_{\theta ^2 r}(x)\not =\emptyset \). This implies that \(r\in {\mathcal {R}}_1\) so that \(r\le \sup \mathcal R_1<\theta r_1<\theta ^2 r_1\). This is a contradiction.

So we have proved that if \(E_r(x)\in {\mathcal {F}}_1\) such that \(E_r(x)\cap E_{r_1}(x_1)\not =\emptyset \), then \(r\le \theta ^2 r_1\). Consequently \(E_{\theta ^2 r_1}(x)\cap E_{\theta ^2 r_1}(x_1)\not =\emptyset \). Therefore, by property (F\('\)), \(E_r(x)\subset E_{\theta ^2 r_1}(x)\subset E_{\theta ^3 r_1}(x_1)\). Thus, the second claim is also proved.

Step II. Let

$$\begin{aligned} \widetilde{{\mathcal {F}}_2}=\{E_r(x)\in {\mathcal {F}}:E_r(x)\cap E_{r_1}(x_1)=\emptyset \}. \end{aligned}$$

We choose \(E_{r_{0,2}}(x_{0,2})\in \widetilde{{\mathcal {F}}_2}\) such that

$$\begin{aligned} 2\mu (E_{r_{0,2}}(x_{0,2}))>\sup _{E_r(x)\in \widetilde{\mathcal F_2}}\mu (E_r(x)). \end{aligned}$$

Let

$$\begin{aligned} {\mathcal {F}}_2=\{E_r(x)\in \widetilde{\mathcal F_2}:E_{\theta ^2r}(x)\cap E_{r_{0,2}}(x_{0,2})\not =\emptyset \} \end{aligned}$$

and

$$\begin{aligned} {\mathcal {R}}_2=\{r:E_r(x)\in {\mathcal {F}}_2\}. \end{aligned}$$

Proceeding as in Step I, we have \(\sup {\mathcal {R}}_2<\infty \) and hence we can choose \(E_{r_2}(x_2)\in {\mathcal {F}}_2\) such that \(\theta r_2>\sup {\mathcal {R}}_2\). Furthermore, if \(E_r(x)\in {\mathcal {F}}_2\) such that \(E_r(x)\cap E_{r_2}(x_2)\not =\emptyset \), then \(E_r(x)\subset E_{\theta ^3r_2}(x_2)\).

Step III. We continue as above. If the process stops after J steps, then \(\{E_{r_i}(x_i):i=1,2,\dots ,J\}\) is the required collection. Otherwise, this process generates a countable collection of disjoint open sets \(\{E_{r_i}(x_i):i=1,2,\dots \}\). Now, we show that if \(E_r(x)\in {\mathcal {F}}\), then it intersects with at least one \(E_{r_i}(x_i)\). This will complete the proof of the lemma. If this is not the case, then \(E_r(x)\cap E_{r_i}(x_i)=\emptyset \) for all i. This means that \(E_r(x)\in \widetilde{{\mathcal {F}}_i}\) for every i. Therefore,

$$\begin{aligned} 2\mu (E_{r_{0,i}}(x_{0,i}))>\mu (E_r(x))\quad \text{ for } \text{ all }~i. \end{aligned}$$

Also, since \(E_{\theta ^2r_{0,i}}(x_{0,i})\cap E_{r_{0,i}}(x_{0,i})\not =\emptyset \), by the definition of \(\sup {\mathcal {R}}_i\), it follows that \(r_{0,i}<\theta r_i\).

On the other hand, by the choice of \(E_{r_i}(x_i)\), the condition \(E_{\theta ^2r_{i}}(x_i)\cap E_{r_{0,i}}(x_{0,i})\not =\emptyset \) implies that \(E_{\theta ^2r_i}(x_i)\cap E_{\theta ^2r_i}(x_{0,i})\not =\emptyset \). Then it follows that \(E_{\theta ^2r_i}(x_{0,i})\subset E_{\theta ^3r_i}(x_i)\). Therefore,

$$\begin{aligned} E_{r_{0,i}}(x_{0,i})\subset E_{\theta r_i}(x_{0,i})\subset E_{\theta ^2r_i}(x_{0,i})\subset E_{\theta ^3r_i}(x_i). \end{aligned}$$

Summarizing these, we obtain

$$\begin{aligned} 0<\mu \bigl (E_r(x)\bigr )<2\mu \bigl (E_{r_{0,i}}(x_{0,i})\bigr )\le 2\mu \bigl (E_{\theta ^3r_i}(x_i)\bigr )\le 2\cdot 2C_\mu \theta ^{3\alpha }\mu \bigl (E_{r_i}(x_i)\bigr ). \end{aligned}$$

Since \(\{E_{r_i}(x_i)\}\) is a pairwise disjoint collection, this will imply that \(\mu (\Sigma )=\infty \), a contradiction. Hence, we conclude that every \(E_r(x)\) in \(\mathcal {F}\) intersects at least one \(E_{r_i}(x_i)\) and in that case \(E_r(x)\subseteq E_{\theta ^3r_i}(x_i) \). This completes the proof of the lemma. \(\square \)

We are now ready to prove the endpoint Fefferman–Stein weighted inequality.

Proof of Theorem 1.10

Let f be a locally integrable function on X and \(\lambda >0\) be given. We want to show that

$$\begin{aligned} w(\{x\in X:M_{{\mathbb {E}}}f(x)>\lambda \})\le C\int _{X}|f(x)|M_{{\mathbb {E}}}w(x)\,d\mu (x), \end{aligned}$$
(3.9)

where C is independent of f and \(\lambda \).

Let \(\mathcal {D}_{M_{{\mathbb {E}}}f}(\lambda )=\{x\in X:M_{{\mathbb {E}}}f(x)>\lambda \}\) be the distribution set of \(M_{{\mathbb {E}}}f\) at the level \(\lambda \). Fix an arbitrary element \(x_0\) in X. For \(n\in {\mathbb {N}}\), let \(f_n\) be the function obtained by restricting f on the set \(E_n(x_0)\), i.e., \(f_n=f\cdot \chi _{E_n(x_0)}\). We also consider the corresponding distribution set \(\mathcal {D}_{M_{{\mathbb {E}}}f_n}(\lambda )=\{x\in X:M_{{\mathbb {E}}}f_n(x)>\lambda \}\). We clearly have that the family \(\mathcal {D}_{M_{{\mathbb {E}}}f_n}(\lambda )\) is increasing in n and moreover, \(\mathcal {D}_{M_{{\mathbb {E}}}f}(\lambda )=\bigcup _{n}\mathcal {D}_{M_{{\mathbb {E}}}f_n}(\lambda )\). This suggests that we may compute the value of \(w(\mathcal {D}_{M_{{\mathbb {E}}}f}(\lambda ))\) as limit of \(w(\mathcal {D}_{M_{{\mathbb {E}}}f_n}(\lambda ))\) as \(n \rightarrow \infty \).

So let n be fixed and \(x\in \mathcal {D}_{M_{{\mathbb {E}}}f_n}(\lambda )\). Then there exists an open set \(E_s(z)\in {\mathbb {E}}\) containing x such that

$$\begin{aligned} \frac{1}{\mu \bigl (E_s(z)\bigr )}\int _{E_s(z)}|f_n(y)|\,d\mu (y)>\lambda . \end{aligned}$$

The last inequality also shows that \(E_s(z) \subseteq \mathcal {D}_{M_{{\mathbb {E}}}f_n}(\lambda )\) and thus \(\mathcal {D}_{M_{{\mathbb {E}}}f_n}(\lambda )\) may be written as a union of open sets from \({\mathbb {E}}\). Further, by the weak (1, 1) property of \(M_{{\mathbb {E}}}\) (Theorem 2.2), we have \(\mu (\mathcal {D}_{M_{{\mathbb {E}}}f_n}(\lambda ))<\infty \). Hence, by Lemma 1.9, there exists a countable family of pairwise disjoint open sets \(\{E_{r_i} (x_i)\}_i\) such that

$$\begin{aligned} \mathcal {D}_{M_{{\mathbb {E}}}f_n}(\lambda ) \subseteq \bigcup _{i} E_{\theta ^3 r_i} (x_i). \end{aligned}$$
(3.10)

Moreover,

$$\begin{aligned} \frac{1}{\mu \bigl (E_{r_i} (x_i)\bigr )}\int _{E_{r_i} (x_i)}|f_n(x)|\,d\mu (x)>\lambda \quad \text{ for } \text{ all }~i. \end{aligned}$$
(3.11)

Therefore, using (3.10), (3.11) and (2.1), we obtain

$$\begin{aligned} w\bigl (\mathcal {D}_{M_{{\mathbb {E}}}f_n}(\lambda )\bigr )\le & {} \sum _i w\bigl (E_{\theta ^3 r_i}(x_i)\bigr ) \\= & {} \sum _i\mu \bigl (E_{\theta ^3 r_i}(x_i)\bigr )\cdot \frac{1}{\mu \bigl (E_{\theta ^3 r_i}(x_i)\bigr )} \int _{E_{\theta ^3 r_i}(x_i)}w\,d\mu \\\le & {} 2C_\mu \theta ^{3\alpha }\sum _i\Bigl (\frac{1}{\lambda }\int _{E_{r_i}(x_i)}|f_n|\,d\mu \Bigr )\frac{1}{\mu \bigl (E_{\theta ^3 r_i}(x_i)\bigr )}\int _{E_{\theta ^3 r_i}(x_i)}w\,d\mu \\= & {} \frac{2C_\mu \theta ^{3\alpha }}{\lambda }\sum _i\left\{ \int _{E_{r_i}(x_i)}|f_n|\left( \frac{1}{\mu \bigl (E_{\theta ^3 r_i}(x_i)\bigr )}\int _{E_{\theta ^3 r_i}(x_i)}w\,d\mu \right) \,d\mu \right\} \\\le & {} \frac{2C_\mu \theta ^{3\alpha }}{\lambda }\sum _i\int _{E_{r_i}(x_i)}|f_n(x)|M_{{\mathbb {E}}}w(x)\,d\mu (x) \\\le & {} \frac{2C_\mu \theta ^{3\alpha }}{\lambda }\int _X|f_n(x)|M_{{\mathbb {E}}}w(x)\,d\mu (x)\\\le & {} \frac{2C_\mu \theta ^{3\alpha }}{\lambda }\int _X|f(x)|M_{{\mathbb {E}}}w(x)\,d\mu (x). \end{aligned}$$

We observe that the last expression is independent of \(x_0\) and n. Hence, we conclude that

$$\begin{aligned} w\bigl (\mathcal {D}_{M_{{\mathbb {E}}}f}(\lambda )\bigr )\le \frac{2C_\mu \theta ^{3\alpha }}{\lambda }\int _X|f(x)|M_{{\mathbb {E}}}w(x)\,d\mu (x) \end{aligned}$$

for all \(\lambda >0\). \(\square \)

Remark 3.2

Strictly speaking, we have proved Fefferman–Stein weighted inequality for the case \(\mu (X)=\infty \) since the covering Lemma 1.9 is valid only under this assumption. However, the assertion of Theorem 1.10 is still true for the case \(\mu (X)<\infty \). In this case, we use the covering Lemma 2.1 instead of Lemma 1.9 and replace the assumptions \(0<r<\infty \) and \(0<\mu (E_r(x))<\infty \) by \(0<r<\mu (X)\) and \(0<\mu (E_r(x))<\mu (X)\) respectively. Then, the argument presented here for the proof of (3.9) will go through for this case as well.

3.5 Vector-Valued Inequalities

Proof of Theorem 1.11

We use the following notation. If \(f=(f_1,f_2,\dots )\) is a sequence of functions on X, then \(M_{{\mathbb {E}}}f=(M_{{\mathbb {E}}}f_1,M_{{\mathbb {E}}}f_2,\dots )\) and

$$\begin{aligned} \Vert f(x)\Vert _{\ell ^q}=\left( \sum _i|f_i(x)|^q\right) ^\frac{1}{q}, \quad 1<q<\infty . \end{aligned}$$

We shall follow the same basic ideas as the arguments presented in [13]. If \(p=q>1\), then the proof is straightforward and therefore we omit the details. We will need this result for the proof of the case \(p=1\). Thus, for all \(q>1\), there is a constant \(A_q\) such that

$$\begin{aligned} \int _X\Vert M_{{\mathbb {E}}}f(x)\Vert _{\ell ^q}^q\,d\mu (x)\le A_q\int _X\Vert f(x)\Vert _{\ell ^q}^q\,d\mu (x). \end{aligned}$$
(3.12)

For the case \(p=1\), we proceed as follows. Let \(f=(f_1,f_2,\dots )\) be such that \(\int _X\Vert f(x)\Vert _{\ell ^q}\,d\mu (x)<\infty \) and \(\lambda >0\) be given. We want to prove

$$\begin{aligned} \mu \bigl (\bigl \{x\in X:\Vert M_{{\mathbb {E}}}f(x)\Vert _{\ell ^q}>\lambda \bigr \}\bigr )\le \frac{C_q}{\lambda }\int _X\Vert f(x)\Vert _{\ell ^q}\,d\mu (x), \end{aligned}$$
(3.13)

where \(C_q\) is independent of f and \(\lambda \).

We apply Calderón–Zygmund decomposition to the function \((\sum _i|f_i|^q)^{\frac{1}{q}}\) to obtain a countable collection of pairwise disjoint open sets \(\{E_{r_j}(x_j):j\in J\}\) such that

  1. (a)

    \(\lambda <\frac{1}{\mu (E_{r_j}(x_j))}\int _{E_{r_j}(x_j)}\Vert f(x)\Vert _{\ell ^q}\,d\mu (x)\le 2C_\mu \theta ^\alpha \lambda \) for all \(j\in J\),

  2. (b)

    \(\sum \limits _{j\in J}\mu (E_{r_j}(x_j))\le \frac{1}{\lambda }\int _X\Vert f(x)\Vert _{\ell ^q}\,d\mu (x)\),

  3. (c)

    \(\Vert f(x)\Vert _{\ell ^q}\le \lambda \), if \(x\not \in \cup _{j\in J}E_{r_j}(x_j)\).

Let \(\Omega =\bigcup _j E_{r_j}(x_j)\). We shall use the open sets \(E_{r_j}(x_j)\) to decompose the function f into two parts as follows. For each i, define \(f_{1,i}=f_i\cdot \chi _\Omega \) on X. Setting \(f_{2,i}=f_i-f_{1,i}\), we obtain a decomposition \(f=f^1+f^2\), where

$$\begin{aligned} f^1=(f_{1,1},f_{1,2},f_{1,3},\dots )\quad \text{ and }\quad f^2=(f_{2,1},f_{2,2},f_{2,3},\dots ). \end{aligned}$$

Observe that \(f^2\) is supported outside of \(\Omega \), \(\Vert f^2(x)\Vert _{\ell ^q}\le \lambda \) for a.e. x, and \(\Vert f^2(x)\Vert _{\ell ^q}\le \Vert f(x)\Vert _{\ell ^q}\). Using these facts and (3.12), we obtain

$$\begin{aligned}{} & {} { \lambda ^q\mu \bigl (\bigl \{x\in X:\Vert M_{{\mathbb {E}}}f^2(x)\Vert _{\ell ^q}>\lambda \bigr \}\bigr ) } \\{} & {} \quad \le q\int _0^\lambda \mu \bigl (\bigl \{x\in X:\Vert M_{{\mathbb {E}}}f^2(x)\Vert _{\ell ^q}>t\bigr \}\bigr )t^{q-1}\,dt \\{} & {} \quad \le \int _0^\infty \mu \bigl (\bigl \{x\in X:\Vert M_{{\mathbb {E}}}f^2(x)\Vert _{\ell ^q}^q>s\bigr \}\bigr )\,ds \\{} & {} \quad = \int _X\Vert M_{{\mathbb {E}}}f^2(x)\Vert _{\ell ^q}^q\,d\mu (x) \\{} & {} \quad \le A_q\int _X\Vert f^2(x)\Vert _{\ell ^q}^q\,d\mu (x) \\{} & {} \quad \le \lambda ^{q-1}A_q\int _X\Bigl (\sum _i|f_{2,i}(x)|^q\Bigr )^{\frac{1}{q}}\,d\mu (x) \\{} & {} \quad \le \lambda ^{q-1}A_q\int _X\Vert f(x)\Vert _{\ell ^q}\,d\mu (x). \end{aligned}$$

Hence,

$$\begin{aligned} \mu \bigl (\bigl \{x\in X:\Vert M_{{\mathbb {E}}}f^2(x)\Vert _{\ell ^q}>\lambda \bigr \}\bigr )\le \frac{A_q}{\lambda }\int _X\Vert f(x)\Vert _{\ell ^q}\,d\mu (x). \end{aligned}$$
(3.14)

Also, by Minkowski’s inequality, we have

$$\begin{aligned} \Vert M_{{\mathbb {E}}}f(x)\Vert _{\ell ^q}\le \Vert M_{{\mathbb {E}}}f^1(x)\Vert _{\ell ^q}+\Vert M_{{\mathbb {E}}}f^2(x)\Vert _{\ell ^q}. \end{aligned}$$

So, in view of the estimate (3.14), it follows that inequality (3.13) will be proved once we prove that

$$\begin{aligned} \mu \bigl (\bigl \{x\in X:\Vert M_{{\mathbb {E}}}f^1(x)\Vert _{\ell ^q}>\lambda \bigr \}\bigr )\le \frac{A_q}{\lambda }\int _X\Vert f(x)\Vert _{\ell ^q}\,d\mu (x). \end{aligned}$$
(3.15)

Now we further reduce the above inequality to simpler inequalities. We split the distribution set of \(\Vert M_{{\mathbb {E}}}f^1(\cdot )\Vert _{\ell ^q}\) as follows. Set

$$\begin{aligned} \widetilde{\Omega }=\bigcup _{j\in J}\widetilde{E}_{r_j}(x_j), \end{aligned}$$

where \(\widetilde{E}_{r_j}(x_j)=E_{\theta r_j}(x_j)\). Then

$$\begin{aligned} \mu \bigl (\bigl \{x\in X:\Vert M_{{\mathbb {E}}}f^1(x)\Vert _{\ell ^q}>\lambda \bigr \}\bigr )\le \mu (\widetilde{\Omega })+\mu \bigl (\bigl \{x\in X\setminus \widetilde{\Omega }:\Vert M_{{\mathbb {E}}}f^1(x)\Vert _{\ell ^q}>\lambda \bigr \}\bigr ). \end{aligned}$$

The first term is easily controlled by (2.1) and (b):

$$\begin{aligned} \mu (\widetilde{\Omega })\le & {} \sum _{j\in J} \mu (\widetilde{E}_{r_j}(x_j)) \\\le & {} 2C_\mu \theta ^\alpha \sum _{j\in J}\mu (E_{r_j}(x_j)) \\\le & {} \frac{2C_\mu \theta ^\alpha }{\lambda }\int _X\Vert f(x)\Vert _{\ell ^q}\,d\mu (x). \end{aligned}$$

Taking the above estimate into account, we see that proving (3.15) reduces to showing that

$$\begin{aligned} \mu \bigl (\bigl \{x\in X\setminus \widetilde{\Omega }:\Vert M_{{\mathbb {E}}}f^1(x)\Vert _{\ell ^q}>\lambda \bigr \}\bigr )\le \frac{C}{\lambda }\int _X\Vert f(x)\Vert _{\ell ^q}\,d\mu (x). \end{aligned}$$
(3.16)

Let us define

$$\begin{aligned} \widetilde{f_i}(x)= \left\{ \begin{array}{ll} \frac{1}{\mu \bigl (E_{r_j}(x_j)\bigr )}\int _{E_{r_j}(x_j)}|f_i(y)|\,d\mu (y), &{} \text{ if }~x\in E_{r_j}(x_j),\\ \\ 0, &{} \text{ otherwise }. \end{array} \right. \end{aligned}$$

Then, the function \(\Vert \widetilde{f}(\cdot )\Vert _{\ell ^q}=\bigl (\sum _i|\widetilde{f_i}(\cdot )|^q\bigr )^{\frac{1}{q}}\) is supported on \(\Omega \) and is bounded by \(2C_\mu \theta ^\alpha \). By a simple computation as above, we obtain

$$\begin{aligned} \mu \bigl (\bigl \{x\in X:\Vert M_{{\mathbb {E}}}\widetilde{f}(x)\Vert _{\ell ^q}>\lambda \bigr \}\bigr )\le \frac{A_q(2C_\mu \theta ^\alpha )^q}{\lambda }\int _X\Vert f(x)\Vert _{\ell ^q}\,d\mu (x). \end{aligned}$$

Therefore, in order to prove (3.16), it is also enough to show that

$$\begin{aligned} M_{{\mathbb {E}}}f_{1,i}(x)\le CM_{{\mathbb {E}}}\widetilde{f_i}(x)\quad \text{ for } \text{ a.e. }~x\in X\setminus \widetilde{\Omega }~\text{ and } \text{ for } \text{ all }~i. \end{aligned}$$
(3.17)

This is not hard to prove. Indeed, let \(x\in X{\setminus }\widetilde{\Omega }\). Consider an open set \(E_r(y)\in {\mathbb {E}}\) such that \(x\in E_r(y)\) and \(\frac{1}{\mu \bigl (E_r(y)\bigr )}\int _{E_r(y)}|f_{1,i}(z)|\,d\mu (z)>0\). Now, we compute the average of \(f_{1,i}\) over \(E_r(y)\). We have

$$\begin{aligned}{} & {} { \frac{1}{\mu \bigl (E_r(y)\bigr )}\int _{E_r(y)}|f_{1,i}(z)|\,d\mu (z) } \\{} & {} \quad = \frac{1}{\mu \bigl (E_r(y)\bigr )}\sum _{{\begin{array}{c} j\in J\\ E_r(y)\cap E_{r_j}(x_j)\not =\emptyset \end{array}}}\int _{E_r(y)\cap E_{r_j}(x_j)}|f_{1,i}(z)|\,d\mu (z) \\{} & {} \quad \le \frac{1}{\mu \bigl (E_r(y)\bigr )}\sum _{{\begin{array}{c} j\in J\\ E_r(y)\cap E_{r_j}(x_j)\not =\emptyset \end{array}}}\int _{E_{r_j}(x_j)}|f_{1,i}(z)|\,d\mu (z) \\{} & {} \quad = \frac{1}{\mu \bigl (E_r(y)\bigr )}\sum _{{\begin{array}{c} j\in J\\ E_r(y)\cap E_{r_j}(x_j)\not =\emptyset \end{array}}}\int _{E_{r_j}(x_j)}|\widetilde{f_i}(z)|\,d\mu (z). \end{aligned}$$

If \(j\in J\) is such that \(E_r(y)\cap E_{r_j}(x_j)\not =\emptyset \), then we claim that \(r_j\le r\). Suppose \(r<r_j\). Then, \(E_{r_j}(y)\cap E_{r_j}(x_j)\not =\emptyset \) so that \(E_{r_j}(y)\subset E_{\theta r_j}(x_j)\) by property (F\('\)). This would mean that \(x\in E_r(y)\subset E_{r_j}(y)\subset E_{\theta r_j}(x_j)\subset \widetilde{\Omega }\), a contradiction. Therefore, \(E_r(y)\cap E_{r_j}(x_j)\not =\emptyset \) implies that \(E_{r_j}(x_j)\subset E_r(x_j)\subset E_{\theta r}(y)\). Using this fact in the last integral, we get

$$\begin{aligned} \frac{1}{\mu \bigl (E_r(y)\bigr )}\int _{E_r(y)}|f_{1,i}(z)|\,d\mu (z)\le & {} \frac{1}{\mu \bigl (E_r(y)\bigr )}\int _{E_{\theta r}(y)}|\widetilde{f_i}(z)|\,d\mu (z) \\\le & {} \frac{2C_\mu \theta ^\alpha }{\mu \bigl (E_{\theta r}(y)\bigr )}\int _{E_{\theta r}(y)}|\widetilde{f_i}(z)|\,d\mu (z) \\\le & {} 2C_\mu \theta ^\alpha M_{{\mathbb {E}}}\widetilde{f_i}(x). \end{aligned}$$

Taking supremum over all such \(E_r(y)\), we see that (3.17) holds. But the proof of (3.13) was reduced to this inequality. Hence, we have proved the theorem for the case \(p=1\).

For the exponent \(1<p<q\), we use the Marcinkiewicz interpolation theorem. We interpolate between the estimates in \(p=1\) and \(p=q\) to conclude that

$$\begin{aligned} \int _X\Vert M_{{\mathbb {E}}}f(x)\Vert _{\ell ^q}^p\,d\mu (x)\le A_q\int _X\Vert f(x)\Vert _{\ell ^q}^p\,d\mu (x), \quad 1<p<q. \end{aligned}$$

For the remaining values of the exponent p, namely for \(1<q<p<\infty \), we may use the duality argument presented in [13]. A careful reading of the proof reveals that the arguments given there only use Hölder’s inequality, the Hahn-Banach theorem and an analogue of the inequality (1.13) as the main ingredients. Therefore, a similar argument gives us the strong type (pp) inequality for this case as well. This completes the proof of the theorem.\(\square \)