1 Introduction

Given a weight (that is, a non-negative locally integrable function)  w and a cube Q⊂ℝn, let

$$A_p(w;Q)=\Big(\frac{1}{|Q|}\int_Qw\Big)\Big(\frac{1}{|Q|}\int_Qw^{-\frac{1}{p-1}}\Big)^{p-1}\quad(1<p<\infty)$$

and

$$\|w\|_{A_p}=\sup_{Q\subset{\mathbb{R}}^n}A_p(w;Q).$$

Sharp weighted norm inequalities in terms of \(\|w\|_{A_{p}}\) have been obtained recently for the Calderón–Zygmund operators and for a large class of the Littlewood–Paley operators. To be more precise, if T is a Calderón–Zygmund operator, then

$$ \|T\|_{L^p(w)}\le c(T,p,n)\|w\|_{A_p}^{\max(1,\frac{1}{p-1})}\quad(1<p<\infty).$$
(1)

This result in its full generality is due to T. Hytönen [5]; we also refer to this work for a very detailed history of closely related results and particular cases. Soon after [5] appeared, a somewhat simplified approach to (1) was found in [9].

If S is a Littlewood–Paley operator (in particular, any typical square function), then (see [12] and the references therein)

$$ \|S\|_{L^p(w)}\le c(S,p,n)\|w\|_{A_p}^{\max(\frac{1}{2},\frac{1}{p-1})}\quad (1<p<\infty).$$
(2)

Observe that the exponents in (1) and (2) are sharp for any 1<p<∞. However, it turns out that this is not the end of the story. Very recently, T. Hytönen and C. Pérez [8] have studied mixed A p -A estimates that improve many of the known sharp A p estimates. Denote

$$\|w\|_{A_{\infty}}=\sup_{Q\subset{\mathbb{R}}^n}A_{\infty}(w;Q)=\sup_{Q\subset{\mathbb{R}}^n}\Big(\frac{1}{|Q|}\int_Qw\Big)\exp \Big (\frac{1}{|Q|}\int_Q\log w^{-1}\Big).$$

Set also

$$\|w\|_{A_{\infty}}'=\sup_{Q\subset{\mathbb{R}}^n}\frac{1}{w(Q)}\int _QM(w\chi_Q),$$

where M is the Hardy–Littlewood maximal operator. Observe that

$$c_n\|w\|_{A_{\infty}}'\le\|w\|_{A_{\infty}}\le\|w\|_{A_p}\quad (1<p<\infty),$$

and the first inequality here cannot be reversed (see [8] for the details).

One of the main results in [8] is the following improvement of (1) in the case p=2:

$$ \|T\|_{L^2(w)}\le c(T,n)\|w\|_{A_2}^{1/2}\max(\|w\|_{A_{\infty}}',\|w^{-1}\|_{A_{\infty }}')^{1/2}.$$
(3)

It is well known that the case p=2 is crucial for inequality (1). Indeed, (1) for any \(p\not=2\) follows from the linear L 2(w) bound and the sharp version of the Rubio de Francia extrapolation theorem. Adapting such an approach, the authors in [8] extended (3) for any \(p\not=2\). For example, it was shown that for p>2,

$$ \|T\|_{L^p(w)}\le c\|w\|_{A_p}^{\frac{2}{p}-\frac{1}{2(p-1)}}\big(\|w\|_{A_{\infty }}^{\frac{1}{2(p-1)}}+\|\sigma \|_{A_{\infty}}^{\frac{1}{2}}\big)(\|w\|_{A_{\infty }}')^{1-\frac{2}{p}},$$
(4)

where \(\sigma =w^{-\frac{1}{p-1}}\).

It turns out that while the extrapolation method is powerful for (1), it is not so effective for mixed A p -A inequalities. Indeed, T. Hytönen et al. [7] improved (4) (at least for p>4) without the use of extrapolation, namely, it is proved in [7] that

$$ \|T^*\|_{L^p(w)}\le c(T,p,n)\Big(\|w\|_{A_p}^{1/p}(\|w\|_{A_{\infty}}')^{1/p'}+\|w\|_{A_p}^{\frac{1}{p-1}}\Big)\quad(p>1),$$
(5)

where T is the maximal Calderón–Zygmund operator.

Soon after that, M. Lacey [10] improved (5) and (4) for several classical singular integrals:

$$ \|T^*\|_{L^p(w)}\le c(T,p,n)\|w\|_{A_p}^{1/p}\max\Big((\|w\|_{A_{\infty}}')^{1/p'},(\|\sigma \|_{A_{\infty}}')^{1/p}\Big),$$
(6)

and it was conjectured in [10] that (6) holds for any Calderón–Zygmund operator.

More precisely, (6) was proved for the Hilbert, Riesz, and Beurling operators and for any one-dimensional convolution Calderón–Zygmund operator with odd C 2 kernel. All these operators are unified by the fact that they can be represented as a suitable average of the so-called Haar shift operators \({\mathbb{S}}\) with bounded complexity. In order to handle such operators, a “local mean oscillation” decomposition was used in [10]. The latter decomposition was obtained by the author in [11]. Then, its various applications (in particular, to the Haar shift operators) have been found by D. Cruz-Uribe, J. Martell and C. Pérez in [2].

After an application of the decomposition to \({\mathbb{S}}\), the proof of (6) is reduced to showing that this estimate is true for

$${\mathcal{A}}_{\gamma}f(x)=\sum_{j,k}\Big(\frac{1}{|\gamma Q_j^k|}\int_{\gamma Q_j^k}|f|\Big)\chi_{Q_j^k}(x),$$

where \(Q_{j}^{k}\) are the dyadic cubes with good overlapping properties. This is done in [10] by means of a number of interesting tricks. It is mentioned in [10] that a more elementary approach to A γ (used in [2] in order to prove (1) for classical singular operators mentioned above) does not allow us to get (6).

In this paper we show, however, that a variation of the approach to A γ from [2] allows us to get mixed estimates of a different type, namely, we obtain L p(w) bounds in terms of

$$\|w\|_{(A_p)^{\alpha }(A_r)^{\beta }}=\sup_{Q\subset{\mathbb{R}}^n}A_p(w;Q)^{\alpha }A_r(w;Q)^{\beta }$$

for suitable α and β. The key point in our results below is that r can be taken arbitrarily big (but with the implicit constant growing exponentially in r). Therefore, our estimates can be also considered as a kind of A p -A estimates. An important feature of the expression defining \(\|w\|_{(A_{p})^{\alpha }(A_{r})^{\beta }}\) is that only one supremum is involved. We will show in simple examples that \(\|w\|_{(A_{p})^{\alpha }(A_{r})^{\beta }}\) is incomparable with the right-hand side of (6), that is, each of such expressions can be arbitrarily larger than the other. This fact indicates that the estimates of both types can be further improved.

In the next theorem we suppose that T is the same operator as in (6), namely,

$$T^*f(x)=\sup_{\varepsilon <\delta }\Big|\int_{\varepsilon <|x-y|<\delta }f(y)K(x-y)dy\Big|,$$

where K is one of the following kernels: (i) \(K(x)=\frac{1}{x},n=1\); (ii) \(k(x)=\frac{x_{j}}{|x|^{n+1}}, n\ge2\); (iii) \(K(z)=\frac{1}{z^{2}}, z\in{\mathbb{C}}\); (iv) K(x) is any odd, one-dimensional C 2 kernel satisfying |K (i)(x)|≤c|x|−1−i (i=0,1,2).

Theorem 1.1

For any 2≤pr<∞,

$$\|T^*\|_{L^p(w)}\le c(T,p,r,n)\|w\|_{(A_p)^{\frac{1}{p-1}}(A_r)^{1-\frac{1}{p-1}}}.$$

A similar result holds for the Littlewood–Paley operators satisfying (1). In the next theorem, S is either the dyadic square function or the intrinsic square function (and hence the theorem is also true for the Lusin area integral S(f), the Littlewood–Paley function g(f), the continuous square functions S ψ (f) and g ψ (f)).

Theorem 1.2

For any 3≤pr<∞,

$$\|S\|_{L^p(w)}\le c(S,p,r,n)\|w\|_{(A_p)^{\frac{1}{p-1}}(A_r)^{\frac{1}{2}-\frac{1}{p-1}}}.$$

Also, the same type of result holds for the vector-valued maximal function (see Remark 3.2 below).

Observe that Theorem 1.1 for T (instead of T ) can be extended by the standard duality argument to the case 1<p<2 as follows:

For r>p′ this improves (1) since

$$A_p(w;Q)A_r(\sigma ;Q)^{2-p}\le A_p(w;Q)A_{p'}(\sigma ;Q)^{2-p}=A_p(Q)^{\frac {1}{p-1}}.$$

In Sect. 4, we show the sharpness of the exponent \(\frac{1}{p-1}\) in Theorems 1.1 and 1.2. Also we show that the right-hand side in Theorem 1.1 is incomparable with the one in (6).

A natural question appearing here is whether the right-hand side in Theorem 1.1 can be replaced by

$$\|w\|_{(A_p)^{\frac{1}{p-1}}(A_{\infty})^{1-\frac{1}{p-1}}}=\sup_{Q}A_p(w;Q)^{\frac{1}{p-1}}A_{\infty}(w;Q)^{1-\frac{1}{p-1}}$$

or by

$$\|w\|_{(A_p)^{\frac{1}{p-1}}(A'_{\infty})^{1-\frac{1}{p-1}}}=\sup_{Q}A_p(w;Q)^{\frac{1}{p-1}}A'_{\infty}(w;Q)^{1-\frac{1}{p-1}},$$

where \(A'_{\infty}(w;Q)=\frac{1}{w(Q)}\int_{Q}M(w\chi_{Q})\).

2 Preliminaries

2.1 Haar Shift Operators

Given a general dyadic grid and m,k∈ℕ, we say that \({\mathbb{S}}\) is a (generalized) Haar shift operator with parameters m,k if

where (Q) is the side length of Q, \(h_{Q'}^{Q''}\) is a (generalized) Haar function on Q′, and \(h_{Q''}^{Q'}\) is one on Q″ such that

$$\|h_{Q'}^{Q''}\|_{L^{\infty}}\|h_{Q''}^{Q'}\|_{L^{\infty}}\le1.$$

The number max(m,k) is called the complexity of \({\mathbb{S}}\).

We refer to [5] for a more detailed explanation of this definition. Also, it is shown in [5] that any Calderón–Zygmund operator can be represented as a suitable average of with respect to all dyadic grids and all m,k∈ℕ. In the case of the classical convolution operators mentioned in Theorem 1.1, such an average can be taken only of with bounded complexity. This fact was proved in the works [3] (the Beurling operator), [13] (the Hilbert transform), [14] (the Riesz transforms), [15] (any one-dimensional singular integral with odd C 2 kernel).

Similarly to the maximal singular integral T , one can define the maximal Haar shift operator \({\mathbb{S}}^{*}\), and to get a control of T by \({\mathbb{S}}^{*}\) (see [7, Prop. 2.8]). In particular, it suffices to prove Theorem 1.1 for a single \({\mathbb{S}}^{*}\) instead of T .

2.2 Littlewood–Paley Operators

The dyadic square function is defined by

$$S_df(x)=\left(\sum_{Q\in{\mathcal{D}}}(f_Q-f_{\widehat{Q}})^2\chi_Q(x)\right)^{1/2},$$

where the sum is taken over all dyadic cubes on ℝn.

Let \({\mathbb{R}}^{n+1}_{+}={\mathbb{R}}^{n}\times{\mathbb{R}}_{+}\) and \(\Gamma (x)=\{(y,t)\in{\mathbb{R}}^{n+1}_{+}:|y-x|<t\}\). For 0<α≤1, let \({\mathcal{C}}_{\alpha }\) be the family of functions supported in {x:|x|≤1}, satisfying ∫ψ=0, and such that for all x and x′, |φ(x)−φ(x′)|≤|xx′|α. If \(f\in L^{1}_{\text{loc}}({\mathbb{R}}^{n})\) and \((y,t)\in{\mathbb{R}}^{n+1}_{+}\), we define

$$A_{\alpha }(f)(y,t)=\sup_{\varphi \in{\mathcal{C}}_{\alpha }}|f*\varphi _t(y)|.$$

The intrinsic square function G α (f) is defined by

$$G_{\alpha }(f)(x)=\left(\int_{\Gamma (x)}\big(A_{\alpha }(f)(y,t)\big)^2\frac{dydt}{t^{n+1}}\right)^{1/2}.$$

This operator was introduced by M. Wilson [16]. On one hand G α pointwise dominates the classical and continuous S and g functions. On the other hand, it is not essentially larger than any one of them.

Denote

$$T(Q)=\{(y,t)\in{\mathbb{R}}^{n+1}_+:y\in Q, \ell(Q)/2\le t<\ell(Q)\}$$

and \(\gamma_{Q}(f)^{2}=\int_{T(Q)}(A_{\alpha }(f)(y,t))^{2}\frac{dydt}{t^{n+1}}\), and let

$$\widetilde{G}_{\alpha }(f)(x)=\Big(\sum_{Q\in{\mathcal{D}}}\gamma_Q(f)^2\chi_{3Q}(x)\Big)^{1/2}.$$

Then we have that (see [12])

$$G_{\alpha }(f)(x)\le\widetilde{G}_{\alpha }(f)(x)\le c(\alpha ,n)G_{\alpha }(f)(x).$$

2.3 A “Local Mean Oscillation” Decomposition

Given a measurable function f on ℝn and a cube Q, define the local mean oscillation of f on Q by

$$\omega_{\lambda }(f;Q)=\inf_{c\in{\mathbb{R}}}\big((f-c)\chi_{Q}\big)^*\big(\lambda |Q|\big)\quad(0<\lambda <1),$$

where f denotes the non-increasing rearrangement of f.

By a median value of f over Q we mean a possibly nonunique, real number m f (Q) such that

$$\max\big(|\{x\in Q: f(x)>m_f(Q)\}|,|\{x\in Q: f(x)<m_f(Q)\}|\big)\le|Q|/2.$$

Given a cube Q 0, denote by \({\mathcal{D}}(Q_{0})\) the set of all dyadic cubes with respect to Q 0. If \(Q\in{\mathcal{D}}(Q_{0})\) and \(Q\not=Q_{0}\), we denote by \(\widehat{Q}\) its dyadic parent, that is, the unique cube from \({\mathcal{D}}(Q_{0})\) containing Q and such that \(|\widehat{Q}|=2^{n}|Q|\).

The dyadic local sharp maximal function \(M^{\#,d}_{\lambda ;Q_{0}}f\) is defined by

$$M^{\#,d}_{\lambda ;Q_0}f(x)=\sup_{x\in Q'\in {\mathcal{D}}(Q_0)}\omega_{\lambda }(f;Q').$$

The following theorem was proved in [11].

Theorem 2.1

Let f be a measurable function onn and let Q 0 be a fixed cube. Then there exists a (possibly empty) collection of cubes \(Q_{j}^{k}\in{\mathcal{D}}(Q_{0})\) such that

  1. (i)

    for a.e. xQ 0,

    $$|f(x)-m_f(Q_0)|\le 4M_{1/4;Q_0}^{\#,d}f(x)+4\sum_{k=1}^{\infty}\sum_j\omega_{\frac{1}{2^{n+2}}}(f;\widehat{Q}_j^k)\chi_{Q_j^k}(x);$$
  2. (ii)

    for each fixed k the cubes \(Q_{j}^{k}\) are pairwise disjoint;

  3. (iii)

    if \(\Omega _{k}=\cup_{j}Q_{j}^{k}\), then Ω k+1⊂Ω k ;

  4. (iv)

    \(|\Omega _{k+1}\cap Q_{j}^{k}|\le\frac{1}{2}|Q_{j}^{k}|\).

We shall use below the standard fact following from the above properties (ii)–(iv), namely, that the sets \(E_{j}^{k}=Q_{j}^{k}\setminus \Omega _{k+1}\) are pairwise disjoint and \(|E_{j}^{k}|\ge\frac{1}{2}|Q_{j}^{k}|\).

3 Proof of Theorems 1.1 and 1.2

The key result implying both Theorems 1.1 and 1.2 can be described as follows.

Theorem 3.1

Let T be a sublinear operator satisfying

$$ \omega_{\lambda }(|Tf|^{\nu};Q)\le c\Big(\frac{1}{|\gamma Q|}\int_{\gamma Q}|f|dx\Big)^{\nu}$$
(7)

for any dyadic cube Q⊂ℝn, where ν,γ≥1, and the constant c does not depend on Q. Then for any ν+1≤pr<∞ and for all f with (Tf)(+∞)=0,

$$ \|Tf\|_{L^p(w)}\le c\|w\|_{(A_p)^{\frac{1}{p-1}}(A_r)^{\frac{1}{\nu}-\frac{1}{p-1}}}\|f\|_{L^p(w)},$$
(8)

where c=c(T,p,r,ν,γ,n).

If it is known additionally that T is, for example, of weak type (1,1) (which is the case for any operator from Theorems 1.1 and 1.2), then (Tf)(+∞)=0 for any fL 1. Hence, we get first (8) for fL 1L p(w), and then by the standard argument it is extended to any fL p(w).

Condition (7) for the maximal Haar shift operator \({\mathbb{S}}^{*}f\) was proved in [2] (see also [10]) with ν=1 and γ depending on the complexity. Hence, by the above discussion in Sect. 2.1, Theorem 3.1 implies Theorem 1.1.

Further, in the case ν=2, condition (7) holds for the dyadic square function S d with γ=1 (this fact was proved in [2]), and for the intrinsic square function \(\widetilde{G}_{\alpha }\) with γ=15 (this was proved in [12]). From this and from Theorem 3.1 we get Theorem 1.2.

Proof of Theorem 3.1

Combining (7) with Theorem 2.1, we get that for a.e. xQ 0,

$$||Tf(x)|^{\nu}-m_{|Tf|^{\nu}}(Q_0)|^{1/\nu}\le c\big (Mf(x)+{\mathcal{A}}_{3\gamma ,\nu}f(x)\big),$$

where

$${\mathcal{A}}_{\gamma ,\nu}f(x)=\left(\sum_{j,k}\Big(\frac{1}{|\gamma Q_j^k|}\int_{\gamma Q_j^k}|f|dx\Big)^{\nu}\chi_{Q_j^k}(x)\right)^{1/\nu}.$$

Therefore, the proof will follow from the corresponding bounds for M and \({\mathcal{A}}_{\gamma ,\nu}\). After that, letting Q 0 to any one of 2n quadrants, we get that \(m_{|Tf|^{\nu}}(Q_{0})\to0\) (since (Tf)(+∞)=0), and Fatou’s theorem would complete the proof.

By Buckley’s theorem [1], \(\|M\|_{L^{p}(w)}\le c(p,n)\|w\|_{A_{p}}^{\frac{1}{p-1}}\), which implies trivially the desired bound for M. Therefore, the proof is reduced to showing that for any ν+1≤pr<∞,

$$ \|{\mathcal{A}}_{\gamma ,\nu}f\|_{L^p(w)}\le c\|w\|_{(A_p)^{\frac{1}{p-1}}(A_r)^{\frac{1}{\nu}-\frac{1}{p-1}}}\|f\|_{L^p(w)},$$
(9)

where c=c(p,r,ν,γ,n).

In order to handle \({\mathcal{A}}_{\gamma ,\nu}f\), following [2], we use the duality. There exists a function h≥0 with \(\|h\|_{L^{(p/\nu)'}(w)}=1\) such that

$$\|{\mathcal{A}}_{\gamma ,\nu}f\|_{L^p(w)}=\|{\mathcal{A}}_{\gamma ,\nu}f\|_{L^{\nu}(hw)}.$$

Further,

(10)

It is well known that, by Hölder’s inequality, 1≤A r (w;E) for any measurable set E with |E|>0. From this, for any EQ with |E|≥ξ|Q|,

Therefore, \(w(Q_{j}^{k})\le2^{r}A_{r}(w;Q_{j}^{k})w(E_{j}^{k})\) (the sets \(E_{j}^{k}\) are defined after Theorem 2.1). Combining this with (10), we get

Since

we obtain

By Hölder’s inequality,

Let \(M_{w}^{c}\) and \(M_{w}^{d}\) be the weighted centered and dyadic maximal operator, respectively. We will use below the well-known fact that these operators are bounded on L p(w),p>1, with the corresponding bounds independent of w.

Since \(1\le A_{p}(w;E_{j}^{k})\), we get from this that

$$|Q_j^k|^{\frac{p}{p-1}}w(E_j^k)^{-\frac{1}{p-1}}\le 2^{\frac{p}{p-1}}|E_j^k|^{\frac{p}{p-1}}w(E_j^k)^{-\frac{1}{p-1}}\le 2^{\frac{p}{p-1}}\sigma (E_j^k),$$

and therefore,

Similarly,

Combining the previous estimates yields

$$\int_{{\mathbb{R}}^n}({\mathcal{A}}_{\gamma ,\nu}f)^{\nu}hw \le c\|w\|_{(A_p)^{\frac{\nu}{p-1}}(A_r)^{1-\frac{\nu}{p-1}}}\Big(\int _{{\mathbb{R}}^n}|f|^pw dx\Big)^{\nu/p},$$

which implies (9), and therefore, the proof is complete. □

Remark 3.2

Theorem 3.1 can be also used to get a new bound for the vector-valued maximal operator \(\overline{M}_{q}\) defined for f={f i }, and q,1<q<1, by

$$\overline{M}_qf(x)=\left(\sum_{i=1}^{\infty}Mf_i(x)^q\right)^{1/q}.$$

It was proved in [2] that for any 1<p,q<∞,

$$ \|\overline{M}_qf\|_{L^p(w)}\le c\|w\|_{A_p}^{\max(\frac{1}{q},\frac{1}{p-1})}\Big(\int_{{\mathbb{R}}^n}\|f(x)\|_{\ell^q}^pwdx\Big)^{1/p}.$$
(11)

The proof of this inequality is based on the following variant of (7) for the vector-valued dyadic maximal operator \(\overline{M}_{q}^{d}\) and any dyadic Q:

$$\omega_{\lambda }((\overline{M}_q^df)^q;Q)\le c\Big(\frac{1}{|Q|}\int_{Q}\|f(x)\|_{\ell^q}dx\Big)^q.$$

Therefore, using the same argument as above, we obtain an improvement of (11) for q+1<pr<∞:

$$\|\overline{M}_qf\|_{L^p(w)}\le c\|w\|_{(A_p)^{\frac{1}{p-1}}(A_r)^{\frac{1}{q}-\frac{1}{p-1}}}\Big(\int_{{\mathbb{R}}^n}\|f(x)\|_{\ell^q}^pwdx\Big)^{1/p}.$$

4 Examples

4.1 The Sharpness of the Exponent \(\frac{1}{p-1}\)

First we note that the exponent \(\frac{1}{p-1}\) in Theorem 1.1 is sharp in the sense that \(\|w\|_{(A_{p})^{\frac{1}{p-1}}(A_{r})^{1-\frac{1}{p-1}}}\) cannot be replaced by \(\|w\|_{(A_{p})^{\alpha }(A_{r})^{1-\alpha }}\) for \(\alpha <\frac{1}{p-1}\). Indeed, it suffices to consider the same example as in [1]. Let T=H be the Hilbert transform. Let w(x)=|x|(p−1)(1−δ) and f=|x|−1+δ χ [0,1]. Then on one hand we have that \(\|H\|_{L^{p}(w)}\ge c\delta ^{-1}\), and on the other hand, if r>p, then \(\|w\|_{(A_{p})^{\alpha }(A_{r})^{1-\alpha }}\le c\delta ^{-\alpha (p-1)}\). Therefore, \(\alpha \ge\frac{1}{p-1}\).

The same observation applies to Theorem 1.2. For instance, in the case of the dyadic square function, exactly the same example as above (see [4]) shows that \(\|w\|_{(A_{p})^{\frac{1}{p-1}}(A_{r})^{\frac{1}{2}-\frac{1}{p-1}}}\) cannot be replaced by \(\|w\|_{(A_{p})^{\alpha }(A_{r})^{\frac{1}{2}-\alpha }}\) for \(\alpha <\frac{1}{p-1}\).

4.2 A Comparison with M. Lacey’s Bound

Let p>2. We show that the right-hand sides in (6) and in Theorem 1.1 are incomparable.

Let w= [0,1]+χ ℝ∖[0,1]. It is easy to see that

$$\|w\|_{A_p}\sim \|w\|_{(A_p)^{\frac{1}{p-1}}(A_r)^{1-\frac{1}{p-1}}}\sim t.$$

Further, it was shown in [8] that for any measurable set E,

$$ \|t\chi_{E}+\chi_{{\mathbb{R}}\setminus E}\|_{A_{\infty}}'\le4\log t\quad(t\ge3).$$
(12)

Hence, \(\|w\|_{A_{\infty}}'\le4\log t\) and

$$\|\sigma \|_{A_{\infty}}'=\|t^{\frac{1}{p-1}}\sigma \|_{A_{\infty}}'\le\frac{4}{p-1}\log t\quad(t\ge3^{p-1}).$$

Therefore,

$$\|w\|_{A_p}^{1/p}\max\Big((\|w\|_{A_{\infty}}')^{1/p'},(\|\sigma \|_{A_{\infty}}')^{1/p}\Big)\le ct^{1/p}(\log t)^{1/p'},$$

which shows that the right-hand side in (6) can be arbitrarily smaller than the one in Theorem 1.1.

On the other hand, for N big enough let

$$w(x) =\left\{\begin{array}{l@{\quad}l}|x|^{(p-1)(1-\delta )}, & x\in[-1,1], \\|x-N|^{\delta -1}, & x\in[N-1,N+1],\\1,& \mbox{otherwise}.\end{array}\right.$$

Then we have \(\|w\|_{A_{p}}\ge c\delta ^{-(p-1)}\) (take I=[0,1]). Also, \(\|w\|_{A_{\infty}}'\ge c\delta ^{-1}\) (take I=[N,N+1]). Therefore,

$$\|w\|_{A_p}^{1/p}\max\Big((\|w\|_{A_{\infty}}')^{1/p'},(\|\sigma \|_{A_{\infty}}')^{1/p}\Big)\ge c\delta ^{-2/p'}.$$

But for N big enough the supremum defining \(\|w\|_{(A_{p})^{\frac{1}{p-1}}(A_{r})^{1-\frac{1}{p-1}}}\) can attain on small intervals containing either 0 or N. If r>p, then for any such interval

$$A_p(w;I)^{\frac{1}{p-1}}A_r(w;I)^{1-\frac{1}{p-1}}\le c/\delta ,$$

and hence \(\|w\|_{(A_{p})^{\frac{1}{p-1}}(A_{r})^{1-\frac{1}{p-1}}}\le c/\delta \). This shows that the right-hand side in Theorem 1.1 can be arbitrarily smaller than the one in (6).

Added in proof.

The conjecture mentioned in the Introduction that (6) holds for any Calderón–Zygmund operator has been recently solved in [6].