1 Introduction

In a very recent article [31], to study the compactness of bilinear commutators of certain bilinear Calderón–Zygmund operators which include (inhomogeneous) Coifman–Meyer bilinear Fourier multipliers and bilinear pseudodifferential operators as special examples, Torres and Xue introduced a new subspace of BMO\(\,(\mathbb {R}^n)\), denoted by XMO\(\,(\mathbb {R}^n)\), and conjectured that it is just the space VMO\(\,(\mathbb {R}^n)\) introduced by Sarason [28]. In this article, we give a negative answer to this conjecture by establishing an equivalent characterization of XMO\(\,(\mathbb {R}^n)\), which further clarifies that XMO\(\,(\mathbb {R}^n)\) is a proper subspace of VMO\(\,(\mathbb {R}^n)\). This equivalent characterization of XMO\(\,(\mathbb {R}^n)\) is formally similar to the corresponding one of CMO\(\,(\mathbb {R}^n)\) obtained by Uchiyama [33], but its proof needs some essential new techniques on dyadic cubes as well as some exquisite geometrical observations. As an application, we also obtain a weighted compactness result on such bilinear commutators, which optimizes the corresponding result of Torres and Xue [31] in the unweighted setting.

In what follows, we use \(L_{\mathrm{c}}^\infty (\mathbb {R}^n)\) to denote the set of all essentially bounded functions on \(\mathbb {R}^n\) with compact support. The theory of commutators of pointwise multiplication with Calderón–Zygmund operators has attracted lots of attentions and many works have been done since Coifman et al. [12] first studied the boundedness characterization of the commutator [bT] which is defined by setting, for any \(f\in L_{\mathrm{c}}^\infty (\mathbb {R}^n)\),

$$\begin{aligned}{}[b,T](f):=bT(f)-T(bf), \end{aligned}$$

where T is any classical Calderón–Zygmund operator with smooth kernel and \(b\in {\mathrm{\,BMO\,}}(\mathbb {R}^n)\). Among those achievements are the celebrated boundedness and compactness results of Coifman et al. [12], Cordes [13], Uchiyama [33] and Janson [23] in the linear situation. In [33], Uchiyama established a characterization of \(\mathrm{\,CMO\,}(\mathbb {R}^n)\) (see Proposition 2.4 below), which was further used to show that, for any given \(p\in (1,\infty )\) and any Calderón–Zygmund operator T with smooth kernel, [bT] is compact on \(L^p(\mathbb {R}^n)\) if and only if b is in \(\mathrm{\,CMO\,}(\mathbb {R}^n)\), where \(\mathrm{\,CMO\,}(\mathbb {R}^n)\) denotes the closure in \({\mathrm{\,BMO\,}}(\mathbb {R}^n)\) of infinitely differentiable functions with compact support.

In the bilinear setting, recall that the boundedness on \(L^p(\mathbb {R}^n)\) of the commutators of more general bilinear Calderón–Zygmund operators with \(b\in {\mathrm{\,BMO\,}}(\mathbb {R}^n)\) was established by Pérez and Torres [27] for any given \(p\in (1,\infty )\), and by Tang [29] and Lerner et al. [24] for any given \(p\in (1/2, 1]\). The compactness in \(L^p(\mathbb {R}^n)\) of the commutators multiplying functions in \(\mathrm{\,CMO\,}(\mathbb {R}^n)\) was demonstrated by Bényi and Torres [2] for any given \(p\in (1,\infty )\), and by Torres et al. [32] for any given \(p\in (1/2, 1]\). Moreover, Chaffee et al. [8] showed that the compactness result for certain homogeneous bilinear Calderón–Zygmund operators holds true if and only if \(b\in \mathrm{\,CMO\,}(\mathbb {R}^n)\); see also Remark 1.5(iv) below. For more related works, we refer the reader to [9, 14, 18, 19, 25, 26] and their references. Very recently, there exists a series of works [6, 7, 20,21,22] on extrapolation theorems, which establishes a general frame on the weighted compactness of multilinear operators whenever the weighted compactness is verified for certain index. In addition, when talking about the compactness of the commutators of multilinear Calderón–Zygmund operators, usually the function b should belong to \(\mathrm{\,CMO\,}(\mathbb {R}^n)\).

In order to investigate the possible versions in the bilinear setting of the compactness result of Cordes [13], Torres and Xue in [31] uncovered two subspaces of \({\mathrm{\,BMO\,}}(\mathbb {R}^n)\), which were denoted, respectively, by \(\mathrm{\,MMO\,}(\mathbb {R}^n)\) and \(\mathrm{\,XMO\,}(\mathbb {R}^n)\). It is known that

$$\begin{aligned} \mathrm{\,CMO\,}(\mathbb {R}^n) \subsetneqq \mathrm{\,MMO\,}(\mathbb {R}^n) \subsetneqq \mathrm{\,XMO\,}(\mathbb {R}^n) \subset \mathrm{\,VMO\,}(\mathbb {R}^n), \end{aligned}$$

where \(\mathrm{\,VMO\,}(\mathbb {R}^n) \subsetneqq {\mathrm{\,BMO\,}}(\mathbb {R}^n)\) denotes the space of functions with “vanishing mean oscillation”. The main results in [31] state that the compactness result still holds true for the commutators of pointwise multiplication with certain bilinear Calderón–Zygmund operators whenever \(b\in \mathrm{\,XMO\,}(\mathbb {R}^n).\) This means, of course, for the compactness of these commutators, b does not need to be in \(\mathrm{\,CMO\,}(\mathbb {R}^n)\). It still works in a larger subspace \(\mathrm{\,XMO\,}(\mathbb {R}^n)\).

In what follows, let \(\mathbb {N}:=\{1,\,2,\ldots \}\), \(\mathbb {Z}_{+}:=\mathbb {N}\cup \{0\}\), \(\mathbb {Z}_{+}^{n}:=(\mathbb {Z}_+)^n\), and \(\mathbb {Z}_{+}^{3n}:=(\mathbb {Z}_+)^{3n}\). In this article, we consider the following particular type bilinear Calderón–Zygmund operator T, whose kernel K satisfies the following conditions:

  1. (i)

    The standard size and regularity conditions: for any given multi-indices \(\alpha :=(\alpha _1,\ldots ,\alpha _{3n})\in \mathbb {Z}_+^{3n}\) with \(|\alpha |:=\alpha _1+\cdots +\alpha _{3n}\le 1\), there exists a positive constant \(C_{(\alpha )}\), depending on \(\alpha \), such that, for any \(x,\ y,\ z\in \mathbb {R}^n\) with \(x\ne y\) or \(x\ne z\),

    $$\begin{aligned} |D^\alpha K(x,y,z)|\le C_{(\alpha )} (|x-y|+|x-z|)^{-2n-|\alpha |}. \end{aligned}$$
    (1.1)

    Here and thereafter, \(D^\alpha :=\big (\frac{\partial }{\partial x_1}\big )^{\alpha _1}\cdots \big (\frac{\partial }{\partial x_{3n}}\big )^{\alpha _{3n}}\), and \(C_{(\alpha )}\) is called the kernel constant.

  2. (ii)

    The additional decay condition: there exist positive constants \(\delta \) and C such that, for any \(x,\ y,\ z\in \mathbb {R}^n\) with \(|x-y|+|x-z|>1\),

    $$\begin{aligned} |K(x,y,z)|\le C (|x-y|+|x-z|)^{-2n-2-\delta } \end{aligned}$$
    (1.2)

and, for any \(f,\ g\in L_{\mathrm{c}}^\infty (\mathbb {R}^n)\) and \(x\notin {\mathrm{\,supp\,}}(f)\cap {\mathrm{\,supp\,}}(g)\), T is supposed to have the following usual representation:

$$\begin{aligned} T(f,g)(x)=\int _{\mathbb {R}^{2n}}K(x,y,z)f(y)g(z)\,dy\,dz, \end{aligned}$$

here and thereafter, \({\mathrm{\,supp\,}}(f):=\{x\in \mathbb {R}^n:\ f(x)\ne 0\}\). The (inhomogeneous) Coifman–Meyer bilinear Fourier multipliers and the bilinear pseudodifferential operators with certain symbols satisfy the above conditions (see, for instance, [31]). Therefore, they are typical examples of the bilinear Calderón–Zygmund operators as above. We refer the reader also to [1, 5, 11, 15,16,17, 31] for the boundedness and more history of multilinear Fourier multipliers and pseudodifferential operators. The original motivation of [31] is to prove that, if the kernel of the modified Calderón–Zygmund operator T in the considered commutator [bT] has some better decay properties than the classical one, then [bT] should be compact for b being in a larger subspace of \({\mathrm{\,BMO\,}}(\mathbb {R}^n)\) than \(\mathrm{\,CMO\,}(\mathbb {R}^n)\), which indeed proved true in [31, Theorem 1.1].

Throughout this article, a cube Q means that it has finite side length and all its sides parallel to the coordinate axes, but Q is not necessary to be open or closed; for any cube Q of \(\mathbb {R}^n\) and \(f\in L_{{\mathrm{\,loc\,}}}^1(\mathbb {R}^n)\) (the set of all locally integrable functions), the mean oscillation \({\mathcal {O}}(f;Q)\) is defined by setting

$$\begin{aligned} {\mathcal {O}}(f;Q):=\frac{1}{|Q|}\int _Q\left| f(x)-\frac{1}{|Q|}\int _Q f(y)\,dy\right| \,dx. \end{aligned}$$

Recall that the space \({\mathrm{\,BMO\,}}(\mathbb {R}^n)\) is defined by setting

$$\begin{aligned} {\mathrm{\,BMO\,}}(\mathbb {R}^n):=\left\{ f\in L^1_{{\mathrm{\,loc\,}}}(\mathbb {R}^n):\ \Vert f\Vert _{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}:=\sup _{\mathrm{cube\ }Q\subset \mathbb {R}^n}{\mathcal {O}}(f;Q)<\infty \right\} , \end{aligned}$$

where the supremum is taking over all cubes Q of \(\mathbb {R}^n\), and the bilinear commutators \(\{[b,T]_i\}_{i=1}^2\), with \(b\in {\mathrm{\,BMO\,}}(\mathbb {R}^n)\) and T being a bilinear Calderón–Zygmund operator, are defined, respectively, by setting, for any \(f,\ g\in L_{\mathrm{c}}^\infty (\mathbb {R}^n)\) and \(x\notin {\mathrm{\,supp\,}}(f)\cap {\mathrm{\,supp\,}}(g)\),

$$\begin{aligned}{}[b,T]_1(f,g)(x):=&\left( bT(f,g)-T(bf,g)\right) (x)\nonumber \\ =&\int _{\mathbb {R}^{2n}}[b(x)-b(y)]K(x,y,z)f(y)g(z)\,dy\,dz \end{aligned}$$
(1.3)

and

$$\begin{aligned}{}[b,T]_2(f,g)(x):=&\left( bT(f,g)-T(f,bg)\right) (x)\nonumber \\ =&\int _{\mathbb {R}^{2n}}[b(x)-b(z)]K(x,y,z)f(y)g(z)\,dy\,dz. \end{aligned}$$
(1.4)

We now need to introduce several subspaces of the space \({\mathrm{\,BMO\,}}(\mathbb {R}^n)\). Recall that

$$\begin{aligned} \mathrm{\,CMO\,}(\mathbb {R}^n):=\overline{C_{\mathrm{c}}^\infty (\mathbb {R}^n)\cap {\mathrm{\,BMO\,}}(\mathbb {R}^n)}^{{\mathrm{\,BMO\,}}(\mathbb {R}^n)} \end{aligned}$$

and

$$\begin{aligned} \mathrm{\,VMO\,}(\mathbb {R}^n):=\overline{C_{\mathrm{u}}(\mathbb {R}^n)\cap {\mathrm{\,BMO\,}}(\mathbb {R}^n)}^{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}, \end{aligned}$$

where \(C_{\mathrm{c}}^\infty (\mathbb {R}^n)\) denotes the set of all smooth functions on \(\mathbb {R}^n\) with compact support and \(C_{\mathrm{u}}(\mathbb {R}^n)\) the set of all functions on \(\mathbb {R}^n\) with uniform continuity. Here and thereafter, \(\overline{\mathcal X}^{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}\) denotes the closure in \({\mathrm{\,BMO\,}}(\mathbb {R}^n)\) of the set \(\mathcal X\).

In what follows, we use \({\vec {0}}_n\) to denote the origin of \(\mathbb {R}^n\) and, for any \(\alpha :=(\alpha _1,\ldots ,\alpha _n)\in \mathbb {Z}_+^n\), we let \(D^\alpha :=\big (\frac{\partial }{\partial x_1}\big )^{\alpha _1}\cdots \big (\frac{\partial }{\partial x_n}\big )^{\alpha _n}\). We also use \(C^\infty (\mathbb {R}^n)\) to denote the set of all infinitely differentiable functions on \(\mathbb {R}^n\) and \(L^\infty (\mathbb {R}^n)\) the set of all essentially bounded functions on \(\mathbb {R}^n\). The spaces \(\mathrm{\,MMO\,}(\mathbb {R}^n)\) and \(\mathrm{\,XMO\,}(\mathbb {R}^n)\) in [31] were defined in the way that

$$\begin{aligned} \mathrm{\,MMO\,}(\mathbb {R}^n):=\overline{A_\infty (\mathbb {R}^n)}^{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}, \end{aligned}$$

where

$$\begin{aligned} A_\infty (\mathbb {R}^n):=\left\{ b\in C^\infty (\mathbb {R}^n)\cap L^\infty (\mathbb {R}^n): \,\,\forall \ \alpha \in \mathbb {Z}_+^n\setminus \{{\vec {0}}_n\}, \lim _{|x|\rightarrow \infty }D^\alpha b(x)=0\right\} , \end{aligned}$$

and

$$\begin{aligned} \mathrm{\,XMO\,}(\mathbb {R}^n):=\overline{B_\infty (\mathbb {R}^n)}^{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}, \end{aligned}$$

where

$$\begin{aligned} B_\infty (\mathbb {R}^n):=\left\{ b\in C^\infty (\mathbb {R}^n)\cap {\mathrm{\,BMO\,}}(\mathbb {R}^n): \,\,\forall \ \alpha \in \mathbb {Z}_+^n\setminus \{{\vec {0}}_n\}, \lim _{|x|\rightarrow \infty }D^\alpha b(x)=0\right\} . \end{aligned}$$

Furthermore, we use the following set

$$\begin{aligned} B_1(\mathbb {R}^n):=\left\{ b\in C^1(\mathbb {R}^n)\cap {\mathrm{\,BMO\,}}(\mathbb {R}^n):\,\, \lim _{|x|\rightarrow \infty }|\nabla b(x)|=0\right\} \end{aligned}$$

to define

$$\begin{aligned} \mathrm{\,X_{1}MO\,}(\mathbb {R}^n):=\overline{B_1(\mathbb {R}^n)}^{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}, \end{aligned}$$

where \(C^1(\mathbb {R}^n)\) denotes the set of all functions f on \(\mathbb {R}^n\) whose gradients \(\nabla f:=\big (\frac{\partial f}{\partial x_1},\ldots ,\frac{\partial f}{\partial x_n}\big )\) are continuous. By the observation \(C_{\mathrm{c}}^\infty (\mathbb {R}^n)\subset B_\infty (\mathbb {R}^n)\subset B_1(\mathbb {R}^n)\subset C_{\mathrm{u}}(\mathbb {R}^n)\), we easily conclude that

$$\begin{aligned} \mathrm{\,CMO\,}(\mathbb {R}^n)\subset \mathrm{\,XMO\,}(\mathbb {R}^n)\subset \mathrm{\,X_{1}MO\,}(\mathbb {R}^n)\subset \mathrm{\,VMO\,}(\mathbb {R}^n). \end{aligned}$$

Moreover, it was shown in [31] that

$$\begin{aligned} \mathrm{\,CMO\,}(\mathbb {R}^n)\subsetneqq \mathrm{\,MMO\,}(\mathbb {R}^n)\subsetneqq \mathrm{\,XMO\,}(\mathbb {R}^n). \end{aligned}$$

Meanwhile, an open question was posed by Torres and Xue in [31] as follows:

Question 1.1

Which one of the following two possibilities

$$\begin{aligned} \mathrm{\,XMO\,}(\mathbb {R}^n)\subsetneqq \mathrm{\,VMO\,}(\mathbb {R}^n)\quad {\mathrm {or}}\quad \mathrm{\,XMO\,}(\mathbb {R}^n)= \mathrm{\,VMO\,}(\mathbb {R}^n) \end{aligned}$$

holds true?

Torres and Xue in [31] conjectured that the latter might be true. However, in this article, we show that the relationship \(\mathrm{\,XMO\,}(\mathbb {R}^n)\subsetneqq \mathrm{\,VMO\,}(\mathbb {R}^n)\) holds true, which gives a complete answer to Question 1.1. Indeed, we have

$$\begin{aligned} \mathrm{\,CMO\,}(\mathbb {R}^n)\subsetneqq \mathrm{\,XMO\,}(\mathbb {R}^n)=\mathrm{\,X_{1}MO\,}(\mathbb {R}^n)\subsetneqq \mathrm{\,VMO\,}(\mathbb {R}^n), \end{aligned}$$

where \(\mathrm{\,XMO\,}(\mathbb {R}^n)\supseteqq \mathrm{\,X_{1}MO\,}(\mathbb {R}^n)\) is quite surprising. To show this, we establish the following equivalent characterization, which is the first main result of this article. In what follows, the symbol \(a\rightarrow 0^+\) means that \(a\in (0,\infty )\) and \(a\rightarrow 0\); for any \(x\in \mathbb {R}^n\) and any cube Q of \(\mathbb {R}^n\), \(Q+x:=\{y+x:\ y\in Q\}\).

Theorem 1.2

The following statements are mutually equivalent:

  1. (i)

    \(f\in \mathrm{\,X_{1}MO\,}(\mathbb {R}^n)\);

  2. (ii)

    \(f\in {\mathrm{\,BMO\,}}(\mathbb {R}^n)\) and enjoys the properties that

    (ii)\(_1\):
    $$\begin{aligned} \lim _{a\rightarrow 0^+}\sup _{|Q|=a}{\mathcal {O}}(f; Q)=0; \end{aligned}$$
    (ii)\(_2\):

    for any cube \(Q\subset \mathbb {R}^n\),

    $$\begin{aligned} \lim _{|x|\rightarrow \infty }{\mathcal {O}}(f; Q+x)=0. \end{aligned}$$
  3. (iii)

    \(f\in \mathrm{\,XMO\,}(\mathbb {R}^n)\).

As a consequence of Theorem 1.2, we have the following conclusion.

Corollary 1.3

\(\mathrm{\,X_{1}MO\,}(\mathbb {R}^n)=\mathrm{\,XMO\,}(\mathbb {R}^n)\subsetneqq \mathrm{\,VMO\,}(\mathbb {R}^n)\).

Thus, Corollary 1.3 completely answers the open question asked by Torres and Xue in [31].

In order to state another main result of this article, we need to introduce a class of multiple weights. Recall that, usually, a non-negative measurable function w on \(\mathbb {R}^n\) is called a weight on \(\mathbb {R}^n\). For any given \(\mathbf{p }:=(p_1,p_2)\in (1,\infty )\times (1,\infty )\), let p satisfy \(\frac{1}{p}=\frac{1}{p_1}+\frac{1}{p_2}\). Following [3], we call \(\mathbf{w }:=(w_1,w_2)\) a vector \(\mathbf{A }_{\mathbf{p }}(\mathbb {R}^n)\) weight, denoted by \(\mathbf{w }:=(w_1,w_2)\in \mathbf{A }_{\mathbf{p }}(\mathbb {R}^n)\), if

$$\begin{aligned}{}[\mathbf{w }]_{\mathbf{A }_{\mathbf{p }}(\mathbb {R}^n)}&:=\sup _Q\left[ \frac{1}{|Q|}\int _Q w(x)\,dx\right] \left\{ \frac{1}{|Q|} \int _Q \left[ w_1(x)\right] ^{1-p_1'}\,dx\right\} ^{\frac{p}{p_1'}}\\&\quad \times \left\{ \frac{1}{|Q|}\int _Q \left[ w_2(x)\right] ^{1-p_2'}\,dx\right\} ^{\frac{p}{p_2'}} \end{aligned}$$

is finite, where \(w:=w_1^{p/p_1}w_2^{p/p_2}\), \(\frac{1}{p_1}+\frac{1}{p_1'}=1=\frac{1}{p_2}+\frac{1}{p_2'}\), and the supremum is taken over all cubes Q of \(\mathbb {R}^n\). In what follows, for any given weight w on \(\mathbb {R}^n\) and any given measurable subset \(E\subset \mathbb {R}^n\), the symbol \(L^p_w(E)\) denotes the set of all measurable functions f on E such that

$$\begin{aligned} \Vert f\Vert _{L^p_w(E)}:=\left[ \int _E |f(x)|^p w(x)\,dx\right] ^\frac{1}{p}<\infty . \end{aligned}$$

Now, we state our second main result of this article on an application of \(\mathrm{\,XMO\,}(\mathbb {R}^n)\) as follows.

Theorem 1.4

Let \(\mathbf{p }:=(p_1,p_2)\in (1,\infty )\times (1,\infty )\), \(p\in \big (\frac{1}{2},\infty \big )\) with \(\frac{1}{p}=\frac{1}{p_1}+\frac{1}{p_2}\), \(\mathbf{w }:=(w_1,w_2)\in \mathbf{A }_{\mathbf{p }}(\mathbb {R}^n)\), \(w:=w_1^{p/p_1}w_2^{p/p_2}\), \(b\in \mathrm{\,XMO\,}(\mathbb {R}^n)\), and T be a bilinear Calderón–Zygmund operator whose kernel satisfies (1.1) and (1.2). Then, for any \(i\in \{1,2\}\), the bilinear commutator \([b,T]_i\) as in (1.3) or (1.4) is compact from \({L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) to \({L_w^p(\mathbb {R}^n)}\).

Remark 1.5

We have the following comments towards the conclusions of Theorem 1.4.

  1. (i)

    Although we state and prove Theorem 1.4 in bilinear case, indeed this theorem can be extended to linear or multilinear case with notational complications and usual modifications. For instance, if \(b\in \mathrm{\,XMO\,}(\mathbb {R}^n)\) and T is a linear Calderón–Zygmund operator whose kernel K satisfies that, for any given \(\alpha :=(\alpha _1,\ldots ,\alpha _{2n})\in \mathbb {Z}_+^{2n}\) with \(|\alpha |:=\alpha _1+\cdots +\alpha _{2n}\le 1\), and any \(x,\ y\in \mathbb {R}^n\),

    $$\begin{aligned} |D^\alpha K(x,y)|\le C_{(\alpha )} |x-y|^{-n-|\alpha |} \end{aligned}$$

    and, for any \(x,\ y\in \mathbb {R}^n\) with \(|x-y|\ge 1\),

    $$\begin{aligned} |K(x,y)|\le C |x-y|^{-n-2-\delta }, \end{aligned}$$

    where \(D^\alpha :=\big (\frac{\partial }{\partial x_1}\big )^{\alpha _1}\cdots \big (\frac{\partial }{\partial x_{2n}}\big )^{\alpha _{2n}}\), and \(C_{(\alpha )}\), C, and \(\delta \) are some positive constants, then [bT] is compact on \(L^p_w(\mathbb {R}^n)\) for any given \(p\in (1,\infty )\) and \(w\in A_p(\mathbb {R}^n)\). Furthermore, observe that the proof of Theorem 1.4 mainly depends on the boundedness of Calderón–Zygmund operators and the Hardy–Littlewood maximal operator. Therefore, Theorem 1.4 can also be extended to Morrey spaces; see, for instance, [30].

  2. (ii)

    The corresponding compactness result in [31, Theorem 1.1] requires that the kernel K satisfies both (1.1) and the following additional estimates: for any given \(\alpha \in \mathbb {Z}_+^{3n}\), with \(|\alpha |\le 1\), and for any given \(N\in \{1,2,3\}\), there exists a positive constant \(C_{(\alpha , N)}\), depending on \(\alpha \) and N, such that, for any \(|x-y|+|x-z|>1\),

    $$\begin{aligned} |D^\alpha K(x,y,z)|\le C_{(\alpha , N)}(|x-y|+|x-z|)^{-2n-N}. \end{aligned}$$
    (1.5)

    But, the assumption (1.2) in Theorem 1.4 only needs \(\alpha ={\vec {0}}_{3n}\) and \(N=2+\delta \) in (1.5). Thus, in this sense, even the unweighted case of Theorem 1.4 also optimizes and hence improves the corresponding result in [31].

  3. (iii)

    Bényi et al. [3, Theorem 1.1] obtained the compactness of weighted compact bilinear operators via \(\mathrm{\,CMO\,}(\mathbb {R}^n)\), which states that, if \(b\in \mathrm{\,CMO\,}(\mathbb {R}^n)\) and T is a bilinear Calderón–Zygmund operator whose kernel K satisfies (1.1), then the bilinear commutators \(\{[b,T]_i\}_{i=1}^2\) are compact from \({L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) to \({L_w^p(\mathbb {R}^n)}\). From this and Proposition 2.4 below, we deduce that

    $$\begin{aligned} {\left\{ \begin{array}{ll} T \mathrm{\ satisfies\ } (1.1)\\ b \mathrm{\ satisfies\ (i),\ (ii),\ and\ (iii)\ of\ Proposition\ 2.4} \end{array}\right. } \end{aligned}$$
    (1.6)

    implies that \(\{[b,T]_i\}_{i=1}^2\) are compact. On the other hand, by Theorems 1.2 and 1.4, and Proposition 2.4 below, we conclude that

    $$\begin{aligned} {\left\{ \begin{array}{ll} T \mathrm{\ satisfies\ (1.1)\mathrm{\ and\ }(1.2)}\\ b \mathrm{\ satisfies\ (i)\ and\ (ii)\ of\ Proposition\ 2.4} \end{array}\right. } \end{aligned}$$
    (1.7)

    implies that \(\{[b,T]_i\}_{i=1}^2\) are compact. Therefore, in (1.6), if we make an additional assumption (1.2) on T, and drop the condition (iii) of Proposition 2.4 on b, then it coincides with (1.7). This is harmonious and reasonable. Besides, [3, Theorem 1.1] requires \(p:=\frac{p_1 p_2}{p_1+p_2}>1\) because they used the weighted Frechét–Kolmogorov theorem on \(L^p_w(\mathbb {R}^n)\) with \(p\in (1,\infty )\). However, thanks to [34, Theorem 1.1], which is re-stated as Lemma 3.2 below, we can optimize and hence improve this range into \(p\in (\frac{1}{2},\infty )\) in Theorem 1.4.

  4. (iv)

    Chaffee et al. [8, Theorem 3.1] proved that, letting \(p_1,\ p_2\in (1,\infty )\), \(p:=\frac{p_1 p_2}{p_1+p_2}>\frac{1}{2}\), and \(\{{\mathcal {R}}_j^k:\ j\in \{1,2\}\ \mathrm{and}\ k\in \{1,\ldots ,n\}\}\) be the bilinear Riesz transforms defined by setting, for any given \(k\in \{1,\ldots ,n\}\) and any \(x:=(x_1,\ldots ,x_n)\in \mathbb {R}^n\),

    $$\begin{aligned} {\mathcal {R}}_1^k(f,g)(x) :=\mathrm{p.\,v.}\int _{\mathbb {R}^{2n}}\frac{x_k-y_k}{(|x-y|^2+|x-z|^2)^{n+\frac{1}{2}}} f(y)g(z)\,dy\,dz \end{aligned}$$

    and

    $$\begin{aligned} {\mathcal {R}}_2^k(f,g)(x) :=\mathrm{p.\,v.}\int _{\mathbb {R}^{2n}}\frac{x_k-z_k}{(|x-y|^2+|x-z|^2)^{n+\frac{1}{2}}} f(y)g(z)\,dy\,dz, \end{aligned}$$

    then, for any \(i,\, j\in \{1,2\}\) and \(k\in \{1,\ldots ,n\}\), \([b,{\mathcal {R}}_j^k]_i\) is compact from \(L^{p_1}(\mathbb {R}^n)\times L^{p_2}(\mathbb {R}^n)\) to \(L^{p}(\mathbb {R}^n)\) if and only if \(b\in \mathrm{\,CMO\,}(\mathbb {R}^n)\). Moreover, as a bilinear counterpart of [33, Theorem 2], Chaffee et al. [8, Remark 3.2] pointed out that [8, Theorem 3.1] also holds true if the bilinear Riesz transform is replaced by any more general bounded convolutional bilinear operator with the smooth kernel

    $$\begin{aligned} \frac{\Omega \Big (\frac{(y,z)}{|(y,z)|}\Big )}{(|y|^2+|z|^2)^n}, \end{aligned}$$
    (1.8)

    where \((y,z)\in \mathbb {R}^n\times \mathbb {R}^n\setminus \{{\vec {0}}_{2n}\}\), \(\Omega \) is a homogeneous function of degree zero defined on the unit sphere in \(\mathbb {R}^n\times \mathbb {R}^n\) and is sufficiently smooth. The main difference between the aforementioned results of Chaffee et al. and Theorem 1.4 is that the bilinear Riesz transform, or the Calderón–Zygmund operator with kernel of the form (1.8), does not satisfy (1.2) and, conversely, the operator T in Theorem 1.4 surely does not have the form (1.8). Thus, the operators considering, respectively, in aforementioned results of Chaffee et al. and Theorem 1.4 are two completely different classes of operators, and hence the corresponding theorems are also completely unrelated. Besides, it is still an challenging open problem to find a class of bilinear Calderón–Zygmund operators \({\widetilde{T}}\), whose kernels satisfy (1.1) and (1.2), such that \(\{[b,{\widetilde{T}}]_i\}_{i=1}^2\) are compact from \(L^{p_1}(\mathbb {R}^n)\times L^{p_2}(\mathbb {R}^n)\) to \(L^{p}(\mathbb {R}^n)\) if and only if \(b\in \mathrm{\,XMO\,}(\mathbb {R}^n)\), where \(p_1,\ p_2\in (1,\infty )\) and \(p\in \big (\frac{1}{2},\infty \big )\) satisfy \(\frac{1}{p}=\frac{1}{p_1}+\frac{1}{p_2}\).

The remainder of this article is organized as follows.

In Sect. 2, we first notice the nontriviality of \(\mathrm{\,XMO\,}(\mathbb {R}^n)\) when \(n=1\), namely,

$$\begin{aligned} \mathrm{\,XMO\,}(\mathbb {R})\subsetneqq \mathrm{\,VMO\,}(\mathbb {R}); \end{aligned}$$

see Proposition 2.1 below. Based on its calculation, we further show that \(\mathrm{\,XMO\,}(\mathbb {R}^n)\) has a similar equivalent characterization as \(\mathrm{\,VMO\,}(\mathbb {R}^n)\) and \(\mathrm{\,CMO\,}(\mathbb {R}^n)\); see Theorem 1.2 below. To achieve this, geometrically inspired by Uchiyama [33], we first approximate a given function \(f\in \mathrm{\,XMO\,}(\mathbb {R}^n)\) by an exceptional simple function \(g_\epsilon \) which is constructed based on a dyadic family \({\mathcal {F}}\), and some essential new techniques on dyadic cubes. These new techniques provide some exponential decay property of the mean oscillation \({\mathcal {O}}(f,Q)\) when Q is far away from the origin. Roughly speaking, \({\mathcal {F}}\) consists of numerous small equal-size dyadic cubes near the origin and, farther away from the origin, larger and larger the dyadic cubes. Moreover, by the convolution of \(g_\epsilon \) and an even function \(\varphi \) with delicate dilation which strongly depends on \(\epsilon \) and some exquisite geometrical observations of \({\mathcal {F}}\), we construct an approximation element \(h_\epsilon \) of f in the \({\mathrm{\,BMO\,}}(\mathbb {R}^n)\) norm. To prove \(h_\epsilon \in B_\infty (\mathbb {R}^n)\), we use a key analytic technic, namely, first to prove \(\lim _{|x|\rightarrow \infty }D^\alpha h_\epsilon (x)=0\) whenever \(|\alpha |\) is odd via the aforementioned exponential decay property; from this and the Taylor remainder theorem, we then deduce \(\lim _{|x|\rightarrow \infty }D^\alpha h_\epsilon (x)=0\) whenever \(|\alpha |\) is even, which further implies that \(h_\epsilon \in B_\infty (\mathbb {R}^n)\) and finally completes the proof of Theorem 1.2. As a corollary, we obtain

$$\begin{aligned} \mathrm{\,X_{1}MO\,}(\mathbb {R}^n)=\mathrm{\,XMO\,}(\mathbb {R}^n)\subsetneqq \mathrm{\,VMO\,}(\mathbb {R}^n)\,\, \end{aligned}$$

in Corollary 1.3 below, which completely answers the open question raised in [31].

In Sect. 3, we give the proof of Theorem 1.4. Since a general \(A_p\) weight is not invariant under translations, the method in [31] can not be applied to the weighted setting directly. Thus, to overcome this difficulty, a main new idea is to change the dominations of the translation-invariant positive operators in [31] into the dominations of both the maximal functions and the smooth truncated Calderón–Zygmund operators. To this end, we use several smooth truncated techniques and the density arguments of compact operators. Especially, using this method, we can also optimize [31, Theorem 1.1] from “K satisfies (1.5)” to “K satisfies (1.2)” even in the unweighted case.

Throughout this article, the origin of \(\mathbb {R}^n\) is denoted by \({\vec {0}}_n\). We denote by C and \({\widetilde{C}}\) positive constants which are independent of the main parameters, but they may vary from line to line. Moreover, we use \(C_{(\gamma ,\ \beta ,\ \ldots )}\) to denote a positive constant depending on the indicated parameters \(\gamma ,\ \beta ,\ \ldots \). Constants with subscripts, such as \(C_{0}\) and \(A_1\), do not change in different occurrences. Furthermore, the symbol \(f\lesssim g\) represents that \(f\le Cg\) for some positive constant C. If \(f\lesssim g\) and \(g\lesssim f\), we then write \(f\sim g\). If \(f\le Cg\) and \(g=h\) or \(g\le h\), we then write \(f\lesssim g\sim h\) or \(f\lesssim g\lesssim h\), rather than \(f\lesssim g=h\) or \(f\lesssim g\le h\). Let \(\mathbb {N}:=\{1,\,2,\ldots \}\) and \(\mathbb {Z}_+:=\mathbb {N}\cup \{0\}\). For any \(p\in [1,\infty ]\), let \(p'\) denote its conjugate index, that is, \(p'\) satisfies \(1/p+1/p'=1\). For any cube Q of \(\mathbb {R}^n\) and \(f\in L_{{\mathrm{\,loc\,}}}^1(\mathbb {R}^n)\), let

$$\begin{aligned} \fint _Q:=\frac{1}{|Q|}\int _Q\quad \mathrm{and}\quad f_Q:=\fint _Q f(y)\,dy; \end{aligned}$$

moreover, the mean oscillation \({\mathcal {O}}(f;Q)\) is defined by setting

$$\begin{aligned} {\mathcal {O}}(f;Q):=\fint _Q\left| f(x)-f_Q\right| \,dx. \end{aligned}$$

2 Characterization and Non-triviality of \(\mathrm{\,XMO\,}(\mathbb {R}^n)\)

In this section, we investigate the equivalent characterization of \(\mathrm{\,XMO\,}(\mathbb {R}^n)\). To this end, we begin with the following concise counterexample on the real line.

Proposition 2.1

There exists some \(f\in \mathrm{\,VMO\,}(\mathbb {R})\setminus \mathrm{\,XMO\,}(\mathbb {R})\).

Proof

For any \(x\in \mathbb {R}\), let \(f(x):=\sin (x)\). Then f is uniformly continuous and \(f\in L^\infty (\mathbb {R})\subset {\mathrm{\,BMO\,}}(\mathbb {R})\). Thus, \(f\in \mathrm{\,VMO\,}(\mathbb {R})\). We claim that, for any \(g\in B_1(\mathbb {R}^n)\),

$$\begin{aligned} \Vert f-g\Vert _{{\mathrm{\,BMO\,}}(\mathbb {R})}\ge \frac{1}{2\pi }. \end{aligned}$$
(2.1)

Indeed, for any \(k\in \mathbb {N}\), let

$$\begin{aligned} I_k:=\left[ 2k\pi -\frac{\pi }{2},2k\pi +\frac{\pi }{2}\right] . \end{aligned}$$

Since \(g\in B_1(\mathbb {R}^n)\), it follows that \(\lim _{|x|\rightarrow \infty }g'(x)=0\) and hence we can choose k large enough such that, for any \(y\in I_k\),

$$\begin{aligned} |g'(y)|<\frac{4}{\pi ^2}<\frac{\sqrt{2}}{\pi }. \end{aligned}$$
(2.2)

Therefore, by the mean value theorem and the fundamental theorem of calculus, we have

$$\begin{aligned} {\mathcal {O}}(f-g;I_k)&=\frac{1}{|I_k|}\int _{I_k}\left| (f-g)(x)-(f-g)_{I_k}\right| \,dx\nonumber \\&=\frac{1}{|I_k|}\int _{I_k}\left| (f-g)(x)-(f-g)(\xi _k)\right| \,dx\nonumber \\&=\frac{1}{|I_k|}\int _{I_k}\left| \int _{\xi _k}^x(f-g)'(y)\,dy\right| \,dx\nonumber \\&=\frac{1}{|I_k|}\int _{I_k}\left| \int _{\xi _k}^x\left[ \cos (y)-g'(y)\right] \,dy\right| \,dx, \end{aligned}$$
(2.3)

where \(\xi _k\in I_k\) is independent of x, but it may depend on k. Without loss of generality, we may assume that \(\xi _k\in \big [2k\pi -\frac{\pi }{2},2k\pi \big ]\). Then, from (2.3), we deduce that

$$\begin{aligned} {\mathcal {O}}(f-g;I_k)\ge \frac{1}{\pi }\int _{2k\pi }^{2k\pi +\frac{\pi }{2}} \left| \int _{\xi _k}^x\left[ \cos (y)-g'(y)\right] \,dy\right| \,dx. \end{aligned}$$
(2.4)

By the fact that \(\xi _k\in \big [2k\pi -\frac{\pi }{2},2k\pi \big ]\) and \(x\in \big [2k\pi ,2k\pi +\frac{\pi }{2}\big ]\), we know that \(\frac{x+\xi _k}{2}\in \big [2k\pi -\frac{\pi }{4},2k\pi +\frac{\pi }{4}\big ]\) and \(\frac{x-\xi _k}{2}\in \big [0,\frac{\pi }{2}\big ]\). Therefore, we obtain

$$\begin{aligned} \cos \left( \frac{x+\xi _k}{2}\right) \ge \frac{\sqrt{2}}{2}\quad {\mathrm {and}}\quad \sin \left( \frac{x-\xi _k}{2}\right) \ge \frac{2}{\pi }\frac{x-\xi _k}{2}=\frac{x-\xi _k}{\pi }. \end{aligned}$$
(2.5)

From (2.2) and (2.5), it follows that

$$\begin{aligned}&\int _{\xi _k}^x\left[ \cos (y)-g'(y)\right] \,dy\nonumber \\&\qquad \qquad \quad >\int _{\xi _k}^x\left[ \cos (y)-\frac{\sqrt{2}}{\pi }\right] \,dy =\sin (x)-\sin (\xi _k)-\frac{\sqrt{2}}{\pi }(x-\xi _k)\nonumber \\&\qquad \qquad \quad =2\cos \left( \frac{x+\xi _k}{2}\right) \sin \left( \frac{x-\xi _k}{2}\right) -\frac{\sqrt{2}}{\pi }(x-\xi _k)\nonumber \\&\qquad \qquad \quad \ge 2\frac{\sqrt{2}}{2}\frac{x-\xi _k}{\pi }-\frac{\sqrt{2}}{\pi }(x-\xi _k) \ge 0. \end{aligned}$$
(2.6)

By (2.4), (2.6), \(\xi _k\in \big [2k\pi -\frac{\pi }{2},2k\pi \big ]\), and (2.2), we conclude that

$$\begin{aligned} {\mathcal {O}}(f-g;I_k)&\ge \frac{1}{\pi }\int _{2k\pi }^{2k\pi +\frac{\pi }{2}} \int _{\xi _k}^x\left[ \cos (y)-g'(y)\right] \,dy\,dx\\&\ge \frac{1}{\pi }\int _{2k\pi }^{2k\pi +\frac{\pi }{2}} \int _{2k\pi }^x\left[ \cos (y)-\frac{4}{\pi ^2}\right] \,dy\,dx\\&=\frac{1}{\pi }\int _{2k\pi }^{2k\pi +\frac{\pi }{2}} \left[ \sin (x)-\frac{4}{\pi ^2}(x-2k\pi )\right] \,dx\\&=\frac{1}{\pi }\left( 1-\frac{4}{\pi ^2}\int _0^\frac{\pi }{2}z\,dz\right) =\frac{1}{\pi }\left[ 1-\frac{4}{\pi ^2}\frac{1}{2}\left( \frac{\pi }{2}\right) ^2 \right] =\frac{1}{2\pi }. \end{aligned}$$

This implies that the inequality (2.1) holds true, which completes the proof of Proposition 2.1. \(\square \)

Remark 2.2

One can modify the above calculation from \(\mathbb {R}\) to \(\mathbb {R}^n\), but this process may be tedious. However, if, for any \((x_1,\ldots ,x_n)\in \mathbb {R}^n\), let

$$\begin{aligned} f(x_1,\ldots ,x_n):=\prod _{k=1}^n\sin (x_k) \end{aligned}$$

then, by Theorem 1.2, we immediately know that

$$\begin{aligned} f\in \mathrm{\,VMO\,}(\mathbb {R}^n)\setminus \mathrm{\,XMO\,}(\mathbb {R}^n); \end{aligned}$$

see the proof of Corollary 1.3 below.

In what follows, we need to use the following equivalent characterizations of \(\mathrm{\,VMO\,}(\mathbb {R}^n)\) and \(\mathrm{\,CMO\,}(\mathbb {R}^n)\) established by Sarason [28] and Uchiyama [33], respectively.

Proposition 2.3

([28, Theorem 1]) Let \(f\in {\mathrm{\,BMO\,}}(\mathbb {R}^n)\). Then \(f\in \mathrm{\,VMO\,}(\mathbb {R}^n)\) if and only if

$$\begin{aligned} \lim _{a\rightarrow 0^+}\sup _{|Q|=a}{\mathcal {O}}(f; Q)=0. \end{aligned}$$

Proposition 2.4

([33, p. 166]) Let \(f\in {\mathrm{\,BMO\,}}(\mathbb {R}^n)\). Then \(f\in \mathrm{\,CMO\,}(\mathbb {R}^n)\) if and only if f satisfies the following three conditions:

  1. (i)
    $$\begin{aligned} \lim _{a\rightarrow 0^+}\sup _{|Q|=a}{\mathcal {O}}(f; Q)=0; \end{aligned}$$
  2. (ii)

    for any cube \(Q\subset \mathbb {R}^n\),

    $$\begin{aligned} \lim _{|x|\rightarrow \infty }{\mathcal {O}}(f; Q+x)=0; \end{aligned}$$
  3. (iii)
    $$\begin{aligned} \lim _{a\rightarrow \infty }\sup _{|Q|=a}{\mathcal {O}}(f; Q)=0. \end{aligned}$$

Observe that, in the proof of Proposition 2.1, the mean oscillations \(\{{\mathcal {O}}(f;I_k)\}_{k\in \mathbb {N}}\) violate Proposition 2.4(ii), which leads us to consider the limit condition (ii)\(_2\) of Theorem 1.2(ii).

Now, we are in the position to give the proof of Theorem 1.2. In what follows, for any complex number \(z:=a+ib\) with \(a,\ b\in \mathbb {R}\), let \(\mathrm{Re}z:=a\) and \(\mathrm{Im}z:=b\).

Proof of Theorem 1.2

We first prove (i) \(\Longrightarrow \) (ii). By the density argument, it suffices to show that, for any \(f\in B_1(\mathbb {R}^n)\), both (ii)\(_1\) and (ii)\(_2\) of Theorem 1.2(ii) hold true. Indeed, for any \(x,\ y\in \mathbb {R}^n\), by the mean value theorem, we obtain

$$\begin{aligned} |f(x)-f(y)|&=\left| \mathrm{Re}f(x)-\mathrm{Re}f(y)+i\left[ \mathrm{Im}f(x)-\mathrm{Im}f(y)\right] \right| \\&=\left| \nabla ( \mathrm{Re}f)(\xi _1)\cdot (x-y) +i\nabla (\mathrm{Im}f)(\xi _2)\cdot (x-y)\right| \\&\le \left[ \left\| \nabla (\mathrm{Re}f)\right\| _{L^\infty (\mathbb {R}^n)} +\left\| \nabla (\mathrm{Im}f)\right\| _{L^\infty (\mathbb {R}^n)}\right] |x-y|\\&\le 2\Vert \nabla f\Vert _{L^\infty (\mathbb {R}^n)}|x-y|, \end{aligned}$$

where \(\xi _1,\ \xi _2\) belong to the segment \({\overline{xy}}\) connecting x and y, and \(\Vert \nabla f\Vert _{L^\infty (\mathbb {R}^n)}<\infty \) because \(f\in B_1(\mathbb {R}^n)\). This implies that \(f\in C_{\mathrm{u}}(\mathbb {R}^n)\) and hence \(f\in \mathrm{\,VMO\,}(\mathbb {R}^n)\). From this and Proposition 2.3, it follows that f satisfies (ii)\(_1\) of Theorem 1.2(ii). Moreover, for any fixed cube Q of \(\mathbb {R}^n\), by the mean value theorem again, we conclude that

$$\begin{aligned} {\mathcal {O}}(f;Q)&=\frac{1}{|Q|}\int _Q\left| f(x)-\frac{1}{|Q|}\int _Q f(y)\,dy\right| \,dx\\&\le \frac{1}{|Q|^2}\int _Q\int _Q|f(x)-f(y)|\,dy\,dx\\&=\frac{1}{|Q|^2}\int _Q\int _Q\left| \nabla ( \mathrm{Re}f)(\xi _1)\cdot (x-y) +i\nabla (\mathrm{Im}f)(\xi _2)\cdot (x-y)\right| \,dy\,dx\\&\lesssim \frac{1}{|Q|^2}\int _Q\int _Q\left[ \sup _{z\in Q}|\nabla f(z)|\right] |Q|^\frac{1}{n}\,dy\,dx \sim \left[ \sup _{z\in Q}|\nabla f(z)|\right] |Q|^\frac{1}{n}. \end{aligned}$$

Thus, for the given cube Q of \(\mathbb {R}^n\) and any \(x\in \mathbb {R}^n\), we have

$$\begin{aligned} {\mathcal {O}}(f;Q+x)\lesssim \left[ \sup _{z\in Q+x}|\nabla f(z)|\right] |Q+x|^\frac{1}{n} \sim \left[ \sup _{z\in Q+x}|\nabla f(z)|\right] |Q|^\frac{1}{n}\rightarrow 0 \end{aligned}$$

as \(|x|\rightarrow \infty \), which shows that f satisfies (ii)\(_2\) of Theorem 1.2(ii). This finishes the proof that (i) \(\Longrightarrow \) (ii).

Notice that one can always apply the mean value theorem, respectively, to the real and the image parts of a complex-valued function. Therefore, for the simplicity of the presentation and without loss of generality, in what follows, we may always assume that the function under consideration is real valued when we apply the mean value theorem.

Next, we prove (ii) \(\Longrightarrow \) (iii). Let \(f\in {\mathrm{\,BMO\,}}(\mathbb {R}^n)\) satisfy both (ii)\(_1\) and (ii)\(_2\) of Theorem 1.2(ii). To prove \(f\in \mathrm{\,XMO\,}(\mathbb {R}^n)\), for any fixed \(\epsilon \in (0,\infty )\), it suffices to show that there exist a simple function \(g_\epsilon \) satisfying

$$\begin{aligned} \Vert f-g_\epsilon \Vert _{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}\lesssim \epsilon , \end{aligned}$$
(2.7)

and a function \(h_\epsilon \in B_\infty (\mathbb {R}^n)\) satisfying

$$\begin{aligned} \Vert g_\epsilon -h_\epsilon \Vert _{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}\lesssim \epsilon . \end{aligned}$$
(2.8)

The remainder of the proof that (ii) \(\Longrightarrow \) (iii) consists of the following three steps.

Step (i):

Construct a family \({\mathcal {F}}\) of disjoint dyadic cubes and introduce a simple function \(g_\epsilon \) via \({\mathcal {F}}\).

Step (ii):

Show that (2.7) holds true.

Step (iii):

Define \(h_\epsilon \) via \(g_\epsilon \), and then show that (2.8) holds true and \(h_\epsilon \in B_\infty (\mathbb {R}^n)\).

We proceed in order and begin with Step (i). For the above given \(\epsilon \in (0,\infty )\), by (ii)\(_1\) of Theorem 1.2(ii), we know that there exists a negative integer \(j(\epsilon ;0)\in \mathbb {Z}_-:=\{-1,-2,\ldots \}\) such that, for any cube Q with the side length \(\ell (Q)<2^{j(\epsilon ;0)+1}\),

$$\begin{aligned} {\mathcal {O}}(f;Q)<\epsilon . \end{aligned}$$
(2.9)

Here and thereafter, we denote the side length of any cube Q by \(\ell (Q)\). Besides, we always use Q(xr) to denote the cube centered at x with the side length 2r, and \({\mathcal {D}}\) to denote the family of all classical dyadic cubes in \(\mathbb {R}^n\). By (ii)\(_2\) of Theorem 1.2(ii), we find that there exists some \(j(\epsilon ;1)\in \mathbb {Z}\) with \(j(\epsilon ;1)>j(\epsilon ;0)\) such that, for any \(x\in \mathbb {R}^n\) with \(|x|\ge j(\epsilon ;1)\),

$$\begin{aligned} {\mathcal {O}}(f;Q(x,2^{j(\epsilon ;0)+1}))<2^{j(\epsilon ;0)}\epsilon \le 2^{-1}\epsilon <\epsilon . \end{aligned}$$
(2.10)

Repeating the above procedure, we obtain, for any \(k\in \mathbb {N}\), there exists some \(j(\epsilon ;k)\in \mathbb {Z}\) with \(j(\epsilon ;k)>j(\epsilon ;k-1)>\cdots >j(\epsilon ;0)\) such that, for any \(x\in \mathbb {R}^n\) with \(|x|\ge 2^{j(\epsilon ;k)}\),

$$\begin{aligned} {\mathcal {O}}(f;Q(x,2^{j(\epsilon ;0)+k}))<2^{kj(\epsilon ;0)}\epsilon <\epsilon . \end{aligned}$$
(2.11)

Now, define \(\{{\mathcal {F}}_k\}_{k\in \mathbb {N}}\) and \({\mathcal {F}}\), respectively, by setting

$$\begin{aligned} {\mathcal {F}}_1:=&\left\{ Q\subset \overline{Q({\vec {0}}_n,2^{j(\epsilon ;1)})}: \,\,Q\in {\mathcal {D}}\mathrm {\,\,with\,\,}\ell (Q)=2^{j(\epsilon ;0)} \right\} ,\\ {\mathcal {F}}_2:=&\left\{ Q\subset \overline{Q({\vec {0}}_n,2^{j(\epsilon ;2)}) \setminus Q({\vec {0}}_n,2^{j(\epsilon ;1)})}: \,\,Q\in {\mathcal {D}}\mathrm {\,\,with\,\,}\ell (Q)=2^{j(\epsilon ;0)+1} \right\} ,\\ \vdots \\ {\mathcal {F}}_k:=&\left\{ Q\subset \overline{Q({\vec {0}}_n,2^{j(\epsilon ;k)}) \setminus Q({\vec {0}}_n,2^{j(\epsilon ;k-1)})}: \,\,Q\in {\mathcal {D}}\mathrm {\,\,with\,\,}\ell (Q)=2^{j(\epsilon ;0)+k-1} \right\} ,\\ \vdots \end{aligned}$$

and

$$\begin{aligned} {\mathcal {F}}:=\bigcup _{k\in \mathbb {N}}{\mathcal {F}}_k, \end{aligned}$$

here and thereafter, for any subset A of \(\mathbb {R}^n\), we use \(\overline{A}\) to denote its closure in \(\mathbb {R}^n\). Then, for any \(k\in \mathbb {N}\), \({\mathcal {F}}_k\) contains disjoint cubes with the same side length and hence \({\mathcal {F}}\) is a family of disjoint dyadic cubes. Next, we introduce the simple function \(g_\epsilon \) associated with \({\mathcal {F}}\) as follows. Since the cubes in \({\mathcal {F}}\) are disjoint, it follows that, for any \(x\in \mathbb {R}^n\), there exists a unique cube \(Q_{(x)}\in {\mathcal {F}}\) such that \(Q_{(x)}\ni x\); let

$$\begin{aligned} g_\epsilon (x):=f_{Q_{(x)}}:=\frac{1}{|Q_{(x)}|}\int _{Q_{(x)}}f(y)\,dy. \end{aligned}$$

Then \(g_\epsilon \) is a simple function on \(\mathbb {R}^n\). This finishes the proof of Step (i).

Step (ii) To estimate \(\Vert f-g_\epsilon \Vert _{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}\), we first claim that, for any \(x,\ y\in \mathbb {R}^n\) with \(\overline{Q_{(x)}}\cap \overline{Q_{(y)}}\ne \emptyset \),

$$\begin{aligned} |g_\epsilon (x)-g_\epsilon (y)|\lesssim \epsilon . \end{aligned}$$
(2.12)

Indeed, if both x and y lie in the same cube \(Q\in {\mathcal {F}}\), then, by the definition of \(g_\epsilon \), we know that \(g_\epsilon (x)=g_\epsilon (y)\) and hence (2.12) holds true trivially. If x and y lie, respectively, in different dyadic cubes \(Q_{(x)}\) and \(Q_{(y)}\), then, from the construction of \({\mathcal {F}}\), it follows that \(Q_{(x)}\) and \(Q_{(y)}\) must be adjacent, namely, \(\overline{Q_{(x)}}\cap \overline{Q_{(y)}}\) is a point, segment, or surface. Anyhow, \(|Q_{(x)}|\) and \(|Q_{(y)}|\) are comparable and hence there exists a larger cube \(Q_{(x,y)}\) of \(\mathbb {R}^n\) such that

  1. (i)

    the center of \(Q_{(x,y)}\) belongs to \(\overline{Q_{(x)}}\cap \overline{Q_{(y)}}\);

  2. (ii)

    \(Q_{(x)}\subset Q_{(x,y)}\) and \( Q_{(y)}\subset Q_{(x,y)}\);

  3. (iii)

    the side length \( \ell (Q_{(x,y)})\) of \(Q_{(x,y)}\) satisfies that

    $$\begin{aligned} \ell (Q_{(x,y)})=2\max \{\ell (Q_{(x)}),\ell (Q_{(y)})\} \sim \ell (Q_{(x)})\sim \ell (Q_{(y)}). \end{aligned}$$

From the definition of \({\mathcal {F}}\) and (2.11), we deduce that

$$\begin{aligned} {\mathcal {O}}(f;Q_{(x,y)})<\epsilon \end{aligned}$$

and hence

$$\begin{aligned} |g_\epsilon (x)-g_\epsilon (y)|&\le \left| f_{Q_{(x)}}-f_{Q_{(x,y)}}\right| +\left| f_{Q_{(y)}}-f_{Q_{(x,y)}}\right| \nonumber \\&\le 2\left[ \frac{|Q_{(x,y)}|}{|Q_{(x)}|}+\frac{|Q_{(x,y)}|}{|Q_{(y)}|}\right] {\mathcal {O}}(f;Q_{(x,y)})\nonumber \\&\lesssim {\mathcal {O}}(f;Q_{(x,y)})\lesssim \epsilon . \end{aligned}$$
(2.13)

Thus, (2.12) also holds true in this case. This finishes the proof of the above claim.

Next, we estimate \(\Vert f-g_\epsilon \Vert _{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}:=\sup _{Q}{\mathcal {O}}(f-g_\epsilon ;Q)\) via considering different side lengths \(\ell (Q)\) in the supremum.

When \(\ell (Q)\in (0,2^{j(\epsilon ;0)})\), by the definition of \({\mathcal {F}}\), we know that Q intersects at most \(2^n\) different cubes in \({\mathcal {F}}\). From this, the definition of \(g_\epsilon \), and (2.12), we deduce that

$$\begin{aligned} \frac{1}{|Q|^2}\int _Q\int _Q|g_\epsilon (x)-g_\epsilon (y)|\,dxdy\lesssim \epsilon . \end{aligned}$$
(2.14)

Combining (2.9) with (2.14), we obtain

$$\begin{aligned} {\mathcal {O}}(f-g_\epsilon ;Q)&\le {\mathcal {O}}(f;Q)+{\mathcal {O}}(g_\epsilon ;Q)\\&<\epsilon +\frac{1}{|Q|^2}\int _Q\int _Q|g_\epsilon (x)-g_\epsilon (y)|\,dxdy\lesssim \epsilon . \end{aligned}$$

When \(\ell (Q)\in [2^{j(\epsilon ;0)},2^{j(\epsilon ;0)+1})\), we consider the following two cases:

Case (i) \(Q\cap Q({\vec {0}}_n,2^{j(\epsilon ;1)})= Q\), namely, \(Q\subset Q({\vec {0}}_n,2^{j(\epsilon ;1)})\). In this case, by the definition of \({\mathcal {F}}_1\), we know that Q intersects at most \(3^n\) different cubes in \({\mathcal {F}}_1\). This, together with the definition of \(g_\epsilon \) and (2.9), implies that

$$\begin{aligned} {\mathcal {O}}(f-g_\epsilon ;Q)&\le \frac{2}{|Q|}\int _Q \left| f(x)-g_\epsilon (x)\right| \,dx\\&=\sum _{\{Q_{*}\in {\mathcal {F}}_1:\,\,Q\cap Q_{*}\ne \emptyset \}}\frac{2}{|Q|} \int _{Q_{*}} \left| f(x)-g_\epsilon (x)\right| \,dx\\&=2\sum _{\{Q_{*}\in {\mathcal {F}}_1:\,\,Q\cap Q_{*}\ne \emptyset \}} \frac{|Q_{*}|}{|Q|}\fint _{Q_{*}}\left| f(x)-f_{Q_*}\right| \,dx\\&<2\epsilon \sum _{\{Q_{*}\in {\mathcal {F}}_1:\,\,Q\cap Q_{*}\ne \emptyset \}} \frac{|Q_{*}|}{|Q|}\lesssim \epsilon . \end{aligned}$$

Case (ii) \(Q\cap Q({\vec {0}}_n,2^{j(\epsilon ;1)})\ne Q\). In this case, we claim that there exists some \(x_Q\in \mathbb {R}^n\) such that

$$\begin{aligned} Q\subset Q(x_Q,2^{j(\epsilon ;0)+1})\quad \mathrm{and}\quad |x_Q|>2^{j(\epsilon ;1)}. \end{aligned}$$
(2.15)

Indeed, if \(Q\cap Q({\vec {0}}_n,2^{j(\epsilon ;1)})=\emptyset \), we can apparently choose \(x_Q\) to be the center of Q and, if \(Q\cap Q({\vec {0}}_n,2^{j(\epsilon ;1)})\ne \emptyset \), the existence of \(x_Q\) is guaranteed by the fact that the distance, between the center of Q and the boundary of \(Q({\vec {0}}_n,2^{j(\epsilon ;1)})\), is less than \(\frac{1}{2}\ell (Q)<2^{j(\epsilon ;0)}\). Thus, the above claim holds true. By (2.15) and (2.10), we find that

$$\begin{aligned} {\mathcal {O}}(f;Q)\le 2\frac{\big [2^{j(\epsilon ;0)+2}\big ]^n}{|Q|}{\mathcal {O}}(f,Q(x_Q;2^{j(\epsilon ;0)+1})) \lesssim \epsilon . \end{aligned}$$
(2.16)

Meanwhile, by the definition of \({\mathcal {F}}\), we know that Q intersects at most \(3^n\) different cubes in \({\mathcal {F}}\). Therefore, we can similarly show that (2.14) still holds true in this case. Combining (2.16) and (2.14), we obtain

$$\begin{aligned} {\mathcal {O}}(f-g_\epsilon ;Q)&\le {\mathcal {O}}(f;Q)+{\mathcal {O}}(g_\epsilon ;Q)\\&\lesssim \epsilon +\frac{1}{|Q|^2}\int _Q\int _Q|g_\epsilon (x)-g_\epsilon (y)|\,dxdy\lesssim \epsilon . \end{aligned}$$

Combining Case (i) and Case (ii), we finally conclude that \({\mathcal {O}}(f-g_\epsilon ;Q)\lesssim \epsilon \) when \(\ell (Q)\in [2^{j(\epsilon ;0)},2^{j(\epsilon ;0)+1})\).

Observe that, by the geometrical property of \({\mathcal {F}}\), for any \(k\in \mathbb {N}\), the above estimations when \(\ell (Q)\in [2^{j(\epsilon ;0)},2^{j(\epsilon ;0)+1})\) can be modified into the case \(\ell (Q)\in [2^{j(\epsilon ;0)+k-1},2^{j(\epsilon ;0)+k})\) with the implicit positive constant depending only on the dimension n. This finishes the proof of Step (ii).

Step (iii) Let \(\varphi \in C_{\mathrm{c}}^\infty (\mathbb {R}^n)\) be a non-negative even function with \(\int _{\mathbb {R}^n}\varphi (x)\,dx=1\) and

$$\begin{aligned} {\mathrm{\,supp\,}}(\varphi )\subset B({\vec {0}}_n,1):=\{x\in \mathbb {R}^n:\,\,|x|\le 1\}. \end{aligned}$$

Let \(h_\epsilon :=g_\epsilon *\varphi _{2^{j(\epsilon ;0)}}\), where \(\varphi _{2^{j(\epsilon ;0)}}(\cdot ) :=2^{-nj(\epsilon ;0)}\varphi (2^{-j(\epsilon ;0)}\cdot )\). Notice that, for any \(x,\ y\in \mathbb {R}^n\) with \(|x-y|\le 2^{j(\epsilon ;0)}\), by the definition of \({\mathcal {F}}\), we know that \(\overline{Q_{(x)}}\cap \overline{Q_{(y)}}\ne \emptyset \). Then, for any \(x\in \mathbb {R}^n\), by (2.12), we have

$$\begin{aligned} \left| g_\epsilon (x)-h_\epsilon (x)\right|&=\left| \int _{\mathbb {R}^n}[g_\epsilon (x)-g_\epsilon (y)]\varphi _{2^{j(\epsilon ;0)}}(x-y)\,dy\right| \\&\le \int _{B(x,2^{j(\epsilon ;0)})}|g_\epsilon (x)-g_\epsilon (y)| |\varphi _{2^{j(\epsilon ;0)}}(x-y)|\,dy\\&\lesssim \epsilon \int _{B(x,2^{j(\epsilon ;0)})}|\varphi _{2^{j(\epsilon ;0)}}(x-y)|\,dy \sim \epsilon , \end{aligned}$$

where \(B(x,2^{j(\epsilon ;0)})\) denotes the ball centered at x with radius \(2^{j(\epsilon ;0)}\). Thus,

$$\begin{aligned} \left\| g_\epsilon -h_\epsilon \right\| _{{\mathrm{\,BMO\,}}(\mathbb {R}^n)} \le 2\left\| g_\epsilon -h_\epsilon \right\| _{L^\infty (\mathbb {R}^n)}\lesssim \epsilon , \end{aligned}$$

which shows that (2.8) holds true.

It remains to prove that \(h_\epsilon \in B_\infty (\mathbb {R}^n)\). Indeed, by \(\varphi \in C_{\mathrm{c}}^\infty (\mathbb {R}^n)\), (2.7), (2.8), and \(f\in {\mathrm{\,BMO\,}}(\mathbb {R}^n)\), we know that \(h_\epsilon \in C^\infty (\mathbb {R}^n)\) and \(h_\epsilon \in {\mathrm{\,BMO\,}}(\mathbb {R}^n)\). Thus, to show \(h_\epsilon \in B_\infty (\mathbb {R}^n)\), it suffices to prove that, for any given \({\widetilde{\epsilon }}\in (0,\epsilon )\) and \(\alpha \in \mathbb {Z}_+^n\setminus \{{\vec {0}}_n\}\), and any \(x\in \mathbb {R}^n\) satisfying \(|x|>E_{(\alpha ,n)}\) with \(E_{(\alpha ,n)}\in (0,\infty )\) being determined later,

$$\begin{aligned} \left| D^\alpha h_\epsilon (x)\right| \lesssim {\widetilde{\epsilon }}. \end{aligned}$$

Indeed, when \(\alpha \in \mathbb {Z}_+^n\) and \(|\alpha |\in \{2m-1\}_{m\in \mathbb {N}}\), by \(j(\epsilon ;0)<0\) and \({\widetilde{\epsilon }}<\epsilon \), we can choose \(k_{|\alpha |}\) to be the smallest positive integer such that

$$\begin{aligned} 2^{(|\alpha |+k_{|\alpha |})j(\epsilon ;0)}\epsilon \le 2^{|\alpha |j(\epsilon ;0)}{\widetilde{\epsilon }} \end{aligned}$$
(2.17)

and \(\{k_{|\alpha |}\}_{|\alpha |\in \{2m-1\}_{m\in \mathbb {N}}}\) is increasing, namely,

$$\begin{aligned} k_1\le \cdots \le k_{|\alpha |}\le k_{|\alpha |+2}\le \cdots . \end{aligned}$$
(2.18)

Meanwhile, from the fact that \(\varphi \) is even, we deduce that \(D^\alpha \varphi \) is odd and hence

$$\begin{aligned} \int _{\mathbb {R}^n}D^\alpha \varphi (x)\,dx=0. \end{aligned}$$
(2.19)

Also, in this case, for any \(x\in \mathbb {R}^n\) with \(|x|>E_{(\alpha ,n)}:=\sqrt{n}2^{j(\epsilon ;|\alpha |+k_{|\alpha |})}\), and any \(y\in B(x,2^{j(\epsilon ;0)})\), by the definition of \({\mathcal {F}}\), we have \(\overline{Q_{(x)}}\cap Q({\vec {0}}_n,2^{j(\epsilon ;|\alpha |+k_{|\alpha |})})=\emptyset \) and \(\overline{Q_{(x)}}\cap \overline{Q_{(y)}}\ne \emptyset \), which, combined with (2.13), the definition of \({\mathcal {F}}\), (2.11), and (2.17), further implies that

$$\begin{aligned} \left| g_\epsilon (y)-g_\epsilon (x)\right| \lesssim {\mathcal {O}}(f;Q_{(x,y)}) \lesssim 2^{(|\alpha |+k_{|\alpha |})j(\epsilon ;0)}\epsilon \lesssim 2^{|\alpha |j(\epsilon ;0)}{\widetilde{\epsilon }}, \end{aligned}$$
(2.20)

where \(Q_{(x,y)}\supset (Q_{(x)}\cup Q_{(y)})\) is the cube comparable with both \(Q_{(x)}\) and \(Q_{(y)}\) [see the first paragraph of the above proof of Step (ii)] and the implicit positive constant only depends on n. By (2.19) and (2.20), we conclude that, for any \(\alpha \in \mathbb {Z}_+^n\) with \(|\alpha |\in \{2m-1\}_{m\in \mathbb {N}}\), and any \(x\in \mathbb {R}^n\) with \(|x|>E_{(\alpha ,n)}\),

$$\begin{aligned} |D^\alpha h_\epsilon (x)|&=\left| \int _{\mathbb {R}^n}g_\epsilon (y) D^\alpha \varphi _{2^{j(\epsilon ;0)}}(x-y)\,dy\right| \nonumber \\&=\left| \int _{B(x,2^{j(\epsilon ;0)})}g_\epsilon (y) D^\alpha \varphi _{2^{j(\epsilon ;0)}}(x-y)\,dy\right| \nonumber \\&=\left| \int _{B(x,2^{j(\epsilon ;0)})}[g_\epsilon (y)-f_{Q_{(x)}}] D^\alpha \varphi _{2^{j(\epsilon ;0)}}(x-y)\,dy\right| \nonumber \\&=\left| \int _{B(x,2^{j(\epsilon ;0)})}[g_\epsilon (y)-g_\epsilon (x)] D^\alpha \varphi _{2^{j(\epsilon ;0)}}(x-y)\,dy\right| \nonumber \\&\lesssim \int _{B(x,2^{j(\epsilon ;0)})}{\mathcal {O}}(f;Q_{(x,y)}) |D^\alpha \varphi _{2^{j(\epsilon ;0)}}(x-y)|\,dy\nonumber \\&\lesssim 2^{|\alpha |j(\epsilon ;0)}{\widetilde{\epsilon }} \int _{B(x,2^{j(\epsilon ;0)})}|D^\alpha \varphi _{2^{j(\epsilon ;0)}}(x-y)|\,dy\nonumber \\&\lesssim 2^{|\alpha |j(\epsilon ;0)}{\widetilde{\epsilon }} 2^{-|\alpha |j(\epsilon ;0)}\Vert D^\alpha \varphi \Vert _{L^1(\mathbb {R}^n)} \lesssim {\widetilde{\epsilon }}, \end{aligned}$$
(2.21)

where the implicit positive constants are independent of \({\widetilde{\epsilon }}\) and x.

When \(\alpha \in \mathbb {Z}_+^n\) and \(|\alpha |\in \{2m\}_{m\in \mathbb {N}}\), we claim that

$$\begin{aligned} |D^\alpha h_\epsilon (x)|\lesssim {\widetilde{\epsilon }} \end{aligned}$$

as well for any \(|x|>E_{(\alpha ,n)}:=\sqrt{n}2^{j(\epsilon ;|\alpha |+1+k_{|\alpha |+1})}\), with the implicit positive constant independent of \(\widetilde{\epsilon }\) and x. Indeed, let \(\psi \in C^\infty (\mathbb {R}^n)\) and M be a positive constant. By the Taylor remainder theorem, we conclude that, for any \(x:=(x_1,\ldots ,x_n)\in \mathbb {R}^n\) with \(|x|>M\), and any \(y\in {\mathcal {R}}_x:=\{y:=(y_1,\ldots ,y_n)\in \mathbb {R}^n:\,\,x_iy_i\ge 0,\quad \forall \,i\in \{1,\ldots ,n\}\}\),

$$\begin{aligned} \psi (x+y)=\psi (x)+\sum _{i=1}^n\frac{\partial }{\partial {x_i}}\psi (x)y_i +\sum _{\{\beta \in \mathbb {Z}^n_+:\,\,|\beta |=2\}} R_\beta (x,y) y^\beta , \end{aligned}$$
(2.22)

where, for any \(\beta :=(\beta _1,\ldots ,\beta _n)\in \mathbb {Z}^n_+\) and \(|\beta |=2\),

$$\begin{aligned} R_\beta (x,y):=\frac{|\beta |}{\beta !}\int _0^1(1-t)^{|\beta |-1} D^\beta \psi (x+ty)\,dt, \end{aligned}$$

with \(\beta !:=\beta _1!\cdots \beta _n!\), satisfies

$$\begin{aligned} \left| R_\beta (x,y) \right| \le \max _{\{\beta \in \mathbb {Z}^n_+:\,\,|\beta |=2\}} \frac{1}{\beta !}\sup _{|z|>M}\left| D^\beta \psi (z) \right| \le \max _{\{\beta \in \mathbb {Z}^n_+:\,\,|\beta |=2\}}\Vert D^\beta \psi \Vert _{L^\infty (\{|z|>M\})}, \end{aligned}$$

by using the following observation that, for any \(t\in [0,1]\), \(|x+ty|\ge |x|>M\). To estimate \(\big \Vert \frac{\partial }{\partial x_1}\psi \big \Vert _{L^\infty (\{|z|>M\})}\), letting

$$\begin{aligned} y\in {\mathcal {R}}_x^{(1)}:= \left\{ y:=(y_1,\ldots ,y_n)\in \mathbb {R}^n:\,\,x_1y_1\ge 0,\, y_1\ne 0,\, y_i=0,\quad \forall \,i\in \{2,\ldots ,n\}\right\} , \end{aligned}$$

then \(|x+y|\ge |x|>M\), and (2.22) becomes

$$\begin{aligned} \psi (x+y)=\psi (x)+\frac{\partial }{\partial {x_1}}\psi (x)y_1+ R_{(2,0,\ldots ,0)}(x,y) y_1^{2}, \end{aligned}$$

which imply that

$$\begin{aligned} \left| \frac{\partial }{\partial x_1}\psi (x)\right|&\le |\psi (x+y)-\psi (x)||y_1|^{-1} +\left| R_{(2,0,\ldots ,0)}(x,y)\right| |y_1|\\&\le 2\Vert \psi \Vert _{L^\infty (\{|z|>M\})}|y_1|^{-1} +\max _{\{\beta \in \mathbb {Z}^n_+:\,\,|\beta |=2\}}\Vert D^\beta \psi \Vert _{L^\infty (\{|z|>M\})}|y_1|. \end{aligned}$$

From the arbitrariness of both \(x\in \{z\in \mathbb {R}^n:\,\,|z|>M\}\) and \(|y_1|\in (0,\infty )\), and the AM-GM inequality (namely, the inequality of arithmetic and geometric means, that is, for any a, \(b\in [0,\infty )\), \(\frac{a+b}{2}\ge \sqrt{ab}\) and, moreover, the equality holds true when \(a=b\)), we then deduce that

$$\begin{aligned}&\left\| \frac{\partial }{\partial x_1}\psi \right\| _{L^\infty (\{|z|>M\})} \nonumber \\&\qquad \le \inf _{|y_1|>0}\left[ 2\Vert \psi \Vert _{L^\infty (\{|z|>M\})}|y_1|^{-1} +\max _{\{\beta \in \mathbb {Z}^n_+:\,\,|\beta |=2\}}\Vert D^\beta \psi \Vert _{L^\infty (\{|z|>M\})}|y_1|\right] \nonumber \\&\qquad = \sqrt{2\Vert \psi \Vert _{L^\infty (\{|z|>M\})} \max _{\{\beta \in \mathbb {Z}^n_+:\,\,|\beta |=2\}}\Vert D^\beta \psi \Vert _{L^\infty (\{|z|>M\})}}. \end{aligned}$$
(2.23)

By the same technique, we know that (2.23) also holds true with \(\frac{\partial }{\partial x_1}\psi \) replaced by \(\frac{\partial }{\partial x_i}\psi \) for any \(i\in \{2,\ldots ,n\}\). Based on this, we can now estimate \(\Vert D^\alpha h_\epsilon \Vert _{L^\infty (\{|x|>E_{(\alpha ,n)}\})}\) for any given \(\alpha :=(\alpha _1,\ldots ,\alpha _n)\in \mathbb {Z}_+^n\) with \(|\alpha |\in \{2m\}_{m\in \mathbb {N}}\). Without loss of generality, we may assume that \(\alpha _1\ne 0\), and let \({\widetilde{\alpha }}:=(\alpha _1-1,\alpha _2,\ldots ,\alpha _n)\). Applying (2.23) with \(\psi :=D^{{\widetilde{\alpha }}}h_\epsilon \) and \(M:=E_{(\alpha ,n)}=\sqrt{n}2^{j(\epsilon ;|\alpha |+1+k_{|\alpha |+1})}\), we have

$$\begin{aligned}&\left\| D^\alpha h_\epsilon \right\| _{L^\infty (\{|x|>E_{(\alpha ,n)}\})}\\&\quad =\left\| \frac{\partial }{\partial x_1}D^{{\widetilde{\alpha }}} h_\epsilon \right\| _{L^\infty (\{|x|>E_{(\alpha ,n)}\})} \\&\quad \le \sqrt{2\Vert D^{{\widetilde{\alpha }}}h_\epsilon \Vert _{L^\infty (\{|x|>E_{(\alpha ,n)}\})} \max _{\{\beta \in \mathbb {Z}^n_+:\,\,|\beta |=|\alpha |+1\}} \Vert D^\beta h_\epsilon \Vert _{L^\infty (\{|x|>E_{(\alpha ,n)}\})}}\\&\quad = \sqrt{2\Vert D^{{\widetilde{\alpha }}}h_\epsilon \Vert _{L^\infty (\{|x|>\sqrt{n}2^{j(\epsilon ;|\alpha |+1+k_{|\alpha |+1})}\})} }\\&\qquad \times \sqrt{\max _{\{\beta \in \mathbb {Z}^n_+:\,\,|\beta |=|\alpha |+1\}} \Vert D^\beta h_\epsilon \Vert _{L^\infty (\{|x|>\sqrt{n}2^{j(\epsilon ;|\alpha |+1+k_{|\alpha |+1})}\})}}\\&\quad \le \sqrt{2\Vert D^{{\widetilde{\alpha }}}h_\epsilon \Vert _{L^\infty (\{|x|>\sqrt{n}2^{j(\epsilon ;|\alpha |-1+k_{|\alpha |-1})}\})} }\\&\qquad \times \sqrt{\max _{\{\beta \in \mathbb {Z}^n_+:\,\,|\beta |=|\alpha |+1\}} \Vert D^\beta h_\epsilon \Vert _{L^\infty (\{|x|>\sqrt{n}2^{j(\epsilon ;|\alpha |+1)}\})} }\\&\quad = \sqrt{2\Vert D^{{\widetilde{\alpha }}}h_\epsilon \Vert _{L^\infty (\{|x|>E_{({\widetilde{\alpha }},n)}\})} \max _{\{\beta \in \mathbb {Z}^n_+:\,\,|\beta |=|\alpha |+1\}} \Vert D^\beta h_\epsilon \Vert _{L^\infty (\{|x|>E_{(\beta ,n)}\})}}, \end{aligned}$$

where we used (2.18) in the last inequality. From this and (2.21), we deduce that, for any \(\alpha :=(\alpha _1,\ldots ,\alpha _n)\in \mathbb {Z}_+^n\) with \(|\alpha |\in \{2m\}_{m\in \mathbb {N}}\),

$$\begin{aligned} \left\| D^\alpha h_\epsilon \right\| _{L^\infty (\{|x|>E_{(\alpha ,n)}\})}\lesssim {\widetilde{\epsilon }}, \end{aligned}$$
(2.24)

where the implicit positive constant is independent of \({\widetilde{\epsilon }}\). This finishes the proof of the above claim.

Combining (2.21) and (2.24), we conclude that \(h_\epsilon \in B_\infty (\mathbb {R}^n)\), which completes the proof of Step (iii) and hence that (ii) \(\Longrightarrow \) (iii).

The proof that (iii) \(\Longrightarrow \) (i) is obvious from the definitions of \(\mathrm{\,XMO\,}(\mathbb {R}^n)\) and \(\mathrm{\,X_{1}MO\,}(\mathbb {R}^n)\).

This finishes the proof of Theorem 1.2. \(\square \)

By Theorem 1.2, we can now completely answer the open question posed in [31].

Proof of Corollary 1.3

To show this corollary, it suffices to prove that there exists some \(f\in \mathrm{\,VMO\,}(\mathbb {R}^n)\setminus \mathrm{\,XMO\,}(\mathbb {R}^n)\). For any \(x:=(x_1,\ldots ,x_n)\in \mathbb {R}^n\), define

$$\begin{aligned} f(x_1,\ldots ,x_n):=\prod _{k=1}^n\sin (x_k). \end{aligned}$$

Then it is easy to show that f is uniform continuous and bounded in \(\mathbb {R}^n\), which implies that \(f\in \mathrm{\,VMO\,}(\mathbb {R}^n)\).

Now, we claim that f violates (ii)\(_2\) of Theorem 1.2(ii) and hence \(f\notin \mathrm{\,XMO\,}(\mathbb {R}^n)\). Indeed, let \(Q_0:=\big [-\frac{\pi }{2},\frac{\pi }{2}\big ]^n\) and, for any \(k\in \mathbb {N}\), let \(Q_k:=2k\pi +Q_0\). Then, for any \(k\in \mathbb {N}\), we have

$$\begin{aligned} \int _{Q_k}f(x_1,\ldots ,x_n)\,dx_1\cdots dx_n&=\int _{2k\pi -\frac{\pi }{2}}^{2k\pi +\frac{\pi }{2}}\sin (x_1)\,dx_1 \cdots \int _{2k\pi -\frac{\pi }{2}}^{2k\pi +\frac{\pi }{2}}\sin (x_n)\,dx_n\\&=0 \end{aligned}$$

and hence

$$\begin{aligned} {\mathcal {O}}(f;Q_0+2k\pi )&={\mathcal {O}}(f;Q_k)=\fint _{Q_k}|f(x)-f_{Q_k}|\,dx=\fint _{Q_k}|f(x)|\,dx\\&=\left[ \frac{1}{\pi }\int _{2k\pi -\frac{\pi }{2}}^{2k\pi +\frac{\pi }{2}}|\sin (t)|\,dt\right] ^n =\left( \frac{2}{\pi }\right) ^n, \end{aligned}$$

which can not tend to 0 as \(k\rightarrow \infty \). Thus, the above claim holds true, which completes the proof of Corollary 1.3. \(\square \)

Proposition 2.5

Proposition 2.4(ii) can be replaced by

(ii’):
$$\begin{aligned} \lim _{R\rightarrow \infty }\sup _{Q\cap Q({\vec {0}}_n,R)=\emptyset }{\mathcal {O}}(f;Q)=0, \end{aligned}$$

where \(Q({\vec {0}}_n,R)\) denotes the cube centered at \({\vec {0}}_n\) with the side length 2R. However, \(\mathrm{(ii)_2}\) of Theorem 1.2(ii) can not be replaced by (ii’).

Proof

Recall that Uchiyama [33] stated Proposition 2.4 via (i), (ii), and (iii), while, in his proof, he proved that Proposition 2.4 with (ii) replaced by (ii’) is true. Indeed, this equivalence is a direct consequence of the following observation:

$$\begin{aligned} (\mathrm{i})+(\mathrm{ii})+(\mathrm{iii}) \text { of Proposition } 2.4&\Longrightarrow \text {Proposition } 2.5\mathrm{(ii')}\\&\Longrightarrow \text {Proposition } 2.4\mathrm{(ii)}. \end{aligned}$$

To show that (ii)\(_2\) of Theorem 1.2(ii) can not be replaced by Proposition 2.5(ii’), for simplicity, we only calculate a typical example in \(\mathbb {R}\). Indeed, let \(f(x):=\log (|x|)\) for any \(x\in \mathbb {R}\) with \(|x|\ge 1\), and extend f to \(\mathbb {R}\) smoothly. Then \(f\in C^1(\mathbb {R})\cap {\mathrm{\,BMO\,}}(\mathbb {R})\) and \(\lim _{|x|\rightarrow \infty }f'(x)=0\), which implies \(f\in B_1(\mathbb {R})\subset \mathrm{\,X_{1}MO\,}(\mathbb {R})\). On the other hand, for any \(k\in \mathbb {Z}_+\) and any interval \(I_k:=[e^k,e^{k+1}]\), we have

$$\begin{aligned} f_{I_k}&=\fint _{I_k}f(x)\,dx =\frac{1}{e^{k+1}-e^{k}}\int _{e^{k}}^{e^{k+1}}\log (x)\,dx\\&=\frac{1}{(e-1)e^k}\left[ (k+1)e^{k+1}-e^{k+1}-(ke^k-e^k)\right] \\&=\frac{1}{e-1}\left[ (k+1)e-e-(k-1)\right] =\frac{1}{e-1}\left( ke-k+1\right) =k+\frac{1}{e-1} \end{aligned}$$

and hence

$$\begin{aligned} {\mathcal {O}}(f;I_k)&=\fint _{I_k}|f(x)-f_{I_k}|\,dx =\frac{1}{(e-1)e^k}\int _{e^{k}}^{e^{k+1}}\left| \log (x)-\left( k+\frac{1}{e-1}\right) \right| \,dx\\&=\frac{1}{(e-1)e^k}\left\{ \int _{e^{k}}^{e^{k+\frac{1}{e-1}}} \left[ -\log (x)+\left( k+\frac{1}{e-1}\right) \right] \,dx\right. \\&\quad \left. +\int _{e^{k+\frac{1}{e-1}}}^{e^{k+1}}\left[ \log (x)-\left( k+\frac{1}{e-1}\right) \right] \,dx\right\} \\&=\frac{1}{(e-1)e^k}\left\{ (k+1)e^{k+1}-e^{k+1}+ke^k-e^k \right. \\&\quad -2\left[ \left( k+\frac{1}{e-1}\right) e^{k+\frac{1}{e-1}}-e^{k+\frac{1}{e-1}}\right] \\&\left. \quad +\left( k+\frac{1}{e-1}\right) \left( 2e^{k+\frac{1}{e-1}}-e^k-e^{k+1}\right) \right\} \\&=\frac{1}{e-1}\left\{ (k+1)e-e+k-1 -2\left[ \left( k+\frac{1}{e-1}\right) e^{\frac{1}{e-1}}-e^{\frac{1}{e-1}}\right] \right. \\&\quad \left. +\left( k+\frac{1}{e-1}\right) \left( 2e^{\frac{1}{e-1}}-1-e\right) \right\} \\&=\frac{1}{e-1}\left( -1+2e^{\frac{1}{e-1}}-\frac{e+1}{e-1}\right) =\frac{2}{e-1}\left( e^{\frac{1}{e-1}}-\frac{e}{e-1}\right) , \end{aligned}$$

which violates (ii’) provided that k satisfies \(e^k>R\). This finishes the proof of Proposition 2.5. \(\square \)

Remark 2.6

Observe that the counterexample in Proposition 2.5 is unbounded, it is still unknown whether or not (ii)\(_1\) of Theorem 1.2(ii) \(+\) Proposition 2.5(ii’) is an equivalent characterization of \(\mathrm{\,MMO\,}(\mathbb {R}^n)\).

3 Proof of Theorem 1.4

In this section, we prove Theorem 1.4 via several smooth truncated techniques. Some of the ideas come from [10]; see also [30]. To begin with, we introduce the following smooth truncated function. Let \(\varphi _1\in C^\infty ([0,\infty ))\) satisfy

$$\begin{aligned} 0\le \varphi _1\le 1\quad {\mathrm {and}}\quad \varphi _1(x)= {\left\{ \begin{array}{ll} 1,\quad x\in [0,1],\\ 0,\quad x\in [2,\infty ). \end{array}\right. } \end{aligned}$$
(3.1)

Moreover, for any \(\eta \in (0,\infty )\) and \(x,\ y,\ z\in \mathbb {R}^n\), let

and, for any \(f,\ g\in C_{\mathrm{c}}^\infty (\mathbb {R}^n)\) and \(x\notin {\mathrm{\,supp\,}}(f)\cap {\mathrm{\,supp\,}}(g)\),

Then

$$\begin{aligned} \left[ b,T_\eta \right] _1(f,g)(x)=\int _{\mathbb {R}^{2n}}[b(x)-b(y)]K_\eta (x,y,z)f(y)g(z)\,dy\,dz. \end{aligned}$$
(3.2)

Recall that the bilinear Hardy–Littlewood maximal operator \({\mathcal {M}}\) is defined by setting, for any \(f,\ g\in L^1_{{\mathrm{\,loc\,}}}(\mathbb {R}^n)\) and \(x\in \mathbb {R}^n\),

$$\begin{aligned} {\mathcal {M}}(f,g)(x):=\sup _{\mathrm {cube}\,Q\ni x}\fint _Q|f(y)|\,dy \fint _Q|g(z)|\,dz, \end{aligned}$$

where the supremum is taken over all the cubes Q of \(\mathbb {R}^n\) containing x. On \([b,T]_1\) and \([b,T_\eta ]_1\), we have the following estimate via \({\mathcal {M}}\).

Lemma 3.1

There exists a positive constant C such that, for any \(b\in B_\infty (\mathbb {R}^n)\), \(\eta \in (0,\infty )\), \(f,\; g\in L^1_{{\mathrm{\,loc\,}}}(\mathbb {R}^n)\), and \(x\in \mathbb {R}^n\),

$$\begin{aligned} \left| [b,T]_1(f,g)(x)-\left[ b,T_\eta \right] _1(f,g)(x) \right| \le C \eta \left\| \nabla b\right\| _{L^\infty (\mathbb {R}^n)} {\mathcal {M}}(f,g)(x). \end{aligned}$$

Proof

For any \(x\in \mathbb {R}^n\), by (3.1), (3.1x), (3.1y), (3.2), and (1.1), we have

$$\begin{aligned}&\left| [b,T]_1(f,g)(x)-\left[ b,T_\eta \right] _1(f,g)(x) \right| \\&\quad =\left| \int _{\mathbb {R}^{2n}}[b(x)-b(y)]K(x,y,z) \varphi _1\left( \frac{2}{\eta }[|x-y|+|x-z|]\right) f(y)g(z)\,dy\,dz\right| \\&\quad \le \int _{|x-y|+|x-z|\le \eta }|b(x)-b(y)||K(x,y,z)||f(y)g(z)|\,dy\,dz\\&\quad \lesssim \left\| \nabla b\right\| _{L^\infty (\mathbb {R}^n)}\sum _{j=0}^\infty \int _{\frac{\eta }{2^{j+1}}<|x-y|+|x-z|\le \frac{\eta }{2^j}} \frac{|x-y|}{(|x-y|+|x-z|)^{2n}}|f(y)g(z)|\,dy\,dz\\&\quad \lesssim \left\| \nabla b\right\| _{L^\infty (\mathbb {R}^n)}\sum _{j=0}^\infty \frac{\eta }{2^j}\frac{1}{(\frac{\eta }{2^{j+1}})^{2n}} \int _{Q(x,\frac{\eta }{2^j})\times Q(x,\frac{\eta }{2^j})}|f(y)g(z)|\,dy\,dz\\&\quad \sim \left\| \nabla b\right\| _{L^\infty (\mathbb {R}^n)} \eta \sum _{j=0}^\infty \frac{1}{2^j}\fint _{Q(x,\frac{\eta }{2^j})}|f(y)|\,dy\fint _{Q(x,\frac{\eta }{2^j})}|g(z)|\,dz\\&\quad \lesssim \left\| \nabla b\right\| _{L^\infty (\mathbb {R}^n)} \eta \sum _{j=0}^\infty \frac{1}{2^j}{\mathcal {M}}(f,g)(x) \sim \left\| \nabla b\right\| _{L^\infty (\mathbb {R}^n)} \eta {\mathcal {M}}(f,g)(x), \end{aligned}$$

where \(Q(x,\frac{\eta }{2^j})\) denotes the cube centered at x with the side length \(2\frac{\eta }{2^j}\). This finishes the proof of Lemma 3.1. \(\square \)

We also need the following result on the relative compactness of a set in weighted Lebesgue spaces, which is just [34, Theorem 1.1].

Lemma 3.2

Let w be a weight on \(\mathbb {R}^n\). Assume that \(w^{-1/(p_0-1)}\) is also a weight on \(\mathbb {R}^n\) for some \(p_0\in (1,\infty )\). Let \(p\in (0,\infty )\) and \({\mathcal {E}}\) be a subset of \(L_w^p(\mathbb {R}^n)\). Then \({\mathcal {E}}\) is relatively compact in \(L_w^p(\mathbb {R}^n)\) if the set \({\mathcal {E}}\) satisfies the following three conditions:

  1. (i)

    \({\mathcal {E}}\) is bounded, namely,

    $$\begin{aligned} \sup _{f\in {\mathcal {E}}}\Vert f\Vert _{{L_w^p(\mathbb {R}^n)}}<\infty ; \end{aligned}$$
  2. (ii)

    \({\mathcal {E}}\) uniformly vanishes at infinity, namely, for any \(\epsilon \in (0,\infty )\), there exists some positive constant A such that, for any \(f\in {\mathcal {E}}\),

    $$\begin{aligned} \left\| f\right\| _{L^p_w(\{|x|>A\})}<\epsilon ; \end{aligned}$$
  3. (iii)

    \({\mathcal {E}}\) is uniformly equicontinuous, namely, for any \(\epsilon \in (0,\infty )\), there exists some positive constant \(\rho \) such that, for any \(f\in {\mathcal {E}}\) and \(t \in \mathbb {R}^n\) with \(|t|\in [0,\rho )\),

    $$\begin{aligned} \Vert f(\cdot +t)-f(\cdot )\Vert _{{L_w^p(\mathbb {R}^n)}}<\epsilon . \end{aligned}$$

Remark 3.3

If w is a classical \(A_p(\mathbb {R}^n)\) weight for some \(p\in (1,\infty )\), then the sufficiency of Lemma 3.2 was first obtained in [10, Theorem 5], which is needed in the proof of Theorem 1.4.

Let \(\mathbf{w }:=(w_1,w_2)\in \mathbf{A }_{\mathbf{p }}(\mathbb {R}^n)\) and \(w:=w_1^{p/p_1}w_2^{p/p_2}\) be as in Theorem 1.4. From [24, Theorem 3.6], it follows that \(w\in A_{2p}(\mathbb {R}^n)\). By this and Remark 3.3, we are now able to use Lemma 3.2 to prove Theorem 1.4 as follows.

Proof of Theorem 1.4

Without loss of generality, it may suffice to prove this theorem for the first entry \([b,T_\eta ]_1\). When \(b\in \mathrm{\,XMO\,}(\mathbb {R}^n)=\mathrm{\,X_{1}MO\,}(\mathbb {R}^n)\) (see Theorem 1.2), from the definition of \(\mathrm{\,X_{1}MO\,}(\mathbb {R}^n)\), we deduce that, for any \(\epsilon \in (0,\infty )\), there exists a \(b^{(\epsilon )}\in B_1(\mathbb {R}^n)\) such that \(\Vert b-b^{(\epsilon )}\Vert _{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}<\epsilon .\) Then, by the boundedness of \([b,T]_1\) from \({L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) to \({L_w^p(\mathbb {R}^n)}\) (see [24, Theorem 3.18] and also [4] for more general results), we obtain, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\),

$$\begin{aligned}&\left\| [b,T]_1(f,g)-\left[ b^{(\epsilon )},T\right] _1(f,g)\right\| _{{L_w^p(\mathbb {R}^n)}}\\&\quad =\left\| \left[ b-b^{(\epsilon )},T\right] _1(f,g)\right\| _{{L_w^p(\mathbb {R}^n)}} \lesssim \left\| b-b^{(\epsilon )}\right\| _{{\mathrm{\,BMO\,}}(\mathbb {R}^n)}\Vert f\Vert _{{L_{w_1}^{p_1}(\mathbb {R}^n)}}\Vert g\Vert _{{L_{w_2}^{p_2}(\mathbb {R}^n)}}\\&\quad \lesssim \epsilon \Vert f\Vert _{{L_{w_1}^{p_1}(\mathbb {R}^n)}}\Vert g\Vert _{{L_{w_2}^{p_2}(\mathbb {R}^n)}}. \end{aligned}$$

Moreover, using Lemma 3.1 and the boundedness of \({\mathcal {M}}\) from \({L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) to \({L_w^p(\mathbb {R}^n)}\) ( [24, Theorem 3.3]), we conclude that

$$\begin{aligned} \lim _{\eta \rightarrow 0}\left\| [b,T]_1-\left[ b,T_\eta \right] _1\right\| _{{L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\rightarrow {L_w^p(\mathbb {R}^n)}}=0, \end{aligned}$$

where \(\Vert [b,T]_1-[b,T_\eta ]_1\Vert _{{L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\rightarrow {L_w^p(\mathbb {R}^n)}}\) denotes the operator norm of \([b,T]_1-[b,T_\eta ]_1\) from \({L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) to \({L_w^p(\mathbb {R}^n)}\). Thus, to prove that \([b,T]_1\) is compact for any \(b\in \mathrm{\,XMO\,}(\mathbb {R}^n)\), by [35, p. 278, Theorem(iii)], it suffices to show that \([b,T_\eta ]_1\) is compact for any \(b\in B_1(\mathbb {R}^n)\) and \(\eta \in (0,\infty )\) small enough. To this end, by the definition of compact operators, Lemma 3.2, and Remark 3.3(ii), we know that it suffices to show that, for any fixed \(b\in B_1(\mathbb {R}^n)\), \(\eta \in (0,\infty )\) small enough, and \({\mathcal {E}}\subset {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) bounded, \([b,T_\eta ]_1{\mathcal {E}}\) satisfies (i), (ii), and (iii) of Lemma 3.2. In what follows, we show these in order.

To begin with, we show that \([b,T_\eta ]_1{\mathcal {E}}\) satisfies Lemma 3.2(i). To this end, we first claim that \(T_\eta \) is a Calderón–Zygmund operator, namely, \(K_\eta \) satisfies (1.1). Indeed, from \(0\le \varphi \le 1\) and hence \(|K_\eta |\le |K|\), it follows that the size condition [\(\alpha :={\vec {0}}_{3n}\) in (1.1)] holds true trivially. When \(|\alpha |=1\), without loss of generality, we may assume that \(\alpha :=(1,\overbrace{0,\ldots ,0}^{3n-1\ \mathrm{times}})\). Then, for any \(x,\ y,\ z\in \mathbb {R}^n\) with \(x\ne y\) or \(x\ne z\), by (3.1) and (1.1), we have

$$\begin{aligned}&\left| \frac{\partial }{\partial x_1} K_\eta (x,y,z)\right| \\&\quad \le \left| \frac{\partial }{\partial x_1} K(x,y,z)\right| \left| 1-\varphi _1\left( \frac{2}{\eta }[|x-y|+|x-z|]\right) \right| \\&\qquad +|K(x,y,z)|\left| \frac{\partial }{\partial x_1} \left[ 1-\varphi _1\left( \frac{2}{\eta }[|x-y|+|x-z|]\right) \right] \right| \\&\quad \le \left| \frac{\partial }{\partial x_1} K(x,y,z)\right| \\&\qquad +|K(x,y,z)|\left| \varphi _1'\left( \frac{2}{\eta }[|x-y|+|x-z|]\right) \right| \frac{2}{\eta }\left| \frac{\partial }{\partial x_1}(|x-y|+|x-z|)\right| \\&\quad \lesssim \frac{1}{(|x-y|+|x-z|)^{2n+1}}\\&\qquad +\frac{\frac{2}{\eta }(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n+1}} \left| \varphi _1'\left( \frac{2}{\eta }[|x-y|+|x-z|]\right) \right| \\&\qquad \times \left| \frac{x_1-y_1}{|x-y|}+\frac{x_1-z_1}{|x-z|}\right| \\&\quad \lesssim \frac{1}{(|x-y|+|x-z|)^{2n+1}}. \end{aligned}$$

Therefore, \(K_\eta \) satisfies (1.1) and hence \(T_\eta \) is a Calderón–Zygmund operator and, moreover, the kernel constant is independent of \(\eta \). From this, \(b\in \mathrm{\,XMO\,}(\mathbb {R}^n)\subset {\mathrm{\,BMO\,}}(\mathbb {R}^n)\), the boundedness of Calderón–Zygmund commutators on weighted Lebesgue spaces ( [24, Theorem 3.18]), and \((f,g)\in {\mathcal {E}}\) , we deduce that

$$\begin{aligned} \left\| \left[ b,T_\eta \right] _1(f,g)\right\| _{{L_w^p(\mathbb {R}^n)}}\lesssim \Vert f\Vert _{{L_{w_1}^{p_1}(\mathbb {R}^n)}}\Vert g\Vert _{{L_{w_2}^{p_2}(\mathbb {R}^n)}}<\infty , \end{aligned}$$

which implies that \([b,T_\eta ]_1{\mathcal {E}}\) satisfies Lemma 3.2(i).

Next, we show that \([b,T_\eta ]_1{\mathcal {E}}\) satisfies Lemma 3.2(ii). In what follows, we use the symbol \(E\gg D\) to denote that E is much larger than D. For fixed \(A\gg 1\) and fixed \(x\in \mathbb {R}^n\) with \(|x|>A\), we first split the truncated function \(1-\varphi _1\) into the following two parts. Let \(\varphi _2,\ \varphi _3\in C^\infty ([0,\infty ))\) satisfy

$$\begin{aligned} 0&\le \varphi _2,\ \varphi _3\le 1, \quad \varphi _2+\varphi _3=1-\varphi _1, \end{aligned}$$
(3.3)
$$\begin{aligned} \varphi _2(x)&= {\left\{ \begin{array}{ll} 1, \quad x\in \left[ 2,\displaystyle \frac{A}{2}\right] ,\\ 0, \quad x\in [0,1]\bigcup \left[ \displaystyle \frac{A}{2}+1,\infty \right) , \end{array}\right. } \end{aligned}$$
(3.4)

and

$$\begin{aligned} \varphi _3(x)= {\left\{ \begin{array}{ll} 0, \quad x\in \left[ 0,\displaystyle \frac{A}{2}\right] ,\\ 1, \quad x\in \left[ \displaystyle \frac{A}{2}+1,\infty \right) . \end{array}\right. } \end{aligned}$$
(3.5)

Accordingly, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) and \(x\in \mathbb {R}^n\), we split \([b,T_\eta ]_1(f,g)(x)\) into the following three parts:

$$\begin{aligned}&\left| \left[ b,T_\eta \right] _1(f,g)(x)\right| \nonumber \\&\quad =\left| \int _{\mathbb {R}^{2n}}[b(x)-b(y)]K_\eta (x,y,z)f(y)g(z)\,dy\,dz \right| \nonumber \\&\quad =\left| \int _{\mathbb {R}^{2n}}\nabla b(\xi )\cdot (x-y) K_\eta (x,y,z)\right. \nonumber \\&\qquad \left. \times \left[ (\varphi _1+\varphi _2+\varphi _3)(|x-y|+|x-z|)\right] f(y)g(z)\,dy\,dz \right| \nonumber \\&\quad \le \int _{\mathbb {R}^{2n}}|\nabla b(\xi )||x-y|\left| K_\eta (x,y,z)\right| \varphi _1(|x-y|+|x-z|)|f(y)g(z)|\,dy\,dz \nonumber \\&\qquad +\int _{\mathbb {R}^{2n}}|\nabla b(\xi )||x-y|\left| K_\eta (x,y,z)\right| \varphi _2(|x-y|+|x-z|)|f(y)g(z)|\,dy\,dz \nonumber \\&\qquad +\int _{\mathbb {R}^{2n}}|\nabla b(\xi )||x-y|\left| K_\eta (x,y,z)\right| \varphi _3(|x-y|+|x-z|)|f(y)g(z)|\,dy\,dz \nonumber \\&\quad =:{\mathrm {L}}_1(x)+\mathrm {L}_2(x)+{\mathrm {L}}_3(x), \end{aligned}$$
(3.6)

where we applied the mean value theorem to \(b(x)-b(y)\), and \(\xi \) lies on the segment \({\overline{xy}}\) connecting x and y. We then estimate \({\mathrm {L}}_1(x)\) to \({\mathrm {L}}_3(x)\) in order.

To estimate \({\mathrm {L}}_1(x)\) as well as \(\Vert {\mathrm {L}}_1\Vert _{L^p_w(\{|x|>A\})}\), we notice that \(\xi \in {\overline{xy}}\) and hence

$$\begin{aligned} |x-\xi |\le |x-y|. \end{aligned}$$
(3.7)

Meanwhile, by (3.1), we know that \({\mathrm{\,supp\,}}(\varphi _1)\subset [0,2]\) and hence

$$\begin{aligned} |x-y|+|x-z|\le 2. \end{aligned}$$
(3.8)

From (3.7), (3.8), and \(|x|>A\gg 1\), it follows that

$$\begin{aligned} |\xi |&\ge |x|-|x-\xi |\ge |x|-|x-y| \nonumber \\&\ge |x|-(|x-y|+|x-z|)\ge |x|-2>\frac{A}{2}. \end{aligned}$$
(3.9)

By (3.9), (1.1), and \(0\le \varphi \le 1\) (and hence \(|K_\eta |\le |K|\)), we conclude that, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) and \(x\in \mathbb {R}^n\),

$$\begin{aligned} {\mathrm {L}}_1(x)&\lesssim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )| \int _{\mathbb {R}^{2n}}\frac{|x-y|\varphi _1(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n}}|f(y)g(z)|\,dy\,dz\nonumber \\&\lesssim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )| \int _{\mathbb {R}^{2n}}\frac{\varphi _1(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n-1}}|f(y)g(z)|\,dy\,dz\nonumber \\&\sim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )| \int _{\mathbb {R}^{2n}}{\mathcal {K}}_1(x,y,z)|f(y)g(z)|\,dy\,dz \nonumber \\&\sim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )| \mathrm {J}_1(|f|,|g|)(x), \end{aligned}$$
(3.10)

where

$$\mathrm {J}_1(|f|,|g|)(x):=\int _{\mathbb {R}^{2n}}\mathcal {K}_1(x,y,z)|f(y)g(z)|\,dy\,dz.$$

We now claim that \({\mathrm {J}}_1\) is a Calderón–Zygmund operator, namely, \({\mathcal {K}}_1\) satisfies (1.1). Indeed, by (3.1), we have, for any \(x,\ y,\ z\in \mathbb {R}^n\) with \(x\ne y\) or \(x\ne z\),

$$\begin{aligned} |{\mathcal {K}}_1(x,y,z)|&=\frac{\varphi _1(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n-1}}\\&=\frac{(|x-y|+|x-z|)\varphi _1(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n}}\\&\le \frac{2}{(|x-y|+|x-z|)^{2n}} \end{aligned}$$

and hence the size condition [\(\alpha ={\vec {0}}_{3n}\) in (1.1)] holds true. When \(|\alpha |=1\), without loss of generality, we may assume that \(\alpha :=(1,\overbrace{0,\ldots ,0}^{3n-1\ \mathrm{times}})\). Then, for any \(x,\ y,\ z\in \mathbb {R}^n\) with \(x\ne y\) or \(x\ne z\), we obtain

$$\begin{aligned}&\left| \frac{\partial }{\partial x_1}{\mathcal {K}}_1(x,y,z)\right| \\&\quad \lesssim \frac{\left| \varphi _1'(|x-y|+|x-z|)\Big (\frac{x_1-y_1}{|x-y|}+\frac{x_1-z_1}{|x-z|}\Big ) (|x-y|+|x-z|)^2\right| }{(|x-y|+|x-z|)^{2n+1}}\\&\qquad +\frac{\left| \varphi _1(|x-y|+|x-z|)(2n-1)(|x-y|+|x-z|)\Big (\frac{x_1-y_1}{|x-y|}+\frac{x_1-z_1}{|x-z|}\Big )\right| }{(|x-y|+|x-z|)^{2n+1}}\\&\quad \lesssim \frac{\Vert \varphi _1'\Vert _{L^\infty (\mathbb {R}^n)}+1}{(|x-y|+|x-z|)^{2n+1}} \lesssim \frac{1}{(|x-y|+|x-z|)^{2n+1}}. \end{aligned}$$

Therefore, \({\mathcal {K}}_1\) satisfies (1.1) and hence \({\mathrm {J}}_1\) is a Calderón–Zygmund operator, which shows the above claim. From this claim, the boundedness of Calderón–Zygmund operators on weighted Lebesgue spaces ( [24, Corollary 3.9]), and (3.10), we deduce that, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\),

$$\begin{aligned} \Vert {\mathrm {L}}_1\Vert _{L^p_w(\{|x|>A\})}&\lesssim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )|\Vert {\mathrm {J}}_1\Vert _{{L_w^p(\mathbb {R}^n)}} \nonumber \\&\lesssim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )|\Vert f\Vert _{{L_{w_1}^{p_1}(\mathbb {R}^n)}}\Vert g\Vert _{{L_{w_2}^{p_2}(\mathbb {R}^n)}}. \end{aligned}$$
(3.11)

Next, we estimate \({\mathrm {L}}_2(x)\) as well as \(\Vert {\mathrm {L}}_2\Vert _{L^p_w(\{|x|>A\})}\). By (3.3) and (3.4), we know that \({\mathrm{\,supp\,}}(\varphi _2)\subset [1,\frac{A}{2}]\) and hence

$$\begin{aligned} |x-y|+|x-z|\le \frac{A}{2}. \end{aligned}$$
(3.12)

From (3.7), (3.12), and \(|x|>A\gg 1\), we deduce that

$$\begin{aligned} |\xi |\ge |x|-|x-\xi |\ge |x|-|x-y|\ge |x|-(|x-y|+|x-z|)\ge |x|-\frac{A}{2}>\frac{A}{2}, \end{aligned}$$

and hence (3.9) still holds true. By (3.9), (1.2), and \(0\le \varphi \le 1\) (and hence \(|K_\eta |\le |K|\)), we further conclude that, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) and \(x\in \mathbb {R}^n\),

$$\begin{aligned} {\mathrm {L}}_2(x)&\lesssim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )| \int _{\mathbb {R}^{2n}}\frac{|x-y|\varphi _2(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n+2}}|f(y)g(z)|\,dy\,dz\nonumber \\&\lesssim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )| \int _{\mathbb {R}^{2n}}\frac{\varphi _2(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n+1}}|f(y)g(z)|\,dy\,dz\nonumber \\&\sim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )| \int _{\mathbb {R}^{2n}}{\mathcal {K}}_2(x,y,z)|f(y)g(z)|\,dy\,dz\nonumber \\&\sim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )| \mathrm {J}_2(|f|,|g|)(x), \end{aligned}$$
(3.13)

where

$$ \mathrm {J}_2(|f|,|g|)(x):=\int _{\mathbb {R}^{2n}}\mathcal {K}_2(x,y,z)|f(y)g(z)|\,dy\,dz.$$

We now claim that \({\mathrm {J}}_2\) is also a Calderón–Zygmund operator, namely, \({\mathcal {K}}_2\) satisfies (1.1). Indeed, by (3.3) and (3.4), we obtain, for any \(x,\ y,\ z\in \mathbb {R}^n\) with \(x\ne y\) or \(x\ne z\),

$$\begin{aligned} |{\mathcal {K}}_2(x,y,z)|=\frac{\varphi _2(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n+1}} \le \frac{1}{(|x-y|+|x-z|)^{2n}} \end{aligned}$$

and hence the size condition [\(\alpha ={\vec {0}}_{3n}\) in (1.1)] holds true. When \(|\alpha |=1\), without loss of generality, we may assume that \(\alpha :=(1,\overbrace{0,\ldots ,0}^{3n-1\ \mathrm{times}})\). Then, for any \(x,\ y,\ z\in \mathbb {R}^n\) with \(x\ne y\) or \(x\ne z\), it holds true that

$$\begin{aligned}&\left| \frac{\partial }{\partial x_1}{\mathcal {K}}_2(x,y,z)\right| \\&\quad \lesssim \frac{\left| \varphi _2'(|x-y|+|x-z|)\Big (\frac{x_1-y_1}{|x-y|}+\frac{x_1-z_1}{|x-z|}\Big )\right| }{(|x-y|+|x-z|)^{2n+1}}\\&\qquad +\frac{\left| \varphi _2(|x-y|+|x-z|)(2n+1)(|x-y|+|x-z|)^{-1} \Big (\frac{x_1-y_1}{|x-y|}+\frac{x_1-z_1}{|x-z|}\Big )\right| }{(|x-y|+|x-z|)^{2n+1}}\\&\quad \lesssim \frac{\Vert \varphi _2'\Vert _{L^\infty (\mathbb {R}^n)}+1}{(|x-y|+|x-z|)^{2n+1}} \lesssim \frac{1}{(|x-y|+|x-z|)^{2n+1}}, \end{aligned}$$

where the implicated positive constant is independent of A. Therefore, \({\mathcal {K}}_2\) satisfies (1.1), and hence \({\mathrm {J}}_2\) is a Calderón–Zygmund operator, which shows the above claim holds true. Using this claim, the boundedness of Calderón–Zygmund operators on weighted Lebesgue spaces ( [24, Corollary 3.9]), and (3.13), we find that, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\),

$$\begin{aligned} \Vert {\mathrm {L}}_2\Vert _{L^p_w(\{|x|>A\})}&\lesssim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )|\Vert {\mathrm {J}}_2\Vert _{{L_w^p(\mathbb {R}^n)}} \nonumber \\&\lesssim \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )|\Vert f\Vert _{{L_{w_1}^{p_1}(\mathbb {R}^n)}}\Vert g\Vert _{{L_{w_2}^{p_2}(\mathbb {R}^n)}} \end{aligned}$$
(3.14)

with the implicit positive constant independent of A.

Next, we estimate \({\mathrm {L}}_3(x)\) as well as \(\Vert {\mathrm {L}}_3\Vert _{L^p_w(\{|x|>A\})}\). From (1.2), \(0\le \varphi \le 1\) (and hence \(|K_\eta |\le |K|\)), and (3.5), we deduce that, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) and \(x\in \mathbb {R}^n\),

$$\begin{aligned} {\mathrm {L}}_3(x)&\lesssim \Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)} \int _{\mathbb {R}^{2n}}\frac{|x-y|\varphi _3(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n+2+\delta }} |f(y)g(z)|\,dy\,dz\nonumber \\&\lesssim \Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)} \int _{\mathbb {R}^{2n}}\frac{A^{-\delta }\varphi _3(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n+1}} |f(y)g(z)|\,dy\,dz\nonumber \\&\sim \Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)} \int _{\mathbb {R}^{2n}}{\mathcal {K}}_3(x,y,z)|f(y)g(z)|\,dy\,dz \nonumber \\&\sim \Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)}\mathrm {J}_3(|f|,|g|)(x), \end{aligned}$$
(3.15)

where

$$\mathrm {J}_3(|f|,|g|)(x):=\int _{\mathbb {R}^{2n}}\mathcal {K}_3(x,y,z)|f(y)g(z)|\,dy\,dz.$$

We now claim that \({\mathrm {J}}_3\) is also a Calderón–Zygmund operator, namely, \({\mathcal {K}}_3\) satisfies (1.1) with the kernel constant \(O(A^{-\delta })\) with \(\delta \in (0,\infty )\) as in (1.2). Indeed, by (3.3) and (3.5), we have, for any \(x,\ y,\ z\in \mathbb {R}^n\) with \(x\ne y\) or \(x\ne z\),

$$\begin{aligned} |{\mathcal {K}}_3(x,y,z)|=\frac{A^{-\delta }\varphi _3(|x-y|+|x-z|)}{(|x-y|+|x-z|)^{2n+1}} \lesssim \frac{A^{-\delta }}{(|x-y|+|x-z|)^{2n}} \end{aligned}$$

and hence the size condition [\(\alpha ={\vec {0}}_{3n}\) in (1.1)] holds true with the kernel constant \(O(A^{-\delta })\). When \(|\alpha |=1\), without loss of generality, we may assume that \(\alpha :=(1,\overbrace{0,\ldots ,0}^{3n-1\ \mathrm{times}})\). Then, for any \(x,\ y,\ z\in \mathbb {R}^n\) with \(x\ne y\) or \(x\ne z\),

$$\begin{aligned}&\left| \frac{\partial }{\partial x_1}{\mathcal {K}}_3(x,y,z)\right| \\&\quad \lesssim A^{-\delta }\frac{\left| \varphi _3'(|x-y|+|x-z|)\Big (\frac{x_1-y_1}{|x-y|}+\frac{x_1-z_1}{|x-z|}\Big )\right| }{(|x-y|+|x-z|)^{2n+1}}\\&\qquad +A^{-\delta }\frac{\left| \varphi _3(|x-y|+|x-z|)(2n+1)(|x-y|+|x-z|)^{-1}\Big (\frac{x_1-y_1}{|x-y|}+\frac{x_1-z_1}{|x-z|}\Big )\right| }{(|x-y|+|x-z|)^{2n+1}}\\&\quad \lesssim A^{-\delta }\frac{\Vert \varphi _3'\Vert _{L^\infty (\mathbb {R}^n)}+\big (\frac{A}{2}\big )^{-1}}{(|x-y|+|x-z|)^{2n+1}} \lesssim \frac{A^{-\delta }}{(|x-y|+|x-z|)^{2n+1}}. \end{aligned}$$

Thus, \({\mathcal {K}}_3\) satisfies (1.1) with the kernel constant \(O(A^{-\delta })\) and hence \({\mathrm {J}}_3\) is a Calderón–Zygmund operator, which shows the above claim. By this claim, the boundedness of Calderón–Zygmund operators on weighted Lebesgue spaces ( [24, Corollary 3.9]), and (3.15), we conclude that, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\),

$$\begin{aligned} \Vert {\mathrm {L}}_3\Vert _{L^p_w(\{|x|>A\})}&\lesssim \Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)}\Vert {\mathrm {J}}_3\Vert _{{L_w^p(\mathbb {R}^n)}} \nonumber \\&\lesssim \frac{\Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)}}{A^\delta }\Vert f\Vert _{{L_{w_1}^{p_1}(\mathbb {R}^n)}}\Vert g\Vert _{{L_{w_2}^{p_2}(\mathbb {R}^n)}}, \end{aligned}$$
(3.16)

where

$$ \mathrm {J}_3(|f|,|g|)(x):=\int _{\mathbb {R}^{2n}}\mathcal {K}_3(x,y,z)|f(y)g(z)|\,dy\,dz.$$

To sum up, for any given \(\epsilon \in (0,\infty )\), there exists a positive constant A large enough such that

$$\begin{aligned} \sup _{|\xi |>\frac{A}{2}}|\nabla b(\xi )|\lesssim \epsilon \quad \mathrm{and}\quad \frac{\Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)}}{A^\delta }\lesssim \epsilon \end{aligned}$$

hold true, where the implicit positive constants are independent of A and \(\epsilon \). From this, (3.6), (3.11), (3.14), (3.16), \(b\in B_1(\mathbb {R}^n)\), and \((f,g)\in {\mathcal {E}}\) bounded, we deduce that

$$\begin{aligned} \left\| \left[ b,T_\eta \right] _1(f,g)\right\| _{L^p_w(\{|x|>A\})} \le \sum _{k=1}^3\Vert {\mathrm {L}}_k\Vert _{L^p_w(\{|x|>A\})}<\epsilon , \end{aligned}$$

which implies that \([b,T_\eta ]_1{\mathcal {E}}\) satisfies Lemma 3.2(ii).

It remains to prove that \([b,T_\eta ]_1{\mathcal {E}}\) also satisfies Lemma 3.2(iii). Recall that \(\eta \) is a fixed positive constant small enough. Let \(t\in \mathbb {R}^n\) satisfy \(|t|\in (0,\eta /8)\). Then, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) and \(x\in \mathbb {R}^n\), we have

$$\begin{aligned}&\left[ b,T_\eta \right] _1(f,g)(x)-\left[ b,T_\eta \right] _1(f,g)(x+t)\nonumber \\&\quad =\int _{\mathbb {R}^{2n}}[b(x)-b(y)]K_\eta (x,y,z)f(y)g(z)\,dy\,dz\nonumber \\&\qquad -\int _{\mathbb {R}^{2n}}[b(x+t)-b(y)]K_\eta (x+t,y,z)f(y)g(z)\,dy\,dz\nonumber \\&\quad =[b(x)-b(x+t)]\int _{\mathbb {R}^{2n}}K_\eta (x,y,z)f(y)g(z)\,dy\,dz\nonumber \\&\qquad +\int _{\mathbb {R}^{2n}}[b(x+t)-b(y)]\left[ K_\eta (x,y,z)-K_\eta (x+t,y,z)\right] f(y)g(z)\,dy\,dz\nonumber \\&\quad =:{\mathrm {L}}_4(x)+\mathrm {L}_5(x). \end{aligned}$$
(3.17)

Observe that we have shown that \(K_\eta \) satisfies (1.1) with the kernel constant independent of \(\eta \) [see the above proof of \([b,T_\eta ]_1{\mathcal {E}}\) satisfying Lemma 3.2(i)]. From this, the mean value theorem, and the boundedness of Calderón–Zygmund operators on weighted Lebesgue spaces ( [24, Corollary 3.9]), it follows that, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\),

$$\begin{aligned} \Vert {\mathrm {L}}_4\Vert _{{L_w^p(\mathbb {R}^n)}}\lesssim |t|\Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)}\Vert f\Vert _{{L_{w_1}^{p_1}(\mathbb {R}^n)}}\Vert g\Vert _{{L_{w_2}^{p_2}(\mathbb {R}^n)}}. \end{aligned}$$
(3.18)

To estimate \({\mathrm {L}}_5\), we first observe that, for any \(x,\ y,\ z,\ t\in \mathbb {R}^n\) with \(|x-y|+|x-z|<\frac{\eta }{4}\) and \(|t|<\frac{\eta }{8}\),

$$\begin{aligned} \varphi _1\left( \frac{2}{\eta }[|x-y|+|x-z|]\right) =0 =\varphi _1\left( \frac{2}{\eta }[|x+t-y|+|x+t-z|]\right) \end{aligned}$$

and hence

$$\begin{aligned} K_\eta (x,y,z)=0=K_\eta (x+t,y,z). \end{aligned}$$
(3.19)

Besides, for any \(x,\ y,\ z,\ t\in \mathbb {R}^n\) with \(|x-y|+|x-z|\ge \frac{\eta }{4}\) and \(|t|<\frac{\eta }{16}\), we have

$$\begin{aligned} |t|\le \frac{1}{4}(|x-y|+|x-z|). \end{aligned}$$

This, together with the mean value theorem and (1.1), implies that

$$\begin{aligned}&\left| K_\eta (x,y,z)-K_\eta (x+t,y,z) \right| \nonumber \\&\quad =\left| t\nabla _x K_\eta (\zeta ,y,z)\right| \lesssim \frac{|t|}{(|\zeta -y|+|\zeta -z|)^{2n+1}}\nonumber \\&\quad \lesssim \frac{|t|}{(|x-y|+|x-z|-2|x-\zeta |)^{2n+1}}\nonumber \\&\quad \lesssim \frac{|t|}{(|x-y|+|x-z|-2|t|)^{2n+1}} \lesssim \frac{|t|}{(|x-y|+|x-z|)^{2n+1}}, \end{aligned}$$
(3.20)

where \(\nabla _x K_\eta (\cdot ,y,z)\) denotes the gradient of \(K_\eta (\cdot ,y,z)\) on the first variable with y, z fixed, and \(\zeta \) lies on the segment connecting x and \(x+t\). By (3.17), the mean value theorem, (3.19), and (3.20), we have, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) and \(x\in \mathbb {R}^n\),

$$\begin{aligned} |{\mathrm {L}}_5(x)|&\lesssim |t|^2\Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)}\int _{|x-y|+|x-z|>\frac{\eta }{4}} \frac{|f(y)g(z)|}{(|x-y|+|x-z|)^{2n+1}}\,dy\,dz\\&\lesssim |t|^2\Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)}\sum _{k=0}^\infty \frac{1}{(2^k\eta )^{2n+1}} \int _{2^k\frac{\eta }{4}<|x-y|+|x-z|\le 2^{k+1}\frac{\eta }{4}}f(y)g(z)\,dy\,dz\\&\lesssim |t|^2\Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)}\sum _{k=0}^\infty \frac{1}{2^k\eta } \fint _{Q(x,2^{k+1}\frac{\eta }{4})}|f(y)|\,dy\fint _{Q(x,2^{k+1}\frac{\eta }{4})}|g(z)|\,dz\\&\lesssim \frac{|t|^2}{\eta }\Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)} {\mathcal {M}}(f,g)(x) \end{aligned}$$

and hence, by the boundedness of \({\mathcal {M}}\) from \({L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\) to \({L_w^p(\mathbb {R}^n)}\) ([24, Theorem 3.3]), we further conclude that, for any \((f,g)\in {L_{w_1}^{p_1}(\mathbb {R}^n)}\times {L_{w_2}^{p_2}(\mathbb {R}^n)}\),

$$\begin{aligned} \Vert {\mathrm {L}}_5\Vert \lesssim |t|^2\Vert \nabla b\Vert _{L^\infty (\mathbb {R}^n)}\Vert f\Vert _{{L_{w_1}^{p_1}(\mathbb {R}^n)}}\Vert g\Vert _{{L_{w_2}^{p_2}(\mathbb {R}^n)}}. \end{aligned}$$
(3.21)

Combining (3.17), (3.18), and (3.21), we conclude that, for any \(b\in B_1(\mathbb {R}^n)\) and \((f,g)\in {\mathcal {E}}\) bounded,

$$\begin{aligned} \lim _{|t|\rightarrow 0}\left\| \left[ b,T_\eta \right] _1(f,g)(\cdot )-\left[ b,T_\eta \right] _1(f,g)(\cdot +t) \right\| _{{L_w^p(\mathbb {R}^n)}}=0, \end{aligned}$$

which implies that \([b,T_\eta ]_1{\mathcal {E}}\) also satisfies Lemma 3.2(iii). Thus, \([b,T_\eta ]_1{\mathcal {E}}\) satisfies (i), (ii), and (iii) of Lemma 3.2 and hence \([b,T_\eta ]_1\) is a compact operator for any \(b\in B_1(\mathbb {R}^n)\) and \(\eta \in (0,\infty )\) small enough. This finishes the proof of Theorem 1.4. \(\square \)