1 Motivation, Preliminaries and Statements of Main Results

The work of Calderón and Zygmund on singular integrals and Calderón’s ideas [8, 9] about improving a pseudodifferential calculus, where the smoothness assumptions on the coefficients are minimal, have greatly affected research in quasilinear and nonlinear PDEs. The subsequent investigations about multilinear operators initiated by Coifman and Meyer [13] in the late 70s have added to the success of Calderón’s work on commutators. A classical bilinear estimate, the so-called Kato-Ponce commutator estimate [26], is crucial in the study of the Navier-Stokes equations. Its original formulation is the following: if 1<r<∞, s>0, and f,g are Schwartz functions, then

$$ \|[J^s, f](g)\|_{L^r}\lesssim\|\nabla f\|_{L^\infty}\|J^{s-1}g\| _{L^r}+\|J^s f\|_{L^r}\|g\|_{L^\infty} $$
(1.1)

where J s:=(I−Δ)s/2 denotes the Bessel potential of order s and [J s,f](⋅):=J s(f⋅)−f(J s⋅) is the commutator of J s with f.

The previous estimate (1.1) has been recast later on into a general Leibniz-type rule (still commonly known as Kato-Ponce’s inequality) which takes the form

$$ \|D^\alpha(fg)\|_{L^r}\lesssim\|D^\alpha f\|_{L^p}\|g\|_{L^q}+\|f\| _{L^p}\|D^\alpha g\|_{L^q}, $$
(1.2)

for 1<p,q≤∞,1<r<∞,1/p+1/q=1/r and α>0. An extension of the estimate (1.2) to the range 1/2<r<∞ can be found in the recent work of Grafakos and S. Oh [17], while the ∞-end point result is explored by Grafakos, Maldonado and Naibo in [16]. More general Leibniz-type rules that apply to bilinear pseudodifferential operators with symbols in the bilinear Hörmander classes \(BS_{\rho, \delta}^{m}\) (see (1.8) below for their definition) can be found, for example, in the works of Bényi et al. [14] and Bernicot et al. [7]. Interestingly, for α=1 and in dimension one, Kato-Ponce’s inequality (1.2) is closely related to the boundedness of the so-called Calderón’s first commutator. Given a Lipschitz function a and fL 2, define C(a,f) by

$$C (a, f)= p.v.\int_{\mathbb {R}} \frac{a(x)-a(y)}{(x-y)^2}f(y)\,dy. $$

Then, denoting by H the classical Hilbert transform, we can identify the operator C(a,⋅) with the commutator of T=H x and the multiplication by the Lipschitz function a; that is, C(a,f)=[T,a](f):=T(af)−aT(f). While we have no hope of controlling each of the individual terms defining [T,a], the commutator itself does behave nicely; Calderón showed [9] that \(\big\|[T, a]\big\|_{L^{2}}\leq\|a'\| _{L^{\infty}}\|f\|_{L^{2}}\), effectively producing the bilinear boundedness of the operator C:Lip1×L 2L 2. Moreover, the boundedness of the first commutator can be extended to give the following result, see [28, Theorem 4 on p. 90]:

Theorem A

Let T σ be a linear pseudodifferential operator with symbol \(\sigma\in S_{1, 0}^{1}\) and a be a Lipschitz function such thataL . Then, [T σ ,a] is a linear Calderón-Zygmund operator. In particular, [T σ ,a] is bounded on L p,1<p<∞. Conversely, if [D j ,a] is bounded on L 2, j=1,…,n, thenaL .

The statement of Theorem A is the very manifestation of the so-called commutator smoothing effect: while the Hörmander class of symbols \(S_{1, 0}^{1}\) does not yield bounded pseudodifferential operators on L p, the commutator with a sufficiently smooth function (Lipschitz in our case) fixes this issue. An application of this result can be found in the work of Kenig, Ponce and Vega [27] on nonlinear Schrödinger equations.

The smoothing effect of commutators gets better when we commute with special multiplicative functions. For example, the result of Coifman, Rochberg and Weiss [12] gives the boundedness on \(L^{p} (\mathbb {R}^{n})\), 1<p<∞, of linear commutators of Calderón-Zygmund operators and pointwise multiplication, when the multiplicative function (or symbol) is in the John-Nirenberg space BMO. Uchiyama [34] improved the boundedness to compactness if the multiplicative function is in CMO; here, CMO denotes the closure of C -functions with compact supports under the BMO-norm. The CMO in our context stands for “continuous mean oscillation” and is not to be confused with other versions of CMO (such as “central mean oscillation”). In fact, the CMO we are considering coincides with VMO, the space of functions of “vanishing mean oscillation” studied by Coifman and Weiss in [14], but also differs from other versions of VMO found in the literature; see, for example, [6] for further comments on the relation between CMO and VMO. An application of this compactness to deriving a Fredholm alternative for equations with CMO coefficients in all L p spaces with 1<p<∞ was given by Iwaniec and Sbordone [25]. Other important applications appear in the theory of compensated compactness of Coifman, Lions, Meyer and Semmes [11] and in the integrability theory of Jacobians, see Iwaniec [24].

In this work, we seek to extend such results for linear commutators to the multilinear setting. For ease of notation and comprehension, we restrict ourselves to the bilinear case. The bilinear Calderón-Zygmund theory is nowadays well understood; for example, the work of Grafakos and Torres [18] makes available a bilinear T(1) theorem for such operators. As an application of their T(1) result, we can obtain the boundedness of bilinear pseudodifferential operators with symbols in appropriate Hörmander classes of bilinear pseudodifferential symbols. Moreover, the bilinear Hörmander pseudodifferential theory has nowadays a similarly solid foundation, see again [13] and the work of Bényi and Torres [5].

Our discussion on the study of such classes of bilinear operators, on the one hand, exploits the characteristics of their kernels in the spatial domain and, on the other hand, makes use of the properties of their symbols in the frequency domain. First, consider bilinear operators a priori defined from \(\mathcal{S}\times\mathcal{S}\) into \(\mathcal{S}'\) of the form

$$ T (f, g)(x)=\int_{\mathbb {R}^n}\int_{\mathbb {R}^n} K(x, y, z)f(y)g(z)\,dydz. $$
(1.3)

Here, we assume that, away from the diagonal \(\varOmega= \{(x,y,z) \in \mathbb {R}^{3n}: x=y=z \}\), the distributional kernel K coincides with a function K(x,y,z) locally integrable in \(\mathbb {R}^{3n}\setminus\varOmega\) satisfying the following size and regularity conditions in \(\mathbb{R}^{3n}\setminus \varOmega\):

$$ |K(x,y,z)| \lesssim\big(|x-y| + |x-z| + |y-z|\big)^{-2n}, $$
(1.4)

and

$$ |K(x,y,z) - K(x',y,z)| \lesssim\frac{|x-x'|}{ (|x-y| + |x-z| + |y-z| )^{2n + 1}}, $$
(1.5)

whenever \(|x - x'| \leq\frac{1}{2} \max\{|x-y|,|x-z|\}\). While the condition (1.5) is not the most general that one can impose in such theory, see [18], we prefer to work with this simplified formulation in order to avoid unnecessary further technicalities. For symmetry and interpolation purposes we also require that the formal transpose kernels K ∗1,K ∗2 (of the transpose operators T ∗1,T ∗2, respectively), given by

$$K^{*1}(x,y,z) = K(y,x,z) \quad\text{and}\quad K^{*2}(x,y,z)=K(z,y,x), $$

also satisfy (1.5). Moreover, for an additional simplification, in the following we will replace the regularity conditions (1.5) on K,K ∗1 and K ∗2 with the natural conditions on the gradient ∇K:

$$ |\nabla K(x,y,z)| \lesssim\big(|x-y| + |x-z| + |y-z|\big)^{-2n -1}, $$
(1.6)

for \((x, y, z)\in \mathbb {R}^{3n} \setminus\varOmega\). We say that such a kernel K(x,y,z) is a bilinear Calderón-Zygmund kernel. Moreover, given a bilinear operator T defined in (1.3) with a Calderón-Zygmund kernel K (which satisfies (1.4) and (1.6)), we say that T is a bilinear Calderón-Zygmund operator if it extends to a bounded operator from \(L^{p_{0}} \times L^{q_{0}}\) into \(L^{r_{0}}\) for some 1<p 0,q 0<∞ and 1/p 0+1/q 0=1/r 0≤1.

The crux of bilinear Calderón-Zygmund theory is the following statement, see [18].

Theorem B

Let T be a bilinear Calderón-Zygmund operator. Then, T maps L p×L q into L r for all p,q,r such that 1<p,q<∞ and 1/p+1/q=1/r≤1. Moreover, we also have the following end-point boundedness results:

  1. (a)

    When p=1 or q=1, then T maps L p×L q into L r,∞;

  2. (b)

    When p=q=∞, then T maps L ×L into BMO.

Theorem B assumes the boundedness \(L^{p_{0}}\times L^{q_{0}}\to L^{r_{0}}\) of the operator T for some Hölder triple (p 0,q 0,r 0). Obtaining one such boundedness via appropriate cancelation conditions is another topic of interest in the theory of linear and multilinear operators with Calderón-Zygmund kernels. A satisfactory answer is provided by the T(1) theorem; the following bilinear version, as stated by Hart [22], is equivalent to the formulation in [18] and is strongly influenced by the fundamental work of David and Journé [15] in the linear case. For the proof of Theorem C below, see [22, Theorem 2.4], [20, Theorem 1.1], and [21] for the correct formulation of [20, Remark 2.1]. It is worthwhile noting that the proof via continuous decompositions in [22] can be simplified if one uses an appropriate discrete formulation; this argument is essentially contained in [23], if slightly obfuscated by the notation needed there to work with accretive functions.

Theorem C

Let \(T: \mathcal{S}\times\mathcal{S}\to\mathcal{S}'\) be a bilinear singular integral operator with Calderón-Zygmund kernel K. Then, T can be extended to a bounded operator from \(L^{p_{0}} \times L^{q_{0}}\) into \(L^{r_{0}}\) for some 1<p 0,q 0<∞ and 1/p 0+1/q 0=1/r 0≤1 if and only if T satisfies the following two conditions:

  1. (i)

    T has the weak boundedness property,

  2. (ii)

    T(1,1),T ∗1(1,1) and T ∗2(1,1) are in BMO.

For the definition of the weak boundedness property and the precise meaning for an expression such as T(1,1), see Sects. 2.4 and 2.5.

Now, we turn our attention to the relation between bilinear Calderón-Zygmund operators and bilinear pseudodifferential operators. A bilinear pseudodifferential operator T σ with a symbol σ, a priori defined from \(\mathcal{S} \times\mathcal{S}\) into \(\mathcal{S}'\), is given by

$$ T_\sigma(f,g)(x)=\int_{\mathbb {R}^n}\int_{\mathbb {R}^n} \sigma(x, \xi, \eta )\widehat{f}(\xi)\widehat{g}(\eta) e^{ix\cdot(\xi+ \eta)}d\xi d\eta. $$
(1.7)

We say that a symbol σ belongs the bilinear class \(BS^{m}_{\rho, \delta}\) if

$$ |\partial_x^\alpha\partial_\xi^\beta\partial _\eta ^\gamma\sigma(x, \xi,\eta)|\lesssim\big(1+|\xi |+|\eta|\big) ^{m+\delta|\alpha|-\rho(|\beta|+|\gamma|)} $$
(1.8)

for all \((x, \xi, \eta)\in{ \mathbb {R}}^{3n}\) and all multi-indices α, β and γ. Such symbols are commonly referred to as bilinear Hörmander pseudodifferential symbols. The collection of bilinear pseudodifferential operators with symbols in \(BS_{\rho, \delta}^{m}\) will be denoted by \(\mathcal{O}p \, BS^{m}_{\rho, \delta }\). Note that, for example, operators in \(\mathcal{O}p \, BS^{m}_{\rho, \delta}\) model the product of two functions and their derivatives; see Remark 1 at the end of this sections.

It is a known fact that bilinear Calderón-Zygmund kernels correspond to bilinear pseudodifferential symbols in the class \(BS_{1, 1}^{0}\), see [18]. Moreover, Calderón-Zygmund operators are “essentially the same” as pseudodifferential operators with symbols in the subclass \(BS_{1, \delta}^{0}\), 0≤δ<1, a fact that in turn is tightly connected to the existence of a symbolic calculus for \(BS_{1, \delta }^{0}\), see [1].

Theorem D

Let \(\sigma\in BS_{1, \delta}^{0}\), 0≤δ<1. Then, \(T_{\sigma }^{*j}=T_{\sigma^{*j}}\) with \(\sigma^{*j}\in BS_{1, \delta}^{0}\), j=1,2, and T σ is a bilinear Calderón-Zygmund operator.

Thus, we can view bilinear Calderón-Zygmund operators on the frequency side as operators given by (1.7) with symbols \(\sigma\in BS^{m}_{\rho, \delta}\), where ρ=1,0≤δ<1 and m=0.

Our main interest is to consider the previously defined bilinear operators under the additional operation of commutation. For a bilinear operator T, and (multiplicative) functions b,b 1, and b 2 , we consider the following three bilinear commutators:

$$\begin{aligned}{} [T, b]_1(f, g)&= T(bf,g)-bT(f,g),\\ [T, b]_2 (f, g)&=T(f,bg)-bT(f,g),\\ [[T, b_1]_{_1}, b_2]_2 (f, g)&= [T, b_1]_{_1}(f,b_2g) - b_2[T, b_1]_{_1}(f,g). \end{aligned}$$

First, we consider the case when T is a bilinear Calderón-Zygmund operator with kernel K and b,b 1,b 2 belong to \(\mathit{BMO}(\mathbb {R}^{n})\). Then, the three bilinear commutators can formally be written as

$$\begin{aligned}{} [T, b]_1(f, g)(x)&=\int_{\mathbb {R}^n}\int_{\mathbb {R}^n} K(x, y, z)\big(b(y)-b(x)\big)f(y)g(z)\, dydz,\\ [T, b]_2 (f, g)(x)&=\int_{\mathbb {R}^n}\int_{\mathbb {R}^n} K(x, y, z)\big(b(z)-b(x)\big)f(y)g(z)\, dydz,\\ [[T, b_1]_{_1}, b_2]_2 (f, g)(x)&=\int_{\mathbb {R}^n}\int_{\mathbb {R}^n} K(x, y, z)\\&\quad{}\times\big(b_1(y)-b_1(x)\big) \big(b_2(z)-b_2(x)\big)f(y)g(z)\, dydz. \end{aligned}$$

As in the linear case, these operators are bounded from L p×L qL r with 1/p+1/q=1/r for all 1<p,q<∞, see Grafakos and Torres [19], Perez and Torres [30], Perez et al. [31] and Tang [33], with estimates of the form

$$\begin{aligned} & \big\| [T, b]_1(f, g)\big\| _{L^r}, \|[T, b]_2(f, g)\|_{L^r}\lesssim\| b\| _{\mathit{BMO}}\|f\|_{L^p}\|g\|_{L^q},\\ & \vphantom{\Big|} \big\| [[T, b_1]_{_1}, b_2]_2 (f, g)\big\| _{L^r}\lesssim\|b_1\| _{\mathit{BMO}}\| b_2\|_{\mathit{BMO}}\|f\|_{L^p}\|g\|_{L^q}. \end{aligned}$$

However, the bilinear commutators obey a “smoothing effect” and are, in fact, even better behaved if we allow the symbols b to be slightly smoother. The following theorem of Bényi and Torres [6], should be regarded as the bilinear counterpart of the result of Uchiyama [34] mentioned before.

Theorem E

Let T be a bilinear Calderón-Zygmund operator. If bCMO, 1/p+1/q=1/r, 1<p,q<∞ and 1≤r<∞, then [T,b]1:L p×L qL r is a bilinear compact operator. Similarly, if b 1,b 2CMO, then [T,b 2]2 and \([[T, b_{1}]_{_{1}}, b_{2}]_{2}\) are bilinear compact operators for the same range of exponents.

Interestingly, the notion of compactness in the multilinear setting alluded to in Theorem E can be traced back to the foundational article of Calderón [10]. Given three normed spaces X,Y,Z, a bilinear operator T:X×YZ is called (jointly) compact if the set {T(x,y):∥x∥,∥y∥≤1} is precompact in Z. Clearly, any compact bilinear operator T is continuous; for further connections between this and other notions of compactness, see again [6]. An immediate consequence of Theorems D and E is the following compactness result for commutators of bilinear pseudodifferential operators.

Corollary F

Let \(\sigma\in BS_{1, \delta}^{0}\), 0≤δ<1, and b,b 1,b 2CMO. Then, [T σ ,b] i ,i=1,2, and \([[T_{\sigma}, b_{1}]_{_{1}}, b_{2}]_{2}\) are bilinear compact operators from L p×L qL r for 1/p+1/q=1/r,1<p,q<∞ and 1≤r<∞.

Varying the parameters ρ,δ and m in the definition of the bilinear Hörmander classes \(BS_{\rho, \delta}^{m}\) is a way of escaping the realm of bilinear Calderón-Zygmund theory. In this context, it is useful to recall the following statement from [2].

Theorem G

Let 0≤δρ≤1, δ<1, 1≤p,q≤∞, 0<r<∞ be such that 1/p+1/q=1/r,

$$m<m(p, q):=n(\rho-1)\left(\max\Big(\frac{1}{2},\,\frac{1}{p},\, \frac {1}{q},\, 1-\frac{1}{r}\Big)+\max\Big(\frac{1}{r}-1,0\Big)\right), $$

and \(\sigma\in BS^{m}_{\rho,\delta}(\mathbb {R}^{n})\). Then, T σ extends to a bounded operator from L p×L qL r.

See also Miyachi and Tomita [29] for the optimality of the order m and the extension of the result in [2] below r=1.

Clearly, the class \(BS_{1, 0}^{1}\) falls outside the scope of Theorem F; since ρ=1, the only way to make the class \(BS_{1, \delta}^{m}\), 0≤δ<1, to produce operators that are bounded is to require the order m<0. However, guided by the experience we gained in the linear case, it is natural to hope that the phenomenon of smoothing of bilinear commutators manifests itself again in the bilinear context of pseudodifferential operators. This is confirmed by our main results, Theorems 1 and 2, which we now state.

Theorem 1

Let \(T_{\sigma}\in\mathcal{O}p \, BS^{1}_{1, 0}\) and a be a Lipschitz function such thataL . Then, [T σ ,a] i ,i=1,2, are bilinear Calderón-Zygmund operators. In particular, [T σ ,a] i ,i=1,2, are bounded from L p×L qL r for 1/p+1/q=1/r,1<p,q<∞ and 1≤r<∞.

Once we prove that the commutators [T σ ,a] i ,i=1,2, are bilinear Calderón-Zygmund operators, the end-point boundedness results directly follow from Theorem B. Theorem 1 also admits a natural converse, see the remark at the end of this paper; thus making Theorem 1 the natural bilinear extension of Theorem A.

Combining Theorem 1 with Theorem E, we immediately obtain the following compactness result for the iteration of commutators.

Theorem 2

Let \(T_{\sigma}\in\mathcal{O}p \, BS^{1}_{1, 0}\), a be a Lipschitz function such thataL , and b,b 1,b 2CMO. Then, [[T σ ,a] i ,b] j , i,j=1,2, and [[[T σ ,a] i ,b 1]1,b 2]2, i=1,2, are bilinear compact operators from L p×L qL r for 1/p+1/q=1/r,1<p,q<∞ and 1≤r<∞.

Remark 1

We briefly point out how our main results are related to the Kato-Ponce estimates (1.1), (1.2). First of all, the bilinear classes of symbols \(BS_{1, 0}^{m}\) arise naturally in the study of bilinear partial differential operators with variable coefficients. More precisely, let

$$D_{k,l}(f, g)=\sum_{|\beta|\leq k}\sum_{|\gamma|\leq l}c_{\beta \gamma }(x)\frac{\partial^\beta f}{\partial x^\beta}\frac{\partial^\gamma g}{\partial x^\gamma}, $$

where the coefficients c βγ have bounded derivatives. Then \(D_{k, l}=T_{\sigma_{k,l}}\), the bilinear symbol being given by

$$\sigma_{k,l} (x, \xi, \eta)=(2\pi)^{-2n}\sum_{\beta, \gamma }c_{\beta \gamma}(x)(i\xi)^\beta(i\eta)^\gamma\in BS_{1, 0}^{k+l}. $$

The symbols σ k,l are almost equivalent to multipliers of the form

$$\sigma_m(\xi, \eta)=(1+|\xi|^2+|\eta|^2)^{m/2}. $$

We can think of σ m as the bilinear counterpart of the multiplier (1+|ξ|2)m/2 that defines the linear operator J m; indeed, it is easy to check that this symbol belongs to \(BS_{1, 0}^{m}\). In general, one can also check that symbols of the form

$$\sigma_{k+l}(\xi, \eta)=\xi^k\eta^l\sigma_{-1}(\xi, \eta) $$

will also belong to \(BS_{1, 0}^{k+l}\).

Letting now k+l=m=1 in any of the examples above, we have the corresponding bilinear pseudodifferential symbols σ belonging to the class \(BS_{1, 0}^{1}\). For such symbols σ, a Lipschitz, and (p,q,r) a Hölder triple, Theorem 1 then yields, in particular, the following bilinear commutator estimate:

$$\|T_\sigma(af,g)-aT_\sigma(f,g)\|_{L^r}\lesssim\|f\|_{L^p}\|g\|_{L^q}. $$

Moreover, an additional commutation with a function bCMO, produces compact bilinear operators, which are of course bounded; hence producing, for example, bilinear estimates of the form

$$\|T_\sigma(abf, g)-aT_\sigma(bf, g)-bT_\sigma(af, g)+abT_\sigma(f, g)\| _{L^r}\lesssim\|f\|_{L^p}\|g\|_{L^q}. $$

In the previous two bilinear commutator estimates, the constants depend on the natural parameters of the pseudodifferential operator, such as the dimension n and the implicit growth constants that appear in the definition (1.8) of \(\sigma\in BS_{1, 0}^{1}\), but, more importantly, also on \(\|\nabla a\|_{L^{\infty}}\) and ∥b BMO .

The remainder of our paper is devoted to the proof of Theorem 1. While the argument we present is influenced by Coifman and Meyer’s exposition of the linear case, see [28, Theorem 4, Chap. 9], there are several technical obstacles in the bilinear setting that must be overcome.

2 Proof of Theorem 1

The proof can be summarized in the following statement: the kernels of the commutators are indeed bilinear Calderón-Zygmund and the commutators verify the conditions (i) and (ii) in the T(1) theorem (Theorem C) from the bilinear Calderón-Zygmund theory.

We divide the proof of Theorem 1 into several subsections. In Sect. 2.1, we show that the kernels of the commutators [T σ ,a] i ,i=1,2, are Calderón-Zygmund. Sections 2.22.4 are devoted to proving that the commutators satisfy the cancelation condition (ii) in Theorem C. Finally, in Sect. 2.5, we prove that the commutators verify the bilinear weak boundedness property.

In the following, a denotes a Lipschitz function such that ∇aL and T=T σ is the bilinear pseudodifferential operator associated to a symbol \(\sigma \in BS_{1, 0}^{1}\), that is, σ satisfies

$$ |\partial ^\alpha _x \partial ^\beta _\xi \partial _\eta^\gamma \sigma (x, \xi, \eta) | \lesssim \big(1 + |\xi | + |\eta|\big)^{1 - |\beta |- |\gamma |}, $$
(2.1)

for all \(x, \xi, \eta\in \mathbb {R}^{n}\) and all multi-indices α,β,γ.

2.1 Bilinear Calderón-Zygmund kernels

Let K j be the kernel of [T,a] j , j=1,2. Then, we have

$$\begin{aligned} K_1(x, y, z) & = \big(a(y) - a(x)\big) K(x, y, z),\\ K_2(x, y, z) & = \big(a(z) - a(x)\big) K(x, y, z), \end{aligned}$$

where K is the kernel of T. Note that K can be written (up to a multiplicative constant) as

$$ K(x, y, z) = \iint e^{i\xi\cdot(x - y)} e^{i\eta\cdot(x - z)} \sigma (x, \xi, \eta) d \xi d \eta. $$
(2.2)

There are certain decay estimates on \(\partial ^{\alpha }_{x} \partial ^{\beta }_{y} \partial ^{\gamma }_{z} K(x, y, z)\), when xy or xz.

Lemma 3

The kernel K satisfies

$$|\partial _x^\alpha \partial _y^\beta \partial _z^\gamma K (x, y, z)| \leq C(\alpha , \beta , \gamma ) \big(|x - y|+ |x-z|\big)^{-2n-1 -|\alpha |-|\beta | - |\gamma |}. $$

when xy or xz.

Assuming Lemma 3, we can show the desired result about the kernels K 1 and K 2.

Lemma 4

K 1 and K 2 are bilinear Calderón-Zygmund kernels.

Proof

By Lemma 3 and noting that |xy|+|xz|+|yz|∼|xy|+|xz|, we have

$$\begin{aligned} |K_1(x, y, z) |, |K_2(x, y, z) | & \lesssim \|\nabla a\|_{L^\infty} \big(|x-y| + |x- z|+ |y -z|\big)^{-2n}, \\ |\nabla K_1(x, y, z) |, |\nabla K_2(x, y, z) | & \lesssim \|\nabla a\|_{L^\infty} \big(|x-y| + |x-z| + |y -z|\big)^{-2n - 1}, \end{aligned}$$

on \(\mathbb{R}^{3n}\setminus\varOmega\), where \(\varOmega= \{(x,y,z) \in \mathbb {R}^{3n}: x=y=z \}\). □

The remainder of this subsection is devoted to the proof of Lemma 3.

Proof of Lemma 3

Let ψ be a smooth cutoff function supported on \(\{\xi\in \mathbb {R}^{n}: |\xi | \leq2\}\) such that ψ(ξ)=1 for |ξ|≤1. For \(N \in\mathbb{N}\), let \(\psi_{N}(\xi, \eta) = \psi(\frac{\xi }{N})\psi (\frac{\eta}{N})\). Note that

$$ |\partial ^\beta _\xi \partial _\eta^\gamma \psi_N(\xi, \eta)| = \begin{cases} O( N^{-|\beta |-|\gamma |}) = O \big((|\xi|+|\eta|)^{-|\beta | - |\gamma |}\big), & N \leq|\xi|, |\eta| \leq2N,\\ 0, & \text{otherwise}. \end{cases} $$
(2.3)

for (β,γ)≠(0,0). Moreover, for β≠0, we have

$$ |\partial ^\beta _\xi\psi_N(\xi, \eta) |= \begin{cases} O( N^{-|\beta |}) = O \big((|\xi|+|\eta|)^{-|\beta | - |\gamma |}\big), & N \leq |\xi| \leq2N,\\ 0, & \text{otherwise}. \end{cases} $$
(2.4)

since ψ N is non-trivial only if |η|≤2N. A similar estimate holds for \(|\partial _{\eta}^{\gamma }\psi_{N} (\xi, \eta)|\), γ≠0. Hence, we have

$$ \sigma _N (x, \xi, \eta) := \sigma (x, \xi, \eta) \psi_N (\xi, \eta) \in B S^1_{1, 0} $$
(2.5)

and, moreover, we have \(|\partial ^{\alpha }_{x} \partial ^{\beta }_{\xi} \partial _{\eta}^{\gamma }\sigma _{N}(x, \xi, \eta) | \lesssim (1 + |\xi | + |\eta|)^{1 - |\beta | - |\gamma |}\), where the implicit constant is independent of N. Now, let

$$K_N(x, y, z) = \iint e^{i\xi\cdot(x - y)} e^{i\eta\cdot(x - z)} \sigma _N (x, \xi, \eta) d \xi d \eta. $$

In the following, we show that

$$ | \partial _x^\alpha \partial _y^\beta \partial _z^\gamma K_N(x, y, z)| \leq C(\alpha , \beta , \gamma ) \big(|x - y|+ |x-z|\big)^{-2n -1 -|\alpha |-|\beta | - |\gamma |} $$
(2.6)

uniformly in N. Since σ N (x,ξ,η) converges pointwise to σ(x,ξ,η), it follows that K N converges to K in the sense of distributions. This in turn shows that the estimates in (2.6) hold for K(x,y,z) as well, yielding our lemma. The remainder of the proof is therefore concerned with (2.6).

First, we consider the case α=β=γ=0, that is, we estimate K N (x,y,z). Without loss of generality, let us assume that |xy|≥|xz|; in particular, we have |xy|∼|xy|+|xz|.

Case (i): |xy|≥1.

Note that \(e^{i\xi\cdot(x - y) } = - \displaystyle\frac{1}{|x-y|^2} \Delta _\xi e^{i\xi\cdot(x - y) }\). Let \(m\in\mathbb{N}\) be such that 2m−1>2n. Then, integrating by parts, we have

$$\begin{aligned} | K_N(x, y, z) | & = \frac{1}{|x-y|^{2m}} \bigg|\iint e^{i\xi\cdot(x - y)} e^{i\eta\cdot(x - y)} \Delta _\xi^m \sigma _N(x, \xi, \eta) d \xi d \eta\bigg| \\ & \lesssim \frac{1}{|x-y|^{2m}} \iint \frac{1}{(1+|\xi|+|\eta|)^{2m-1} } d \xi d \eta \\ & \leq \frac{1}{|x-y|^{2m}} \int\frac{1}{(1+|\xi|)^{m - \frac{1}{2}} } d \xi \int \frac{1}{(1+|\eta|)^{m-\frac{1}{2}} } d \eta \\ & \lesssim |x - y|^{-2m} \leq|x - y|^{-2n - 1}. \end{aligned}$$

Hence, (2.6) holds in this case.

Case (ii): |xy|<1.

Fix x,y with xy and let r=|xy|∼|xy|+|xz|. Then, write xy as

$$x - y = r u $$

for some unit vector u. With the smooth cutoff function ψ supported on \(\{\xi\in \mathbb {R}^{n}: |\xi |\leq2\}\) as above, define \(\widetilde{\psi}= 1 - \psi\). Then, by a change of variables, we have

$$\begin{aligned} K_N(x, y, z) & = \frac{1}{r^{2n}} \iint e^{i\xi\cdot u} e^{i r^{-1}\eta\cdot(x - z)} \sigma _N(x, r^{-1} \xi, r^{-1}\eta) d \xi d\eta \\ & = \frac{1}{r^{2n}} \iint e^{i\xi\cdot u} e^{i r^{-1}\eta\cdot(x - z)} \sigma _N(x, r^{-1} \xi, r^{-1}\eta) \psi(\eta)d \xi d\eta \\ & \quad{} + \frac{1}{r^{2n}} \iint e^{i\xi\cdot u} e^{i r^{-1}\eta\cdot(x - z)} \sigma _N(x,r^{-1} \xi, r^{-1}\eta) \widetilde{\psi}(\eta)d \xi d\eta \\ & = : K^0_N(x, y, z) + K^1_N(x, y, z). \end{aligned}$$
(2.7)

Then, by inserting another cutoff in ξ, we write \(K^{0}_{N}\) as

$$\begin{aligned} K_N^0(x, y, z) & = \frac{1}{r^{2n}} \iint e^{i\xi\cdot u} e^{i r^{-1}\eta\cdot(x - z)} \sigma _N(x, r^{-1} \xi, r^{-1}\eta) \psi(\xi) \psi(\eta)d \xi d\eta \\ & \quad{} + \frac{1}{r^{2n}} \iint e^{i\xi\cdot u} e^{i r^{-1}\eta\cdot(x - z)} \sigma _N(x, r^{-1} \xi, r^{-1}\eta) \widetilde{\psi}(\xi) \psi(\eta)d \xi d\eta \\ & = : K^2_N(x, y, z) + K^3_N(x, y, z) . \end{aligned}$$
(2.8)

We begin by estimating \(K^{2}_{N}\). Since |σ N (x,r −1 ξ,r −1 η)|≲r −1 on {|ξ|,|η|≤2}, we have

$$\begin{aligned} | K^2_N(x, y, z) | \lesssim r^{-2n -1} \sim\big( |x-y|+ |x - z|\big)^{-2n-1}. \end{aligned}$$
(2.9)

Note now that

$$\begin{aligned} |\partial _\xi^\beta \partial _\eta^\gamma \sigma _N(x, r^{-1} \xi, r^{-1} \eta) | & = r^{-|\beta |-|\gamma |}|\partial _2^\beta \partial _3^\gamma \sigma _N(x, r^{-1} \xi, r^{-1}\eta) | \\ & \lesssim r^{-1} (r + |\xi| + |\eta|)^{1-|\beta |-|\gamma |} \\ & \lesssim r^{-1} (1 + |\xi| + |\eta|)^{1-|\beta |-|\gamma |}, \end{aligned}$$
(2.10)

where the last inequality holds if |ξ|≥1 or |η|≥1. Then, proceeding as before with integration by parts and using (2.10), we have

$$\begin{aligned} |K_N^1(x, y, z) | & = \frac{1}{r^{2n}} \bigg|\iint e^{i\xi\cdot u} e^{i r^{-1} \eta\cdot(x - z)} \Delta _\xi^m \sigma _N(x, r^{-1} \xi, r^{-1}\eta) \widetilde{\psi}(\eta) d \xi d \eta \bigg| \\ & \lesssim r^{-2n-1} \iint \frac{1}{(1+|\xi|+|\eta|)^{2m-1} } d \xi d \eta \\ & \lesssim r^{-2n-1}, \end{aligned}$$
(2.11)

as long as 2m−1>2n. Similarly, integrating by parts with (2.10) and noting that, for β≠0, we have \(\partial _{\xi}^{\beta} \widetilde{\psi}(\xi) = 0\) unless |ξ|∈[1,2], we have

$$\begin{aligned} |K_N^3(x, y, z) | & = \frac{1}{r^{2n}} \bigg| \iint e^{i\xi\cdot u} e^{i r^{-1}\eta\cdot(x - z)} \Delta _\xi^m \big( \sigma _N(x, r^{-1} \xi, r^{-1}\eta) \widetilde{\psi}(\xi )\big) \psi (\eta)d \xi d\eta\bigg| \\ & \lesssim r^{-2n-1} + \frac{1}{r^{2n}} \bigg|\operatorname*{\iint}_{|\xi|\geq1, |\eta|\leq2} e^{i\xi\cdot u} \\&\quad{}\times e^{i r^{-1}\eta\cdot(x - z)} \Delta _\xi^m \big( \sigma _N(x, r^{-1} \xi, r^{-1}\eta) \big) \widetilde{\psi}(\xi) \psi(\eta) d \xi\bigg| \\ & \lesssim r^{-2n-1}, \end{aligned}$$
(2.12)

as long as 2m−1>n in this case. Finally, combining the estimates (2.9), (2.11), and (2.12) yields (2.6).

Next, we consider the case (α,β,γ)≠(0,0,0). Note that \(\xi^{\widetilde{\beta }} \eta^{\widetilde{\gamma }}\partial _{x}^{\theta} \sigma _{N} \in BS^{1+|\widetilde{\beta }| + |\widetilde{\gamma }|}_{1, 0}\) where the implicit constant on the bounds of the derivatives of \(\xi^{\widetilde{\beta }} \eta^{\widetilde{\gamma }} \partial _{x}^{\theta} \sigma _{N} \) is independent of N and θ. Then, we have

$$\partial _x^\alpha \partial _y^\beta \partial _z^\gamma K_N(x, y, z) = \iint e^{i\xi\cdot(x - y)} e^{i \eta\cdot(x - z)} \widetilde{\sigma }_N(x, \xi, \eta) d \xi d \eta, $$

for some \(\widetilde{\sigma }_{N} \in BS^{1+ |\alpha |+|\beta |+|\gamma |}_{1, 0}\).

When |xy|≥1, we can repeat the computation in Case (i) and obtain (2.6) by choosing 2m−1−|α|−|β|−|γ|>2n. Now, assume |xy|<1. For \(K^{2}_{N}\), it suffices to note that \(|\widetilde{\sigma }_{N}(x, r^{-1} \xi, r^{-1} \eta) | \lesssim r^{-1 - |\alpha | - |\beta | - |\gamma |}\) on {|ξ|,|η|≤2}. For \(K^{1}_{N}\) and \(K^{3}_{N}\), we note that

$$\begin{aligned} |\partial _\xi^{\widetilde{\beta }} \partial _\eta^{\widetilde{\gamma }} \widetilde{\sigma }_N(x, r^{-1} \xi, r^{-1} \eta) | & = r^{-|\widetilde{\beta }|-|\widetilde{\gamma }|}|\partial _2^{\widetilde{\beta }} \partial _3^{\widetilde{\gamma }} \widetilde{\sigma }_N(x, r^{-1} \xi, r^{-1}\eta) | \\ & \lesssim r^{-1-|\alpha | - |\beta |-|\gamma |} (r + |\xi| + |\eta|)^{1+ |\alpha |+|\beta |+|\gamma |-|\widetilde{\beta }|-|\widetilde{\gamma }|} \\ & \lesssim r^{-1-|\alpha | - |\beta |-|\gamma |} (1 + |\xi| + |\eta|)^{1+ |\alpha |+|\beta |+|\gamma |-|\widetilde{\beta }|-|\widetilde{\gamma }|}, \end{aligned}$$

where the last inequality holds if |ξ|≥1 or |η|≥1. The rest follows as in Case (ii). □

2.2 A Representation of the Class \(BS_{1, 0}^{1}\) via \(BS_{1, 0}^{0}\)

Without loss of generality, we will assume that σ(x,0,0)=0. This is possible because even if we replace σ by σ 0, where σ 0(x,ξ,η)=σ(x,ξ,η)−σ(x,0,0), the commutators are unchanged. Namely, \([T_{\sigma }, a]_{j} = [T_{\sigma _{0}}, a]_{j}\) for j=1,2. Note that σ 0(x,0,0)=0 and \(\sigma _{0}\in BS_{1, 0}^{1}\). We can further assume that σ has compact support; this justifies the manipulations in the following. A standard limiting argument then removes this additional assumption; see, for example, the discussion about loosely convergent sequences of \(BS_{\rho, \delta}^{m}\) symbols in [5], also Stein [32, pp. 232–233].

Lemma 5

The symbol \(\sigma \in BS^{1}_{1, 0}\) has the representation \(\sigma = \sum_{j = 1}^{n} (\xi_{j} \sigma _{j} + \eta_{j} \widetilde {\sigma }_{j})\), where \(\sigma _{j}, \widetilde {\sigma }_{j} \in BS^{0}_{1, 0}\). In particular, if T j and \(\widetilde{T}_{j}\) are the bilinear pseudodifferential operators corresponding to σ j and \(\widetilde{\sigma }_{j}\), respectively, then we have

$$T(f, g) = \sum_{j = 1}^n \big[T_j (D_j f, g) + \widetilde{T}_j (f, D_jg)\big], $$

where \(T = T_{\sigma }\in\mathcal{O}p BS^{1}_{1, 0}\).

Proof

By the Fundamental Theorem of Calculus with ζ=(ξ,η), we have

$$\begin{aligned} \sigma (x, \xi, \eta) & = \sigma (x, \xi, \eta) - \sigma (x, 0, 0) = \zeta\cdot\int_0^1 \nabla_{ \zeta'} \sigma (x,\zeta' )\Big|_{\zeta ' = t \zeta} dt\\ & = \sum_{j = 1}^n \big[\xi_j \sigma _j(x, \xi, \eta) + \eta_j \widetilde{\sigma }_j(x, \xi, \eta) \big], \end{aligned}$$

where the symbols σ j and \(\widetilde{\sigma }_{j}\) are given by

$$\begin{aligned} &\sigma _j(x, \xi, \eta) = \int_0^1 \partial _{\xi'_j} \sigma (x, \xi' , t \eta) \Big|_{\xi' = t \xi} dt \quad\text{and} \\ & \widetilde{\sigma }_j(x, \xi, \eta) = \int_0^1 \partial _{\eta'_j} \sigma (x, t\xi, \eta' ) \Big|_{\eta' = t \eta} dt. \end{aligned}$$

It remains to show that \(\sigma _{j}, \widetilde{\sigma }_{j} \in BS^{0}_{1, 0}\). First, note that, for t∈[0,1], we have

$$\begin{aligned} t \big(1+t( |\xi|+ |\eta|) \big)^{-1} \lesssim (1+ |\xi| + |\eta| )^{-1} . \end{aligned}$$
(2.13)

By exchanging the differentiation with integration and applying (2.13), we have

$$\begin{aligned} |\partial _x^\alpha \partial _\xi^\beta \partial _\eta^\gamma \sigma _j(x, \xi, \eta)| & = \bigg| \int_0^1 t^{|\beta |+ |\gamma |} \partial _x^\alpha \partial _{\xi'}^\beta \partial _{\eta'}^\gamma \partial _{\xi'_j} \sigma (x, \xi ', \eta' )\Big|_{(\xi', \eta') = t(\xi, \eta)} dt\bigg|\\ & \lesssim \int_0^1 t^{|\beta |+ |\gamma |} \big(1+t (|\xi|+|\eta|)\big)^{-(|\beta |+|\gamma |)} dt\\ & \lesssim (1+|\xi|+ |\eta|)^{-(|\beta |+|\gamma |)}, \end{aligned}$$

Therefore, \(\sigma _{j} \in BS^{0}_{1, 0}\). A similar argument shows that \(\widetilde{\sigma }_{j} \in BS^{0}_{1, 0}\). □

2.3 Transposes of Bilinear Commutators

Recall that the commutators [T,a]1 and [T,a]2 are defined as

$$\begin{aligned}{} [T, a]_1(f, g) & = T(af, g) - aT(f, g), \end{aligned}$$
(2.14)
$$\begin{aligned}{} [T, a]_2(f, g) & = T(f, ag) - aT(f, g). \end{aligned}$$
(2.15)

Given a bilinear operator T, the transposes T ∗1 and T ∗2 are defined by

$$ \langle T(f, g), h \rangle = \langle T^{*1}(h, g), f \rangle = \langle T^{*2}(f, h), g \rangle , $$

where 〈⋅,⋅〉 denotes the dual pairing.

Lemma 6

We have the following identities:

$$\begin{aligned} \big([T, a]_1\big)^{*1}& = - [T^{*1}, a]_1, \end{aligned}$$
(2.16)
$$\begin{aligned} \big([T, a]_1\big)^{*2}& = [T^{*2}, a]_1-[T^{*2}, a]_2. \end{aligned}$$
(2.17)

Similarly, we have

$$\begin{aligned} \big([T, a]_2\big)^{*1}& = [T^{*1}, a]_2-[T^{*1}, a]_1, \end{aligned}$$
(2.18)
$$\begin{aligned} \big([T, a]_2\big)^{*2}& = -[T^{*2}, a]_2. \end{aligned}$$
(2.19)

Proof

We briefly indicate the calculations that give (2.16) and (2.17). The following sequence of equalities yields (2.16):

$$\begin{aligned} \langle [T, a]_1(f, g), h \rangle & = \langle T(af, g), h \rangle - \langle aT(f, g), h \rangle = \langle T^{*1}(h, g), af \rangle - \langle T(f, g), a h \rangle \\ & = \langle aT^{*1}(h, g), f \rangle - \langle T^{*1} (ah, g), f \rangle = \langle - [T^{*1}, a]_1(h, g), f \rangle . \end{aligned}$$

We also have

$$\begin{aligned} \langle [T, a]_1(f, g), h \rangle & = \langle T^{*2}(af, h), g \rangle - \langle T^{*2} (f, ah), g \rangle \\ & = \langle T^{*2}(af, h), g \rangle - \langle a T^{*2}(f, h), g \rangle \\&\quad{} - \big( \langle T^{*2} (f, ah), g \rangle - \langle a T^{*2}(f, h), g \rangle \big) \\ & = \langle [T^{*2}, a]_1(f, h), g \rangle - \langle [T^{*2}, a]_2(f, h), g \rangle , \end{aligned}$$

thus proving (2.17). The identities (2.18) and (2.19) follow in a similar manner. □

2.4 Cancelation Conditions for Bilinear Commutators

We will prove here that the commutators satisfy the BMO bounds in the bilinear T(1) theorem (Theorem C). Given a bilinear operator S with Calderón-Zygmund kernel k, one can define S(f,g) for f,gC L via a standard procedure; in particular, this definition then gives the exact meaning of the expression S(1,1). In Lemmas 7 and 8, the role of such T is played by any of the commutators [T σ ,a] j ,j=1,2, or their transposes.

Let \(\varphi\in C_{c}^{\infty}(\mathbb {R}^{n})\) be non-negative such that φ(x)=1 for |x|<1 and let φ R (x)=φ(R −1 x). For f,gC L , we define (in the sense of distributions)

$$S(f, g):=\lim_{R\to\infty} S(f\varphi_R, g\varphi_R)-\int _{|y|>1}\int _{|z|>1}k(0, y, z)f(y)g(z)\varphi_R(y)\varphi_R(z)\,dydz. $$

Thus, when computing S(f,g) we must restrict to pairing S( R , R ) with functions in \(C_{c}^{\infty}\) that have integral zero; see [18] for further details.

Lemma 7

Let \(T \in\mathcal{O}p BS^{1}_{1, 0}\) and a be a Lipschitz function. Then, we have [T,a] j (1,1)∈BMO,j=1,2.

Proof

By Lemma 5, we have

$$\begin{aligned}{} [T, a]_1 (1, 1)& = T(a, 1) - \underbrace{aT(1, 1)}_{= 0} = \sum_{j = 1}^n \big[ T_j (D_j a, 1) + \underbrace{\widetilde{T}_j (a, D_j 1)}_{ =0} \big] \\ & = \sum_{j = 1}^n T_j (D_j a, 1). \end{aligned}$$

It follows from Theorem D that \(T_{j} \in\mathcal{O}p\,BS^{0}_{1, 0}\) are bilinear Calderón-Zygmund operators. Then, by Theorem B, we obtain that T j (D j a,1)∈BMO, since D j aL . Therefore, we conclude that [T,a]1(1,1)∈BMO.

Similarly, we have

$$\begin{aligned}{} [T, a]_2(1, 1) & = T(1, a) - \underbrace{aT(1, 1)}_{= 0} = \sum_{j = 1}^n \big[ \underbrace{T_j (D_j 1, a)}_{= 0} + \widetilde{T}_j (1, D_j a) \big] \\ & = \sum_{j = 1}^n \widetilde{T}_j (1, D_j a)\in \mathit{BMO}, \end{aligned}$$

since D j aL and \(\widetilde{T}_{j} \in\mathcal{O}p\,BS^{0}_{1, 0}\). □

Lemma 8

Let T and a be as in Lemma 7. Then, we have \([T, a]_{j}^{*i}\in \mathit{BMO}, \, i, j=1,2\).

Proof

From Theorem 2.1 in [1], we know that if \(T \in\mathcal{O}p\, BS^{1}_{1, 0}\), then \(T^{*1}, T^{*2} \in\mathcal{O}p\,BS^{1}_{1, 0}\) as well. By Lemma 6, for i=1,2, the transposes \([T, a]_{1}^{*i}\) and \([T, a]_{2}^{*i}\) consist of commutators of T ∗1 and T ∗2 with the Lipschitz function a. The conclusion now follows from Lemma 7. □

2.5 The Weak Boundedness Property for Bilinear Commutators

A function \(\phi\in\mathcal{D}\) is called a normalized bump function of order M if \(\operatorname*{supp}\phi\subset B_{0}(1)\) and \(\|\partial ^{\alpha }\phi\|_{L^{\infty}} \leq1\) for all multi-indices α with |α|≤M. Here, B x (r) denotes the ball of radius r centered at x.

We say that a bilinear singular integral operator \(T:\mathcal{S} \times\mathcal{S}\to\mathcal{S}'\) has the (bilinear) weak boundedness property if there exists \(M \in\mathbb{N}\cup\{0\}\) such that for all normalized bump functions ϕ 1,ϕ 2, and ϕ 3 of order M, \(x_{1}, x_{2}, x_{3} \in \mathbb {R}^{n}\) and t>0, we have

$$ \big|\langle T(\phi_1^{x_1, t}, \phi_2^{x_2, t}), \phi_3^{x_3, t}) \rangle \big| \lesssim t^n, $$
(2.20)

where \(\phi_{j}^{x_{j}, t}(x) = \phi_{j}\big(\frac{x - x_{j}}{t}\big)\). Note that

$$ \| \partial_x^\alpha \phi_j^{x_j, t}\|_{L^p} \lesssim t^{\frac{n}{p} - |\alpha |}. $$
(2.21)

The following lemma provides a simplification of the condition (2.20).

Lemma 9

Let T be a bilinear operator defined by (1.3) with a bilinear Calderón-Zygmund kernel K, satisfying (1.4). Then, the weak boundedness property holds if there exists \(M \in\mathbb{N}\cup\{0\}\) such that

$$ \big|\langle T(\phi_1^{x_0, t}, \phi_2^{x_0, t}), \phi_3^{x_0, t}) \rangle \big| \lesssim t^n, $$
(2.22)

for all normalized bump functions ϕ 1,ϕ 2, and ϕ 3 of order M, \(x_{0} \in \mathbb {R}^{n}\) and t>0.

Proof

Suppose that T satisfies (2.22) for some fixed M. Fix t>0 and normalized bump functions ϕ 1,ϕ 2 and ϕ 3 of order M in the following.

Case (i): Suppose that |x 1x 3|,|x 2x 3|≤3t.

For j=1,2, we define ψ j by setting

$$ \psi_j^{x_3, 4t}(x) = \psi_j\big(\tfrac{x - x_3}{4t}\big) := \begin{cases} 4^{-M} \phi_j^{x_j, t}(x), & \text{if } x \in B_{x_j}(t), \\ 0, & \text{otherwise}. \end{cases} $$

Note that ψ j is a normalized bump function of order M. For j=3, let ψ 3(x)=4M ϕ 3(4x). Note that ψ 3 is also a normalized bump function of order M. Then, by (2.22), we have

$$\begin{aligned} \big|\langle T(\phi_1^{x_1, t}, \phi_2^{x_2, t}), \phi_3^{x_3, t}) \rangle \big| = 4^{3M} \big|\langle T( \psi_1^{ x_3, 4t}, \psi_2^{x_3, 4t}), \psi _3^{x_3,4t}) \rangle \big| \lesssim 4^{3M + n} t^n \sim t^n. \end{aligned}$$

Case (ii): Suppose that max(|x 1x 3|,|x 2x 3|)>3t.

For the sake of the argument, suppose that |x 1x 3|>3t. Then, by the triangle inequality, we have |xy|>|x 1x 3|−|xx 3|−|yx 1|>t for all \(x\in B_{x_{3}}(t)\) and \(y\in B_{x_{1}}(t)\). A similar calculation shows that if |x 2x 3|>3t, then we have |xz|>t for all \(x\in B_{x_{3}}(t)\) and \(z\in B_{x_{2}}(t)\). Hence, we have

$$\max\big(|x-y|, |x-z|\big) > t $$

for all \(x\in B_{x_{3}}(t)\), \(y\in B_{x_{1}}(t)\) and \(z\in B_{x_{2}}(t)\) in this case. Then, by (1.3), (1.4) and (2.21), we have

$$\begin{aligned} \big|\langle T(\phi_1^{x_1, t}, \phi_2^{x_2, t}), \phi_3^{x_3, t}) \rangle \big| & \lesssim t^{-2n} \iiint | \phi_1^{x_1, t}(y) \phi_2^{x_2, t}(z) \phi^{x_3, t}_3(x)| dy dz dx\\ & \lesssim t^{-2n} \prod_{j = 1}^3 \| \phi_j^{x_j, t}\|_{L^1} \lesssim t^n. \end{aligned}$$

Hence, (2.20) holds in both cases, thus completing the proof of the lemma. □

Now, we are ready to prove the weak boundedness property of the commutators.

Lemma 10

Let \(T \in\mathcal{O}p BS^{1}_{1, 0}\) and a be a Lipschitz function. Then, the bilinear commutators [T,a] j , j=1,2, satisfy the weak boundedness property.

Proof

We only show that the weak boundedness property holds for [T,a]1. A similar argument holds for [T,a]2. By Lemma 9, it suffices to prove (2.22). First, note that we can assume that a(x 0)=0, since replacing a by aa(x 0) does not change the commutator. Then, by the Fundamental Theorem of Calculus, we have

$$\begin{aligned} \|a\|_{L^\infty(B_{x_0}(t))} \lesssim t\|\nabla a\|_{L^\infty}. \end{aligned}$$
(2.23)

By writing

$$\begin{aligned} &\big|\langle [T, a]_1 (\phi_1^{x_0, t}, \phi_2^{x_0, t}), \phi_3^{x_0, t}) \rangle \big| \\ &\quad \leq \big|\langle T(a\phi_1^{x_0, t}, \phi_2^{x_0, t}), \phi_3^{x_0, t}) \rangle \big| + \big|\langle aT(\phi_1^{x_0, t}, \phi_2^{x_0, t}), \phi_3^{x_0, t}) \rangle \big| =: \mathrm {I}+ \mathrm {II} , \end{aligned}$$

it suffices to estimate I and II separately.

First, we estimate II. By (2.21), (2.23) and Lemma 5, we have

$$\begin{aligned} \mathrm {II} & \leq\|aT(\phi_1^{x_0, t}, \phi_2^{x_0, t}) \|_{L^2(B_{x_0}(t))} \| \phi_3^{x_0, t}\|_{L^2} \\ & \lesssim t^{\frac{n}{2}} \|a\|_{L^\infty(B_{x_0}(t))} \| T(\phi_1^{x_0, t}, \phi_2^{x_0, t}) \|_{L^2(B_{x_0}(t))} \\ & \lesssim t^{\frac{n}{2} + 1} \|\nabla a\|_{L^\infty} \bigg\| \sum_{j = 1}^n \big[ T_j (D_j \phi_1^{x_0, t}, \phi_2^{x_0, t}) + \widetilde {T}_j ( \phi_1^{x_0, t},D_j \phi_2^{x_0, t})\big] \bigg\| _{L^2} \end{aligned}$$

By the fact that \(T_{j}, \widetilde{T}_{j} \in\mathcal{O}p\, BS^{0}_{1, 0}\) and (2.21), we have

$$\begin{aligned} \mathrm {II} & \lesssim t^{\frac{n}{2} + 1} \|\nabla a\|_{L^\infty} \sum_{j = 1}^n \Big[ \| D_j \phi_1^{x_0, t}\|_{L^4}\| \phi_2^{x_0, t}\|_{L^4} + \| \phi_1^{x_0, t}\|_{L^4}\|D_j \phi_2^{x_0, t}\|_{L^4} \Big] \\ & \lesssim t^{n}\|\nabla a\|_{L^\infty}. \end{aligned}$$

Next, we estimate I. As before, by Lemma 5, (2.21) and (2.23), we have

$$\begin{aligned} \mathrm {I}& \lesssim t^{\frac{n}{2}} \bigg\| \sum_{j = 1}^n \big[ T_j (D_j (a \phi_1^{x_0, t}), \phi_2^{x_0, t}) + \widetilde {T}_j ( a \phi_1^{x_0, t},D_j \phi_2^{x_0, t})\big] \bigg\| _{L^2} \\ & \lesssim t^{\frac{n}{2}} \sum_{j = 1}^n \Big[ \|D_j (a \phi_1^{x_0, t})\|_{L^4}\| \phi_2^{x_0, t}\|_{L^4} + \| a \phi_1^{x_0, t}\|_{L^4} \|D_j \phi_2^{x_0, t}\|_{L^4} \Big] \\ & \lesssim t^{\frac{n}{2}} \sum_{j = 1}^n \Big[ \|D_j (a) \phi_1^{x_0, t}\|_{L^4}\| \phi_2^{x_0, t}\|_{L^4} + \| a D_j\phi_1^{x_0, t}\|_{L^4}\| \phi_2^{x_0, t}\|_{L^4} \\ & \quad{} + \| a \phi_1^{x_0, t}\|_{L^4} \|D_j \phi_2^{x_0, t}\|_{L^4} \Big] \\ & \lesssim t^{\frac{n}{2}} \sum_{j = 1}^n \Big[ t ^{\frac{n}{2}} \|\nabla a\|_{L^\infty} + t^{\frac{n}{2}-1} \|a\|_{L^\infty(B_{x_0}(t))} \Big] \lesssim t^n \|\nabla a\|_{L^\infty}. \end{aligned}$$

This completes the proof of Lemma 10 and thus the proof of Theorem 1. □

Remark 2

We wish to end this work by observing that the converse of Theorem 1 also holds. Let \(T_{j} \in\mathcal{O}pBS^{1}_{1, 0}\), j=1,…,n, be defined by T j (f,g)=(D j f)g. Suppose that [T j ,a]1 is bounded from L 4×L 4 into L 2, j=1,…,n. Then, a is a Lipschitz function. See Theorem A for the converse statement in the linear setting.

The proof is immediate. Noting that [T j ,a]1(f,g)=(D j a)fg, the boundedness of [T j ,a]1 then forces D j aL (say, by taking f=g to be a bump function localized near the maximum of D j a). Since this is true for all 1≤jn, a must be Lipschitz.

In particular, if we assume that [T,a]1 is bounded from L 4×L 4 into L 2 for all \(T\in\mathcal{O}p\, BS^{1}_{1, 0}\), then a must be a Lipschitz function. Of course, the boundedness [T,a]1:L 4×L 4L 2 can be exchanged with a more general one L p×L qL r for some Hölder triple (p,q,r)∈[1,∞)3. An analogous statement applies to the second commutator [T,a]2.