Keywords

AMS Subject Classification (2000):

1 Introduction

A family \(\{\psi _{\gamma }\}_{\gamma \in \Gamma } \subset L^{2}(\mathbf{R}^{d})\) is called almost-orthogonal if there is finite R so that, for all finite subsets \(\mathcal{F}\subset \Gamma \) and all linear sums \(\sum _{\gamma \in \mathcal{F}}\lambda _{\gamma }\psi _{\gamma }\),

$$\displaystyle{ \left \Vert \sum _{\gamma \in \mathcal{F}}\lambda _{\gamma }\psi _{\gamma }\right \Vert _{2} \leq R\left (\sum _{\mathcal{F}}\vert \lambda _{\gamma }\vert ^{2}\right )^{1/2}. }$$
(1)

If R is the least such constant for which (1) holds, we say the family is almost-orthogonal with constant R.

“Almost-orthogonal” is a mild misnomer: “almost-orthonormal” may be more accurate. We recall that a family \(\{\psi _{\gamma }\}_{\gamma \in \Gamma }\) is orthonormal if, for all γ and γ′ in \(\Gamma \),

$$\displaystyle{\langle \psi _{\gamma },\psi _{\gamma '}\rangle \equiv \int _{\mathbf{R}^{d}}\psi _{\gamma }(x)\,\overline{\psi _{\gamma '}(x)}\,dx = \left \{\begin{array}{@{}l@{\quad }l@{}} 1\quad &\text{if}\ \gamma =\gamma ';\\ 0\quad &\text{otherwise.}\\ \quad \end{array} \right.}$$

The family \(\{\psi _{\gamma }\}_{\gamma \in \Gamma }\) is orthonormal if and only if, for all finite sums as in (1), we have equality, with R = 1.

A duality argument shows that \(\{\psi _{\gamma }\}_{\gamma \in \Gamma } \subset L^{2}(\mathbf{R}^{d})\) satisfies (1) if and only if, for all f ∈ L 2(R d),

$$\displaystyle{ \left (\sum _{\Gamma }\vert \langle \,f,\psi _{\gamma }\rangle \vert ^{2}\right )^{1/2} \leq R\Vert \,f\Vert _{ L^{2}}. }$$
(2)

Combining (1) and (2), we see that, if \(\{\psi _{\gamma }^{(1)}\}_{\gamma \in \Gamma }\) and \(\{\psi _{\gamma }^{(2)}\}_{\gamma \in \Gamma }\) are two almost-orthogonal families in L 2(R d), with respective constants R 1 and R 2, then, for all f ∈ L 2(R d),

$$\displaystyle{\sum _{\Gamma }\langle \,f,\psi _{\gamma }^{(1)}\rangle \psi _{ \gamma }^{(2)}}$$

converges unconditionallyFootnote 1 to define a linear operator T: L 2 → L 2 with bound ≤ R 1 R 2. The canonical example of such an operator is the identity, where \(\{\psi _{\gamma }^{(1)}\}_{\gamma \in \Gamma }\) and \(\{\psi _{\gamma }^{(2)}\}_{\gamma \in \Gamma }\) are both the same complete orthonormal family, such as the classical Haar functions [3]. Recall that an interval I is dyadic if I = [ j2k, ( j + 1)2k) for some integers j and k. For each such I we set

$$\displaystyle{h^{(I)}(x) \equiv \chi _{ I_{l}}(x) -\chi _{I_{r}}(x),}$$

where I l is I’s left half and I r is I’s right half. (We also use this notation for non-dyadic intervals.) The Haar function associated to I is h (I)(x)∕ | I | 1∕2, where, here and henceforth, | E | is a set E’s Lebesgue measure (of varying dimension!).

One can define “Haar functions” adapted to dyadic cubes in R d. A cube Q ⊂ R d is a cartesian product of d intervals I i (Q) of equal length: Q =  1 d I i (Q). We call their common length Q’s sidelength, denoted (Q). The cube is dyadic if each I i (Q) is a dyadic interval. The set of all dyadic cubes in R d is \(\mathcal{D}\). The dimension d will vary but be clear from the context. We get d-dimensional Haar functions for the Qs in \(\mathcal{D}\) by taking products

$$\displaystyle{\mu _{1}(x_{1}) \times \mu _{2}(x_{2}) \times \cdots \times \mu _{d}(x_{d})}$$

where x = (x 1, x 2,  , x d ) ∈ R d and each μ j is \(h^{(I_{j}(Q))}\) or \(\chi _{I_{j}(Q)}\). We run over all such products except the one for which every μ j equals \(\chi _{I_{j}(Q)}\). This yields, for each \(Q \in \mathcal{D}\), an orthogonal set of 2d − 1 functions, \(\{h_{(i)}^{(Q)}\}_{1}^{2^{d}-1 }\). Each h (i) (Q) is supported on Q (where it only takes on the values ± 1), has integral equal to 0, and is constant on Q’s immediate dyadic subcubes. We normalize the set by dividing each h (i) (Q) by | Q | 1∕2. The resulting “Haar functions”,

$$\displaystyle{ \left \{\frac{h_{(i)}^{(Q)}} {\vert Q\vert ^{1/2}} \right \}_{Q\in \mathcal{D},\,1\leq i<2^{d}} }$$
(3)

make up a complete orthonormal family for L 2(R d), letting us write

$$\displaystyle{ f =\sum _{Q,i}\frac{\langle \,f,h_{(i)}^{(Q)}\rangle } {\vert Q\vert } h_{(i)}^{(Q)}, }$$
(4)

for any f ∈ L 2(R d).

Formula (4) is true, but is it stable? If we want to use (4) to investigate f, we have to estimate integrals

$$\displaystyle{\langle \,f,h_{(i)}^{(Q)}\rangle =\int _{\mathbf{ R}^{d}}f(x)\,h_{(i)}^{(Q)}(x)\,dx,}$$

which are likely to have small errors. We might make translation errors: instead of f(x) we have \(f(x +\vec{\tau }_{ 1}^{i}(Q))\), where (we hope) \(\vert \vec{\tau }_{1}^{i}(Q)\vert <\ell (Q)\), the computed inner product is

$$\displaystyle{\int _{\mathbf{R}^{d}}f(x)\,h_{(i)}^{(Q)}(x -\vec{\tau }_{ 1}^{i}(Q))\,dx \equiv \langle \, f,h1_{ (i)}^{(Q)}\rangle.}$$

We can expect similar translation errors—call them \(\vec{\tau }_{2}^{i}(Q)\)—in the other h (i) (Q)s occurring in (4), resulting in “perturbed” Haar functions h2(i) (Q). If we try to add up part of (4), we face

$$\displaystyle{ \sum _{Q,i}\frac{\langle \,f,h1_{(i)}^{(Q)}\rangle } {\vert Q\vert } h2_{(i)}^{(Q)}. }$$
(5)

If the \(\vec{\tau }_{k}^{i}(Q)\) s have norms ≤ η ℓ(Q), where η is small, then we hope that

$$\displaystyle{ \left \Vert \,f -\sum _{Q,i}\frac{\langle \,f,h1_{(i)}^{(Q)}\rangle } {\vert Q\vert } h2_{(i)}^{(Q)}\right \Vert _{ 2} \leq C(\eta )\Vert \,f\Vert _{2} }$$
(6)

for some function C(η) going to 0 as η → 0.

But it is not clear that the families {hk (i) (Q)∕ | Q | 1∕2} Q, i (k = 1, 2) are even almost-orthogonal. The problem comes from the Haar functions’ jumps. We can fix this by working with a smoother family. Let 0 < α ≤ 1. Suppose that, for each \(Q \in \mathcal{D}\), we have a function ϕ (Q): R d → C such that:

  1. (a)

    supp ϕ (Q) ⊂ Q;

  2. (b)

     | ϕ (Q)(x) −ϕ (Q)(x′) | ≤ ( | xx′ | ∕(Q))α for all x and x′;

  3. (c)

    ∫ ϕ (Q)dx = 0.

It is well known that \(\{\phi ^{(Q)}/\vert Q\vert ^{1/2}\}_{Q\in \mathcal{D}}\) is almost-orthogonal in L 2(R d) [3, 4]. If \(\{\phi _{(1)}^{(Q)}/\vert Q\vert ^{1/2}\}_{Q\in \mathcal{D}}\) and \(\{\phi _{(2)}^{(Q)}/\vert Q\vert ^{1/2}\}_{Q\in \mathcal{D}}\) are two such families then the unconditionally convergent sum

$$\displaystyle{ \sum _{Q\in \mathcal{D}}\frac{\langle \,f,\phi _{(1)}^{(Q)}\rangle } {\vert Q\vert } \phi _{(2)}^{(Q)}(x), }$$
(7)

defines bounded linear operator T: L 2 → L 2. This sum is also stable. Let 0 < η < 1∕2 and let \(\{\vec{\tau }_{i}(Q)\}_{Q\in \mathcal{D}}\) (i = 1, 2) be two families of vectors in R d such that \(\vert \vec{\tau }_{i}(Q)\vert \leq \eta \ell (Q)\). Define \(\widetilde{\phi _{(i)}^{(Q)}}(x) =\phi _{ (i)}^{(Q)}(x -\vec{\tau }_{1}(Q))\) (i = 1, 2). The families \(\{\widetilde{\phi _{(i)}^{(Q)}}/\vert Q\vert ^{1/2}\}_{Q\in \mathcal{D}}\) are almost-orthogonal, with constants ≤ C(α, d) [3, 4], implying that

$$\displaystyle{\widetilde{T}(\,f) \equiv \sum _{Q\in \mathcal{D}}\frac{\langle \,f,\widetilde{\phi _{(1)}^{(Q)}}\rangle } {\vert Q\vert } \widetilde{\phi _{(2)}^{(Q)}}}$$

defines a bounded linear operator on L 2. Moreover, for every 0 < r < α, there is a constant C = C(α, r, d) so that, for all f ∈ L 2(R d) [4],

$$\displaystyle{ \left \Vert T(\,f) -\widetilde{ T}(\,f)\right \Vert _{2} \leq C\eta ^{r}\Vert \,f\Vert _{ 2}; }$$
(8)

and analogous results hold in L p(R d) if 1 < p <  [4]. The ϕ (i) (Q)s’ smoothness is crucial here. But with the hk (i) (Q)s, “α is 0”, and the Hölder smooth ϕ (Q)s seem better for working with wavelet representations of operators. This superiority is somewhat specious. In the real world, (7) is discretized: the ϕ (Q)s are replaced by discontinuous, piecewise constant functions. Sums like (4) provide a model to understand their sensitivity to errors.

It turns out that the perturbed Haar systems are almost-orthogonal in L 2(R d) (Theorems 1 and 2) and series like (4) are stable: they satisfy (6) with C(η) equal to a dimensional constant times η 1∕2 (Theorem 3). The almost-orthogonality and stability results hold for much more general systems, perturbations, and operators than those discussed above, and the exponent on η is sharp.

Our proofs of these facts start from a familiar concept. A function f: [a, b] → C is said to be of bounded variation on [a, b] (written f ∈ BV [a, b]) [1] if there is a finite M so that, for all partitions P = { a = x 0 < ⋯ < x n  = b} of [a, b],

$$\displaystyle{\sum _{1}^{n}\vert \,f(x_{ k}) - f(x_{k-1})\vert \leq M.}$$

The supremum over all such sums is called f’s total variation over [a, b] and is denoted V f [a, b]. (When we write V f (I) and I = [a, b], we mean V f [a, b].) If f ∈ BV [a, b] then f ∈ BV [c, d] for every [c, d] ⊂ [a, b], and, for every partition P as above,

$$\displaystyle{\sum _{1}^{n}V _{ f}[x_{k-1},x_{k}] = V _{f}(\cup _{1}^{n}[x_{ k-1},x_{k}]) = V _{f}[a,b].}$$

We say that a function is of bounded variation on R if the supremum of the preceding expression, over all closed bounded intervals, is finite; and we call that supremum the function’s total variation on R.

For every cube Q ⊂ R d, let NBV (Q) be the set of f: R d → C such that: (a) f is measurable; (b) f’s support is a subset of \(\overline{Q}\) (the closure of Q); (c) for each 1 ≤ i ≤ d, f is of bounded variation with respect to x i on R, with total variation on R being ≤ 1; (d) ∫ fdx = 0.

Condition (c) means: If we fix the x 2, x 3, … , x d components of x = (x 1,  , x d ) ∈ R d, then the function f(⋅ , x 2, x 3,  , x d ) has total variation ≤ 1 on R, with the analogous statements for x 2, x 3, etc.

Our fundamental result is:

Theorem 1

If f (Q) ∈ NBV (Q) for every \(Q \in \mathcal{D}\) then

$$\displaystyle{\left \{ \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}\right \}_{Q\in \mathcal{D}}}$$

is almost-orthogonal in L 2 ( R d ), with constant \(\leq \left (1 + \frac{1} {\sqrt{2}}\right )d\) .

Theorem 1 immediately implies the fact stated in the abstract:

Corollary 1

Let N ≥ 2. Suppose that, for every dyadic cube Q ⊂ R d , we have N convex regions {R i (Q)} 1 N , subsets of Q, and N complex numbers {c i (Q)} 1 N such that |c i (Q)|≤ 1 and ∑ 1 N c i (Q)|R i (Q)| = 0. Define, for every \(Q \in \mathcal{D}\) ,

$$\displaystyle{\tilde{h}_{(Q)}(x) \equiv \vert Q\vert ^{-1/2}\left (\sum _{ 1}^{N}c_{ i}(Q)\chi _{R_{i}(Q)}(x)\right ).}$$

Then, for every finite linear combination \(\sum _{Q\in \mathcal{D}}\lambda _{Q}\tilde{h}_{(Q)}\) ,

$$\displaystyle{\left \Vert \sum _{Q\in \mathcal{D}}\lambda _{Q}\tilde{h}_{(Q)}\right \Vert _{2} \leq \left (2 + \sqrt{2}\right )Nd\left (\sum _{Q\in \mathcal{D}}\vert \lambda _{Q}\vert ^{2}\right )^{1/2}.}$$

Proof

Each function \(\sum _{1}^{N}c_{i}(Q)\chi _{R_{i}(Q)}\) equals 2N times some f (Q) ∈ NBV (Q).

Corollary 1 holds no matter what the convex bodies are (cones, spheres, parallelepipeds, cylinders, etc.) or how they are placed (overlapping, disjoint, etc.). Careful placement gives a better constant.

Corollary 2

Suppose that, for every dyadic cube Q ⊂ R d , we have 2 d convex regions \(\{R_{i}(Q)\}_{1}^{2^{d} }\) , where each R i (Q) is a subset of a unique immediate dyadic subcube of Q, and that we have complex numbers \(\{c_{i}(Q)\}_{1}^{2^{d} }\) such that |c i (Q)|≤ 1 and \(\sum _{1}^{2^{d} }c_{i}(Q)\vert R_{i}(Q)\vert = 0\) . Define, for every \(Q \in \mathcal{D}\) ,

$$\displaystyle{\tilde{h}_{(Q)}(x) \equiv \vert Q\vert ^{-1/2}\left (\sum _{ 1}^{2^{d} }c_{i}(Q)\chi _{R_{i}(Q)}(x)\right ).}$$

For every finite linear combination \(\sum _{Q\in \mathcal{D}}\lambda _{Q}\tilde{h}_{(Q)}\) ,

$$\displaystyle{\left \Vert \sum _{Q\in \mathcal{D}}\lambda _{Q}\tilde{h}_{(Q)}\right \Vert _{2} \leq \left (4 + 2\sqrt{2}\right )d\left (\sum _{Q\in \mathcal{D}}\vert \lambda _{Q}\vert ^{2}\right )^{1/2}.}$$

Again, it’s simple: because of how we placed the R i (Q)s, each function \(\sum _{1}^{2^{d} }c_{i}(Q)\chi _{R_{i}(Q)}\) equals 4 times some f (Q) ∈ NBV (Q).

After proving Theorem 1 we look at the stability of almost-orthogonal expansions of the form

$$\displaystyle{ T(g) \equiv \sum _{Q\in \mathcal{D}}\frac{\langle g,f_{1}^{(Q)}\rangle } {\vert Q\vert } f_{2}^{(Q)}, }$$
(9)

where each f i (Q) ∈ NBV (Q). Corollary 3 shows that, for any g ∈ L 2, the series in (9) converges unconditionally to define T as a bounded linear operator on L 2. In Theorem 3 we show that the operator defined by (9) is L 2-stable with respect to small dilation and translation errors in the functions f i (Q). We now say precisely what those small errors are.

Given a family of functions \(\{\,f^{(Q)}\}_{Q\in \mathcal{D}}\), where each f (Q) ∈ NBV (Q), we suppose we have two sequences of vectors \(\{\vec{\delta }(Q)\}_{Q\in \mathcal{D}}\) and \(\{\vec{\tau }(Q)\}_{Q\in \mathcal{D}}\) in R d. The vectors \(\vec{\tau }(Q)\) are assumed to be small and the vectors \(\vec{\delta }(Q)\) are assumed to be close to \(\vec{1} \equiv (1,1,1,\ldots,1)\). Precisely, for some 0 < η < 1∕2, \(\vert \vec{1} -\vec{\delta } (Q)\vert +\vert \vec{\tau } (Q)\vert <\eta\) for all \(Q \in \mathcal{D}\). If \(\vec{\delta }(Q) = (\delta _{1},\delta _{2},\ldots,\delta _{d})\) and x = (x 1, x 2, , x d ) ∈ R d we shall set \(\vec{\delta }(Q)x \equiv (\delta _{1}x_{1},\delta _{2}x_{2},\delta _{3}x_{3},\ldots,\delta _{d}x_{d})\).

We define the perturbed form of f (Q) by

$$\displaystyle{ \widetilde{f^{(Q)}}(x) \equiv f^{(Q)}\left (\vec{\delta }(Q)(x - x_{ Q} +\ell (Q)\vec{\tau }(Q)) + x_{Q}\right ). }$$
(10)

The effect of replacing x with \(\vec{\delta }(Q)(x - x_{Q} +\ell (Q)\vec{\tau }(Q)) + x_{Q}\) is to shift f (Q)’s “center” a bit and dilate it slightly “relative to x Q ”. For example, if

$$\displaystyle{g(x) =\chi _{B(x_{Q};\ell(Q))}(x),}$$

the characteristic function of a ball roughly comparable to Q, and \(\vec{\delta }(Q) = (\delta,\delta,\delta,\ldots,\delta )\), then

$$\displaystyle{g(\vec{\delta }(Q)(x - x_{Q} +\ell (Q)\vec{\tau }(Q)) + x_{Q}) =\chi _{B(x_{Q}-\ell(Q)\vec{\tau }(Q);\ell(Q)/\delta )}(x):}$$

the center shifts by a small multiple of (Q) and the radius gets multiplied by δ −1. (For a general δ(Q), the ball becomes an ellipsoid.) For an operator T like (9) built from two families \(\{\,f_{j}^{(Q)}\}_{Q\in \mathcal{D}}\) ( j = 1, 2), we assume we have sequences of vectors \(\{\vec{\delta }_{j}(Q)\}_{Q\in \mathcal{D}}\) and vectors \(\{\vec{\tau }_{j}(Q)\}_{Q\in \mathcal{D}}\) such that \(\vert \vec{1} -\vec{\delta }_{j}(Q)\vert +\vert \vec{\tau } _{j}(Q)\vert <\eta\), from which we define the analogous \(\widetilde{f_{j}^{(Q)}}\) s as given by formula (10). We define a perturbation of T in the obvious way:

$$\displaystyle{ \widetilde{T}(g) \equiv \sum _{Q\in \mathcal{D}}\frac{\langle g,\widetilde{f_{1}^{(Q)}}\rangle } {\vert Q\vert } \widetilde{f_{2}^{(Q)}}. }$$
(11)

In Sect. 3 we prove:

Theorem 2

The operator defined by ( 11 ) is L 2 bounded, with norm ≤ C(d).

Theorem 3

There is a constant C = C(d), independent of η, so that, for all operators T and \(\widetilde{T}\) (as defined by ( 9 ) and ( 11 ), respectively), and all g ∈ L 2 ( R d ),

$$\displaystyle{\Vert T(g) -\widetilde{ T}(g)\Vert _{2} \leq C(d)\eta ^{1/2}\Vert g\Vert _{ 2}.}$$

The exponent 1∕2 is the best possible. Let \(\{\,f_{j}^{(Q)}/\vert Q\vert ^{1/2}\}_{Q\in \mathcal{D}}\) ( j = 1, 2) be the Haar functions on R and let g = h [0, 1). Leave the f 1 (Q)s alone but shift h [0, 1) in the f 2 (Q) system to the right by 0 < η < 1∕10. Then T(g)(x) = h [0, 1)(x), \(\widetilde{T}(g)(x) = h_{[0,1)}(x-\eta )\), and \(\Vert T(g) -\widetilde{ T}(g)\Vert _{2} \sim \eta ^{1/2}\).

At two places the reader may wonder why we are doing things certain ways when others seem simpler. Remarks there (labeled “Point 1” and “Point 2”) direct the reader to an appendix (Sect. 3) for motivations. Originally we tried to put these in the introduction, but attempts to motivate the motivations (before stating the proofs) made the paper too long and confusing. We removed them, thinking nobody would care about them anyway, but the referee asked about precisely those issues. We then had the idea of addressing them in an appendix. We are grateful to the referee for getting us to explain ourselves, and helping to make the paper not too long and just confusing enough. The “points” remarks occur, respectively, after the proofs of Lemma 1 and Theorem 1.

We write A ∼ B—where A and B are positive quantities depending on some parameters—to mean that there are positive numbers c 1 and c 2 (“comparability constants”) so that

$$\displaystyle{ c_{1}A \leq B \leq c_{2}A; }$$
(12)

and, if c 1 and c 2 do happen to depend on parameters, (12) does not become trivial. We will often use ‘C’ to denote a constant which might change to occurrence. We will not always state the parameters C depends on. If E and F are sets, we write E ⊂ F to express E ⊆ F.

We indicate the end of the proof with the symbol .

2 The Proof of Theorem 1

We begin with two lemmas.

Lemma 1

Let I be a closed, bounded interval. Suppose that f: I → C is of bounded variation, with V f (I) ≤ 1, b: I → R is integrable, and ∫b dx = 0. Then:

$$\displaystyle{ \left \vert \int _{I}f\,b\,dx\right \vert \leq (1/2)\Vert b\Vert _{1}. }$$
(13)

Proof of Lemma 1

Take ∥ b ∥ 1 = 1. Assume first that f is real. If b + and b are b’s positive and negative parts then ∫ b +dx = ∫ b dx = 1∕2, implying

$$\displaystyle{\int f(x)\,b^{+}(x)\,dx = (1/2)s_{ 1}}$$

and

$$\displaystyle{\int f(x)\,b^{-}(x)\,dx = (1/2)s_{ 2},}$$

where s 1 and s 2 are two numbers lying in [inf I f, sup I f]. Therefore

$$\displaystyle{\left \vert \int f(x)\,(b^{+}(x) - b^{-}(x))\,dx\right \vert = (1/2)\vert s_{ 1} - s_{2}\vert \leq (1/2)\sup _{x,y\in I}\vert \,f(x) - f(\,y)\vert \leq 1/2.}$$

If f is not real, let α be a complex number with modulus equal to 1 such that

$$\displaystyle{\left \vert \int _{I}f\,b\,dx\right \vert =\int (\alpha f(x))\,b(x)\,dx =\int (\mathfrak{R}(\alpha f(x)))\,b(x)\,dx,}$$

and apply the same argument to \(\mathfrak{R}(\alpha f)\).

Point 1. Using bounded variation seems like overkill. For f defined on I we can set

$$\displaystyle{\Omega _{f}(I) \equiv \sup \{\vert \, f(x) - f(\,y)\vert:\ x,y \in I\}.}$$

If \(\Omega _{f}(I) \leq 1\) we’ll get

$$\displaystyle{\left \vert \int _{I}f\,b\,dx\right \vert \leq (1/2)\Vert b\Vert _{1}.}$$

Why use V f (I)? See the appendix.

The second lemma lets us prove Theorem 1 by induction on d.

Lemma 2

Suppose that d ≥ 2, Q ⊂ R d is a cube, and f: R d C lies in NBV (Q). Write Q ≡ I 1 (Q) × K(Q), where K(Q) = ∏ 2 d I i (Q). For y ∈ R d−1 define

$$\displaystyle{\phi (\,y) \equiv \ell (Q)^{-1}\int _{ I_{1}(Q)}f(t,y)\,dt.}$$

Then ϕ ∈ NBV (K(Q)).

Proof of Lemma 2

It is trivial that supp \(\phi \subset \overline{K(Q)}\) and ∫ ϕdy = 0. For 2 ≤ j ≤ d, let {y k }0 n be points in R d−1 differing only in their x j coordinates, where these increase with k. Then:

$$\displaystyle{\sum _{1}^{n}\vert \phi (\,y_{ k}) -\phi (\,y_{k-1})\vert \leq \ell (Q)^{-1}\int _{ I_{1}(Q)}\left (\sum _{1}^{n}\vert \,f(t,y_{ k}) - f(t,y_{k-1})\vert \right )\,dt \leq 1,}$$

because f ∈ NBV (Q).

We now prove Theorem 1.

Let d = 1. Take Q and I, dyadic intervals. Consider the inner product

$$\displaystyle{ \left \langle \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}, \frac{h^{(I)}} {\vert I\vert ^{1/2}}\right \rangle, }$$
(14)

where f (Q) ∈ NBV (Q) and h (I)∕ | I | 1∕2 is the classical Haar function associated to I. If QI = ∅ or Q is properly contained in I then (14) is 0. If I ⊂ Q then, by Lemma 1,

$$\displaystyle{\left \vert \left \langle \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}, \frac{h^{(I)}} {\vert I\vert ^{1/2}}\right \rangle \right \vert \leq (1/2)V _{f^{(Q)}}(\overline{I})\left ( \frac{\vert I\vert } {\vert Q\vert }\right )^{1/2}.}$$

Therefore, for each j ≥ 0,

$$\displaystyle\begin{array}{rcl} \sum _{{ I\subset Q \atop \ell(I)=2^{-j}\ell(Q)} }\left \vert \left \langle \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}, \frac{h^{(I)}} {\vert I\vert ^{1/2}}\right \rangle \right \vert & \leq & (1/2)2^{-j/2}\sum _{{ I\subset Q \atop \ell(I)=2^{-j}\ell(Q)} }V _{f^{(Q)}}(\overline{I}) \\ & =& (1/2)2^{-j/2}V _{ f^{(Q)}}(\overline{Q}) \\ & \leq & (1/2)2^{-j/2}. {}\end{array}$$
(15)

For each \(Q \in \mathcal{D}\),

$$\displaystyle\begin{array}{rcl} \sum _{I\in \mathcal{D}}\left \vert \left \langle \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}, \frac{h^{(I)}} {\vert I\vert ^{1/2}}\right \rangle \right \vert & =& \sum _{{ I\in \mathcal{D} \atop I\subset Q} }\left \vert \left \langle \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}, \frac{h^{(I)}} {\vert I\vert ^{1/2}}\right \rangle \right \vert {}\\ & \leq & (1/2)\sum _{0}^{\infty }2^{-j/2} {}\\ & =& 1 + \frac{1} {\sqrt{2}}. {}\\ \end{array}$$

For every \(I \in \mathcal{D}\),

$$\displaystyle\begin{array}{rcl} \sum _{Q\in \mathcal{D}}\left \vert \left \langle \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}, \frac{h^{(I)}} {\vert I\vert ^{1/2}}\right \rangle \right \vert & =& \sum _{{ Q\in \mathcal{D} \atop I\subset Q} }\left \vert \left \langle \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}, \frac{h^{(I)}} {\vert I\vert ^{1/2}}\right \rangle \right \vert {}\\ & \leq & (1/2)\sum _{0}^{\infty }2^{-j/2} {}\\ & =& 1 + \frac{1} {\sqrt{2}}. {}\\ \end{array}$$

By the Schur Test, the linear mapping \(L:\ell ^{2}(\mathcal{D}) \rightarrow \ell^{2}(\mathcal{D})\) defined by

$$\displaystyle{L\left (\{\lambda _{Q}\}_{Q\in \mathcal{D}}\right ) \equiv \left \{\sum _{Q\in \mathcal{D}}\lambda _{Q}\left \langle \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}, \frac{h^{(I)}} {\vert I\vert ^{1/2}}\right \rangle \right \}_{I\in \mathcal{D}}}$$

has a bound less than or equal to \(1 + \frac{1} {\sqrt{2}}\). Let \(g =\sum _{Q\in \mathcal{D}}\lambda _{Q} \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}\) be a finite linear sum. The classical Haar functions form a complete orthonormal set in L 2(R). Therefore,

$$\displaystyle\begin{array}{rcl} \int \vert g\vert ^{2}\,dx& =& \sum _{ I}\left \vert \left \langle g, \frac{h^{(I)}} {\vert I\vert ^{1/2}}\right \rangle \right \vert ^{2} {}\\ & =& \sum _{I}\left \vert \sum _{Q\in \mathcal{D}}\lambda _{Q}\left \langle \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}, \frac{h^{(I)}} {\vert I\vert ^{1/2}}\right \rangle \right \vert ^{2} {}\\ & \leq & \left (1 + \frac{1} {\sqrt{2}}\right )^{2}\sum _{ Q\in \mathcal{D}}\vert \lambda _{Q}\vert ^{2}, {}\\ \end{array}$$

proving the Theorem 1 when d = 1.

Assume the result for d − 1 ≥ 1, with constant C(d − 1); i.e., assume that if f (Q) ∈ NBV (Q) for every (d − 1)-dimensional dyadic cube Q, and \(\sum _{Q\in \mathcal{D}}\lambda _{Q} \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}\) is any finite linear combination, then

$$\displaystyle{\left (\int _{\mathbf{R}^{d-1}}\left \vert \sum _{Q\in \mathcal{D}}\lambda _{Q} \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}\right \vert ^{2}\,dx\right )^{1/2} \leq C(d - 1)\left (\sum _{ Q\in \mathcal{D}}\vert \lambda _{Q}\vert ^{2}\right )^{1/2}.}$$

Consider the family

$$\displaystyle{\left \{ \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}\right \}_{Q\in \mathcal{D}},}$$

where every Q is a d-dimensional dyadic cube and each f (Q) ∈ NBV (Q). Put f (Q)(x) = f (Q)(x′, y), where x′ ∈ R and y ∈ R d−1. Write

$$\displaystyle{f^{(Q)}(x',y) = f_{ 1}^{(Q)}(x',y) + f_{ 2}^{(Q)}(x',y),}$$

where

$$\displaystyle\begin{array}{rcl} f_{1}^{(Q)}(x',y)& =& \left (\,f^{(Q)}(x',y) -\ell (Q)^{-1}\int _{ I_{1}(Q)}f^{(Q)}(t,y)\,dt\right )\chi _{ \overline{I_{1}(Q)}}(x')\chi _{\overline{K(Q)}}(\,y) {}\\ f_{2}^{(Q)}(x',y)& =& \left (\ell(Q)^{-1}\int _{ I_{1}(Q)}f^{(Q)}(t,y)\,dt\right )\chi _{ \overline{I_{1}(Q)}}(x')\chi _{\overline{K(Q)}}(\,y), {}\\ \end{array}$$

and I 1(Q) and K(Q) are as in the statement of Lemma 2, so that Q = I 1(Q) × K(Q).

By our d = 1 result, for each fixed y ∈ R d−1, the family \(\{\ell(Q)^{-1/2}f_{1}^{(Q)}(x',y)\}_{Q\in \mathcal{D}}\) is almost-orthogonal in L 2(R), with constant ≤ C(1). This is because, for each fixed y, the function (Q)−1∕2 f 1 (Q)(x′, y) is either identically 0 (with respect to x′) or it’s a suitably scaled, uniformly bounded-variation function, with integral 0, adapted to a unique dyadic interval I 1(Q). Note that subtracting a term of the form \(c\chi _{\overline{I_{ 1}(Q)}}(x')\) does not change f (Q)’s total variation in x′ on \(\overline{I_{1}(Q)}\), and so does not affect the relevant Schur Test estimates. (See the proof of Lemma 1.)

For each fixed y ∈ R d−1,

$$\displaystyle{\int _{\mathbf{R}}\left \vert \sum \lambda _{Q}\vert Q\vert ^{-1/2}f_{ 1}^{(Q)}(x',y)\right \vert ^{2}\,dx' \leq C(1)^{2}\sum \vert \lambda _{ Q}\vert ^{2}\ell(Q)^{-(d-1)}\chi _{ \overline{K(Q)}}(\,y).}$$

Since \(\vert \overline{K(Q)}\vert =\ell (Q)^{d-1}\), integrating in y yields

$$\displaystyle{ \int _{\mathbf{R}^{d}}\left \vert \sum \lambda _{Q}\vert Q\vert ^{-1/2}f_{ 1}^{(Q)}(x',y)\right \vert ^{2}\,dx'\,dy \leq C(1)^{2}\sum \vert \lambda _{ Q}\vert ^{2}. }$$
(16)

By induction (and because of Lemma 2), for each fixed x′ ∈ R, the family \(\{\ell(Q)^{-(d-1)/2}f_{2}^{(Q)}(x',y)\}_{Q\in \mathcal{D}}\) is almost-orthogonal in L 2(R d−1), with constant ≤ C(d − 1). (As with the f 1 (Q)s, for some x′, f 2 (Q)(x′, y) is identically 0 in y—which is fine.) Hence, for each fixed x′ ∈ R,

$$\displaystyle{\int _{\mathbf{R}^{d-1}}\left \vert \sum \lambda _{Q}\vert Q\vert ^{-1/2}f_{ 2}^{(Q)}(x',y)\right \vert ^{2}\,dy \leq C(d - 1)^{2}\sum \vert \lambda _{ Q}\vert ^{2}\ell(Q)^{-1}\chi _{ \overline{I_{1}(Q)}}(x').}$$

Now integrating in x′ yields:

$$\displaystyle{ \int _{\mathbf{R}^{d}}\left \vert \sum \lambda _{Q}\vert Q\vert ^{-1/2}f_{ 2}^{(Q)}(x',y)\right \vert ^{2}\,dx'\,dy \leq C(d - 1)^{2}\sum \vert \lambda _{ Q}\vert ^{2}. }$$
(17)

Combining (16) and (17) yields

$$\displaystyle{\left \Vert \sum \lambda _{Q} \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}\right \Vert _{2} \leq (C(1) + C(d - 1))\left (\sum \vert \lambda _{Q}\vert ^{2}\right )^{1/2},}$$

which implies

$$\displaystyle{C(d) \leq d\,C(1).}$$

The Schur Test gives \(C(1) \leq 1 + \frac{1} {\sqrt{2}}\). We have Theorem 1.

Point 2. Why do induction? We have d-dimensional Haar functions (3). Why not get Schur test estimates directly from inner products between them and the functions in

$$\displaystyle{\left \{ \frac{f^{(Q)}} {\vert Q\vert ^{1/2}}\right \}_{Q\in \mathcal{D}}?}$$

See the appendix.

Theorem 1 implies the L 2 boundedness of a certain “rough” operator (see the introduction), defined as a limit of finite sums. We need to specify in what way this limit is taken.

Definition 1

We say that a sequence \(\{\mathcal{E}_{k}\}_{1}^{\infty }\) of finite subsets of \(\mathcal{D}\) fills up \(\mathcal{D}\) if every \(Q \in \mathcal{D}\) is in all but finitely many \(\mathcal{E}_{k}\) s. (This holds if the \(\mathcal{E}_{k}\) s are increasing and \(\cup _{k}\mathcal{E}_{k} = \mathcal{D}\).) Let \(\{\lambda _{Q}\}_{Q\in \mathcal{D}}\) be a sequence of complex numbers, and \(\{g_{(Q)}\}_{Q\in \mathcal{D}}\) a sequence of functions in L 2(R d), each indexed over the family of dyadic cubes \(\mathcal{D}\). We say that

$$\displaystyle{\sum _{Q\in \mathcal{D}}\lambda _{Q}g_{(Q)}}$$

converges unconditionally to h ∈ L 2(R d) if, for every sequence of finite subsets \(\{\mathcal{E}_{k}\}_{1}^{\infty }\) that fills up \(\mathcal{D}\),

$$\displaystyle{\lim _{k\rightarrow \infty }\left \Vert h -\sum _{Q\in \mathcal{E}_{k}}\lambda _{Q}g_{(Q)}\right \Vert _{2} = 0.}$$

Corollary 3

Let \(\{\,f_{1}^{(Q)}\}_{Q\in \mathcal{D}}\) and \(\{\,f_{2}^{(Q)}\}_{Q\in \mathcal{D}}\) be two families such that f i (Q) ∈ NBV (Q) for all \(Q \in \mathcal{D}\) and i = 1, 2. If g ∈ L 2 ( R d ) then the series

$$\displaystyle{ \sum _{Q\in \mathcal{D}}\frac{\langle g,f_{1}^{(Q)}\rangle } {\vert Q\vert } f_{2}^{(Q)} }$$
(18)

converges unconditionally to some h in L 2 . Moreover,

$$\displaystyle{\Vert h\Vert _{2} \leq \left (\left (1 + \frac{1} {\sqrt{2}}\right )d\right )^{2}\Vert g\Vert _{ 2}.}$$

In other words, ( 18 ) defines a linear operator T: L 2 → L 2 with norm \(\leq ((1 + \frac{1} {\sqrt{2}})d)^{2}\) .

Proof of Corollary 3

Let g ∈ L 2(R d), and suppose that \(\mathcal{E}\subset \mathcal{D}\) is a finite subset. Define

$$\displaystyle{T_{\mathcal{E}}(g) \equiv \sum _{\mathcal{E}}\frac{\langle g,f_{1}^{(Q)}\rangle } {\vert Q\vert } f_{2}^{(Q)}.}$$

By Theorem 1,

$$\displaystyle{ \Vert T_{\mathcal{E}}(g)\Vert _{2} \leq \left (1 + \frac{1} {\sqrt{2}}\right )d\,\left (\sum _{\mathcal{E}}\frac{\vert \langle g,f_{1}^{(Q)}\rangle \vert ^{2}} {\vert Q\vert } \right )^{1/2} \leq \left (\left (1 + \frac{1} {\sqrt{2}}\right )d\right )^{2}\Vert g\Vert _{ 2} < \infty. }$$
(19)

If \(\{\mathcal{E}_{k}\}_{1}^{\infty }\) is a sequence of finite subsets that fills up \(\mathcal{D}\) then, for any m and n,

$$\displaystyle{\Vert T_{\mathcal{E}_{m}}(g) - T_{\mathcal{E}_{n}}(g)\Vert _{2} \leq \left (1 + \frac{1} {\sqrt{2}}\right )d\,\left (\sum _{\mathcal{E}_{m}\Delta \mathcal{E}_{n}}\frac{\vert \langle g,f_{1}^{(Q)}\rangle \vert ^{2}} {\vert Q\vert } \right )^{1/2};}$$

which, because of (19), goes to 0 as m and n go to infinity. (Apply dominated convergence to the sums over the symmetric differences \(\mathcal{E}_{m}\Delta \mathcal{E}_{n}\).) Therefore \(\{T_{\mathcal{E}_{k}}(g)\}_{k}\) is Cauchy in L 2(R d) and converges to an h with norm \(\leq ((1 + \frac{1} {\sqrt{2}})d)^{2}\Vert g\Vert _{2}\). The function h is unique because, if \(\{\mathcal{E}_{k}\}_{1}^{\infty }\) and \(\{\mathcal{E}'_{k}\}_{1}^{\infty }\) fill up \(\mathcal{D}\), so does { 1,  1′,  2,  2′, }.

3 The Proofs of Theorems 2 and 3

As with Theorem 1, we will first work in one dimension, where we will sometimes call dyadic intervals I or J, and sometimes Q.

Both proofs make use of a simple fact whose proof can be found in [2] and [3].

Lemma 3

If \(\tilde{\mathcal{D}}\) denotes the family of concentric triples of dyadic intervals in R then \(\tilde{\mathcal{D}}\) can be decomposed into 3 disjoint families,

$$\displaystyle{\tilde{\mathcal{D}} = \cup _{1}^{3}\mathcal{G}_{ i},}$$

such that, for each 1 ≤ i ≤ 3: a) \(\forall I,\,J \in \mathcal{G}_{i}\) , either I ∩ J =or one is a subset of the other; b) every \(I \in \mathcal{G}_{i}\) is the right or left half of a \(J \in \mathcal{G}_{i}\) ; c) \(\forall I \in \mathcal{G}_{i}\) , I’s right and left halves belong to \(\mathcal{G}_{i}\) ; d) R is covered by the set of \(I \in \mathcal{G}_{i}\) of length 3; and therefore, for any k, R is covered by the set of \(I \in \mathcal{G}_{i}\) of length 3 ⋅ 2 k .

As an immediate corollary of Lemma 3, the set of concentric triples of dyadic cubes in R d (also denoted \(\tilde{\mathcal{D}}\)) can be split into 3d disjoint families, each one having the analogous inclusion/exclusion and relative size properties as the set of dyadic cubes. The proof is trivial: for every \(\vec{a} = (a_{1},\,\ldots \,,a_{d}) \in \{ 1,2,3\}^{d}\), let \(\mathcal{G}_{\vec{a}}\) be the set of cubes Q =  1 d I i (Q) such that each \(I_{i}(Q) \in \mathcal{G}_{a_{i}}\).

Proof of Corollary 3

The Proof of Theorem  2 If I is a dyadic interval, we use \(\tilde{I}\) to denote I’s concentric triple, and we define \(h^{(\tilde{I})}\) by

$$\displaystyle{h^{(\tilde{I})}(x) =\chi _{\tilde{ I}_{l}}(x) -\chi _{\tilde{I}_{r}}(x).}$$

Then \(h^{(\tilde{I})}/\vert \tilde{I}\vert ^{1/2}\) is the “Haar function” associated to \(\tilde{I}\). Because of Lemma 3, for each 1 ≤ i ≤ 3, \(\{h^{(\tilde{I})}/\vert \tilde{I}\vert ^{1/2}:\ \tilde{ I} \in \mathcal{G}_{i}\}\) forms a complete orthonormal basis for L 2(R).

For each 1 ≤ i ≤ 3 we let \(\mathcal{F}_{i}\) be the set of dyadic intervals Q such that \(\tilde{Q} \in \mathcal{G}_{i}\). We note that if \(Q \in \mathcal{F}_{i}\) and f (Q) ∈ NBV (Q) then \(\widetilde{f^{(Q)}} \in NBV (\tilde{Q})\), where \(\tilde{Q} \in \mathcal{G}_{i}\). We claim that if 1 ≤ i ≤ 3 and \(\{\,f^{(Q)}\}_{Q\in \mathcal{F}_{i}}\) is any family such that each f (Q) ∈ NBV (Q) then

$$\displaystyle{\left \{ \frac{\widetilde{f^{(Q)}}} {\vert Q\vert ^{1/2}}\right \}_{Q\in \mathcal{F}_{i}}}$$

is almost-orthogonal in L 2(R), with a constant less than or equal to an absolute C. The proof is easy. We only need to bound

$$\displaystyle{ \left \vert \left \langle \frac{\widetilde{f^{(Q)}}} {\vert Q\vert ^{1/2}}, \frac{h^{(\tilde{I})}} {\vert \tilde{I}\vert ^{1/2}}\right \rangle \right \vert }$$
(20)

for Q and I both lying in \(\mathcal{F}_{i}\). But we have already seen this sort of thing. If \(\tilde{Q} \cap \tilde{ I} =\emptyset\) or \(\tilde{Q}\) is strictly contained in \(\tilde{I}\) then the inner product is 0. Otherwise \(\tilde{I} \subset \tilde{ Q}\), with \(\vert \tilde{I}\vert = 2^{-j}\vert \tilde{Q}\vert\) for some j ≥ 0, and (20) is less than or equal to

$$\displaystyle{\left ( \frac{\vert \tilde{I}\vert } {\vert Q\vert }\right )^{1/2}V _{\widetilde{ f^{(Q)}}}(\tilde{I}) = 3^{1/2}2^{-j/2}V _{\widetilde{ f^{(Q)}}}(\tilde{I}).}$$

For every \(Q \in \mathcal{F}_{i}\) and j ≥ 0,

$$\displaystyle{\sum _{{ I\in \mathcal{F}_{i}:\ \tilde{I}\subset \tilde{Q} \atop \vert \tilde{I}\vert =2^{-j}\vert \tilde{Q}\vert } }\left \vert \left \langle \frac{\widetilde{f^{(Q)}}} {\vert Q\vert ^{1/2}}, \frac{h^{(\tilde{I})}} {\vert \tilde{I}\vert ^{1/2}}\right \rangle \right \vert }$$

is less than or equal to a constant times

$$\displaystyle{2^{-j/2}V _{\widetilde{ f^{(Q)}}}(\tilde{Q}) \leq 2^{-j/2},}$$

implying that, for every \(Q \in \mathcal{F}_{i}\),

$$\displaystyle{\sum _{I\in \mathcal{F}_{i}}\left \vert \left \langle \frac{\widetilde{f^{(Q)}}} {\vert Q\vert ^{1/2}}, \frac{h^{(\tilde{I})}} {\vert \tilde{I}\vert ^{1/2}}\right \rangle \right \vert \leq C(1 + \frac{1} {\sqrt{2}}) \leq C.}$$

Similarly, for every \(I \in \mathcal{F}_{i}\),

$$\displaystyle{\sum _{Q\in \mathcal{F}_{i}}\left \vert \left \langle \frac{\widetilde{f^{(Q)}}} {\vert Q\vert ^{1/2}}, \frac{h^{(\tilde{I})}} {\vert \tilde{I}\vert ^{1/2}}\right \rangle \right \vert \leq C(1 + \frac{1} {\sqrt{2}}) \leq C.}$$

Combining the two inequalities proves our claim.

For every \(\vec{a} \in \{ 1,2,3\}^{d}\), let \(\mathcal{F}_{\vec{a}}\) be the family of dyadic cubes Q such that \(\tilde{Q} \in \mathcal{G}_{\vec{a}}\). Fix an \(\vec{a} \in \{ 1,2,3\}^{d}\). If \(Q \in \mathcal{F}_{\vec{a}}\) then \(\widetilde{f^{(Q)}} \in NBV (\tilde{Q})\). We can now repeat the inductive argument from the proof of Theorem 1 to get that

$$\displaystyle{\left \{ \frac{\widetilde{f^{(Q)}}} {\vert Q\vert ^{1/2}}\right \}_{Q\in \mathcal{F}_{\vec{a}}}}$$

is almost-orthogonal in L 2(R d), with constant ≤ Cd, where C is the constant we get for d = 1. We get the same estimate for every \(\vec{a} \in \{ 1,2,3\}^{d}\), implying that

$$\displaystyle{\left \{ \frac{\widetilde{f^{(Q)}}} {\vert Q\vert ^{1/2}}\right \}_{Q\in \mathcal{D}}}$$

is almost-orthogonal in L 2(R d), with constant ≤ C3d d ≡ C(d). A repetition of the argument in the proof of Corollary 3 shows that, for any g ∈ L 2(R d),

$$\displaystyle{\sum _{Q\in \mathcal{D}}\frac{\langle g,\widetilde{f_{1}^{(Q)}}\rangle } {\vert Q\vert } \widetilde{f_{2}^{(Q)}}}$$

converges unconditionally in L 2 to define a bounded linear operator \(\widetilde{T}: L^{2} \rightarrow L^{2}\) with norm ≤ C(d)2.

The Proof of Theorem  3 Write \(T(g) -\widetilde{ T}(g)\) as S 1(g) + S 2(g), where

$$\displaystyle\begin{array}{rcl} S_{1}(g)& =& \sum _{Q\in \mathcal{D}}\frac{\langle g,f_{1}^{(Q)} -\widetilde{ f_{1}^{(Q)}}\rangle } {\vert Q\vert } f_{2}^{(Q)} {}\\ S_{2}(g)& =& \sum _{Q\in \mathcal{D}}\frac{\langle g,\widetilde{f_{1}^{(Q)}}\rangle } {\vert Q\vert } \left (\,f_{2}^{(Q)} -\widetilde{ f_{ 2}^{(Q)}}\right ). {}\\ \end{array}$$

Because of (2) and Theorem 2, Theorem 3 will follow once we show that, for all finite linear sums

$$\displaystyle{\sum _{Q\in \mathcal{D}}\lambda _{Q}\left (\frac{f_{i}^{(Q)} -\widetilde{ f_{i}^{(Q)}}} {\vert Q\vert ^{1/2}} \right )}$$

(i = 1, 2), we have

$$\displaystyle{ \left \Vert \sum _{Q\in \mathcal{D}}\lambda _{Q}\left (\frac{f_{i}^{(Q)} -\widetilde{ f_{i}^{(Q)}}} {\vert Q\vert ^{1/2}} \right )\right \Vert _{2} \leq C\eta ^{1/2}\left (\sum _{ Q\in \mathcal{D}}\vert \lambda _{Q}\vert ^{2}\right )^{1/2} }$$
(21)

for a constant C only depending on d. Inequality (21) will follow from Theorem 2 and a technical, one-dimensional lemma (Lemma 4). We prove Lemma 4 first. We warn the reader that its proof requires an additional (fortunately very easy) lemma (Lemma 5).

Since the f i (Q)s’ subscripts are now irrelevant, we no longer write them.

Until otherwise stated, \(\mathcal{D}\), \(\tilde{\mathcal{D}}\), \(\mathcal{F}_{i}\), and \(\mathcal{G}_{i}\) refer to families of intervals.

Lemma 4

For each \(Q \in \mathcal{F}_{i}\) , let g (Q) : R R have support contained in \(\overline{Q}\) and be of bounded variation, with total variation ≤ 1. (Note: we do not require that ∫g (Q)  dx = 0.) Let \(\{\delta (Q)\}_{Q\in \mathcal{D}}\) and \(\{\tau (Q)\}_{Q\in \mathcal{D}}\) be two sequences of real numbers indexed over \(\mathcal{D}\) , such that |1 −δ(Q)| + |τ(Q)| < η < 1∕2 for all \(Q \in \mathcal{D}\) . Define \(\widetilde{g^{(Q)}}(x) \equiv g^{(Q)}(\delta (Q)(x - x_{Q} +\ell (Q)\tau (Q)) + x_{Q})\) and, for each \(\tilde{I} \in \mathcal{G}_{i}\) , set

$$\displaystyle{a(Q,\tilde{I}) \equiv \left \langle \frac{g^{(Q)} -\delta (Q)\widetilde{g^{(Q)}}} {\vert Q\vert ^{1/2}}, \frac{h^{(\tilde{I})}} {\vert \tilde{I}\vert ^{1/2}}\right \rangle.}$$

There is an absolute C such that, for all \(Q \in \mathcal{F}_{i}\) and \(\tilde{I} \in \mathcal{G}_{i}\) ,

$$\displaystyle{ \sum _{\tilde{I}\in \mathcal{G}_{i}}\vert a(Q,\tilde{I})\vert }$$
(22)

and

$$\displaystyle{ \sum _{Q\in \mathcal{F}_{i}}\vert a(Q,\tilde{I})\vert }$$
(23)

are both bounded by Cη 1∕2 .

Proof of Lemma 4

If \(Q \in \mathcal{F}_{i}\) and \(\tilde{I} \in \mathcal{G}_{i}\) then

$$\displaystyle\begin{array}{rcl} & \delta (Q)\int \widetilde{g^{(Q)}}(x)\,h^{(\tilde{I})}(x)\,dx & {}\\ & =\delta (Q)\int g^{(Q)}\left (\delta (Q)(x - x_{Q} +\ell (Q)\tau (Q)) + x_{Q}\right )\,h^{(\tilde{I})}(x)\,dx & {}\\ & =\int g^{(Q)}(u)\,h^{(\tilde{I})}\left (\delta (Q)^{-1}(u - x_{Q} -\delta (Q)\ell(Q)\tau (Q)) + x_{Q}\right )\,du,& {}\\ \end{array}$$

after substituting u = δ(Q)(xx Q + (Q)τ(Q)) + x Q . Therefore,

$$\displaystyle{ \int \left (g^{(Q)}(x) -\delta (Q)\widetilde{g^{(Q)}}(x)\right )\,h^{(\tilde{I})}(x)\,dx =\int g^{(Q)}(x)\,\gamma ^{(\tilde{I})}(x)\,dx, }$$
(24)

where \(\gamma ^{(\tilde{I})}(x)\) equals

$$\displaystyle{ h^{(\tilde{I})}(x) - h^{(\tilde{I})}\left (\delta (Q)^{-1}(x - x_{ Q} -\delta (Q)\ell(Q)\tau (Q)) + x_{Q}\right ). }$$
(25)

We note a fact which will be important soon. Although we do not assume that ∫ g (Q)dx = 0, we do have \(\int \left (g^{(Q)}(x) -\delta (Q)\widetilde{g^{(Q)}}(x)\right )\,dx = 0\), ensuring that (24) equals 0 if \(\tilde{I}\not\subset \tilde{Q}\): if \(\tilde{I}\not\subset \tilde{Q}\) and \(\tilde{I} \cap \tilde{ Q}\not =\emptyset\), the support of \(g^{(Q)}(x) -\delta (Q)\widetilde{g^{(Q)}}\) is entirely contained in either the right or the left half of \(\tilde{I}\), across which \(h^{(\tilde{I})}\) is constant.

The key to the proof of Lemma 4 is a good estimate for the right-hand side of (24), which follows from Lemma 1 and a bound on \(\Vert \gamma ^{(\tilde{I})}\Vert _{1}\). For the latter we need the simple lemma mentioned above.

Lemma 5

If I is a bounded interval, with endpoints a < b, and I′ is another bounded interval, with endpoints a′ < b′, then

$$\displaystyle{ \int \vert \chi _{I}(x) -\chi _{I'}(x)\vert \,dx \leq \vert a - a'\vert +\vert b - b'\vert. }$$
(26)

Proof of Lemma 5

Assume that ba ≤ b′ − a′. If b ≤ a′ the left-hand side of (26) is ba + b′ − a′, while | aa′ | ≥ ba and | bb′ | ≥ b′ − a′. If a ≤ a′ < b the left-hand side of (26) is exactly a′ − a + b′ − b (because b ≤ b′), and if a′ < a < b ≤ b′ it is aa′ + b′ − b. The other cases follow from symmetry.

We continue the proof of Lemma 4. Recall that \(h^{(\tilde{I})}\) has the form χ [a, b)χ [a′, b′), where \([a,b) =\tilde{ I}_{l}\) and \([a',b') =\tilde{ I}_{r}\). If ϕ: R → R is a strictly increasing bijection then

$$\displaystyle{\chi _{[\alpha,\beta )}(\phi (x)) =\chi _{[\phi ^{-1}(\alpha ),\phi ^{-1}(\beta ))}(x).}$$

Set ϕ(x) = δ(Q)−1(xx Q δ(Q)(Q)τ(Q)) + x Q . Then ϕ −1(x) = δ(Q)(xx Q + (Q)τ(Q)) + x Q . For ease of reading we will refer to ϕ −1 as ψ. We can write

$$\displaystyle\begin{array}{rcl} & & h^{(\tilde{I})}\left (\delta (Q)^{-1}(x - x_{ Q} -\delta (Q)\ell(Q)\tau (Q)) + x_{Q}\right ) {}\\ & & =\chi _{[\psi (a),\psi (b))}(x) -\chi _{[\psi (a'),\psi (b'))}(x), {}\\ \end{array}$$

and therefore the L 1 norm of

$$\displaystyle{\gamma ^{(\tilde{I})}(x) \equiv h^{(\tilde{I})}(x) - h^{(\tilde{I})}\left (\delta (Q)^{-1}(x - x_{ Q} -\delta (Q)\ell(Q)\tau (Q)) + x_{Q}\right )}$$

is less than or equal to

$$\displaystyle{ \vert a -\psi (a)\vert +\vert b -\psi (b)\vert +\vert a' -\psi (a')\vert +\vert b' -\psi (b')\vert. }$$
(27)

A quick calculation yields

$$\displaystyle{ a -\psi (a) = (a - x_{Q})(1 -\delta (Q)) -\delta (Q)\ell(Q)\tau (Q), }$$
(28)

with similar expressions for the other terms.

We recall that \(Q \in \mathcal{F}_{i}\), \(\tilde{I} \in \mathcal{G}_{i}\), and that the inner product (24) is zero unless \(\tilde{I} \subset \tilde{ Q}\); thus, for the only cases of interest, \(\ell(\tilde{I}) = 2^{-j}\ell(\tilde{Q})\) for some j ≥ 0. Given 0 < η < 1∕2, let N be the unique natural number such that η ∈ [2N−1, 2N). For such \(\tilde{I}\), the absolute value of (28)—and thus \(\Vert \gamma ^{(\tilde{I})}\Vert _{1}\)—is less than or equal to a constant times 2N (Q).

We will give two bounds on the absolute value of (24), depending on whether j ≤ N or j > N. We only use (28) for the j ≤ N estimate.

If j ≤ N (so that \(\tilde{I}\) is not too small compared to \(\tilde{Q}\)), then the absolute value of (24) is less than or equal to a constant times \(2^{-N}\ell(\tilde{Q})V _{g^{(Q)}}(\overline{\tilde{I}})\).

If j > N (meaning that \(\tilde{I}\) is very small compared to \(\tilde{Q}\)) then the absolute value of (24) is less than or equal to

$$\displaystyle{\left (V _{g^{(Q)}}(\overline{\tilde{I}}) + V _{\widetilde{g^{(Q)}}}(\overline{\tilde{I}})\right )\Vert h^{(\tilde{I})}\Vert _{ 1},}$$

which is the same as

$$\displaystyle{ 2^{-j}\ell(\tilde{Q})\left (V _{ g^{(Q)}}(\overline{\tilde{I}}) + V _{\widetilde{g^{(Q)}}}(\overline{\tilde{I}})\right ). }$$
(29)

Of course, what we need to bound is not the absolute value of (24), but the same divided by \(\vert Q\vert ^{1/2}\vert \tilde{I}\vert ^{1/2} \sim 2^{-j/2}\ell(Q)\). (Recall that we are still working in d = 1.) If j ≤ N, the quotient is less than or equal to a constant times \(2^{j/2}2^{-N}V _{g^{(Q)}}(\overline{\tilde{I}})\). If j > N the corresponding estimate is \(2^{-j/2}\left (V _{g^{(Q)}}(\overline{\tilde{I}}) + V _{\delta (Q)\widetilde{g^{(Q)}}}(\overline{\tilde{I}})\right )\). Therefore, if \(Q \in \mathcal{F}_{i}\), \(\tilde{I} \in \mathcal{G}_{i}\), \(\tilde{I} \subset \tilde{ Q}\), and \(\ell(\tilde{I}) = 2^{-j}\ell(\tilde{Q})\), then

$$\displaystyle{\vert a(Q,\tilde{I})\vert \leq \left \{\begin{array}{@{}l@{\quad }l@{}} C2^{j/2}2^{-N}V _{g^{(Q)}}(\overline{\tilde{I}}) \quad &\text{if}\ j \leq N; \\ C2^{-j/2}\left (V _{g^{(Q)}}(\overline{\tilde{I}}) + V _{\delta (Q)\widetilde{g^{(Q)}}}(\overline{\tilde{I}})\right )\quad &\text{if}\ j > N;\\ \quad \end{array} \right.}$$

while \(a(Q,\tilde{I}) = 0\) if \(\tilde{I}\not\subset \tilde{Q}\).

We now estimate (22)

$$\displaystyle{\sum _{\tilde{I}}\vert a(Q,\tilde{I})\vert =\sum _{\tilde{I}:\ \tilde{I}\subset \tilde{Q}}\vert a(Q,\tilde{I})\vert }$$

and (23)

$$\displaystyle{\sum _{Q}\vert a(Q,\tilde{I})\vert =\sum _{Q:\ \tilde{I}\subset \tilde{Q}}\vert a(Q,\tilde{I})\vert.}$$

Estimate of (22):

$$\displaystyle\begin{array}{rcl} \sum _{\tilde{I}:\ \tilde{I}\subset \tilde{Q}}\vert a(Q,\tilde{I})\vert & \leq & C\sum _{j\geq 0}\sum _{\tilde{I}:\ell(\tilde{I})=2^{-j}\ell(\tilde{Q})}\vert a(Q,\tilde{I})\vert {}\\ & =& C2^{-N}\sum _{ 0\leq j\leq N}2^{j/2}\sum _{ \tilde{I}:\ell(\tilde{I})=2^{-j}\ell(\tilde{Q})}V _{g^{(Q)}}(\overline{\tilde{I}}) {}\\ & +& C\sum _{j>N}2^{-j/2}\sum _{ \tilde{I}:\ell(\tilde{I})=2^{-j}\ell(\tilde{Q})}\left (V _{g^{(Q)}}(\overline{\tilde{I}}) + V _{\delta (Q)\widetilde{g^{(Q)}}}(\overline{\tilde{I}})\right ) {}\\ & =& (I) + (II), {}\\ \end{array}$$

where

$$\displaystyle\begin{array}{rcl} (I)& =& C2^{-N}\sum _{ 0\leq j\leq N}2^{j/2}\sum _{ \tilde{I}:\ell(\tilde{I})=2^{-j}\ell(\tilde{Q})}V _{g^{(Q)}}(\overline{\tilde{I}}) {}\\ (II)& =& C\sum _{j>N}2^{-j/2}\sum _{ \tilde{I}:\ell(\tilde{I})=2^{-j}\ell(\tilde{Q})}\left (V _{g^{(Q)}}(\overline{\tilde{I}}) + V _{\delta (Q)\widetilde{g^{(Q)}}}(\overline{\tilde{I}})\right ). {}\\ \end{array}$$

For each Q and j ≥ 0,

$$\displaystyle{\sum _{\tilde{I}:\ell(\tilde{I})=2^{-j}\ell(\tilde{Q})}V _{g^{(Q)}}(\overline{\tilde{I}}) \leq V _{g^{(Q)}}(\overline{\tilde{Q}}) \leq 1}$$

and

$$\displaystyle{\sum _{\tilde{I}:\ell(\tilde{I})=2^{-j}\ell(\tilde{Q})}\left (V _{g^{(Q)}}(\overline{\tilde{I}}) + V _{\delta (Q)\widetilde{g^{(Q)}}}(\overline{\tilde{I}})\right ) \leq V _{g^{(Q)}}(\overline{\tilde{Q}}) + V _{\delta (Q)\widetilde{g^{(Q)}}}(\overline{\tilde{Q}}) \leq 5/2,}$$

because the change of variable does not affect the total variation and | δ(Q) | ≤ 3∕2. Therefore

$$\displaystyle\begin{array}{rcl} (I)& \leq & C2^{-N}\sum _{ 0\leq j\leq N}2^{j/2} \leq C2^{-N/2} {}\\ (II)& \leq & C\sum _{j>N}2^{-j/2} \leq C2^{-N/2}, {}\\ \end{array}$$

implying that \(\sum _{\tilde{I}:\ \tilde{I}\subset \tilde{Q}}\vert a(Q,\tilde{I})\vert \leq C2^{-N/2}\).

Estimate of (23): This is like case a), but simpler, because, for each j ≥ 0 and \(\tilde{I}\), there is only one \(\tilde{Q}\) such that \(\tilde{I} \subset \tilde{ Q}\) and \(\ell(\tilde{I}) = 2^{-j}\ell(\tilde{Q})\). We get the same estimate: ≤ C2N∕2.

That proves Lemma 4, since η ∼ 2N.

The Schur Test and Lemma 4 imply that if \(\{g^{(Q)}\}_{Q\in \mathcal{D}}\) and \(\{\widetilde{g^{(Q)}}\}_{Q\in \mathcal{D}}\) are two families as given in Lemma 4’s hypotheses, then, for any finite linear sum

$$\displaystyle{\sum _{Q\in \mathcal{D}}\lambda _{Q}\left (\frac{g^{(Q)} -\delta (Q)\widetilde{g^{(Q)}}} {\vert Q\vert ^{1/2}} \right ),}$$

we have

$$\displaystyle{ \left \Vert \sum _{Q\in \mathcal{D}}\lambda _{Q}\left (\frac{g^{(Q)} -\delta (Q)\widetilde{g^{(Q)}}} {\vert Q\vert ^{1/2}} \right )\right \Vert _{2} \leq C\eta ^{1/2}\left (\sum _{ Q\in \mathcal{D}}\vert \lambda _{Q}\vert ^{2}\right )^{1/2}, }$$
(30)

with C an absolute constant. This implies Theorem 3 when d = 1. We can write

$$\displaystyle{f^{(Q)} -\widetilde{ f^{(Q)}} = \left (\,f^{(Q)} -\delta (Q)\widetilde{f^{(Q)}}\right ) + (\delta (Q) - 1)\widetilde{f^{(Q)}}.}$$

Lemma 4 implies that

$$\displaystyle{ \left \Vert \sum _{Q\in \mathcal{D}}\lambda _{Q}\left (\frac{f^{(Q)} -\delta (Q)\widetilde{f^{(Q)}}} {\vert Q\vert ^{1/2}} \right )\right \Vert _{2} \leq C\eta ^{1/2}\left (\sum _{ Q\in \mathcal{D}}\vert \lambda _{Q}\vert ^{2}\right )^{1/2}; }$$
(31)

while, by Theorem 2,

$$\displaystyle{ \left \Vert \sum _{Q\in \mathcal{D}}\lambda _{Q}\left ( \frac{\widetilde{f^{(Q)}}} {\vert Q\vert ^{1/2}}\right )\right \Vert _{2} \leq C\left (\sum _{Q\in \mathcal{D}}\vert \lambda _{Q}\vert ^{2}\right )^{1/2}. }$$
(32)

Since | 1 −δ(Q) | ≤ η ≤ η 1∕2 for all Q, we get (21) and thus Theorem 3 in one dimension.

We now prove (21) for general d. From here on we work in R d: \(\mathcal{F}_{\vec{a}}\), \(\mathcal{G}_{\vec{a}}\), \(\mathcal{D}\), and \(\tilde{\mathcal{D}}\) are families of cubes; \(\vec{\delta }(Q)\) and \(\vec{\tau }(Q)\) are vectors.

Fix \(\vec{a} \in \{ 1,2,3\}^{d}\). For \(Q \in \mathcal{F}_{\vec{a}}\) we write \(\vec{\delta }(Q)\) as (δ 1(Q), δ 2(Q), , δ d (Q)) and \(\vec{\tau }(Q)\) as (τ 1(Q), τ 2(Q), , τ d (Q)). Associated to each \(\vec{\delta }(Q)\) and \(\vec{\tau }(Q)\) will be two finite sequences of vectors \(\{\tilde{\delta }_{j}(Q)\}_{0}^{d}\) and \(\{\tilde{\tau }_{j}(Q)\}_{0}^{d}\), defined by

$$\displaystyle\begin{array}{rcl} \tilde{\delta }_{0}(Q)& \equiv & \vec{1} {}\\ \tilde{\delta }_{1}(Q)& \equiv & (\delta _{1}(Q),1,1,\ldots,1) {}\\ \tilde{\delta }_{2}(Q)& \equiv & (\delta _{1}(Q),\delta _{2}(Q),1,1,\ldots,1) {}\\ \tilde{\delta }_{3}(Q)& \equiv & (\delta _{1}(Q),\delta _{2}(Q),\delta _{3}(Q),1,1,\ldots,1) {}\\ & \ldots & {}\\ \tilde{\delta }_{d}(Q)& =& \delta (Q) {}\\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} \tilde{\tau }_{0}(Q)& =& 0 {}\\ \tilde{\tau }_{1}(Q)& =& (\tau _{1}(Q),0,0,\ldots,0) {}\\ \tilde{\tau }_{2}(Q)& =& (\tau _{1}(Q),\tau _{2}(Q),0,0,\ldots,0) {}\\ \tilde{\tau }_{3}(Q)& =& (\tau _{1}(Q),\tau _{2}(Q),\tau _{3}(Q),0,0,\ldots,0) {}\\ & \ldots & {}\\ \tilde{\tau }_{d}(Q)& =& \tau (Q). {}\\ \end{array}$$

In other words, considered as a dilation operator, \(\tilde{\delta }_{0}(Q)\) starts as the identity, and then, as j advances, morphs—one variable at a time—into \(\vec{\delta }(Q)\); while \(\tilde{\tau }_{j}(Q)\) similarly morphs from the identity into \(\vec{\tau }(Q)\), but now considered as a sequence of translation operators. Keep in mind that δ j (Q) and τ j (Q) are numbers (components of the vectors \(\vec{\delta }(Q)\) and \(\vec{\tau }(Q)\) ) while \(\tilde{\delta }_{j}(Q)\) and \(\tilde{\tau }_{j}(Q)\) are vectors.

Define ζ 0 (Q)(x) ≡ f (Q)(x) and, for 1 ≤ k ≤ d,

$$\displaystyle{\zeta _{k}^{(Q)}(x) = \left (\prod _{ 1}^{k}\delta _{ j}(Q)\right )f^{(Q)}(\tilde{\delta }_{ k}(Q)(x - x_{Q} +\ell (Q)\tilde{\tau }_{k}(Q)) + x_{Q}).}$$

After noticing that \(\zeta _{d}^{(Q)}(x) = \left (\prod _{1}^{d}\delta _{k}(Q)\right )\widetilde{f^{(Q)}}(x)\), we write

$$\displaystyle\begin{array}{rcl} f^{(Q)}(x) -\widetilde{ f^{(Q)}}(x)& =& f^{(Q)}(x) -\left (\prod _{ 1}^{d}\delta _{ k}(Q)\right )\widetilde{f^{(Q)}}(x) + \left (\left (\prod _{ 1}^{d}\delta _{ k}(Q)\right ) - 1\right )\widetilde{f^{(Q)}}(x) {}\\ & =& \zeta _{0}^{(Q)}(x) -\zeta _{ d}^{(Q)}(x) + \left (\left (\prod _{ 1}^{d}\delta _{ k}(Q)\right ) - 1\right )\widetilde{f^{(Q)}}(x) {}\\ & =& \left [\sum _{k=1}^{d}\left (\zeta _{ k-1}^{(Q)}(x) -\zeta _{ k}^{(Q)}(x)\right )\right ] + \left [\left (\left (\prod _{ 1}^{d}\delta _{ k}(Q)\right ) - 1\right )\widetilde{f^{(Q)}}(x)\right ] {}\\ & \equiv & \left [I\right ] + \left [II\right ]. {}\\ \end{array}$$

The term [II] is no problem, because

$$\displaystyle{\left \vert \left (\left (\prod _{1}^{d}\delta _{ k}(Q)\right ) - 1\right )\right \vert \leq C(d)\eta }$$

and Theorem 2 controls the almost-orthogonal “norm” of \(\{\widetilde{\,f^{(Q)}}/\vert Q\vert ^{1/2}\}_{\mathcal{F}_{\vec{a}}}\).

To see what is going on with [I], we look at the first term in the sum,

$$\displaystyle{ \zeta _{0}^{(Q)}(x) -\zeta _{ 1}^{(Q)}(x) = f^{(Q)}(x) -\delta _{ 1}(Q)f^{(Q)}(\tilde{\delta }_{ 1}(Q)(x - x_{Q} +\ell (Q)\tilde{\tau }_{1}(Q)) + x_{Q}). }$$
(33)

Write x = (x 1, x 2, , x d ) as (x 1, x ), where x 1 ∈ R and x  ∈ R d−1. For fixed x , (33) is

$$\displaystyle{ f^{(Q)}(x_{ 1},x^{{\ast}}) -\delta _{ 1}(Q)f^{(Q)}(\delta _{ 1}(Q)(x_{1} - (x_{Q})_{1} +\ell (Q)\tau _{1}(Q)) + (x_{Q})_{1},x^{{\ast}}) }$$
(34)

(note the absence of tildes), because the (respective) dilation and translation operators \(\tilde{\delta }_{1}(Q)\) and \(\tilde{\tau }_{1}(Q)\) do not affect the x components at all.

To ease reading we refer to (34) as ω (Q)(x).

For \(\tilde{Q} \in \mathcal{G}_{\vec{a}}\), write \(\tilde{Q} = I_{1}(\tilde{Q}) \times K(\tilde{Q})\), as in the statement of Lemma 2. Then

$$\displaystyle{\omega ^{(Q)}(x_{ 1},x^{{\ast}}) =\omega ^{(Q)}(x_{ 1},x^{{\ast}})\chi _{ I_{1}(\tilde{Q})}(x_{1})\chi _{K(\tilde{Q})}(x^{{\ast}})}$$

and, for every fixed x  ∈ R d−1 and every finite linear sum

$$\displaystyle{\sum _{Q\in \mathcal{F}_{\vec{a}}}\lambda _{Q}\left ( \frac{\omega ^{(Q)}} {\vert Q\vert ^{1/2}}\right ) =\sum _{Q\in \mathcal{F}_{\vec{a}}}\lambda _{Q}\left (\frac{\omega ^{(Q)}(x_{1},x^{{\ast}})\chi _{I_{1}(\tilde{Q})}(x_{1})\chi _{K(\tilde{Q})}(x^{{\ast}})} {\vert Q\vert ^{1/2}} \right ),}$$

we have, by the one-dimensional version of Theorem 3,

$$\displaystyle{ \int _{\mathbf{R}}\left \vert \sum _{Q\in \mathcal{F}_{\vec{a}}}\lambda _{Q}\left (\frac{\omega ^{(Q)}(x_{1},x^{{\ast}})} {\vert Q\vert ^{1/2}} \right )\right \vert ^{2}\,dx_{ 1} \leq C\eta \sum _{Q\in \mathcal{F}_{\vec{a}}}\vert \lambda _{Q}\vert ^{2}\vert K(\tilde{Q})\vert ^{-1}\chi _{ K(\tilde{Q})}(x^{{\ast}}). }$$
(35)

Here we are arguing just as we did in estimating (16), but incorporating the ‘ ≤ C η’ bound we have from the one-dimensional Theorem 3 (see (30)–(32)). We get η this time, and not η 1∕2, because we are not taking the square root of the integral. When we integrate (35) in x we get, for every \(\vec{a} \in \{ 1,2,3\}^{d}\),

$$\displaystyle\begin{array}{rcl} \int _{\mathbf{R}^{d}}\left \vert \sum _{Q\in \mathcal{F}_{\vec{a}}}\lambda _{Q}\left (\frac{\omega ^{(Q)}(x)} {\vert Q\vert ^{1/2}} \right )\right \vert ^{2}\,dx& =& \int _{\mathbf{ R}\times \mathbf{R}^{d-1}}\left \vert \sum _{Q\in \mathcal{F}_{\vec{a}}}\lambda _{Q}\left (\frac{\omega ^{(Q)}(x_{1},x^{{\ast}})} {\vert Q\vert ^{1/2}} \right )\right \vert ^{2}\,dx_{ 1}\,dx^{{\ast}} {}\\ &\leq & C\eta \sum _{Q\in \mathcal{F}_{\vec{a}}}\vert \lambda _{Q}\vert ^{2}. {}\\ \end{array}$$

The other summands in [I] are handled in a similar fashion, successively treating the variables x 2, … , x d as we did x 1. For example, ζ 1 (Q)(x) −ζ 2 (Q)(x) equals δ 1(Q) times

$$\displaystyle{f^{(Q)}(\tilde{\delta }_{ 1}(Q)(x-x_{Q}+\ell(Q)\tilde{\tau }_{1}(Q))+x_{Q})-\delta _{2}(Q)f^{(Q)}(\tilde{\delta }_{ 2}(Q)(x-x_{Q}+\ell(Q)\tilde{\tau }_{2}(Q))+x_{Q}),}$$

where the functions’ two arguments, respectively

$$\displaystyle{ (\tilde{\delta }_{1}(Q)(x - x_{Q} +\ell (Q)\tilde{\tau }_{1}(Q)) + x_{Q} }$$
(36)

and

$$\displaystyle{ \tilde{\delta }_{2}(Q)(x - x_{Q} +\ell (Q)\tilde{\tau }_{2}(Q)) + x_{Q}, }$$
(37)

differ only in their second components. The second component of (36) is x 2, and that of (37) is

$$\displaystyle{\delta _{2}(Q)(x_{2} - (x_{Q})_{2} +\ell (Q)\tau _{2}(Q)) + (x_{Q})_{2}.}$$

But their first components both equal

$$\displaystyle{\delta _{1}(Q)(x_{1} - (x_{Q})_{1} +\ell (Q)\tau _{1}(Q)) + (x_{Q})_{1};}$$

and, for 3 ≤ k ≤ d, each kth component for both functions equals x k .

If we now define, more or less as before,

$$\displaystyle{\omega ^{(Q)}(x) \equiv \zeta _{ 1}^{(Q)}(x) -\zeta _{ 2}^{(Q)}(x),}$$

then the preceding argument applies virtually verbatim to yield

$$\displaystyle{\int _{\mathbf{R}^{d}}\left \vert \sum _{Q\in \mathcal{F}_{\vec{a}}}\lambda _{Q}\left (\frac{\omega ^{(Q)}(x)} {\vert Q\vert ^{1/2}} \right )\right \vert ^{2}\,dx \leq C\eta \sum _{ Q\in \mathcal{F}_{\vec{a}}}\vert \lambda _{Q}\vert ^{2}}$$

for every \(\vec{a} \in \{ 1,2,3\}^{d}\). (Recall that δ 1(Q) is essentially 1.) The same argument applies to the other summands ζ k−1 (Q)ζ k (Q) for 3 ≤ k ≤ d to yield the same estimates. When we add up over all k and all \(\vec{a} \in \{ 1,2,3\}^{d}\), and include the term [II], we get

$$\displaystyle{\left \Vert \sum _{Q\in \mathcal{D}}\lambda _{Q}\left (\frac{f^{(Q)} -\widetilde{ f^{(Q)}}} {\vert Q\vert ^{1/2}} \right )\right \Vert _{2} \leq C\eta ^{1/2}\left (\sum _{ Q\in \mathcal{D}}\vert \lambda _{Q}\vert ^{2}\right )^{1/2}}$$

for all finite linear sums,

$$\displaystyle{\sum _{Q\in \mathcal{D}}\lambda _{Q}\left (\frac{f^{(Q)} -\widetilde{ f^{(Q)}}} {\vert Q\vert ^{1/2}} \right ),}$$

where C depends on d. That’s (21). Theorem 3 is proved.