1 Background

In 1972, Derridj and Zuily [4] proved \(G^s\) hypoellipticity (\(Pu \in G^s \implies u \in G^s\)) for

$$\begin{aligned} P = \sum _{j=1}^m X_j^2 + X_0 + c \end{aligned}$$

satisfying

$$\begin{aligned} \Vert v\Vert _{\varepsilon }^2 \le C\left( |(Pv, v)| + \Vert v\Vert _0^2\right) \qquad \forall v\in C_0^\infty \end{aligned}$$
(1.1)

whenever \(s>1/\varepsilon =q/p \quad \text {with} \quad p,q \in \mathbb {N}^+\) and very recently, for P with \(G^k\) coeffients, \(k\in \mathbb {N^+},\) by studying Gevrey vectors for such operators (see below), Derridj was able to sharpen this result to include \(s=1/\varepsilon =q/p,\) but still with rational \(\varepsilon \) and \(G^k\) coefficients, \(k\in \mathbb {N^+}\) (announced in [3] and proven in [2]).

Consider a linear partial differential operator P of order 2 with real analytic coefficients. An analytic vector for P is a distribution u such that u behaves analytically when differentiated by powers of P alone: locally, \(\Vert P^ju\Vert \le C^j(2j)!\) that is, not all derivatives of u are assumed to behave as though u were analytic, only those sums occurring together precisely as in P.

Similarly a Gevrey-s vector u for P (with P only assumed to have Gevrey-s coefficients now) satisfies (locally) \(\Vert P^ju\Vert \le C^j(2j)!^s,\) or more precisely,

$$\begin{aligned} \forall K\Subset \Omega _0, \;\exists C_K: \Vert P^j u\Vert _{L^2(K)}\le C_K^{j+1} (2j)!^s, \;\forall j. \end{aligned}$$

Derridj proved that Gevrey-s vectors for P under (1.1) belong to \(G^{s/\varepsilon }\) (for \(s>1/\varepsilon \) if \(\varepsilon \) is rational) and, to accomplish this, followed the classical method of adding a variable and showing (local) Gevrey hypoellipticity in \(G^{1,s/\varepsilon }_{t,x}\) for the operator

$$\begin{aligned} Q=-D_t^{2}-P. \end{aligned}$$
(1.2)

This yields the result since the (convergent) homogeneous solution

$$\begin{aligned} U(t,x) = \sum _{\ell \ge 0} (-1)^{\ell }\frac{t^{2\ell }}{(2\ell )!} P^\ell u(x) \end{aligned}$$

for Q is just equal to u(x) when \(t=0.\)

Slightly earlier, Rodrigues et al. [7] had obtained a (global) result on a torus for a restricted subclass of such operators P.

The methods we use also apply to prove the anisotropic hypoellipticity for (1.2) even for non-rational \(\varepsilon .\)

Note that \(G^s\) functions are always Gevrey-s vectors for any P.

2 General considerations

There are two main results of this paper. First, the subellipticity index \(\varepsilon \) need no longer be rational and secondly, we are able to let s equal \(1/\varepsilon .\) From a technical point of view, the proof is no harder for Gevrey-k coefficients than for analytic coefficients, so we take the vector fields to have analytic coefficients.

And from a more personal point of view, in reading Derridj’s preprint [3] we could not find a reason why the result should not follow from the direct lines we have established over many decades and which in fact avoid the need to add a variable and deal with (1.2), despite the historical significance of that approach which in some sense deals with iterates of P in a less obvious way.

In the elliptic case (\(\varepsilon =1\) in (2.1) just below), we recover the celebrated Kotake-Narasimhan theorem [6].

The only hypothesis, aside from Gevrey smoothness of the coefficients of P near \(\overline{\Omega _0}\), is that for some real \(0<\varepsilon <1,\)

$$\begin{aligned} \Vert v\Vert _{\varepsilon }^2 \;\left( +\sum _1^n \Vert X_j v\Vert _{L^2}^2\right) \le C\{|(Pv, v)_{L^2}| + \Vert v\Vert _{L^2}^2\}, \quad \forall v\in C_0^\infty (\Omega _0) \end{aligned}$$
(2.1)

3 Smoothness

From the basic a priori estimate (2.1) and those that will follow from it, we have \(u\in C^\infty \): from \(Pu \in L^2_{loc}\) it will follow from (2.1) that \(u\in H^\varepsilon _{loc}.\) From our estimate (4.2) below (for \(\Vert u\Vert _{2\varepsilon }^2\)), it will follow that \(Pu \in H^\varepsilon _{loc},\) (since \(P^2u\in L^2_{loc}\)) and hence that \(u\in H^{2\varepsilon }_{loc},\) and similarly from \(P^n u \in L^2_{loc},\) that \(P^{n-1} u \in H^\varepsilon _{loc},\) ..., and finally that \(u\in H^{(n+1)\varepsilon }_{loc}\) (for all n,  and hence \(u \in C^\infty \)).

We will henceforth assume that u is smooth.

And furthermore, there is no difference in the proof if one assumes that the coefficients of P are real analytic functions and not merely Gevrey functions; thus we will not mention the smoothness of the coefficients again.

4 Estimates

Unless otherwise specified, norms and inner products are in \(L^2.\) We have used a fractional power \(\Lambda ^\mu ,\) of the Laplacian defined by

$$\begin{aligned} \widehat{\Lambda ^{\mu } w} (\xi ) = (1+|\xi |^2)^{\mu /2}\hat{w}(\xi ). \end{aligned}$$

In order to obtain estimates at higher and higher levels, we want to replace v by \(\varphi (x)\Lambda ^{\varepsilon }v\) above, with \(\varphi \in C_0^\infty (K^{0}),\) \(\varphi \equiv 1\) on \(K'\) so that we are inserting suitably supported functions into the norm, and we denote by ‘(AB)’ both terms with AB and with BA (i.e., the order of A and B is unspecified), so \(\Vert (X_j \varphi \Lambda ^\varepsilon ) v\Vert _{L^2}^2 (=\Vert (X_j (\varphi \Lambda ^\varepsilon )) v\Vert _{L^2}^2)\) is shorthand for \(\Vert X_j \varphi \Lambda ^\varepsilon v\Vert _{L^2}^2 + \Vert \varphi \Lambda ^\varepsilon X_j v\Vert _{L^2}^2.\) We also will not need to distinguish the various \(\{X_j\}\) or explicitly sum over them:

$$\begin{aligned} \Vert \varphi \Lambda ^\varepsilon v\Vert _{\varepsilon }^2 + \Vert (X \varphi \Lambda ^\varepsilon ) v\Vert _{L^2}^2 \le C_0\{ |(P\varphi \Lambda ^{\varepsilon }v, \varphi \Lambda ^{\varepsilon }v)_{L^2}| +\Vert [X, \varphi \Lambda ^\varepsilon ]v\Vert _{L^2}^2\}_. \end{aligned}$$
(4.1)

Finally, a right hand side with \(C_0\) in front will be taken to mean that there may be a uniformly ‘junk’ term on the right, in this case \(\Vert \varphi \Lambda ^\varepsilon v\Vert _{L^2}^2\) from (2.1). The constant \(C_0\) may take various, but finitely many, values, independent of \(\varepsilon .\)

Thus, keeping both norms and inner products for the moment,

$$\begin{aligned}&\Vert \varphi \Lambda ^\varepsilon v\Vert _{\varepsilon }^2 +\Vert (X \varphi \Lambda ^\varepsilon ) v\Vert _{L^2}^2\nonumber \\&\quad \le C_0\{|(\varphi \Lambda ^{\varepsilon }Pv, \varphi \Lambda ^{\varepsilon }v)| + |([P, \varphi \Lambda ^{\varepsilon }]v, \varphi \Lambda ^{\varepsilon }v)| +\Vert [X, \varphi \Lambda ^\varepsilon ]v\Vert _{L^2}^2\}.\qquad \end{aligned}$$
(4.2)

To expand the brackets, we denote, generically,

$$\begin{aligned}{}[P, \varphi \Lambda ^\varepsilon ]= & {} [X^2,\varphi \Lambda ^\varepsilon ] = X[X,\varphi \Lambda ^\varepsilon ] + [X,\varphi \Lambda ^\varepsilon ]X\\= & {} X[X,\varphi \Lambda ^\varepsilon ] + \varphi ' \Lambda ^\varepsilon X + \varphi [X,\Lambda ^\varepsilon ]X \end{aligned}$$

and

$$\begin{aligned} \varphi [X,\Lambda ^\varepsilon ]X= X\varphi [X,\Lambda ^\varepsilon ] - \varphi '[X,\Lambda ^\varepsilon ] + \varphi [[X,\Lambda ^\varepsilon ],X] \end{aligned}$$

so that expanding the second term on the right in (4.2), integrating by parts and interchanging \(\varphi \) and \(\varphi ',\)

$$\begin{aligned}&([P, \varphi \Lambda ^{\varepsilon }]v, \varphi \Lambda ^{\varepsilon }v) \sim -([X, \varphi \Lambda ^{\varepsilon }]v, X\varphi \Lambda ^{\varepsilon }v) + (\varphi \Lambda ^{\varepsilon }Xv, \varphi '\Lambda ^{\varepsilon }v)\\&\quad - (\varphi [X,\Lambda ^\varepsilon ]v,X \varphi \Lambda ^\varepsilon v) + (\varphi [X,\Lambda ^\varepsilon ]v, \varphi ' \Lambda ^\varepsilon v) + (\varphi [[X,\Lambda ^\varepsilon ],X]v, \varphi \Lambda ^\varepsilon v) \end{aligned}$$

and so, after the usual weighted Schwarz inequalities, (4.2) reads

$$\begin{aligned}&\Vert \varphi \Lambda ^\varepsilon v\Vert _{\varepsilon }^2 +\Vert (X \varphi \Lambda ^\varepsilon ) v\Vert ^2_{L^2}\le C_0\{|(\varphi \Lambda ^{\varepsilon }Pv, \varphi \Lambda ^{\varepsilon }v)|\nonumber \\&\quad (+\Vert [X, \varphi \Lambda ^\varepsilon ]v\Vert _{L^2}^2)+ \Vert \varphi '\Lambda ^\varepsilon v\Vert _{L^2}^2 + \Vert \varphi \Lambda _1^\varepsilon v\Vert _{L^2}^2\} + \Vert \varphi \Lambda _2^\varepsilon v\Vert _{-\varepsilon }^2\} \end{aligned}$$
(4.3)

where \(\Lambda _1^\varepsilon \) stands for \([X, \Lambda ^\varepsilon ]\) and \(\Lambda _2^\varepsilon \) for \([[X,\Lambda ^\varepsilon ],X]\) pseudodifferential operators of order \(\varepsilon .\) We have suppressed the term \(\Vert \varphi [X, \Lambda ^\varepsilon ]\Vert _{L^2}^2,\) since \(\varphi [X, \Lambda ^\varepsilon ] = [X, \varphi \Lambda ^\varepsilon ] - X(\varphi )\Lambda ^\varepsilon \) both of which already appear above, and now we could omit the term \(\Vert [X,\varphi \Lambda ^\varepsilon ]v\Vert _{L^2}^2\) since the last two terms contain this, though we will preserve it for now because it is suggestive and helps make sense of the second term on the left.

Everything at this point is well defined. Things become somewhat more complicated as we seek to obtain estimates for higher derivatives. In the end we shall not write everything down explicitly, but for a while it will be important to keep the reader grounded.

Some features of (4.3) are that a gain of \(\varepsilon \) results in at most one derivative on \(\varphi ,\) and clearly this will be important. It is for this reason that we have retained the inner product with P since when an extra derivative threatens, we are able to exchange the two \(\varphi \)’s on the two sides of the inner product and avoid a second derivative on \(\varphi \) when we have gained only one \(\varepsilon \) power of \(\Lambda .\) And while v is a test function of compact support, our ‘solution’ u will not have compact support. We will introduce a ‘largest’ localizing function, denoted \(\Psi ,\) which will reside beside u everywhere but in the end be removable modulo infinitely smoothing brackets with precise bounds since there will be other functions of smaller support, such as \(\varphi ,\) to render \(\Psi \) unnecessary.

5 Personal heuristics

This paper has an unusual formulation.

It has become my conviction over the years that a mathematical paper that contains every symbol, and every derivative of a localizing function explicitly notated becomes unreadable. I personally require more guidance in reading a technical paper to aid me in following the formulas. Perhaps, to paraphrase Frege in [5], anyone who understands the flow of the argument and the justification of the flow well enough probably does not actually need all the detailed calculations.

I would not go that far. But the challenge of following every bracket and every derivative and writing it down would challenge the stomach of the strongest physique and I prefer to omit that much detail and ask the reader to honor the author’s honesty and track record and precision and to let the flow suffice in many places.

I took this approach in my previous paper, Analytic Hypoellipticity for a New Class of Sums of Squares of Vector Fields in \({\mathcal {R}}^3\) [9] and in fact the referee wrote that ”I guess the author is trying to explain the ideas in his technical calculations by describing them in words with a minimum of symbols, but the words pile on to the point where one needs to be almost as familiar with the calculations as the author himself for them to make sense. A reader might wonder if the author is trying to pull a fast one by substituting a lot of hand-waving for honest computation if it werent for some of the subsequent pages where the symbols swamp the words. Can’t one strike a better balance?” But I have tried for many years to find a better balance and concluded that in this material, and for this author, the answer is ”Sadly, no.”

6 Derivatives in terms of powers of P

The algorithm we will use to achieve estimates in terms of pure powers of P on u is as follows: as above, although now of order \(\beta ,\) modulo uniform, lower order errors, with \(\Vert (X \varphi \Lambda ^{\beta })v\Vert _{L^2}^2 \,{\mathop {\equiv }\limits _{\text {def}}}\, \Vert X \varphi \Lambda ^{\beta }v\Vert _{L^2}^2 + \Vert \varphi \Lambda ^{\beta } Xv\Vert _{L^2}^2\),

  1. (1)

    First estimate, for general \(\beta \) (and \(v\in C_0^\infty (\Omega )),\)

    $$\begin{aligned}&\Vert \varphi \Lambda ^{\beta }v\Vert _{\varepsilon }^2 + \Vert (X \varphi \Lambda ^{\beta })v\Vert _{L^2}^2\\&\quad \le C |(P\varphi \Lambda ^{\beta }v, \varphi \Lambda ^{\beta }v)_{L^2}| \quad (+\; \Vert [X, \varphi \Lambda ^\beta ]v\Vert _{L^2}^2) \end{aligned}$$
  2. (2)

    Then commute P past \(\varphi \Lambda ^{\beta }\) until it lands beside v,  to obtain \((\varphi \Lambda ^{\beta }Pv,\varphi \Lambda ^{\beta }v)_{L^2},\) thus requiring treatment of the bracket \(([P,\varphi \Lambda ^{\beta }]v,\varphi \Lambda ^{\beta }v)_{L^2}.\)

  3. (3)

    Next, expand the second inner product of item (2) by writing \(P=X^2\) generically, so with \(\varphi ' = \pm [X, \varphi ]\)

    $$\begin{aligned}{}[P, \varphi \Lambda ^{\beta }] = \{\varphi ' X + X\circ \varphi ' \}\Lambda ^\beta + 2\varphi X[X, \Lambda ^{\beta }] + \varphi [[X,\Lambda ^{\beta }],X] \end{aligned}$$

    and thus, integrating X by parts and/or switching \(\varphi \) and \(\varphi ',\) and using a weighted Schwarz inequality, uniformly in \(\beta ,\) and modulo a small constant times the LHS in (1),

    $$\begin{aligned} |([P,\varphi \Lambda ^{\beta }] v,\varphi \Lambda ^{\beta } v)| \sim \Vert \varphi '\Lambda ^{\beta } v\Vert _{L^2}^2 + \Vert \varphi \Lambda ^{\beta }_{1} v\Vert _{L^2}^2 + \Vert \varphi \Lambda ^{\beta }_{2} v\Vert _{-\varepsilon }^2 \end{aligned}$$

    where we recall the notation

    $$\begin{aligned} \Lambda _1^\beta = [X, \Lambda ^\beta ] \text { and } \Lambda _2^\beta = [[X, \Lambda ^\beta ], X], \end{aligned}$$

    both of which are of order \(\beta \).

  4. (4)

    We gather these steps and freely move \(\varphi \) past powers of \(\Lambda ,\) since any bracket (whether applied to v or Pv) will introduce one or more derivatives on \(\varphi \) but also decrease the power of \(\Lambda \) by at least the same number (not just by that number times \(\varepsilon<<1\)), a trade that will be acceptable (together with the corresponding remainders) and that we will not write explicitly:

    $$\begin{aligned}&\Vert \varphi \Lambda ^{\beta +\varepsilon }v\Vert _{L^2}^2 + \Vert (X \varphi \Lambda ^{\beta })v\Vert _{L^2}^2 \sim \Vert \varphi \Lambda ^{\beta }v\Vert _{\varepsilon }^2 + \Vert (X \varphi \Lambda ^{\beta })v\Vert _{L^2}^2\\&\quad \le C \Vert \varphi \Lambda ^{\beta -\varepsilon }Pv\Vert _{L^2}^2 + \Vert \varphi '\Lambda ^{\beta } v\Vert _{L^2}^2 + \Vert \varphi \Lambda ^{\beta }_{1} v\Vert _{L^2}^2 + \Vert \varphi \Lambda ^{\beta }_{2} v\Vert _{-\varepsilon }^2. \end{aligned}$$
  5. (5)

    These last two terms are of order \(\beta \) and will be expanded below in the section Expanding the Brackets below. Looking ahead to (7.2) below, however, for the moment with \(\mu = \beta \) and any r

    $$\begin{aligned} \varphi \Lambda ^{\beta }_{1} v = \varphi \sum _{\ell =1}^{r-1}\frac{1}{\ell !}a^{(\ell )}{(\Lambda ^{\beta })}^{(\ell )}Dv+ {{}_1R_{\,r} v} \end{aligned}$$

    so that (with Lemma 7.1, for \(X_j\) with analytic coefficients)

    $$\begin{aligned} \Vert \varphi \Lambda ^{\beta }_{1} v\Vert _{L^2}\le & {} \sum _{\ell =1}^{r-1}\Vert \varphi \frac{a^{(\ell )}}{\ell !}(\Lambda ^{\beta })^{(\ell )}Dv\Vert _{L^2} + \Vert {{}_1R_{\,r} v}\Vert _{L^2}\\\le & {} \sum _{\ell =1}^{\beta -1}C_a^\ell \, \beta ^\ell \Vert \varphi (\Lambda ^{\beta -\ell })Dv\Vert _{L^2} + \Vert {{}_1R_{\,r} v}\Vert _{L^2} \end{aligned}$$

    and the similar but slightly more complicated expression for

    $$\begin{aligned} \Lambda ^{-\varepsilon }\varphi \Lambda _2^\beta v= & {} \Lambda ^{-\varepsilon }\varphi [[aD,\Lambda ^{\beta }],aD]v = \Lambda ^{-\varepsilon }\varphi [[a,\Lambda ^{\beta }]D,aD]v\\= & {} \Lambda ^{-\varepsilon }\varphi ([a,\Lambda ^{\beta }]a'D + [[a,\Lambda ^\beta ],aD]D)v\\= & {} \Lambda ^{-\varepsilon }\varphi ([a,\Lambda ^{\beta }]a'D + [[a,\Lambda ^\beta ],a]D^2+ a[a',\Lambda ^\beta ]D)v\\&\quad \sim \Lambda ^{-\varepsilon }\varphi ([[a,\Lambda ^\beta ],a]D^2+ 2a[a',\Lambda ^\beta ]D)v\\&\quad \sim \Lambda ^{-\varepsilon }\varphi \sum _{\ell =1}^{r-1} \sum _{\ell '=1}^{r'-1}\frac{1}{\ell !\,\ell '!}a^{(\ell )}a^{(\ell ')} {(\Lambda ^{\beta })}^{(\ell +\ell ')}D^2v\\&\quad + \Lambda ^{-\varepsilon }\varphi \sum _{\ell =1}^{r-1}\frac{1}{\ell !} a^{(\ell +1)}a{(\Lambda ^{\beta })}^{(\ell )}Dv \end{aligned}$$

    so that, and bringing the coefficients out of the norm at the expense of additional brackets, as though it were the \(L^2\) norm,

    $$\begin{aligned} \Vert \Lambda ^{-\varepsilon }\varphi \Lambda _2^\beta v\Vert _{L^2} \le \sum _{\tilde{\ell }=2}^{r-1}C_a^{\tilde{\ell }}\beta ^{\tilde{\ell }}\Vert \varphi \, \Lambda ^{\beta -\tilde{\ell }}D^2v\Vert _{-\varepsilon } + \sum _{\ell =1}^{r-1}C_a^{\ell }\beta ^{\ell }\Vert \varphi \,\Lambda ^{\beta -\ell }Dv\Vert _{-\varepsilon } \end{aligned}$$

    or

    $$\begin{aligned} \Vert \Lambda ^{-\varepsilon }\varphi \Lambda _2^\beta v\Vert _{L^2} \le \sum _{\ell =0}^{r-1}C_a^{\ell }\beta ^{\ell }\Vert \varphi \,\Lambda ^{\beta -\ell }v\Vert _{-\varepsilon }. \end{aligned}$$

    As always with pseudodifferential operators, there will be a sum of terms of lower and lower order as dictated by Leibniz formula for brackets, and remainders.

  6. (6)

    We repeat the above steps by applying the estimate in (4) to the terms on the right in (4) producing \(\varphi \Lambda ^{\beta -3\varepsilon }P^2v,\) \(\varphi ' \Lambda ^{\beta -2\varepsilon }Pv\) and \(\varphi ''\Lambda ^{\beta -\varepsilon } v\), etc. On the right hand side each of the four terms will lead to a ‘spray’ of additional more terms, about four times as many at each next step. The resulting paradigm may be simplified to read

    $$\begin{aligned}&\Vert \varphi \Lambda ^{\beta +\varepsilon }v\Vert _{L^2}^2 \leadsto \Vert \varphi \Lambda ^{\beta -\varepsilon }Pv\Vert _{L^2}^2 + \Vert \varphi '\Lambda ^{\beta } v\Vert _{L^2}^2\\&\quad \leadsto \Vert \varphi \Lambda ^{\beta -3\varepsilon }P^2v\Vert _{L^2}^2 + \Vert \varphi ' \Lambda ^{\beta -2\varepsilon }Pv\Vert _{L^2}^2 + \Vert \varphi ''\Lambda ^{\beta -\varepsilon } v\Vert _{L^2}^2, \end{aligned}$$

    and in general, after k iterations, there will be \(C^k\) terms of the form

    $$\begin{aligned} \Vert \varphi ^{(k_1)}\Lambda ^{\beta +\varepsilon -(k_1+2k_2)\varepsilon }P^{k_2}v\Vert _{L^2}^2 \end{aligned}$$

    with \(k=k_1+k_2.\)

  7. (7)

    Continue each iteration until we get to \(\beta +\varepsilon -(k_1+2k_2)\varepsilon \le 0,\) but at the previous step, \(\beta +\varepsilon -(k_1+2k_2)\varepsilon \ge 0,\) i.e., \(k_1+2k_2=\lceil \frac{\beta +\varepsilon }{\varepsilon } \rceil ,\) so that

    $$\begin{aligned} \Vert \varphi \Lambda ^{\beta +\varepsilon }v\Vert _{L^2}^2 \le C^k \Vert \varphi ^{(k_1)}\Lambda ^{\beta +\varepsilon -(k_1+2k_2)\varepsilon }P^{k_2}v\Vert _{L^2}^2 \end{aligned}$$

    where the power of \(\Lambda \) in each term is non-positive.

  8. (8)

    It remains to apply all of this to our ‘solution’ u,  which is subject to the growth of \(P^ku,\) not functions like v which are ‘test’ functions and have compact support:

    $$\begin{aligned} \Vert P^j u\Vert _{L^2(K)}\le C_K^{2j+1} (2j)!^s, \;\forall j \text { for suitable } C_K. \end{aligned}$$

    But we are free to replace v in this estimate by \(\Psi u\) where \(\Psi \equiv 1\) near the support of \(\varphi ,\) since any error committed in then bringing \(\Psi \) out of the norm will be of order \(-\infty .\) Modulo this error, then,

    $$\begin{aligned} \Vert \varphi \Lambda ^{\beta +\varepsilon }\Psi u\Vert _{L^2}^2 \le C^k \Vert \varphi ^{(k_1)}\Lambda ^{\beta +\varepsilon -(k_1+2k_2)\varepsilon }P^{k_2}u\Vert _{L^2(K)}^2 \end{aligned}$$

    Our conclusion is that for any \(K'\Subset \Omega _0, \;\exists \,C_{K'}: \Vert D^m u\Vert _{L^2(K')} \le \tilde{C}^{m+1} m!^{s/\varepsilon }, \;\forall m.\) Taking \(\beta +\varepsilon = m,\) we have

    $$\begin{aligned} \Vert D^m u\Vert _{L^2(K')} \le \tilde{C}^{m+1} \sup _{k_1+2k_2=\lceil \frac{m}{\varepsilon } \rceil }\Vert \varphi ^{(k_1)}\Vert _\infty \Vert P^{k_2}u\Vert _{L^2(K)}, \end{aligned}$$

    in particular, with \(\varphi \in G^s,\)

    $$\begin{aligned}&\Vert D^m u\Vert _{L^2(K')} \le \tilde{C}^{m+1} \sup _{k_1+2k_2=\lceil \frac{m}{\varepsilon } \rceil }k_1!^s\Vert P^{k_2}u\Vert _{L^2(K)}\\&\quad \le \tilde{C}^{m+1} {\lceil \frac{m}{\varepsilon } \rceil }!^s \le C^m(\frac{m}{\varepsilon }+1)!^s\le C_1^{m/\varepsilon }(\frac{m}{\varepsilon })!^s \end{aligned}$$

7 Expanding the brackets

In order to write out the above brackets of the previous section concretely, we use a Taylor expansion of the symbol \(\lambda ^\mu (\xi )\) of \(\Lambda ^\mu \): \(\forall \mu , r,\) and write, with \(f=a\) (a coefficient of one of the X’s, which will always be accompanied by \(\varphi \)) or by \(f=\varphi (x)\) itself,

$$\begin{aligned} ([f,\Lambda ^{\mu }]v)^\wedge (\xi )= & {} \int \hat{f}(\xi -\eta )\sum _{\ell =1}^{r-1} \frac{(\xi -\eta )^{\ell }{\lambda ^{\mu }}^{(\ell )}(\eta )}{\ell !}\hat{v}(\eta )d\eta + \widehat{{}_f R_{\,r}v}(\xi )\\= & {} \sum _{\ell =1}^{r-1} \int \frac{\widehat{f^{(\ell )}}(\xi -\eta )}{\ell !}{\lambda ^{\mu }}^{(\ell )}(\eta )\hat{v}(\eta )d\eta + \widehat{{}_f R_{\,r}v}(\xi ) \end{aligned}$$

where

$$\begin{aligned} \widehat{{}_f R_{\,r}v}(\xi )=\int \frac{\widehat{f^{(r)}}(\xi -\eta )}{r!}\underbrace{\int _0^1dp \cdots \int _0^1dt}_r {\;\lambda ^{\mu }}^{(r)}(\eta + t\cdots p(\xi -\eta ))\hat{v}(\eta )d\eta \end{aligned}$$

so that (if we write \(({\Lambda ^{\mu }})^{(\ell )}\) for the operator with symbol \({(\lambda ^\mu )}^{(\ell )}\)), with \(f=\varphi :\)

$$\begin{aligned} \Vert [\varphi (x),\Lambda ^{\mu }]v\Vert _{L^2} \le \sum _{\ell =1}^{r-1}\frac{1}{\ell !}\Vert \varphi ^{(\ell )}{(\Lambda ^{\mu })}^{(\ell )}v\Vert _{L^2}+ \Vert {{}_\varphi R_{\,r} v}\Vert _{L^2} \end{aligned}$$
(7.1)

and, recalling that we write \(X=aD,\) with \(f=a\) (localized):

$$\begin{aligned} \Vert \varphi [a,\Lambda ^{\mu }]Dv\Vert _{L^2} \le \sum _{\ell =1}^{r-1}\frac{1}{\ell !}\Vert \varphi a^{(\ell )}{(\Lambda ^{\mu })}^{(\ell )}Dv\Vert _{L^2}+ \Vert \varphi \,{{}_a R_{\,r} v}\Vert _{L^2}. \end{aligned}$$
(7.2)

And for the last term in (4) above, \(\Vert \varphi \Lambda ^{\beta }_{2} v\Vert _{-\varepsilon }^2,\) we write

$$\begin{aligned} \Lambda ^{-\varepsilon }\varphi \Lambda ^{\mu }_{2} v= & {} \Lambda ^{-\varepsilon }\varphi [[a,\Lambda ^{\mu }]D,aD]v\\= & {} \Lambda ^{-\varepsilon }\varphi ([a,\Lambda ^{\mu }]a'D + [[a,\Lambda ^\mu ],aD]D)v\\= & {} \Lambda ^{-\varepsilon }\varphi ([a,\Lambda ^{\mu }]a'D + [[a,\Lambda ^\mu ],a]D^2+ a[a',\Lambda ^\mu ]D)v \\&\quad \sim \Lambda ^{-\varepsilon }\varphi ([[a,\Lambda ^\mu ],a]D^2+ 2a[a',\Lambda ^\mu ]D)v \\&\quad \sim \Lambda ^{-\varepsilon }\varphi \sum _{\ell =1}^{r-1}\frac{1}{\ell !} \sum _{\ell '=1}^{r'-1}\frac{1}{\ell '!}a^{(\ell )}a^{(\ell ')}{(\Lambda ^{\mu })}^{(\ell +\ell ')}D^2v \\&\quad + \Lambda ^{-\varepsilon }\varphi \sum _{\ell =1}^{r-1}\frac{1}{\ell !} a^{(\ell +1)}a{(\Lambda ^{\mu })}^{(\ell )}Dv \end{aligned}$$

Lemma 7.1

For any \(\mu \ge 0\) and any \(\sigma ,\)

$$\begin{aligned}{(\lambda ^{\mu })}^{(k)}(\rho )= \sum _j\underline{(3\mu )^k} \left\{ \begin{array}{l} \lambda ^{\mu -k -2j}(\rho ), \quad 0\le j \le \frac{k}{2}, \;k {\textit{ even }}\\ \rho \,\lambda ^{\mu -k -1 -2j}(\rho ), \quad 0\le j \le \frac{k-1}{2}, \;k {\textit{ odd }} \end{array} \right. \end{aligned}$$

where underlining the coefficient before the brace indicates the number of terms of the form which follow that are present.

Proof

The simplest proof we have found of this result is to denote by L the expression \((1+\rho ^2)^{1/2}\) since the pleasant fact that \(L'(\rho ) = \rho L^{-1}(\rho )\) seems to make the calculations suggestive and transparent. We omit the details but the precise dependence on \(\mu \) and r above are important.

To treat the remainders, we divide up the region of integration as we did in [8] into two parts, the first where \(|\xi -\rho |\le \frac{1}{10}|\rho |,\) and hence the action of \(R_{\,r}\) is bounded by the \(L^1\) norm of derivatives of the coefficients of total order r times \(\Vert \Lambda ^{\mu -r}v\Vert _{L^2} \) and the region where \(|\xi |\) (and hence \(|\eta |\)) are bounded by a multiple of \(|\xi -\eta |\) and so that \(|\lambda ^\mu (\xi )-\lambda ^\mu (\eta )| \le C^{\mu }|\xi -\eta |^\mu \) whence for any M

$$\begin{aligned} |([\Lambda ^\mu , a(x)] v)^\wedge (\xi )|= & {} |((\lambda ^\mu \hat{a}) *\hat{v})(\xi ) - (\hat{a} *(\lambda ^\mu \hat{v}))(\xi )|\\= & {} |\lambda ^\mu (\xi )\int \hat{a}(\xi -\eta ) \hat{v}(\eta )d\eta - \int \hat{a}(\xi -\eta )\lambda ^\mu (\eta ) \hat{v}(\eta )\,d\eta |\\= & {} |\int \hat{a}(\xi -\eta )[\lambda ^\mu (\xi )-\lambda ^\mu (\eta )]\hat{v}(\eta ) d\eta |\\\le & {} C^M|\int \widehat{a^{(M+\mu )}}(\xi -\eta )(1+|\eta |^2)^{-M/2}\hat{v}(\eta ) d\eta |. \end{aligned}$$

\(\square \)

8 Adding a variable

Previous proofs concerning Gevrey vectors have often, as in Derridj’s paper, proved and then used the Gevrey hypoellipticity of the operator

$$\begin{aligned} Q=-\frac{\partial ^{2}}{\partial t^{2}} - P \end{aligned}$$

The proof that a homogeneous solution for Q satisfies \(U\in G^{1,s}_{t,x}\) locally for \(s\ge 1/\varepsilon \) follows using the above techniques and the evident a priori inequality

$$\begin{aligned}&\Vert W(t,x)\Vert _{L^2(t),\varepsilon (x)}^2 + \sum \Vert X_j W(t,x)\Vert _{L^2(t,x)}^2 + \Vert W(t,x)\Vert _{1(t),L^2(x)}^2\\&\quad \le C\{|(QW,W))|_{L^2} + \Vert W(t,x)\Vert _{L^2(t,x)}^2\} \end{aligned}$$

for W of small support and smooth since the variables are completely separated.

Then observing that under our hypothesis on the iterates of P on u,  the homogeneous convergent series

$$\begin{aligned} U(t,x)=\sum _{\ell \ge 0}(-1)^{\ell } \frac{t^{2\ell }}{(2\ell )!}P^\ell u(x) \end{aligned}$$

satisfies the above equation in some interval about \(t=0\), and hence, restricted to \(t=0\) where it is equal to u,  will have the desired regularity in Gevrey class.

Finally, since the variables t and x are totally separated in the problem, localizing functions may be taken as products \(\varphi _1(t) \varphi _2(x)\) with \(\varphi _1\) of Ehrenpreis type or using nested open sets in t while in x Gevrey localization is familiar (and the fact that coefficients now depend on t as well as x presents no new obstacles, even in brackets with \(D_t\) or \(\Lambda ^\beta _2\)).