1 Introduction and Main Result

Consider the following forced nonlinear wave equations (NLW) with derivative nonlinearity:

$$\begin{aligned} u_{tt}-u_{xx}+\varepsilon f(\omega t, x, u, u_x, u_t)=0,\,\,\varepsilon >0, \end{aligned}$$
(1.1)

satisfying Dirichlet boundary conditions

$$\begin{aligned} u(t,0)=0, u(t,\,\,\,\, \frac{\pi }{\mu })=0,\,\mu >0. \end{aligned}$$
(1.2)

The forcing frequency \(\omega =(1, \alpha ),\,\,\alpha \in \mathbb {R}{\setminus } \mathbb {Q}\). The forcing term f is real analytic and satisfies

$$\begin{aligned}{} & {} f(\theta , x, y, z, w)=f(-\theta ,x, y, z, -w), \end{aligned}$$
(1.3)
$$\begin{aligned}{} & {} f(\theta , -x, -y, z, -w)=-f(\theta , x, y, z, w) \end{aligned}$$
(1.4)

and

$$\begin{aligned} f(\theta , x, 0, 0, 0)\ne 0. \end{aligned}$$
(1.5)

The goal of the present paper is to prove the existence of response solutions of Eq. (1.1) via KAM theory.

The existence problem of periodic and quasi-periodic solutions for partial differential equations (PDEs) is an interesting and difficult problem in the fields of mathematics, mechanics and physics. Different methods are used and developed on this problem, for example, variational methods, Lyapunov–Schmidt decomposition, KAM theory and Nash–Moser iteration techniques. The case of periodic solutions was first widely studied. The first breakthrough was due to Rabinowitz [43,44,45] for the forced dissipative derivative NLW with rational forcing frequency \(\omega =1\) under Dirichlet boundary conditions:

$$\begin{aligned} u_{tt}-u_{xx}+\alpha u_t+\varepsilon F( t, x, u, u_x, u_t)=0,\,\,\alpha >0,\,\,x\in [0, \pi ], \end{aligned}$$

and

$$\begin{aligned} u_{tt}-u_{xx}+\alpha u_t+\varepsilon F( t, x, u, u_x, u_t, u_{tt}, u_{tx}, u_{xx})=0,\,\,\alpha >0,\,\,\,\,x\in [0, \pi ]. \end{aligned}$$

By variational methods, Rabinowitz [46] also considered the autonomous NLW on \([0,\pi ]\) which had periodic solutions whenever the time period was a rational multiple of \(\pi \). Later on, based on perturbation methods (mainly Newton-like methods), Wayne [51], Craig and Wayne [25] proved the existence of periodic solutions of the NLW on \([0,\pi ]\) under Dirichlet or periodic boundary conditions. The periods of such periodic solutions were irrational multiples of \(\pi \). Bourgain [19] and Craig [24] also proved the existence of small-amplitude periodic solutions for autonomous Hamiltonian and reversible derivative NLW.

For the case of quasi-periodic solutions, small divisor problem will occur. Infinite dimensional KAM theory is a very powerful tool to solve it. Kuksin [34] and Wayne [51] first studied the existence of quasi-periodic solutions for Hamiltonian PDEs by KAM methods. One can also refer to [23, 26, 28, 37, 41, 42] and references therein. Although all these works are concerned with autonomous equations, their methods can also be applied to the forced ones. In the proof of KAM theorem, to handle the small divisors, the following non-resonance conditions are required: For some constants \(\tau >n-1\) and \(\gamma >0\),

  • (Diophantine conditions) \(|\langle k, \omega \rangle |\ge \frac{\gamma }{|k|^\tau },\,\hbox {for all}\,\,k\in \mathbb {Z}^n\setminus \{0\};\)

  • (the first Melnikov conditions) \(|\langle k, \omega \rangle \pm \Omega _i|\ge \frac{\gamma }{\langle k\rangle ^\tau };\)

  • (the second Melnikov conditions) \(|\langle k, \omega \rangle +\Omega _i \pm \Omega _j)|\ge \frac{\gamma }{\langle k\rangle ^\tau },\)

where \(\langle k\rangle =\max \{1, |k|\},\) \(\omega \in \mathbb {R}^n\) is the tangent frequency or the forcing frequency for the forced case. \(\Omega _i\in \mathbb {R},(i\in \mathbb {Z})\) are normal frequencies. \(\langle k, \omega \rangle :=\sum ^n_{i=1}k_i\omega _i.\) Another way to study quasi-periodic solutions for PDEs is CWB (Craig-Wayne-Bourgain) method based on Lyapunov–Schmidt reduction and Nash–Moser implicit function techniques. It only needs Diophantine and the first Melnikov conditions in the proof. One can refer to [18, 20, 49, 50] for details. By the improved CWB method, [12, 13, 17] also considered quasi-periodically forced nonlinear Schrödinger equations (NLS) and NLW on \(\mathbb {T}^d\) and on compact Lie groups and symmetric spaces, respectively. Using a variational Lyapunov–Schmidt reduction, Berti and Procesi [15, 16] proved the existence of quasi-periodic solutions of the following wave equations under periodic forcing:

$$\begin{aligned} {\left\{ \begin{array}{ll} u_{tt}-u_{xx}+ f(\omega _1 t, u)=0, \\ u(t,x)= u(t,x+2\pi ), \end{array}\right. } \end{aligned}$$

where the nonlinear forcing term \(f(\omega _1 t, u)=a(\omega _1 t)u^{2d-1}+O(u^{2d}), d>1\) is \(2\pi /\omega _1\)-periodic in time (\(\omega _1\in \mathbb {Q}\) or \(\omega _1\in \mathbb {R}-\mathbb {Q}\) ) and satisfies some analyticity assumptions. More recently, Calleja, Celletti, Corsi and de la Llave [21] obtained response solutions for the following four classes of quasi-periodically forced, dissipative wave equations

$$\begin{aligned}{} & {} u_{tt}-\Delta _{x}u+\varepsilon ^{-1}u_{t}+h(x,u)=f(\omega t,x); \\{} & {} u_{tt}-\Delta _{x}u\pm \varepsilon ^{-1}\partial _t(\Delta _{x}u)+h(x,u)=f(\omega t,x); \\{} & {} \varepsilon ^{2}u_{tt}-\Delta _{x}u+u_{t}+h(x,u)=f(\omega t,x); \\{} & {} \varepsilon ^{2}u_{tt}-\Delta _{x}u+u_{t}+\varepsilon h(x,u)=f(\omega t,x), \end{aligned}$$

where the forcing frequency \(\omega \) is weaker than the usual Brjuno condition, i.e., the following Brjuno function \(\mathcal {B}(\omega )\) is finite:

$$\begin{aligned} \mathcal {B}(\omega )=\sum \limits ^{\infty }_{j=0}\frac{1}{2^j}\log \frac{1}{\alpha _j(\omega )},\,\hbox {where}\ \alpha _j(\omega )=\inf \limits _{\begin{array}{c} {k\in \mathbb {Z}^n,}\\ {0<|k|<2^j} \end{array}}|\langle k, \omega \rangle |, \end{aligned}$$
(1.6)

which is slightly weaker than Diophantine one. The proof relies on Lindstedt series method and contraction mapping principle.

A frequency \(\omega \in \mathbb {R}^n\) is called Liouvillean if it is not Diophantine but rationally independent. It is weaker than Diophantine and Brjuno frequency. As we have mentioned above, Diophantine or Brjuno conditions play an essential role for the persistence of invariant tori of Hamiltonian and reversible systems. Nevertheless, it would be still possible to establish Liouvillean KAM theory for some special system like Eq. (1.1) with only two frequencies. Such possibility was first given by Avila et al. [1] and by Hou and You [31] in the reducibility theory for linear quasi-periodic skew-products

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{\theta }=\omega =(1,\alpha ),\quad \\ \dot{x}=A(\theta )x. \end{array}\right. } \end{aligned}$$

Wang, You and Zhou [48] and Wang, You [47] generalized above results to finite dimensional nonlinear Hamiltonian case. They proved the existence of response solutions for the quasi-periodically forced harmonic oscillators

$$\begin{aligned} \ddot{x}+\lambda ^2x=\varepsilon f(\omega t,x), \end{aligned}$$
(1.7)

where the parameter \(\lambda \in \mathbb {R},\) \(\omega =(1,\alpha )\) is rationally independent and f is a real analytic function. In [39], the authors of the present paper proved the existence of smooth response solutions in forced reversible system with Liouvillean frequencies. Krikorian, Wang, You and Zhou [33] revealed the possibility of studying Liouvillian quasiperiodic dynamics with KAM improved tools in the non-linear skew-product setting. More recently, Xu, You and Zhou [52] first established an infinite dimensional Hamiltonian KAM theorem with Liouvillean frequency. As an application, they proved the existence of response solutions for the forced NLS with Dirichlet boundary condition:

$$\begin{aligned} \textrm{i}u_t-u_{xx}+V(x)u+f(\omega t,x, u, \bar{u}; \xi )=0 \end{aligned}$$
(1.8)

where \(\omega =(\omega _1, \omega _2)\) with \( \omega _1=(1,\alpha ), \alpha \in \mathbb {R}{\setminus } \mathbb {Q},\,\,\omega _2\in \mathbb {R}^d. \) The tangent frequency \(\omega \) satisfies \(|\langle k, \omega _1\rangle +\langle l,\omega _2\rangle |\ge \frac{\gamma }{(|k|+|l|)^\tau } \) for all \(k\in \mathbb {Z}^2,l\in \mathbb {Z}^d{\setminus }\{0\}.\) The forcing ter f is a real analytic function. Chang, Geng and Lou [22] proved the existence of bounded non-response solutions for a class of Hamiltonian wave equations with Liouvillean forced frequencies. Their result shows that one can not obtain quasi-periodic solutions with Liouvillean frequencies for the nonlinear autonomous PDEs.

In this paper, we consider a class of non-Hamiltonian but reversible forced wave equations (1.1). We would like to make some comments on it.

(1). Let us explain why we use the \(\mu \) in the length of the interval as a parameter. In the measure estimates, we need the condition

$$\begin{aligned} |\frac{d}{d\mu }(\langle k,\omega \rangle +\Omega _i(\mu )-\Omega _j(\mu ))| \ge c>0, \end{aligned}$$
(1.9)

If \(\omega \) is Diophantine, the inequality (1.9) is easily satisfied even if there is no parameter \(\mu \). However, now \(\omega \) is fixed and could be Liouvillean, to guarantee the inequality (1.9) still holds, we use the parameter \(\mu \) since \(\Omega _i-\Omega _j=O(\mu (i-j)).\) Note that the Eq. (1.1) with Dirichlet boundary conditions (1.2) is equivalent to the following more natural form

$$\begin{aligned} u_{tt}-\mu ^2u_{xx}+\varepsilon f(\omega t, x, u, u_x, u_t)=0, \end{aligned}$$
(1.10)

with \(u(t,0)=0=u(t,\, \pi ).\) Here the role played by the parameter \(\mu \) is the same as that of the parameter \(\lambda \) in (1.7) or the parameter \(\xi \) in (1.8).

(2). The presence of derivative nonlinearity in (1.1) leads the lack of smooth effect of perturbation vector field. This brings some difficulty in the measure estimate. KAM theory for derivative nonlinear PDEs was developed by Kuksin [36] and Kappeler and Pöschel [32] for KdV-type equations. See also [8, 38, 53] for the unbounded perturbations of Schrödinger equations. Berti, Biasco and Procesi [10, 11] studied the following Hamiltonian and reversible derivative NLW, respectively:

$$\begin{aligned} u_{tt}-u_{xx}+mu+f(Du)=0,\,m>0,\,D:=\sqrt{-\partial _{xx}+m},\,x\in \mathbb {T}, \end{aligned}$$

and

$$\begin{aligned} u_{tt}-u_{xx}+mu=g(x, u, u_x, u_t),\,m>0,\,\,x\in \mathbb {T}, \end{aligned}$$

with

$$\begin{aligned} g(x, u, u_x, -u_t)=g(x, u, u_x, u_t)\,\,\hbox {and }\,\,g(-x, u, -u_x, u_t)=g(x, u, u_x, u_t). \end{aligned}$$

More recently, Baldi, Berti and Montalto [3] obtained KAM results for quasi-linear and fully forced perturbations of the linear Airy equation. The proof is based on a combination of KAM reducibility, regularization procedure and Nash–Moser iteration. These methods have been extended and applied to quasi-linear KdV [4], fully nonlinear forced reversible NLS [27] and quasi-linear water waves [2, 14]. See also recent reducibility results in [6, 7, 9] for Schrödinger equations with time quasi-periodic unbounded perturbations. We point out that frequencies \(\omega \) in above works are required to be Diophantine but not more weaker Liouvillean as in Eq. (1.1). This is the main difference between Eq. (1.1) and those in above papers.

Before stating our main result, we first give some notations. For the frequency \(\omega =(1,\alpha ),\,\,\alpha \in \mathbb {R}{\setminus }\mathbb {Q}\) in the Eq. (1.1), denote by \(\frac{p_n}{q_n}\) the continued fraction approximates to \(\alpha \). As in [47], we will use the quantity

$$\begin{aligned} \tilde{U}(\alpha ):=\sup \limits _{n>0} \frac{\ln \ln q_{n+1}}{\ln q_n}. \end{aligned}$$
(1.11)

\(\alpha \) is called not super-Liouvillean if \(\tilde{U}(\alpha )<\infty .\) The set of \(\omega \) satisfying \(\tilde{U}(\alpha )<\infty \) is not empty (see Remark 3.1) and this set includes a lot of Liouvillean frequencies. Then our main result is stated as follows.

Theorem 1.1

Suppose \(\omega =(1, \alpha ),\,\,\alpha \in \mathbb {R}{\setminus } \mathbb {Q}\) is fixed and \(\tilde{U}(\alpha )<\infty .\) Let \(\mu \in \mathcal {O}=[1,2].\) The function f satisfies (1.3)–(1.5). Then for any sufficiently small \(\gamma >0,\) there exist \(\varepsilon _0>0\) and a Cantor subset \(\mathcal {O}_\gamma \subseteq \mathcal {O}\) with Lebesgue measure \({{\,\textrm{meas}\,}}(\mathcal {O}\setminus \mathcal {O}_\gamma )=O(\gamma )\) such that if \(0<\varepsilon <\varepsilon _0,\) for each \(\mu \in \mathcal {O}_\gamma ,\) the above Eq. (1.1) admits a small amplitude time quasi-periodic solution of the form \(u(t, x; \mu )=U(\omega t, x; \mu ),\) where \(U(\theta , x; \mu ):\mathbb {T}^2\times \mathbb {R} \rightarrow \mathbb {R}\) is smooth (\(C^\infty \)) in \(\theta \) and real analytic in x.

Let us make comments on the three hypotheses (1.3)–(1.5) in Theorem 1.1.

Remark 1.1

The reversible condition (1.3) for Eq. (1.1) is very natural in KAM theory. It guarantees that the corresponding normal frequencies during the KAM iteration are elliptic. The Hamiltonian perturbations can play the same role as that the reversible ones do. Following the ideas of our paper, one can build a similar Liouvillean KAM theorem to Theorem 4.1 for forced Hamiltonian derivative wave equations. In the present paper we only restrict our attention to the reversible case since it actually contains all the difficulties that appear in the Hamiltonian case during the KAM iteration.

Remark 1.2

For the oddness condition (1.4), on one hand, it is natural for Eq. (1.1) on \([0, \frac{\pi }{\mu }]\) under Dirichlet boundary conditions because \(\{\sqrt{\frac{2\mu }{\pi }}\sin \mu jx,\,j\ge 1\}\) form a complete orthogonal basis of the subspace consisting of all odd functions in \(L^2[0, \frac{\pi }{\mu }]\). On the other hand, we note that the following simplest equations

$$\begin{aligned} u_{tt}-u_{xx}+\varepsilon f(\omega t)=0,\,\,\varepsilon >0, f\ne 0, \end{aligned}$$
(1.12)

have no response solutions. And the oddness condition (1.4) excludes such perturbations \(\varepsilon f(\omega t),\) thus it is also necessary.

2 Outline of the Proof

The proof of Theorem 1.1 is based on the abstract KAM Theorem 4.1 for infinite dimensional forced reversible systems. Below let us explain the main ideas and techniques of proving Theorems 1.1 and 4.1.

  • Reversible systems formulation. Let \(\lambda _j=\mu ^2j^2\) and \(\phi _j(x)=\sqrt{\frac{2\mu }{\pi }}\sin \mu jx,\,(j\ge 1)\) be the eigenvalues and eigenfunctions of the operator

    $$\begin{aligned} -\frac{d^2}{dx^2}y=\lambda y,\quad \,y(0)=0=y\left( \frac{\pi }{\mu }\right) . \end{aligned}$$

    We introduce infinitely many coordinates by \(u=\mathcal {S} q=\sum _{j\ge 1}q_j\phi _j,\) then Eq. (1.1) becomes

    $$\begin{aligned} \ddot{q}_j+\lambda _jq_j+\varepsilon g_j(\omega t, q, \dot{q})=0,\quad j\ge 1 \end{aligned}$$
    (2.1)

    where \(g_j(\omega t, q, \dot{q})=\int ^{\frac{\pi }{\mu }}_0 f(\omega t, x, \mathcal {S}q, (\mathcal {S}q)_x, (\mathcal {S}q)_t)\phi _jdx\) and reversible condition (1.3) becomes \(g_j(\omega t, q, \dot{q})=g_j(-\omega t, q, -\dot{q}).\) Let \(\theta =\omega t\), \(z_j=-\sqrt{\lambda }_j q_j+\textrm{i}\dot{q}_j, \bar{z}_j=-\sqrt{\lambda }_j q_j-\textrm{i}\dot{q}_j\), then system (2.1) becomes an autonomous reversible one

    $$\begin{aligned} {\left\{ \begin{array}{ll} \dot{\theta }=\omega ,\quad \\ \dot{z}_j=\textrm{i}\sqrt{\lambda _j} z_j-\textrm{i}\varepsilon g_j(\theta , \ldots ,-\frac{z_i+\bar{z}_i}{2\sqrt{\lambda _i}},\ldots , \frac{z_i-\bar{z}_i}{2\textrm{i}},\ldots ),\quad \\ \dot{\bar{z}}_j=-\textrm{i}\sqrt{\lambda _j} \bar{z}_j+\textrm{i}\varepsilon g_j(\theta , \ldots ,-\frac{z_i+\bar{z}_i}{2\sqrt{\lambda _i}},\ldots , \frac{z_i-\bar{z}_i}{2\textrm{i}},\ldots ), \end{array}\right. } \end{aligned}$$
    (2.2)

    with respect to the involution \(S(\theta , z,\bar{z})=(-\theta , \bar{z},z)\). The corresponding \(S-\)reversible vector field of system (2.2) is

    $$\begin{aligned} \begin{aligned} X(\theta , z,\bar{z}; \mu )=&\omega \frac{\partial }{\partial \theta }+\textrm{i}\Omega (\mu ) z\frac{\partial }{\partial z}-\textrm{i}\Omega (\mu ) \bar{z}\frac{\partial }{\partial \bar{z}}\\&+\sum \limits _{j\ge 1}-\textrm{i}\varepsilon g_j\frac{\partial }{\partial z_j}+\sum \limits _{j\ge 1}\textrm{i}\varepsilon g_j\frac{\partial }{\partial \bar{z}_j}, \end{aligned} \end{aligned}$$
    (2.3)

    where \(\Omega _j(\mu )=\mu j, j\ge 1,\) \(\mu \in [1,2]\). One can verify that vector field (2.3) satisfy all the conditions in KAM Theorem 4.1, see Sect. 8 for details.

  • Solving homological equations. In infinite dimensional KAM theory, the most difficult homological equation is

    $$\begin{aligned} \partial _{\omega } F_{ij}(\theta ;\xi ) +\textrm{i}(\Omega _i(\xi ) -\Omega _j(\xi ))F_{ij}(\theta ;\xi )=R_{ij}(\theta ;\xi ), \quad \,i,j\ge 1, \end{aligned}$$
    (2.4)

    which can be solved by the non-resonance condition

    $$\begin{aligned} |\langle k, \omega \rangle +\Omega _i -\Omega _j)|\ge \frac{\gamma }{\langle k\rangle ^\tau }, \quad \,\hbox {for}\, k\ne 0\, \hbox {or}\, i\ne j. \end{aligned}$$

    However in the present paper, when \(i=j\) equation (2.4) is unsolvable due to the lack of Diophantine restriction on \(\omega .\) Therefore we have to put the whole \(R_{jj}(\theta )\) rather than its average \([R_{jj}]\) into \(\Omega _j.\) This leads to the following \(\theta \)-dependent homological equations

    $$\begin{aligned} \partial _{\omega } F_{ij}(\theta ;\xi ) +\textrm{i}(\Omega _i(\theta ;\xi ) -\Omega _j(\theta ;\xi ))F_{ij}(\theta ;\xi )=R_{ij}(\theta ;\xi ), \quad \,i\ne j \in \mathbb {N}, \end{aligned}$$
    (2.5)

    This kind of variable coefficient homological equations also appear in the KAM theory for unbounded perturbations [32, 35]. In [32, 35], Diophantine condition on \(\omega \) is still necessary to solve (2.5). In this paper, \(\omega \) is no longer Diophantine but can be Liouvillean. To deal with this case, we will use the method based on CD-bridge technique introduced in [1]. Then by a rotation transformation, we have new variable coefficient homological equations which can be solved by diagonally dominant method. Note that, for (2.5), there are two main differences between our paper and [39, 48] for finite dimensional reversible and Hamiltonian systems: (i) For each \(|k|<K\), we need infinitely many non-resonance conditions

    $$\begin{aligned} |\langle k, \omega \rangle +\Omega _i -\Omega _j)|\ge \frac{\gamma }{\langle k\rangle ^\tau },\quad i,j\ge 1 \end{aligned}$$

    instead of only finitely many ones as in [39, 48]. In [39, 48], one can take a special large truncation K due to \(\sup \limits _i|\Omega _i|<+\infty .\) Then \(|\langle k, \omega \rangle +\Omega _i -\Omega _j)|\) has an uniform lower bound for all \(|k|<K.\) Hence [39, 48] obtain real analytic response solution for all Liouvillean frequencies \(\omega .\) However in this paper, \(|\Omega _i|\rightarrow +\infty \) as \(i\rightarrow +\infty ,\) and we cannot take a special truncation K. Thus we cannot obtain real analytic solutions for all Liouvillean \(\omega \) but only smooth (\(C^\infty \)) solutions with only not super-Liouvillean \(\omega \). Similar case also happens in [33, 47, 52]. (ii) We need verifying Töplitz–Lipschitz property of solution for measure estimate, which is the biggest difference with [39, 47, 48, 52]. The details are given in Proposition 5.2.

  • Constructing KAM scheme. Though above variable coefficient homological equations (2.5) are solvable under non-Diophantine conditions, the upper bound of the estimate for solutions would be very large such that the usual KAM iteration cannot be convergent. To overcome this, we will perform finite times normal form transformations at each KAM step. See Sect. 6.1 for details.

  • Töplitz–Lipschitz Property and Measure estimate. Due to the presence of derivatives in the nonlinearities of Eq. (1.1), there is no smoothing effects for the corresponding perturbation vector field P in (4.1). Therefore, one can not control the shift of the normal frequency which is necessary in the measure estimates. To give the measure estimates, we introduce a new class of Töplitz–Lipschitz vector fields. The idea of Töplitz–Lipschitz proerty was first introduced by Eliasson–Kuksin [26] and then developed in [10, 11, 28, 30, 42]. It can compensate the lack of smoothing effect of the perturbation vector field P. See assumption (A4) in Sect. 4 for more details.

The rest of the paper is organized as follows. In Sect. 3, we introduce the definitions of weighted norms for functions and vector fields and give some arithmetical properties of irrational numbers. In Sect. 4, we state an abstract KAM theorem (Theorem 4.1) for infinite dimensional reversible systems with non-Diophantine frequencies. In Sect. 5, we solve homological equations for vector field with Töplitz–Lipschitz property and prove that their solutions still admit Töplitz–Lipschitz property. In Sect. 6, we describe the details of proving KAM Theorem 4.1. The proof of convergence of the iteration and measure estimates are given in Sect. 7. In Sect. 8, we use the KAM theorem to prove Theorem 1.1. In Appendix we list some technical lemmas.

3 Preliminary

3.1 Functional Setting

Let \(\mathcal {O} \subset \mathbb {R}^n\) be a parameter set of positive Lebesgue measure. Throughout the paper, for any real or complex valued function depending on parameters \(\xi \in \mathcal {O},\) its derivatives with respect to \(\xi \) are understood in the sense of Whitney. We denote by \(C^1_W(\mathcal {O})\) the class of \(C^1\) Whitney differentiable functions on \(\mathcal {O}.\)

Suppose \(f\in C^1_W(\mathcal {O}),\) we define its norm as

$$\begin{aligned} \begin{aligned} \Vert f\Vert _{\mathcal {O}}:=&\sup \limits _{\xi \in \mathcal {O}}(|f(\xi )|+|\frac{\partial f}{\partial \xi }(\xi )|),\\ \end{aligned} \end{aligned}$$

where \(|\cdot |\) denotes the sup-norm of complex vectors.

Consider an \(\text {n}\)-torus \(\mathbb {T}^\text {n}=\mathbb {R}^\text {n}/{(2\pi \mathbb {Z})^\text {n}}\) and its complex neighborhood

$$\begin{aligned} D(r)=\{\theta \in \mathbb {C}^\text {n}: |\hbox {Im} \theta |<r\} \end{aligned}$$

(\(r>0\)).

Suppose \(f(\theta ;\xi ),\) (\( \theta \in D(r),\, \xi \in \mathcal {O}\)), is real analytic in \(\theta \in D(r)\) and \(C^1_W\) in \(\xi \in \mathcal {O}.\) We define its norm as

$$\begin{aligned} \Vert f\Vert _{D(r)\times \mathcal {O}}:=\sup \limits _{(\theta ,\xi )\in D(r)\times \mathcal {O}}(|f(\theta ,\xi )|+|\frac{\partial }{\partial \xi }f(\theta ,\xi )|), \end{aligned}$$

For \(f(\theta ;\xi )=\sum \limits _{k\in \mathbb {Z}^\text {n}}\widehat{f}(k;\xi )\textrm{e}^{\textrm{i}\langle k, \theta \rangle }\) on D(r),  its \(K(>0)\) order Fourier truncation \(\mathcal {T}_Kf\) is defined as follows:

$$\begin{aligned} (\mathcal {T}_Kf)(\theta ):=\sum \limits _{k\in \mathbb {Z}^\text {n},\,|k|< K}\widehat{f}(k)\textrm{e}^{\textrm{i}\langle k, \theta \rangle }. \end{aligned}$$

where \(\langle k,\theta \rangle =\sum \limits ^{\text {n}}_{i=1}k_i\theta _i\) and \(|k|=\sum \limits ^{\text {n}}_{i=1}|k_i|\).

The remainder \(\mathcal {R}_Kf\) of f is denoted by \((\mathcal {R}_Kf)(\theta ):=f(\theta )-\mathcal {T}_Kf(\theta ).\) Suppose \(0<2\sigma <r,\) we have the following estimate for \(\mathcal {R}_Kf:\)

$$\begin{aligned} \Vert \mathcal {R}_Kf\Vert _{D(r-2\sigma )\times \mathcal {O}}\le 32\sigma ^{-2}\textrm{e}^{-K\sigma }\Vert f\Vert _{D(r)\times \mathcal {O}}. \end{aligned}$$
(3.1)

The average [f] of f on \(\mathbb {T}^\text {n}\) is defined as

$$\begin{aligned}{}[f]:=\widehat{f}(0)=\frac{1}{(2\pi )^\text {n}}\int _{\mathbb {T}^\text {n}}f(\theta )d\theta . \end{aligned}$$

Let \(p> 0\), we introduce the Banach space \(\ell ^{p}\) of all real or complex sequences \(z=(z_j)_{j\ge 1}\) with

$$\begin{aligned} \Vert z\Vert _p=\sum \limits ^{\infty }_{j=1}|z_j|\textrm{e}^{jp}<\infty . \end{aligned}$$

For \(r, s>0,\) we define the phase space

$$\begin{aligned} \mathcal {P}^{p}:=\mathbb {T}^\text {n} \times \ell ^{p}\times \ell ^{p}\ni w:=(\theta , z, \bar{z}) \end{aligned}$$

and a complex neighborhood

$$\begin{aligned} D(r,s)\equiv D_p(r,s):=\{w:|\hbox {Im} \theta |<r, \Vert z\Vert _{p}<s, \Vert \bar{z}\Vert _{p}<s\} \end{aligned}$$

of \(\mathcal {T}^{\text {n}}_0:=\mathbb {T}^{\text {n}}\times \{z=0\}\times \{\bar{z}=0\}\) in \(\mathcal {P}^{p}_{\mathbb {C}}:=\mathbb {C}^\text {n} \times \ell ^{p}\times \ell ^{p}.\)

Let \(\alpha =(\alpha _j)_{j\ge 1},\) \(\beta =(\beta _j)_{j\ge 1}\) with \(\alpha _j,\beta _j\in \mathbb {N}.\) \(\alpha \) and \(\beta \) have only finitely many nonzero components. Suppose

$$\begin{aligned} \begin{aligned} f(\theta ,z,\bar{z}; \xi )=&\sum \limits _{\alpha , \beta }f_{\alpha \beta }(\theta ;\xi ) z^{\alpha }\bar{z}^{\beta }\\ =&\sum \limits _{k\in \mathbb {Z}^\text {n}, \alpha , \beta }\widehat{f}_{\alpha \beta }(k;\xi )\textrm{e}^{\textrm{i}\langle k, \theta \rangle } z^{\alpha }\bar{z}^{\beta }, \end{aligned} \end{aligned}$$

is real analytic on D(rs) and \(C^1_W\)-smooth on \( \mathcal {O},\) where the notation \(z^{\alpha }\bar{z}^{\beta }=\prod \nolimits _{j\ge 1}z^{\alpha _j}_j\bar{z}^{\beta _j}_j.\) We define

$$\begin{aligned} \begin{aligned} \Vert f\Vert _{D(r)\times \mathcal {O}}\equiv \Vert f(\cdot , z, \bar{z};\cdot )\Vert _{D(r)\times \mathcal {O}}:=\sum \limits _{\alpha , \beta }\Vert f_{\alpha \beta }\Vert _{D(r)\times \mathcal {O}}|z^{\alpha }||\bar{z}^{\beta }| \end{aligned} \end{aligned}$$

and the weighted norm of f as follows:

$$\begin{aligned} \begin{aligned} \Vert f\Vert _{D(r,s)\times \mathcal {O}}:=&\sup \limits _{\begin{array}{c} {\Vert z\Vert _{p}<s,}\\ {\Vert \bar{z}\Vert _{p}<s} \end{array}} \Vert f\Vert _{D(r)\times \mathcal {O}}\\ =&\sup \limits _{\begin{array}{c} {\Vert z\Vert _{p}<s,}\\ {\Vert \bar{z}\Vert _{p}<s} \end{array}} \sum \limits _{\alpha , \beta }\Vert f_{\alpha \beta }\Vert _{D(r)\times \mathcal {O}}|z^{\alpha }||\bar{z}^{\beta }|.\\ \end{aligned} \end{aligned}$$

Consider an infinite dimensional dynamical system on D(rs) : 

$$\begin{aligned} \dot{w}=X(w),\,\,w=(\theta , I, z,\bar{z})\in D(r, s), \end{aligned}$$

where the vector field

$$\begin{aligned} \begin{aligned} X(w)=&(X^{(\theta )}(w),X^{(z)}(w), X^{(\bar{z})}(w))\\ =&(X^{(\textsf {v})}(w))_{\textsf {v}\in \mathscr {V}}\in \mathcal {P}^{p}_{\mathbb {C}}, \end{aligned} \end{aligned}$$

where \(\mathscr {V}=\{\theta _1, \ldots ,\theta _\text {n},\,z_j, \bar{z}_j:j\ge 1\}.\) In the paper, we will write vector field X(w) as the form of differential operator

$$\begin{aligned} \begin{aligned} X(w)=&X^{(\theta )}(w)\frac{\partial }{\partial \theta }+ X^{(z)}(w)\frac{\partial }{\partial z}+X^{(\bar{z})}(w)\frac{\partial }{\partial \bar{z}}\\ =&\sum \limits _{\textsf {v}\in \mathscr {V}}X^{(\textsf {v})}(w)\frac{\partial }{\partial \textsf {v}}. \end{aligned} \end{aligned}$$
(3.2)

Definition 3.1

An analytic vector field \(X: D(r, s)\rightarrow \mathcal {P}^{p}_{\mathbb {C}}\) is said to be real analytic, if it satisfies

$$\begin{aligned} \overline{X^{(\theta )}}=X^{(\theta )},\,\,\overline{X^{(z)}}=X^{(\bar{z})}\,\,\hbox {on}\,\, D^{re}(r,s), \end{aligned}$$

whereFootnote 1

$$\begin{aligned} D^{re}(r,s)=\{(\theta , z, \bar{z})\in D(r,s):\theta \in \mathbb {T}^n ,\,\,\bar{z}\ \hbox {is the complex conjugate of}\ z\}. \end{aligned}$$

Suppose vector field \(X(w;\xi )\) is real analytic on D(rs) and \(C^1_W\) smooth on \(\mathcal {O},\) we define the weighted norm of X as follows

$$\begin{aligned} \begin{aligned}&\Vert X\Vert _{s;D(r,s)\times \mathcal {O}}\\&\quad =\sum \limits ^\text {n}_{i=1}\Vert X^{(\theta _i)}\Vert _{D(r,s)\times \mathcal {O}} +\frac{1}{s}\sup \limits _{\begin{array}{c} {\Vert z\Vert _{p}<s,}\\ {\Vert \bar{z}\Vert _{p}<s} \end{array}}\sum \limits ^{\infty }_{i=1}\textrm{e}^{i p}\left( \Vert X^{(z_i)}\Vert _{D(r)\times \mathcal {O}} +\Vert X^{(\bar{z}_i)}\Vert _{D(r)\times \mathcal {O}}\right) \\&\quad =\sup \limits _{\begin{array}{c} {\Vert z\Vert _{p}<s,}\\ {\Vert \bar{z}\Vert _{p}<s} \end{array}}\sum \limits ^\text {n}_{i=1} \sum \limits _{\alpha , \beta }\Vert X^{(\theta _i)}_{\alpha \beta }\Vert _{D(r)\times \mathcal {O}}|z^{\alpha }||\bar{z}^{\beta }|\\&\quad \quad +\frac{1}{s}\sup \limits _{\begin{array}{c} {\Vert z\Vert _{p}<s,}\\ {\Vert \bar{z}\Vert _{p}<s} \end{array}}\sum \limits ^{\infty }_{i=1}\textrm{e}^{ip}\sum \limits _{\alpha , \beta }\left( \Vert X^{(z_i)}_{\alpha \beta }\Vert _{D(r)\times \mathcal {O}} +\Vert X^{(\bar{z}_i)}_{\alpha \beta }\Vert _{D(r)\times \mathcal {O}}\right) |z^{\alpha }||\bar{z}^{\beta }|. \end{aligned} \end{aligned}$$

Definition 3.2

The Lie bracket of two vector fields X and Y on D(rs) is defined as

$$\begin{aligned}{}[X,Y](w)=\mathcal {D}X(w)\cdot Y(w)-\mathcal {D}Y(w)\cdot X(w),\,for \,\,w\in D(r,s), \end{aligned}$$

where \(\mathcal {D}X(w)\) is the differential of X at w, namely, its \( \textsf {v} \)-component is

$$\begin{aligned}{}[X,Y]^{(\textsf {v})}=\sum _{\textsf {u} \in \mathscr {V}}\left( \frac{\partial X^{(\textsf {v})}}{\partial \textsf {u}}Y^{(\textsf {u})}-\frac{\partial Y^{(\textsf {v})}}{\partial \textsf {u}}X^{(\textsf {u})}\right) . \end{aligned}$$

3.2 Some Arithmetical Properties of Irrational Numbers

The main purpose of this subsection is to recall some arithmetical properties of irrational numbers required in the paper.

3.2.1 Continued Fraction Expansion

Given an irrational number \(\alpha \in (0, 1).\) We define inductively the following sequences:

$$\begin{aligned}{} & {} a_0=0,\,\, \alpha _0=\alpha ,\\{} & {} a_k=\lfloor \alpha ^{-1}_{k-1}\rfloor ,\,\, \alpha _k=\alpha ^{-1}_{k-1}-a_k,\,\,k\ge 1, \end{aligned}$$

where \(\lfloor x\rfloor = \max \{l\in \mathbb {Z}:l\le x\}.\)

Setting

$$\begin{aligned} p_0=0,\,\,p_1=1,\,\,q_0=1,\,\,q_1=a_1, \end{aligned}$$

and we also define inductively

$$\begin{aligned}{} & {} p_k=a_k p_{k-1}+ p_{k-2}, \\{} & {} q_k=a_k q_{k-1}+ q_{k-2}. \end{aligned}$$

Then \(\{q_n\}\) is the sequence of denominators of the best rational approximations for \(\alpha .\) It satisfies

$$\begin{aligned} \Vert k\alpha \Vert _{\mathbb {T}}\ge \Vert q_{n-1}\alpha \Vert _{\mathbb {T}},\,\,\,\,\forall \, 1\le k<q_n, \end{aligned}$$
(3.3)

and

$$\begin{aligned} \frac{1}{q_{n}+q_{n+1}}<\Vert q_{n}\alpha \Vert _{\mathbb {T}}\le \frac{1}{q_{n+1}}, \end{aligned}$$
(3.4)

where \(\Vert x\Vert _{\mathbb {T}}:=\inf \limits _{p\in \mathbb {Z}}|x-p|.\)

3.2.2 CD Bridge

Now we choose a special subsequence \(\{q_{n_k}\}\) of denominators of the best rational approximations for irrational number \(\alpha .\) For simplicity, we denote the subsequences \(\{q_{n_k}\}\) and \(\{q_{n_k+1}\}\) by \(\{Q_{k}\}\) and \(\{\overline{Q}_{k}\},\) respectively.

The concept of CD bridge was first used in [1].

Definition 3.3

(CD bridge, [1]) Let \(0<\mathcal {A}\le \mathcal {B}\le \mathcal {C}.\) We say that the pair of denominators \((q_l, q_n)\) forms a CD\((\mathcal {A}, \mathcal {B}, \mathcal {C})\) bridge if

  1. (1)

    \(q_{i+1} \le q^{\mathcal {A}}_{i},\,\,\forall \,\,i=l,\ldots ,n-1;\)

  2. (2)

    \(q^{\mathcal {C}}_l\ge q_n\ge q^{\mathcal {B}}_l.\)

Lemma 3.1

([1]) For any \(\mathcal {A}\ge 1\) there exists a subsequences \(\{Q_{k}\}\) such that \(Q_0=1\) and for each \(k\ge 0,\) \(Q_{k+1}\le \overline{Q}^{\mathcal {A}^4}_{k},\) and either \(\overline{Q}_{k}\ge Q^\mathcal {A}_{k}\) or the pairs \((\overline{Q}_{k-1}, Q_{k})\) and \((Q_{k}, Q_{k+1})\) are both CD\((\mathcal {A}, \mathcal {A}, \mathcal {A}^3)\) bridges.

Definition 3.4

(Not super-Liouvillean numbers, [47]) The irrational number \(\alpha \) is called not super-Liouvillean if the quantity

$$\begin{aligned} \tilde{U}(\alpha ):=\sup \limits _{n>0} \frac{\ln \ln q_{n+1}}{\ln q_n}<\infty . \end{aligned}$$

In the sequel, we assume \(\mathcal {A}\ge 14\). Then we have the following conclusion.

Lemma 3.2

([47]) If \(\tilde{U}(\alpha )<\infty \), then there is \(Q_{n}\ge Q^\mathcal {A}_{n-1}\) for any \(n\ge 1\). Furthermore, one has

$$\begin{aligned} \sup \limits _{n>0} \frac{\ln \ln Q_{n+1}}{\ln Q_n}\le U(\alpha ),\ln Q_{n+1}\le Q^U_{n} \end{aligned}$$

where \(U(\alpha )=\tilde{U}(\alpha )+4\frac{\ln \mathcal {A}}{\ln 2}<\infty .\)

Remark 3.1

Notice that if

$$\begin{aligned} \beta (\alpha ):=\limsup \limits _{n>0} \frac{\ln \ln q_{n+1}}{\ln q_n}<\infty , \end{aligned}$$

then \(\tilde{U}(\alpha )<\infty \). In the case \(n=2\), if \(\mathcal {B}(\omega )<\infty \) (see (1.6)), then \(\beta (\alpha )=0.\) Hence if \(\omega =(1, \alpha )\) is Brjuno, then it must be not super-Liouvillean, thus a larger set than Brjuno.

4 An Infinite Dimensional Reversible KAM Theorem Without Diophantine Condition

Throughout the rest of the paper, we work on the space \(\mathcal {P}^{p}_{\mathbb {C}}:=\mathbb {C}^2 \times \ell ^{p}\times \ell ^{p}.\) Denote

$$\begin{aligned} z^\varrho _j= {\left\{ \begin{array}{ll} z_j,\quad &{} \varrho =+,\\ \bar{z}_j,\quad &{} \varrho =-, \end{array}\right. } \end{aligned}$$

and similarly for \(z^\varrho =(z^\varrho _j)_{j\ge 1}\).

Given \(s,r>0,\) a domain D(rs) in \(\mathcal {P}^{p}_{\mathbb {C}}\) and compact subset \(\mathcal {O}\subset \mathbb {R}^n\) of positive Lebesgue measure, we begin with a family of real analytic vector fields of the form

$$\begin{aligned} X(\theta , z, \bar{z}; \xi ) =N(\theta , z, \bar{z}; \xi )+P(\theta , z, \bar{z}; \xi ),\,(\theta , z, \bar{z})\in D(r,s),\,\xi \in \mathcal {O}, \end{aligned}$$
(4.1)
$$\begin{aligned} N=\omega \frac{\partial }{\partial \theta }+\textrm{i}\Omega (\xi )z\frac{\partial }{\partial z}-\textrm{i}\Omega (\xi )\bar{z}\frac{\partial }{\partial \bar{z}}, \end{aligned}$$

the perturbation

$$\begin{aligned} P=\sum \limits _{\textsf {v}\in \{z_j, \bar{z}_j,j\ge 1\}}\sum \limits _{\alpha , \beta }P^{(\textsf {v})}_{\alpha \beta }(\theta ; \xi )z^{\alpha }\bar{z}^{\beta }\frac{\partial }{\partial \textsf {v}}, \end{aligned}$$

is reversible with respect to involution \(S: (\theta , z, \bar{z})\mapsto (-\theta , \bar{z}, z).\) The forcing frequency \(\omega \in \mathbb {R}^2\) is a fixed non-resonant frequency vector. Without loss of generality, let \(\omega =(1, \alpha )\) and \(\alpha \in \mathbb {R}{\setminus }\mathbb {Q}\) is not super-Liouvillean. The normal frequencies \(\Omega _j(\xi )\in \mathbb {R}\) (\(j\ge 1\)) are \(C^1_W\) on \(\mathcal {O}.\)

Suppose above S-reversible vector field X satisfies the following four assumptions:

  1. (A1)

    Asymptotics of normal frequencies:

    $$\begin{aligned} \Omega _j=d(\xi )j+\widetilde{\Omega }_j,\,\,j\ge 1, \end{aligned}$$
    (4.2)

    where \(d(\xi ), \widetilde{\Omega }_j\in C^1_W(\mathcal {O}).\) Moreover, there exist positive constants \(A_0,\) \(A_1\) and \(A_2\) with \(A_0>\frac{3A_1}{4},\) \(A_2>A_1\) such that \(\forall \xi \in \mathcal {O},\) \(| d(\xi )|\ge A_0,\) \(A_1\le |\partial _\xi d(\xi )|\le A_2\) and \(|\widetilde{\Omega }|_\mathcal {O}\le \frac{A_1}{4}.\)

  2. (A2)

    Melnikov non-resonance conditions: For \(\tau \ge 10,\) \(0<\gamma \le 1,\) \(\xi \in \mathcal {O},\)

    $$\begin{aligned} |\langle k, \omega \rangle +\langle l, \Omega (\xi )\rangle |\ge \frac{\gamma }{\langle k\rangle ^\tau },\,\forall k\in \mathbb {Z}^2,\, 1\le |l|\le 2, \end{aligned}$$

    where \(\langle k\rangle =\max \{1, |k|\}.\)

  3. (A3)

    Regularity: The reversible perturbation P defines a map

    $$\begin{aligned} P: D(r,s)\times \mathcal {O}\rightarrow \mathcal {P}^{p}_\mathbb {C}, \end{aligned}$$

    \(P(\cdot ,\xi )\) is real analytic on D(rs) for each \(\xi \in \mathcal {O},\) and \(P(w,\cdot )\) is \(C^1_W-\)smooth on \(\mathcal {O}\) for each \(w\in D(r,s).\) Moreover, for some \(\varepsilon _0>0,\) \(\Vert P\Vert _{s;D(r,s)\times \mathcal {O}}\le \varepsilon _0.\)

To compensate the lack of smoothing effect of P due to the derivative nonlinearity, we need some additional conditions on the derivatives of P. In [10, 11], to deal with it, the authors introduced quasi-Töplitz property of functions and vector fields (first used in [42] for NLS). However, for the case of non-diagonal variable coefficient homological equations here, it is not easy to verify the quasi-Töplitz property of solutions. In this paper, we introduce new Töplitz–Lipschitz property of vector field which plays the similar role to quasi-Töplitz property but is more easy to handle. Töplitz–Lipschitz property introduced here was first used in [26] and then in [28] for higher dimensional Hamiltonian NLS.

  1. (A4)

    Töplitz–Lipschitz property: There exists \(\rho >0\) such that the following limits exist and satisfy:

    $$\begin{aligned}{} & {} \left\| \lim _{t\rightarrow \infty }\widetilde{\Omega }_{j+t}\right\| _{\mathcal {O}} \le \varepsilon _0, \end{aligned}$$
    (4.3)
    $$\begin{aligned}{} & {} \left\| \lim _{t\rightarrow \infty }\frac{\partial P^{(z_{i+t})}}{\partial z_{j+ t}}\right\| _{D(r,s)\times \mathcal {O}} \le \varepsilon _0\textrm{e}^{-\rho |i- j|}. \end{aligned}$$
    (4.4)
    $$\begin{aligned}{} & {} \left\| \lim _{t\rightarrow \infty }\frac{\partial P^{(z_{i+t})}}{\partial \bar{z}_{j- t}}\right\| _{D(r,s)\times \mathcal {O}} \le \varepsilon _0\textrm{e}^{-\rho |i+ j|}. \end{aligned}$$
    (4.5)
    $$\begin{aligned}{} & {} \left\| \lim _{t\rightarrow \infty }\frac{\partial P^{(\bar{z}_{i+t})}}{\partial \bar{z}_{j+ t}}\right\| _{D(r,s)\times \mathcal {O}} \le \varepsilon _0\textrm{e}^{-\rho |i-j|}. \end{aligned}$$
    (4.6)
    $$\begin{aligned}{} & {} \left\| \lim _{t\rightarrow \infty }\frac{\partial P^{(\bar{z}_{i+t})}}{\partial z_{j- t}}\right\| _{D(r,s)\times \mathcal {O}} \le \varepsilon _0\textrm{e}^{-\rho |i+ j|}. \end{aligned}$$
    (4.7)

    Furthermore, there exists \(K>0\) such that when \(|t|>K,\) the following estimates hold.

    $$\begin{aligned}{} & {} \left\| \widetilde{\Omega }_{j+t}-\lim _{t\rightarrow \infty }\widetilde{\Omega }_{j+t}\right\| _{\mathcal {O}}\le \frac{\varepsilon _0}{|t|}, \end{aligned}$$
    (4.8)
    $$\begin{aligned}{} & {} \left\| \frac{\partial P^{(z_{i+t})}}{\partial z_{j+ t}}-\lim _{t\rightarrow \infty }\frac{\partial P^{(z_{i+t})}}{\partial z_{j+ t}}\right\| _{D(r,s)\times \mathcal {O}}\le \frac{\varepsilon _0}{|t|}\textrm{e}^{-\rho |i- j|}, \end{aligned}$$
    (4.9)
    $$\begin{aligned}{} & {} \left\| \frac{\partial P^{(z_{i+t})}}{\partial \bar{z}_{j- t}}-\lim _{t\rightarrow \infty }\frac{\partial P^{(z_{i+t})}}{\partial \bar{z}_{j- t}}\right\| _{D(r,s)\times \mathcal {O}}\le \frac{\varepsilon _0}{|t|}\textrm{e}^{-\rho |i+ j|}, \end{aligned}$$
    (4.10)
    $$\begin{aligned}{} & {} \left\| \frac{\partial P^{(\bar{z}_{i+t})}}{\partial \bar{z}_{j+ t}}-\lim _{t\rightarrow \infty }\frac{\partial P^{(\bar{z}_{i+t})}}{\partial \bar{z}_{j+ t}}\right\| _{D(r,s)\times \mathcal {O}}\le \frac{\varepsilon _0}{|t|}\textrm{e}^{-\rho |i- j|}, \end{aligned}$$
    (4.11)
    $$\begin{aligned}{} & {} \left\| \frac{\partial P^{(\bar{z}_{i+t})}}{\partial z_{j- t}}-\lim _{t\rightarrow \infty }\frac{\partial P^{(\bar{z}_{i+t})}}{\partial z_{j- t}}\right\| _{D(r,s)\times \mathcal {O}}\le \frac{\varepsilon _0}{|t|}\textrm{e}^{-\rho |i+ j|}, \end{aligned}$$
    (4.12)

    here when \(i\le 0\) or \(j\le 0,\) \(\frac{\partial P^{(z_{i})}}{\partial z_{j}}, \frac{\partial P^{(z_{i})}}{\partial \bar{z}_{j}}, \frac{\partial P^{(\bar{z}_{i})}}{\partial \bar{z}_{j}}, \frac{\partial P^{(\bar{z}_{i})}}{\partial z_{j}}\equiv 0.\)

Remark 4.1

If a vector field P satisfies properties (4.4)–(4.7) and (4.9)–(4.12), then it is called a Töplitz–Lipschitz vector field. Here we only give the definition of Töplitz–Lipschitz vector fields. One can prove that Lie bracket of two Töplitz–Lipschitz vector fields and the solution of homological equations still satisfy Töplitz–Lipschitz property. This means that Töplitz–Lipschitz property can be preserved along the KAM iteration. We will prove these basic properties in Proposition 5.2 and Lemma 6.3 below.

Our KAM theorem is stated as follows.

Theorem 4.1

Assume real analytic \(S-\)reversible vector field (4.1) satisfying above assumptions \((A1){-}(A4)\). Then for every sufficiently small \(\gamma >0,\) there exists \(\varepsilon >0\) depending on \( \tau ,\) \(\gamma ,\) \(A_0,\) \(A_1,\) \(A_2,\) rs\(\alpha ,\) and \(\rho ,\) such that if \(\Vert P\Vert _{s; D(r,s)\times \mathcal {O}}\le \varepsilon ,\) then there is a non-empty subset \(\mathcal {O}_\gamma \subseteq \mathcal {O}\) of positive Lebesgue measure, and an \(S-\)invariant transformation \(\Phi \) of the form

$$\begin{aligned} (\theta , z, \bar{z};\xi )\mapsto (\theta , W(\theta , z, \bar{z};\xi ), \overline{W}(\theta , z, \bar{z};\xi )),\,\xi \in \mathcal {O}_\gamma , \end{aligned}$$

where W and \(\overline{W}\) are \(C^\infty \) in \(\theta \) and affine in \((z, \bar{z}),\) such that \(\Phi \) transforms above vector field (4.1) into

$$\begin{aligned} \Phi ^*X=N_*+P_* \end{aligned}$$

where

$$\begin{aligned} N_*=\omega \frac{\partial }{\partial \theta }+\textrm{i}\left( \Omega (\xi )+B_*(\theta ;\xi )\right) z\frac{\partial }{\partial z}-\textrm{i}\left( \Omega (\xi )+\overline{B_*(\theta ;\xi )}\right) \bar{z}\frac{\partial }{\partial \bar{z}}, \end{aligned}$$

\(B_*\in C^\infty (\mathbb {T}^2, \mathbb {R})\), \(\overline{B_*(\theta )}=B_*(-\theta ),\) and

$$\begin{aligned} P_*=\sum \limits _{\textsf {v}\in \{z_j, \bar{z}_j\}}\sum \limits _{|\alpha |+|\beta |\ge 2}P^{(\textsf {v})}_{*\alpha \beta }(\theta ; \xi )z^{\alpha }\bar{z}^{\beta }\frac{\partial }{\partial \textsf {v}},\,\xi \in \mathcal {O}_\gamma . \end{aligned}$$

Moreover, \({{\,\textrm{meas}\,}}(\mathcal {O}\setminus \mathcal {O}_\gamma )=O(\gamma ).\)

The proof of the theorem is given in Sect. 6.

5 Homological Equation and Töplitz–Lipschitz Property of Solutions

Consider the homological equation (the unknown is F)

$$\begin{aligned}{}[N, F]+R=\llbracket R\rrbracket \end{aligned}$$
(5.1)

on \(D_\rho (r,s)\times \mathcal {O},\) where

$$\begin{aligned} N=\omega \frac{\partial }{\partial \theta }+\textrm{i}\Omega (\theta ;\xi )z\frac{\partial }{\partial z}-\textrm{i}\overline{\Omega (\theta ;\xi )}\bar{z}\frac{\partial }{\partial \bar{z}} \end{aligned}$$

with fixed \(\omega =(1,\alpha ),(\alpha \in \mathbb {R}{\setminus }\mathbb {Q}),\,\, \Omega (\theta ;\xi )=\Omega (\xi )+B(\theta ;\xi )+b(\theta ;\xi )\).The normal frequencies \(\Omega _j(\xi ),\,j\ge 1\) satisfy (4.2). And the frequency drifts \(B(\theta )\) and \(b(\theta )\) are both real analytic on D(r) and \(\overline{B(\theta )}=B(-\theta ),\) \(\overline{b(\theta )}=b(-\theta ).\) This implies N is a reversible vector field with respect to the involution \(S:(\theta , z, \bar{z})\mapsto (-\theta , \bar{z}, z)\). R is also an \(S-\)reversible vector field of the form:

$$\begin{aligned} \begin{aligned} R=&(R^{z}(\theta ;\xi )+R^{zz}(\theta ;\xi )z+R^{z\bar{z}}(\theta ;\xi )\bar{z})\frac{\partial }{\partial z}\\&+(R^{\bar{z}}(\theta ;\xi )+R^{\bar{z}z}(\theta ;\xi )z+R^{\bar{z}\bar{z}}(\theta ;\xi )\bar{z})\frac{\partial }{\partial \bar{z}}. \end{aligned} \end{aligned}$$
(5.2)

\(\llbracket R\rrbracket \) is the \(\theta \)-depend normal form of R : 

$$\begin{aligned} \begin{aligned} \llbracket R\rrbracket =&diag R^{zz}(\theta )z\frac{\partial }{\partial z}+diag R^{\bar{z}\bar{z}}\bar{z}(\theta )\frac{\partial }{\partial \bar{z}}\\ =&\sum _{j\ge 1}R^{z_jz_j}(\theta )z_j\frac{\partial }{\partial z_j}+\sum _{j\ge 1}R^{\bar{z}_j\bar{z}_j}(\theta )\bar{z}_j\frac{\partial }{\partial \bar{z}_j}. \end{aligned} \end{aligned}$$

For \(f(\theta )=(f_j(\theta ):j\ge 1)\) on D(r),  we define the norm

$$\begin{aligned} \Vert f\Vert _{\infty ,D(r)}=\sup \limits _{j}\Vert f_j\Vert _{D(r)}. \end{aligned}$$

Moreover, suppose \(\Omega (\xi )+[B(\theta )]\in \mathcal{M}\mathcal{C}_\omega \left( \gamma , \tau , K, \mathcal {O}\right) ,\) where for \(\tau \ge 10,\) \(0<\gamma \le 1,\) \(K>0,\) the non-resonance set

$$\begin{aligned} \begin{aligned}&\mathcal{M}\mathcal{C}_\omega \left( \gamma , \tau , K, \mathcal {O}\right) \\ {}&\quad := \left\{ \acute{\Omega }(\xi ):\xi \in \mathcal {O}, |\langle k, \omega \rangle +\langle l, \acute{\Omega }(\xi )\rangle |\ge \frac{\gamma }{\langle k\rangle ^\tau },\, \forall |k|\le K,\, 1\le |l|\le 2\right\} . \end{aligned} \end{aligned}$$
(5.3)

During the KAM iteration, it is enough to obtain the approximate solution of the the homological equation(5.1) above. The following proposition gives the existence and estimate of such approximate solution. A similar proposition was given in [52].

Proposition 5.1

Let \(\{ Q_k\}\) be the selected subsequence of \(\alpha \) in Lemmas 3.1 and 3.2 with respect to \(\mathcal {A}\ge 14,\) also let \(\gamma >0,\,0<\zeta ,\,\tilde{\zeta }<1, \) and \(0<5\sigma<\tilde{r}<r.\) If all above assumptions on N and R are satisfied and

$$\begin{aligned} \Vert B\Vert _{\infty ,D(r)\times \mathcal {O}}\le \zeta , \end{aligned}$$
(5.4)
$$\begin{aligned} \Vert b\Vert _{\infty ,D(r)\times \mathcal {O}}\le \tilde{\zeta }. \end{aligned}$$
(5.5)

Furthermore for some \(n\ge 1,\) the following three assumptions are satisfied:

  1. (i)
    $$\begin{aligned} 360\tilde{r}Q_{n+1}\zeta \le (r-\tilde{r})^{3}, \end{aligned}$$
    (5.6)
  2. (ii)
    $$\begin{aligned} 512\textrm{e}^{-\frac{r-\tilde{r}}{2}Q_{n+1}}\zeta \le \tilde{\zeta }^{\frac{1}{2}}(r-\tilde{r})^2, \end{aligned}$$
    (5.7)
  3. (iii)
    $$\begin{aligned} 2C_0K^{2\tau +1} \tilde{\zeta }^{\frac{1}{2}}\le \gamma ^2\sigma ^2, \end{aligned}$$
    (5.8)

where \(C_0>0\) is a constant depending only on \(A_0,\, A_1, A_2\), then above homological equation (5.1) has a unique approximate solution F of the same form as R satisfying \(\llbracket F \rrbracket =0,\) \(S^*F=F\) and the estimate

$$\begin{aligned} \Vert F\Vert _{s;D_{\rho }(\tilde{r}-2\sigma ,s)\times \mathcal {O}}\le \frac{C_0\zeta ^2Q^2_{n+1}K^{2\tau +1}}{\gamma ^{2}\sigma ^2(r-\tilde{r})^4}\Vert R\Vert _{s;D_{\rho }(r,s)\times \mathcal {O}}. \end{aligned}$$
(5.9)

Moreover, the S-reversible error term \(\breve{R}\) satisfies

$$\begin{aligned} \begin{aligned}&\Vert \breve{R}\Vert _{s;D_{\rho }(\tilde{r}-4\sigma ,s)\times \mathcal {O}}\\&\quad \le \frac{C_0 \zeta ^2 Q^2_{n+1}}{ \sigma ^{2} (r-\tilde{r})^{2}}\textrm{e}^{-K\sigma } \left( \Vert R\Vert _{s;D_{\rho }(r,s)\times \mathcal {O}}+\tilde{\zeta }^{\frac{1}{2}}\Vert F\Vert _{s;D_{\rho }(\tilde{r}-2\sigma ,s)\times \mathcal {O}}\right) .\\ \end{aligned} \end{aligned}$$
(5.10)

Proof

Suppose F has the same form as R. By the definition of Lie bracket, Eq. (5.1) can be rewritten as the following scalar form

$$\begin{aligned}{} & {} R^{z_i}- \partial _{\omega }F^{z_i}+\textrm{i}\Omega _i(\theta )F^{z_i}=0, \end{aligned}$$
(5.11)
$$\begin{aligned}{} & {} R^{\bar{z}_i}- \partial _{\omega }F^{\bar{z}_i}-\textrm{i}\overline{\Omega _i(\theta )}F^{\bar{z}_i}=0, \end{aligned}$$
(5.12)
$$\begin{aligned}{} & {} R^{z_i \bar{z}_j}- \partial _{\omega }F^{z_i \bar{z}_j}+ \textrm{i}\Omega _i(\theta )F^{z_i \bar{z}_j}+\textrm{i}F^{z_i \bar{z}_j}\overline{\Omega _j(\theta )}=0, \end{aligned}$$
(5.13)
$$\begin{aligned}{} & {} R^{\bar{z}_i z_j}- \partial _{\omega }F^{\bar{z}_i z_j}-\textrm{i}\overline{\Omega _i(\theta )}F^{\bar{z}_i z_j}-\textrm{i}F^{\bar{z}_i z_j}\Omega _j(\theta )=0, \end{aligned}$$
(5.14)
$$\begin{aligned}{} & {} R^{z_i z_j}- \partial _{\omega }F^{z_i z_j}+\textrm{i}\Omega _i(\theta )F^{z_i z_j}-\textrm{i}F^{z_i z_j}\Omega _j(\theta )=\delta _{ij} R^{z_i z_j}, \end{aligned}$$
(5.15)
$$\begin{aligned}{} & {} R^{\bar{z}_i \bar{z}_j}- \partial _{\omega }F^{\bar{z}_i \bar{z}_j}-\textrm{i}\overline{\Omega _i(\theta )}F^{\bar{z}_i \bar{z}_j}+\textrm{i}F^{\bar{z}_i \bar{z}_j}\overline{\Omega _j(\theta )}=\delta _{ij} R^{\bar{z}_i \bar{z}_j}, \end{aligned}$$
(5.16)

where \(\delta _{ij}\) is the Kronecker delta symbol.

In what follows, we only give the details of solving equation (5.15). The other five ones can be done by the same way, thus omitted.

For Eq. (5.15). If \(i=j,\) let \(F^{z_j z_j}(\theta )=0.\) If \(i\ne j,\) we solve

$$\begin{aligned} \partial _{\omega }F^{z_i z_j}-\textrm{i}\Omega _i(\theta )F^{z_i z_j}+\textrm{i}F^{z_i z_j}\Omega _j(\theta )=R^{z_i z_j}. \end{aligned}$$
(5.17)

In the proof, denote \(\Omega _{ij}(\theta ):=\Omega _{i}(\theta )-\Omega _{j}(\theta ),\) and similarly for \(\Omega _{ij}(\xi ), B_{ij}(\theta )\) and \( b_{ij}(\theta ).\) Then we rewrite the equation above as

$$\begin{aligned} \partial _{\omega }F^{z_i z_j}-\textrm{i}(\Omega _{ij}(\xi )+[B_{ij}(\theta )])F^{z_i z_j}-\textrm{i}(B_{ij}(\theta )-[B_{ij}(\theta )])F^{z_i z_j}-\textrm{i}b_{ij}(\theta )F^{z_i z_j}=R^{z_i z_j}, \end{aligned}$$
(5.18)

Let \(\partial _\omega \beta _{ij}=\mathcal {T}_{ Q_{n+1}}B_{ij}(\theta )-[B_{ij}],\) then one can verify that

$$\begin{aligned} \beta _{ij}(\theta )=\sum \limits _{0<|k|\le Q_{n+1}}\frac{\widehat{B_{ij}}(k)}{\textrm{i}\langle k, \omega \rangle }\textrm{e}^{\textrm{i}\langle k, \theta \rangle }, \end{aligned}$$
(5.19)

is its unique solution. We have the estimate for \(\beta _{ij}\):

$$\begin{aligned} \begin{aligned} \Vert \beta _{ij}\Vert _{D(\tilde{r})\times \mathcal {O}} \le&\sum \limits _{0<|k|\le Q_{n+1}}\frac{|\widehat{B_{ij}}(k)|_{\mathcal {O}}}{|\langle k, \omega \rangle |}\textrm{e}^{|k|\tilde{r}}\\ \le&2Q_{n+1}\sum \limits _{0<|k|\le Q_{n+1}}2\Vert B_{ij}\Vert _{D(r)\times \mathcal {O}}\textrm{e}^{-|k|(r-\tilde{r})}\\ \le&\frac{256Q_{n+1}}{(r-\tilde{r})^2}\Vert B_{ij}\Vert _{D(r)\times \mathcal {O}}, \end{aligned} \end{aligned}$$
(5.20)

here we use (3.3), (3.4) and the inequality: for \(\sigma >0,\) \(\sum \limits _{k\in \mathbb {Z}^2}\textrm{e}^{-2|k|\sigma }\le (1+\textrm{e})^2\sigma ^{-2}.\)

Let

$$\begin{aligned}{} & {} u_{ij}=F^{z_i z_j}\textrm{e}^{-\textrm{i}\beta _{ij}},\,\,v_{ij}=R^{z_i z_j}\textrm{e}^{-\textrm{i}\beta _{ij}}, \\{} & {} \underline{b}_{ij}(\theta )=\mathcal {R}_{ Q_{n+1}}B_{ij}(\theta )+b_{ij}(\theta ). \end{aligned}$$

Then Eq. (5.18) will be transformed as one on \(u_{ij}.\) However, it is difficulty to obtain its solution and we solve its approximate equation, i.e.,

$$\begin{aligned} \partial _{\omega }u_{ij}-\textrm{i}(\Omega _{ij}(\xi )+[B_{ij}(\theta )])u_{ij}-\textrm{i}\mathcal {T}_K (\underline{b}_{ij}u_{ij})=\mathcal {T}_K v_{ij}, \end{aligned}$$
(5.21)

and the error term

$$\begin{aligned} \breve{R}_3=\sum _{\varrho =\pm }\sum _{i}\sum _{j\ne i}\textrm{e}^{\varrho \textrm{i}\beta ^\varrho _{ij}}\mathcal {R}_K\left( \textrm{e}^{-\varrho \textrm{i}\beta ^\varrho _{ij}}R^{z^\varrho _i z^\varrho _j}+\varrho \textrm{i}\textrm{e}^{-\varrho \textrm{i}\beta ^\varrho _{ij}} \underline{b}^\varrho _{ij} F^{z^\varrho _i z^\varrho _j}\right) z^{\varrho }_j\frac{\partial }{\partial z^\varrho _i}, \end{aligned}$$
(5.22)

here \(\beta ^\varrho _{ij}\) is a function determined by \(B^\varrho _{ij}\) and can be defined as in (5.19).

Let \(u_{ij}=\sum \limits _{|k|\le K, k\in \mathbb {Z}^2}\hat{u}_{ij}(k)\textrm{e}^{\textrm{i}\langle k, \theta \rangle },\) \(v_{ij}=\sum \limits _{k\in \mathbb {Z}^2}\hat{v}_{ij}(k)\textrm{e}^{\textrm{i}\langle k, \theta \rangle },\) \(\underline{b}_{ij}=\sum \limits _{ k\in \mathbb {Z}^2}\widehat{\underline{b}_{ij}}(k)\textrm{e}^{\textrm{i}\langle k, \theta \rangle }.\) We have, for \(|k|\le K,\)

$$\begin{aligned} \textrm{i}\left( \langle k, \omega \rangle -(\Omega _{ij}(\xi )+[B_{ij}(\theta )])\right) \hat{u}_{ij}(k)-\textrm{i}\sum \limits _{|l|\le K}\widehat{\underline{b}_{ij}}(k-l)\hat{u}_{ij}(l)=\hat{v}_{ij}(k). \end{aligned}$$

Rewrite it as vector equation,

$$\begin{aligned} (A_{ij}+D_{ij})\mathfrak {X}_{ij}=\mathcal {B}_{ij} \end{aligned}$$
(5.23)

where

$$\begin{aligned}{} & {} \mathfrak {X}_{ij}=(\hat{u}_{ij}(k): |k|\le K)^T, \\{} & {} A_{ij}=diag\left( \textrm{i}\langle k, \omega \rangle -\textrm{i}(\Omega _{ij}(\xi )+[B_{ij}(\theta )]): |k|\le K\right) , \\{} & {} D_{ij}=(-\textrm{i}\widehat{\underline{b}_{ij}}(k-l): |k|, |l|\le K), \\{} & {} \mathcal {B}_{ij}=(\hat{v}_{ij}(k): |k|\le K)^T, \end{aligned}$$

Denote \(\Lambda _{\tilde{r}-2\sigma }:=diag\left( \textrm{e}^{|k|(\tilde{r}-2\sigma )}: |k|\le K\right) ,\) then

$$\begin{aligned} (A_{ij}+\Lambda _{\tilde{r}-2\sigma } D_{ij}\Lambda ^{-1}_{\tilde{r}-2\sigma })\Lambda _{\tilde{r}-2\sigma }\mathfrak {X}_{ij}=\Lambda _{\tilde{r}-2\sigma }\mathcal {B}_{ij} \end{aligned}$$
(5.24)
$$\begin{aligned} \Lambda _{\tilde{r}-2\sigma } D_{ij}\Lambda ^{-1}_{\tilde{r}-2\sigma }=(-\textrm{i}\textrm{e}^{(|l|-|k|)(\tilde{r}-2\sigma )}\widehat{\underline{b}_{ij}}(l-k): |k|, |l|\le K). \end{aligned}$$

To solve Eq. (5.24), we will use the non-resonance condition (5.3): \(\Omega (\xi )+[B(\theta )]\in \mathcal{M}\mathcal{C}_\omega (\gamma , \tau , K, \mathcal {O}).\)

Recall that

$$\begin{aligned}{} & {} \Omega _j(\xi )=d(\xi )j+\tilde{\Omega }_j(\xi ),\,j\ge 1, \\{} & {} \Omega (\xi )=(\Omega _j(\xi ):j\ge 1),\, B(\theta ;\xi )=(B_j(\theta ;\xi ):j\ge 1). \end{aligned}$$

Assume \(\zeta < \frac{A_1}{8}\) and \(\mathcal {C}>\frac{4}{4A_0-3A_1}\) is a constant. It is not difficulty to prove that there exists a constant \(C_0>0\) depending only on \(A_0,\, A_1, A_2\) such that

$$\begin{aligned} \Vert A^{-1}_{ij}\Vert _{\mathcal {O}}\le \frac{C_0K^{2\tau +1}}{\gamma ^{2}}. \end{aligned}$$
(5.25)

For \(\Lambda _{\tilde{r}-2\sigma } D_{ij}\Lambda ^{-1}_{\tilde{r}-2\sigma },\) \(\Vert \Lambda _{\tilde{r}-2\sigma } D_{ij}\Lambda ^{-1}_{\tilde{r}-2\sigma }\Vert _{\mathcal {O}} \le 64\sigma ^{-2}\Vert \underline{b}\Vert _{\infty ,D(\tilde{r})\times \mathcal {O}}.\) By assumption (ii) in (5.7),

$$\begin{aligned} \begin{aligned}&\Vert \underline{b}\Vert _{\infty ,D(\tilde{r})\times \mathcal {O}} \le \frac{32}{(\frac{r-\tilde{r}}{2})^2}\textrm{e}^{-\frac{r-\tilde{r}}{2}Q_{n+1}}\Vert B\Vert _{\infty ,D(\tilde{r})\times \mathcal {O}}+2\Vert b\Vert _{\infty ,D(\tilde{r})\times \mathcal {O}}\\&\quad \le \frac{256}{(r-\tilde{r})^2}\textrm{e}^{-\frac{r-\tilde{r}}{2}Q_{n+1}}\zeta +2\tilde{\zeta } \le \tilde{\zeta }^{\frac{1}{2}}, \end{aligned} \end{aligned}$$

and by (5.25) and assumption (iii) in (5.8),

$$\begin{aligned} \begin{aligned} \Vert A^{-1}_{ij}\Lambda _{\tilde{r}-2\sigma } D_{ij}\Lambda ^{-1}_{\tilde{r}-2\sigma }\Vert _{\mathcal {O}} \le&\frac{C_0K^{2\tau +1}}{\gamma ^{2}\sigma ^{2}} \tilde{\zeta }^{\frac{1}{2}}\le 1/2. \end{aligned} \end{aligned}$$

This implies \(A_{ij}+\Lambda _{\tilde{r}-2\sigma } D_{ij}\Lambda ^{-1}_{\tilde{r}-2\sigma }\) has a bounded inverse.

$$\begin{aligned} \begin{aligned} \Vert (A_{ij}+\Lambda _{\tilde{r}-2\sigma } D_{ij}\Lambda ^{-1}_{\tilde{r}-2\sigma })^{-1}\Vert _{\mathcal {O}} \le&\Vert A^{-1}_{ij}\Vert _{\mathcal {O}}\frac{1}{1-\Vert A^{-1}_{ij}\Lambda _{\tilde{r}-2\sigma } D_{ij}\Lambda ^{-1}_{\tilde{r}-2\sigma }\Vert _{\mathcal {O}}}\\ \le&\frac{C_0K^{2\tau +1}}{\gamma ^{2}}. \end{aligned} \end{aligned}$$

Then

$$\begin{aligned} \begin{aligned} \Vert u_{ij}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}} \le&\Vert \Lambda _{\tilde{r}}\mathfrak {X}_{ij}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}\\ \le&\Vert (A_{ij}+\Lambda _{\tilde{r}-2\sigma } D_{ij}\Lambda ^{-1}_{\tilde{r}-2\sigma })^{-1}\Vert _{\mathcal {O}}\Vert \Lambda _{\tilde{r}-2\sigma }\mathcal {B}_{ij}\Vert _{\mathcal {O}}\\ \le&\frac{C_0K^{2\tau +1}}{\gamma ^{2}\sigma ^2}\Vert v_{ij}\Vert _{D(\tilde{r}-\sigma )\times \mathcal {O}} \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \Vert v_{ij}\Vert _{D(\tilde{r}-\sigma )\times \mathcal {O}}\le&\Vert \textrm{e}^{-\textrm{i}\beta _{ij}}\Vert _{D(\tilde{r}-\sigma )\times \mathcal {O}}\Vert R^{z_iz_j}\Vert _{D(\tilde{r}-\sigma )\times \mathcal {O}}.\\ \end{aligned} \end{aligned}$$

Since \(F^{z_i z_j}=u_{ij}\textrm{e}^{\textrm{i}\beta _{ij}},\) \(R^{z_i z_j}=v_{ij}\textrm{e}^{\textrm{i}\beta _{ij}},\)

$$\begin{aligned} \begin{aligned} \Vert F^{z_i z_j}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}\le&\Vert \textrm{e}^{\textrm{i}\beta _{ij}}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}\Vert u_{ij}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}\\ \le&\Vert \textrm{e}^{\textrm{i}\beta _{ij}}\Vert _{D(\tilde{r})\times \mathcal {O}}\Vert \textrm{e}^{-\textrm{i}\beta _{ij}}\Vert _{D(\tilde{r})\times \mathcal {O}}\frac{C_0K^{2\tau +1}}{\gamma ^{2}\sigma ^2}\Vert R^{z_iz_j}\Vert _{D(\tilde{r}-\sigma )\times \mathcal {O}}\\ \end{aligned} \end{aligned}$$

Now we estimate \(\Vert \textrm{e}^{\pm \textrm{i}\beta _{ij}}\Vert _{D(\tilde{r})\times \mathcal {O}}.\) By assumptions (5.4): \(\Vert B\Vert _{\infty ,D(r)\times \mathcal {O}}\le \zeta \) and (ii): \( 360\tilde{r}Q_{n+1}\zeta \le (r-\tilde{r})^{3}\) in (5.6), we have

$$\begin{aligned} \begin{aligned}&\Vert \textrm{e}^{\pm \textrm{i}\beta _{ij}}\Vert _{D(\tilde{r})\times \mathcal {O}}\\&\quad \le \left( 1+\frac{Q_{n+1}}{(r-\tilde{r})^2}\right) \Vert B_{ij}\Vert _{D(r)\times \mathcal {O}}\textrm{e}^{90\tilde{r}Q_{n+1}(r-\tilde{r})^{-3}\Vert B_{ij}\Vert _{D(r)\times \mathcal {O}}}\\&\quad \le \frac{4Q_{n+1}}{(r-\tilde{r})^2}\Vert B\Vert _{D(r)\times \mathcal {O}}\textrm{e}^{180\tilde{r}Q_{n+1}(r-\tilde{r})^{-3}\zeta }\\&\quad \le \frac{8Q_{n+1}}{(r-\tilde{r})^2}\zeta . \end{aligned} \end{aligned}$$
(5.26)

Then

$$\begin{aligned} \begin{aligned} \Vert F^{z_i z_j}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}&\le \Vert \textrm{e}^{\textrm{i}\beta _{ij}}\Vert _{D(\tilde{r})\times \mathcal {O}}\Vert \textrm{e}^{-\textrm{i}\beta _{ij}}\Vert _{D(\tilde{r})\times \mathcal {O}}\frac{C_0K^{2\tau +1}}{\gamma ^{2}\sigma ^2}\Vert R^{z_iz_j}\Vert _{D(\tilde{r}-\sigma )\times \mathcal {O}}\\&\le \left( \frac{8Q_{n+1}}{(r-\tilde{r})^2}\zeta \right) ^2\frac{C_0K^{2\tau +1}}{\gamma ^{2}\sigma ^2}\Vert R^{z_iz_j}\Vert _{D(\tilde{r}-\sigma )\times \mathcal {O}}\\&\le \frac{C_0\zeta ^2Q^2_{n+1}K^{2\tau +1}}{\gamma ^{2}\sigma ^2(r-\tilde{r})^4}\Vert R^{z_iz_j}\Vert _{D(r)\times \mathcal {O}}.\\ \end{aligned} \end{aligned}$$

Consider the estimate for vector field \(F=F^{(z)}\frac{\partial }{\partial z}+F^{(\bar{z})}\frac{\partial }{\partial \bar{z}}.\) It remains to consider the estimate for \(F^{(z)}\). It follows from above analysis that

$$\begin{aligned} \begin{aligned}&\Vert F^{(z_i)}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}\\&\quad =\Vert F^{z_i}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}+\sum \limits _{j}\Vert F^{z_iz_j}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}|z_j|+\sum \limits _{j}\Vert F^{z_i\bar{z}_j}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}|\bar{z}_j|\\&\quad \le \frac{C_0\zeta ^2Q^2_{n+1}K^{2\tau +1}}{\gamma ^{2}\sigma ^2(r-\tilde{r})^4}\left( \Vert R^{z_i}\Vert _{D(\tilde{r})\times \mathcal {O}}+\sum \limits _{j}\Vert R^{z_iz_j}\Vert _{D(\tilde{r})\times \mathcal {O}}|z_j| +\sum \limits _{j}\Vert R^{z_i\bar{z}_j}\Vert _{D(\tilde{r})\times \mathcal {O}}|\bar{z}_j|\right) \\&\quad =\frac{C_0\zeta ^2Q^2_{n+1}K^{2\tau +1}}{\gamma ^{2}\sigma ^2(r-\tilde{r})^4}\Vert R^{(z_i)}\Vert _{D(r)\times \mathcal {O}}, \end{aligned} \end{aligned}$$

then

$$\begin{aligned} \Vert F\Vert _{s;D(\tilde{r}-2\sigma ,s)\times \mathcal {O}}\le \frac{C_3\eta ^2_1Q^2_{n+1}K^{2\tau +1}}{\gamma ^{2}\sigma ^2(r-\tilde{r})^4}\Vert R\Vert _{s;D(r,s)\times \mathcal {O}}. \end{aligned}$$

We turn to the estimate for error term \( \breve{R}\) in (5.22). By (3.1) and (5.26),

$$\begin{aligned} \begin{aligned}&\Vert \breve{R}^{z_iz_j}\Vert _{D(\tilde{r}-4\sigma )\times \mathcal {O}}\\&\quad =\Vert \textrm{e}^{\textrm{i}\beta _{ij}}\mathcal {R}_K\left( \textrm{e}^{-\textrm{i}\beta _{ij}}R^{z_iz_j}+\textrm{i}\textrm{e}^{-\textrm{i}\beta _{ij}} \underline{b}_{ij}F^{z_iz_j}\right) \Vert _{D(\tilde{r}-4\sigma )\times \mathcal {O}}\\&\quad \le \Vert \textrm{e}^{\textrm{i}\beta _{ij}}\Vert _{D(\tilde{r}-4\sigma )\times \mathcal {O}}32\sigma ^{-2}\textrm{e}^{-K\sigma }\Vert \textrm{e}^{-\textrm{i}\beta _{ij}}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}\Vert R^{z_iz_j}+\textrm{i}\underline{b}_{ij} F^{z_iz_j}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}\\&\quad \le \frac{C_0\zeta ^2Q^2_{n+1}}{\sigma ^{2}(r-\tilde{r})^2}\textrm{e}^{-K\sigma }\left( \Vert R^{z_iz_j}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}+\Vert \underline{b}_{ij}\Vert _{D(\tilde{r})\times \mathcal {O}}\Vert F^{z_iz_j}\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}\right) ,\\ \end{aligned} \end{aligned}$$

then

$$\begin{aligned} \begin{aligned} \Vert \breve{R}\Vert _{s;D(\tilde{r}-4\sigma ,s)\times \mathcal {O}} \le \frac{C_0\zeta ^2Q^2_{n+1}}{\sigma ^{2}(r-\tilde{r})^2}\textrm{e}^{-K\sigma } \left( \Vert R\Vert _{s;D(r,s)\times \mathcal {O}}+\tilde{\zeta }^{\frac{1}{2}}\Vert F\Vert _{s;D(\tilde{r}-2\sigma ,s)\times \mathcal {O}}\right) .\\ \end{aligned} \end{aligned}$$

It follows from the definition of reversibility that \(S^*F=F\) and \(S^*\breve{R}=-\breve{R}.\) \(\square \)

Finally, we verify Töplitz–Lipschitz property of solutions.

Proposition 5.2

(Töplitz–Lipschitz Property of Solutions) Let \(K, Q_{n+1}, C_0, \mathcal {O}\) and the parameters \(\tau , \sigma , r, \tilde{r}, s\) be as in Proposition 5.1. Suppose \(0<\varepsilon <1\), \(\tilde{\varepsilon }>0\) and the following inequality holds:

$$\begin{aligned} \left( \frac{32}{\sigma ^2}\textrm{e}^{-K\sigma }\right) ^{\frac{4}{3}} \le \tilde{\varepsilon } \le \min \bigg \{\sigma ^{12(2\tau +3)}, \left( \frac{(r-\tilde{r})^2}{Q_{n+1}}\right) ^{60(2\tau +3)}, \left( \frac{\gamma ^2}{C_0K^{2\tau +1}}\right) ^{\frac{50}{9}}\bigg \}.\qquad \end{aligned}$$
(5.27)

If the limits \(\lim \limits _{t\rightarrow \infty }B_{i+t}(\theta ),\) \(\lim \limits _{t\rightarrow \infty }b_{i+t}(\theta ),\) \(i\ge 1\) exist and

$$\begin{aligned} \begin{aligned}&\Vert \lim \limits _{t\rightarrow \infty }B_{i+t}(\theta )\Vert _{D(r)\times \mathcal {O}}\le 2\varepsilon ^{\frac{1}{2}},\\&\Vert B_{i+t}(\theta )-\lim \limits _{t\rightarrow \infty }B_{i+t}(\theta )\Vert _{D(r)\times \mathcal {O}}\le \frac{2\varepsilon ^{\frac{1}{2}}}{|t|}, \end{aligned} \end{aligned}$$
(5.28)
$$\begin{aligned} \begin{aligned}&\Vert \lim \limits _{t\rightarrow \infty }b_{i+t}(\theta )\Vert _{D(r)\times \mathcal {O}}\le 2\varepsilon ,\\&\Vert b_{i+t}(\theta )-\lim \limits _{t\rightarrow \infty }b_{i+t}(\theta )\Vert _{D(r)\times \mathcal {O}}\le \frac{2\varepsilon }{|t|}, \end{aligned} \end{aligned}$$
(5.29)

and the vector field R above satisfies Töplitz–Lipschitz property (A4) with \(\tilde{\varepsilon }\) in place of \(\varepsilon \) on \(D(\tilde{r},s),\) then the vector field F (resp. the error term \(\breve{R}\)) obtained in Proposition 5.1 also satisfies the property (A4) with \(\tilde{\varepsilon }^{\frac{3}{5}}\) (resp. \(\tilde{\varepsilon }^{\frac{4}{3}}\)) in place of \(\varepsilon \) on \(D(\tilde{r}-2\sigma ,s)\) (resp. \(D(\tilde{r}-4\sigma , s)\)).

Proof

In the proof, we only verify the cases \(\frac{\partial F^{(z_i)}}{\partial z_{j}}\) and \(\frac{\partial \breve{R}^{(z_i)}}{\partial z_{j}},\) and the other cases can be done by the same way.

From the proof of Proposition 5.1, we obtain that

$$\begin{aligned} \frac{\partial F^{(z_i)}}{\partial z_{j}}=F^{z_i z_j}=u_{ij}\textrm{e}^{i\beta _{ij}(\theta )} \end{aligned}$$

and

$$\begin{aligned} \frac{\partial R^{(z_i)}}{\partial z_{j}}=R^{z_i z_j}=v_{ij}\textrm{e}^{i \beta _{ij}(\theta )}. \end{aligned}$$

where \(\beta _{ij}(\theta )=\sum \limits _{0<|k|<Q_{n+1}}\frac{\widehat{B_{ij}}(\theta )}{\textrm{i}\langle k,\omega \rangle }\textrm{e}^{\textrm{i}\langle k,\omega \rangle }.\)

We first verify Töplitz–Lipschitz property for \(\beta _{i,j}(\theta )\) and \(\textrm{e}^{\textrm{i}\beta _{i,j}(\theta )}.\) Denote \(\beta _{i,j,\infty }(\theta ):=\lim \limits _{t\rightarrow \infty }\beta _{i+t,j+t}(\theta ).\) Then as in (5.20),

$$\begin{aligned} \begin{aligned}&\Vert \beta _{i+t,j+t}(\theta )-\beta _{i,j,\infty }(\theta )\Vert _{D(\tilde{r}),\mathcal {O}}\\&\quad \le \sum \limits _{0<|k|<Q_{n+1}}\frac{|\widehat{B_{i+t}}(k)-\widehat{B_{i,\infty }}(k)| _{\mathcal {O}}+|\widehat{B_{j+t}}(k)-\widehat{B_{j,\infty }}(k)|_{\mathcal {O}}}{|\langle k,\omega \rangle |}\textrm{e}^{|k|\tilde{r}}\\&\quad \le 256Q_{n+1}(r-\tilde{r})^{-2}\frac{2\varepsilon ^{\frac{1}{2}}}{|t|} \le \frac{\tilde{\varepsilon }^{-\frac{1}{1380}}}{|t|}. \end{aligned} \end{aligned}$$

As in (5.26), one has

$$\begin{aligned} \begin{aligned}&\Vert \textrm{e}^{\textrm{i}\beta _{i+t,j+t}(\theta )}-\textrm{e}^{\textrm{i}\beta _{i,j,\infty }(\theta )}\Vert _{D(\tilde{r}),\mathcal {O}} \\&\quad \le \left( \frac{8Q_{n+1}}{(r-\tilde{r})^2}\zeta \right) ^{2}\Vert \beta _{i+t,j+t}(\theta )-\beta _{i,j,\infty }(\theta )\Vert _{D(\tilde{r}),\mathcal {O}}\\&\quad \le \tilde{\varepsilon }^{-\frac{2}{690}}\frac{\tilde{\varepsilon }^{-\frac{1}{1380}}}{|t|} \le \frac{\tilde{\varepsilon }^{-\frac{1}{276}}}{|t|}. \end{aligned} \end{aligned}$$
(5.30)

We then prove that \(v_{ij}\) satisfy the Töplitz–Lipschitz property. Denote \(R_{i,j,\infty }^{zz}:=\lim \nolimits _{t\rightarrow \infty }R^{z_{i+t},z_{j+t}}\) and \(v_{i,j,\infty }(\theta ):=\lim \nolimits _{t\rightarrow \infty }v_{i+t,j+t}(\theta ).\) Below we will use similar notations for \(F^{z_{i},z_{j}}\) and \(u_{ij}\).

$$\begin{aligned} \begin{aligned}&\Vert v_{i,j,\infty }\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}}\\&\quad \le \Vert R_{i,j,\infty }^{zz}\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}}\Vert \textrm{e}^{-\beta _{i,j,\infty }}\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}}\\&\quad \le \tilde{\varepsilon }\textrm{e}^{-\rho |i-j|}\Vert \textrm{e}^{-\beta _{i,j,\infty }}\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}}\\&\quad \le \tilde{\varepsilon }^{\frac{683}{690}}\textrm{e}^{-\rho |i-j|}. \end{aligned} \end{aligned}$$

Using (5.30), we have

$$\begin{aligned} \begin{aligned}&\Vert v_{i+t,j+t}-v_{i,j,\infty }\Vert _{D(\tilde{r}-\sigma )\times \mathcal {O}}\\&\quad \le \Vert R^{z_{i+t}z_{j+t}}-R_{i,j,\infty }^{zz}\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}} \Vert \textrm{e}^{-\textrm{i}\beta _{i+t,j+t}}\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}}\\&\quad \quad +\Vert R_{i,j,\infty }^{zz}\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}}\Vert \textrm{e}^{-\textrm{i}\beta _{i+t,j+t}}-\textrm{e}^{-\textrm{i}\beta _{i,j,\infty }}\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}}\\&\quad \le \frac{\tilde{\varepsilon }}{|t|}\textrm{e}^{-\rho |i-j|}\frac{32}{\sigma ^2}\Vert \textrm{e}^{-\textrm{i}\beta _{i+t,j+t}}\Vert _{D(\tilde{r}),\mathcal {O}} +\tilde{\varepsilon }\textrm{e}^{-\rho |i-j|}\frac{32}{\sigma ^2}\Vert \textrm{e}^{-\textrm{i}\beta _{i+t,j+t}}-\textrm{e}^{-\textrm{i}\beta _{i,j,\infty }}\Vert _{D(\tilde{r}),\mathcal {O}}\\&\quad \le \frac{\tilde{\varepsilon }^{\frac{68}{69}}}{|t|}\textrm{e}^{-\rho |i-j|}. \end{aligned} \end{aligned}$$

Let \(A_{i,j,\infty }:=\lim \limits _{t\rightarrow \infty }A_{i+t,j+t}\) and similarly for \(D_{i,j,\infty }\), \(\mathfrak {X}_{i,j,\infty }\) and \(\mathcal {B}_{i,j,\infty }\). Note that as in the proof of Proposition 5.1, one has

$$\begin{aligned} \begin{aligned}&(A_{i,j,\infty }+D_{i,j,\infty })\mathfrak {X}_{i,j,\infty }=\mathcal {B}_{i,j,\infty },\\&(A_{i,j,\infty }+\Lambda _{\tilde{r}}D_{i,j,\infty }\Lambda ^{-1}_{\tilde{r}})\Lambda _{\tilde{r}} \mathfrak {X}_{i,j,\infty }=\Lambda _{\tilde{r}}\mathcal {B}_{i,j,\infty }, \end{aligned} \end{aligned}$$

and then \(\Vert A_{i,j,\infty }^{-1}\Lambda _{\tilde{r}} D_{i,j,\infty }\Lambda ^{-1}_{\tilde{r}}\Vert _{\mathcal {O}}\le \frac{1}{2}.\) This implies \(A_{i,j,\infty }+\Lambda _{\tilde{r}} D_{i,j,\infty }\Lambda ^{-1}_{\tilde{r}}\) has a bounded inverse. Moreover, one has

$$\begin{aligned} \begin{aligned} \Vert (A_{i,j,\infty }+\Lambda _{\tilde{r}}D_{i,j,\infty }\Lambda ^{-1}_{\tilde{r}})^{-1}\Vert _{\mathcal {O}} \le \frac{C_0K^{2\tau +1}}{\gamma ^2} \le \tilde{\varepsilon }^{-\frac{18}{100}}. \end{aligned} \end{aligned}$$

Therefore,

$$\begin{aligned} \begin{aligned}&\Vert u_{i,j,\infty }\Vert _{D(\tilde{r}-2\sigma )\times \mathcal {O}}\\&\quad \le \Vert (A_{i,j,\infty }+\Lambda _{\tilde{r}-2\sigma }D_{i,j,\infty }\Lambda ^{-1}_{\tilde{r}-2\sigma })^{-1}\Vert _{\mathcal {O}}\Vert \Lambda _{\tilde{r}-2\sigma }\mathcal {B}_{i,j,\infty }\Vert _{\mathcal {O}}\\&\quad \le \Vert (A_{i,j,\infty }+\Lambda _{\tilde{r}-2\sigma }D_{i,j,\infty }\Lambda ^{-1}_{\tilde{r}-2\sigma })^{-1}\Vert _{\mathcal {O}}\frac{32}{\sigma ^2}\Vert v_{i,j,\infty }\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}}\\&\quad \le \frac{C_0K^{2\tau +1}}{\gamma ^2}\frac{32}{\sigma ^2} \Vert v_{i,j,\infty }\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}} \le \tilde{\varepsilon }^{\frac{8}{10}}\textrm{e}^{-\rho |i-j|}. \end{aligned} \end{aligned}$$

To study the Töplitz–Lipschitz property of \(u_{ij}\), in the following, we first consider

$$\begin{aligned} \left( A_{i+t,j+t}+\Lambda _{\tilde{r}-2\sigma } D_{i+t,j+t}\Lambda ^{-1}_{\tilde{r}-2\sigma }\right) \Lambda _{\tilde{r}-2\sigma }\left( \mathfrak {X}_{i+t,j+t}-\mathfrak {X}_{i,j,\infty }\right) =H, \end{aligned}$$

where we denote

$$\begin{aligned} \begin{aligned} H&:=\Lambda _{\tilde{r}-2\sigma }\mathcal {B}_{i+t,j+t}-\Lambda _{\tilde{r}-2\sigma }\mathcal {B}_{i,j,\infty }\\&\quad -\left( (A_{i+t,j+t}-A_{i,j,\infty })-\Lambda _{\tilde{r}-2\sigma } (D_{i+t,j+t}-D_{i,j,\infty })\Lambda ^{-1}_{\tilde{r}-2\sigma }\right) \Lambda _{\tilde{r}-2\sigma }\mathfrak {X}_{i,j,\infty }. \end{aligned} \end{aligned}$$

Now we consider H.

$$\begin{aligned} \begin{aligned}&\Vert \Lambda _{\tilde{r}-2\sigma }\mathcal {B}_{i+t,j+t}-\Lambda _{\tilde{r}-2\sigma }\mathcal {B}_{i,j,\infty }\Vert _{\mathcal {O}}\\&\quad \le \frac{32}{\sigma ^2}\Vert v_{i+t,j+t}-v_{i,j,\infty }\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}} \le \frac{\tilde{\varepsilon }^{\frac{674}{690}}}{|t|}\textrm{e}^{-\rho |i-j|}. \end{aligned} \end{aligned}$$

Recall that \( \Omega _j(\xi )=d(\xi )j+\tilde{\Omega }_j(\xi ),\,j\ge 1\) (see (4.2)). Then

$$\begin{aligned} \begin{aligned}&\Vert A_{i+t,j+t}-A_{i,j,\infty }\Vert _{\mathcal {O}}\\&\quad \le \Vert \tilde{\Omega }_{i+t,j+t}-\tilde{\Omega }_{i,j,\infty }\Vert _{\mathcal {O}}+\Vert [B_{i+t,j+t}]-[B_{i,j,\infty }]\Vert _{\mathcal {O}}\\&\quad \le \frac{8\varepsilon ^{\frac{1}{2}}}{|t|}. \end{aligned} \end{aligned}$$

Note that the matrix

$$\begin{aligned} \begin{aligned}&\Lambda _{\tilde{r}-2\sigma }(D_{i+t,j+t}-D_{i,j,\infty })\Lambda _{\tilde{r}-2\sigma }^{-1}\\&\quad =\left( -\textrm{i}\textrm{e}^{(|l|-|k|)(\tilde{r}-2\sigma )}(\widehat{\underline{b}}_{i+t,j+t}(l-k) -\widehat{\underline{b}}_{i,j,\infty }(l-k))\right) _{|k|,|l|\le K}, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned} \underline{b}_{i+t}-\underline{b}_{i,\infty } =\mathcal {R}_{Q_{n+1}}(B_{i+t}-B_{i,\infty })+(b_{i+t}-b_{i,\infty }). \end{aligned} \end{aligned}$$

Moreover,

$$\begin{aligned} \begin{aligned}&\Vert \underline{b}_{i+t}-\underline{b}_{i,\infty }\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}}\\&\quad \le \frac{32}{(\frac{r-\tilde{r}}{2})^2}\textrm{e}^{-\frac{r-\tilde{r}}{2}Q_{n+1}}\Vert B_{i+t}-B_{i,\infty }\Vert _{D(r),\mathcal {O}}+\Vert b_{i+t}-b_{i,\infty }\Vert _{D(r),\mathcal {O}}\\&\quad \le \frac{128}{(r-\tilde{r})^2}\textrm{e}^{-\frac{r-\tilde{r}}{2}Q_{n+1}}\frac{2\varepsilon ^{\frac{1}{2}}}{|t|}+\frac{2\varepsilon }{|t|}. \end{aligned} \end{aligned}$$

It follows from above estimates that

$$\begin{aligned} \begin{aligned}&\Vert \Lambda _{\tilde{r}-2\sigma }(D_{i+t,j+t}-D_{i,j,\infty })\Lambda ^{-1}_{\tilde{r}-2\sigma }\Vert _{\mathcal {O}}\\&\quad \le \sum \limits _{k}\textrm{e}^{|k|(\tilde{r}-2\sigma )} |\widehat{\underline{b}}_{i+t,j+t}(k)-\widehat{\underline{b}}_{i,j,\infty }(k))|_{\mathcal {O}}\\&\quad \le \frac{32}{\sigma ^2}\Vert \underline{b}_{i+t,j+t}-\underline{b}_{i,j,\infty }\Vert _{D(\tilde{r}-\sigma ),\mathcal {O}}\\&\quad \le \frac{32}{\sigma ^2} 2\left( \frac{128}{(r-\tilde{r})^2} \textrm{e}^{-\frac{r-\tilde{r}}{2}Q_{n+1}}\frac{2\varepsilon ^{\frac{1}{2}}}{|t|}+\frac{2\varepsilon }{|t|}\right) \\&\quad \le \tilde{\varepsilon }^{-\frac{1}{100}}\frac{\varepsilon ^{\frac{1}{2}}}{|t|}. \end{aligned} \end{aligned}$$

We have obtained \(\Vert \Lambda _{\tilde{r}-2\sigma }\mathfrak {X}_{i,j,\infty }\Vert _{\mathcal {O}} \le \tilde{\varepsilon }^{\frac{8}{10}}\textrm{e}^{-\rho |i-j|}.\)

By the estimates obtained above,

$$\begin{aligned} \begin{aligned}&\Vert H\Vert _{\mathcal {O}}\\&\quad \le \Vert \Lambda _{\tilde{r}-2\sigma }\mathcal {B}_{i+t,j+t}-\Lambda _{\tilde{r}-2\sigma }\mathcal {B}_{i,j,\infty }\Vert _{\mathcal {O}}+\Vert A_{i+t,j+t}-A_{i,j,\infty }\Vert _{\mathcal {O}} \Vert \Lambda _{\tilde{r}-2\sigma }\mathfrak {X}_{i,j,\infty }\Vert _{\mathcal {O}}\\&\qquad +\Vert \Lambda _{\tilde{r}-2\sigma }(D_{i+t,j+t}-D_{i,j,\infty })\Lambda ^{-1}_{\tilde{r}-2\sigma }\Vert _{\mathcal {O}} \Vert \Lambda _{\tilde{r}-2\sigma }\mathfrak {X}_{i,j,\infty }\Vert _{\mathcal {O}}\\&\quad \le \frac{\tilde{\varepsilon }^{\frac{674}{690}}}{|t|}\textrm{e}^{-\rho |i-j|} +\frac{8\varepsilon ^{\frac{1}{2}}}{|t|}\tilde{\varepsilon }^{\frac{8}{10}}\textrm{e}^{-\rho |i-j|} +\varepsilon ^{\frac{1}{2}}\frac{\tilde{\varepsilon }^{\frac{79}{100}}}{|t|}\textrm{e}^{-\rho |i-j|}\\&\quad \le \frac{\tilde{\varepsilon }^{\frac{79}{100}}}{|t|}\textrm{e}^{-\rho |i-j|}. \end{aligned} \end{aligned}$$

This implies

$$\begin{aligned} \begin{aligned}&\Vert u_{i+t,j+t}-u_{i,j,\infty }\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\\&\quad =\Vert (A_{i+t,j+t}+\Lambda _{\tilde{r}-2\sigma }D_{i+t,j+t}\Lambda ^{-1}_{\tilde{r}-2\sigma })^{-1}H\Vert _{\mathcal {O}}\\&\quad \le \frac{C_0K^{2\tau +1}}{\gamma ^2} \Vert H\Vert _{\mathcal {O}} \le \frac{\tilde{\varepsilon }^{\frac{61}{100}}}{|t|}\textrm{e}^{-\rho |i-j|}. \end{aligned} \end{aligned}$$

Then one has

$$\begin{aligned} \begin{aligned}&\Vert F_{i,j,\infty }^{zz}\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\\&\quad \le \Vert u_{i,j,\infty }\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\Vert \textrm{e}^{\textrm{i}\beta _{i,j,\infty }}\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\\&\quad \le 8\tilde{\varepsilon }^{\frac{8}{10}}\textrm{e}^{-\rho |i-j|}\tilde{\varepsilon }^{-\frac{2}{276}}\tilde{\varepsilon }^{-\frac{1}{690}} \le \tilde{\varepsilon }^{\frac{3}{5}}\textrm{e}^{-\rho |i-j|} \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&\Vert F^{z_i+t z_j+t}-F_{i,j,\infty }^{zz}\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\\&\quad \le \Vert (u_{i+t,j+t}-u_{i,j,\infty })\textrm{e}^{\textrm{i}\beta _{i+t,j+t}}\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}} +\Vert u_{i,j,\infty }(\textrm{e}^{\textrm{i}\beta _{i+t,j+t}}-\textrm{e}^{\textrm{i}\beta _{i,j,\infty }})\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\\&\quad \le 8\frac{\tilde{\varepsilon }^{\frac{61}{100}}}{|t|}\textrm{e}^{-\rho |i-j|}\tilde{\varepsilon }^{-\frac{2}{276}}\tilde{\varepsilon }^{-\frac{1}{690}} +8\tilde{\varepsilon }^{\frac{8}{10}}\textrm{e}^{-\rho |i-j|}\tilde{\varepsilon }^{-\frac{2}{276}}\frac{\tilde{\varepsilon }^{-\frac{1}{276}}}{|t|}\\&\quad \le \frac{\tilde{\varepsilon }^{\frac{3}{5}}}{|t|}\textrm{e}^{-\rho |i-j|}. \end{aligned}. \end{aligned}$$

Finally, we consider the error term

$$\begin{aligned} \breve{R}_3=\sum _{\varrho =\pm }\sum _{i}\sum _{j\ne i}\textrm{e}^{\varrho \textrm{i}\beta _{ij}}\mathcal {R}_K\left( \textrm{e}^{-\varrho \textrm{i}\beta _{ij}}R^{z^\varrho _i z^\varrho _j}+\varrho \textrm{i}\textrm{e}^{-\varrho \textrm{i}\beta _{ij}}\underline{b}^\varrho _{ij} F^{z^\varrho _i z^\varrho _j}\right) z^{\varrho }_j\frac{\partial }{\partial z^\varrho _i}, \end{aligned}$$
$$\begin{aligned} \frac{\partial \breve{R}_3^{(z_i)}}{\partial z_j} =\textrm{e}^{\textrm{i}\beta _{ij}}\mathcal {R}_K \left( \textrm{e}^{-\textrm{i}\beta _{ij}}R^{z_iz_j}+\textrm{i}\textrm{e}^{-i\beta _{ij}}\underline{b}_{ij}F^{z_i z_j}\right) . \end{aligned}$$

Denote

$$\begin{aligned} g_{ij} =\mathcal {R}_K\left( \textrm{e}^{-\textrm{i}\beta _{ij}}R^{z_iz_j}+\textrm{i}\textrm{e}^{-\textrm{i}\beta _{ij}}\underline{b}_{ij}F^{z_iz_j}\right) =\mathcal {R}_K\left( v_{ij}+\textrm{i}\underline{b}_{ij}u_{ij}\right) . \end{aligned}$$

We have

$$\begin{aligned} \begin{aligned}&\Vert g_{i,j,\infty }\Vert _{D(\tilde{r}-4\sigma ),\mathcal {O}}\\&\quad \le \frac{32}{\sigma ^2}\textrm{e}^{-K\sigma }\left( \Vert v_{i,j,\infty }\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}+\Vert \underline{b}_{i,j,\infty }\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\Vert u_{i,j,\infty }\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\right) \\&\quad \le \tilde{\varepsilon }^{\frac{3}{4}}\left( \frac{32}{\sigma ^2}\tilde{\varepsilon }^{\frac{683}{690}}\textrm{e}^{-\rho |i-j|} +\frac{32}{4\sigma ^2}4\varepsilon \tilde{\varepsilon }^{\frac{8}{10}}\textrm{e}^{-\rho |i-j|}\right) \\&\quad \le \tilde{\varepsilon }^{\frac{77}{50}}\textrm{e}^{-\rho |i-j|}. \end{aligned} \end{aligned}$$

Then using above estimates, we obtain

$$\begin{aligned} \begin{aligned}&\Vert g_{i+t,j+t}-g_{i,j,\infty }\Vert _{D(\tilde{r}-4\sigma ),\mathcal {O}}\\&\quad \le \frac{32}{\sigma ^2}\textrm{e}^{-K\sigma }\big (\Vert v_{i+t,j+t}-v_{i,j,\infty }\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}} +\Vert \underline{b}_{i+t,j+t}\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\Vert u_{i+t,j+t}-u_{i,j,\infty }\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\\&\qquad +\Vert u_{i,j,\infty }\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}} \Vert \underline{b}_{i+t,j+t}-\underline{b}_{i,j,\infty }\Vert _{D(\tilde{r}-2\sigma ),\mathcal {O}}\big )\\&\quad \le \tilde{\varepsilon }^{\frac{3}{4}}\Big (\frac{32}{\sigma ^2}\frac{\tilde{\varepsilon }^{\frac{68}{69}}}{|t|}\textrm{e}^{-\rho |i-j|} +\frac{32}{4\sigma ^2}2\varepsilon \frac{\tilde{\varepsilon }^{\frac{61}{100}}}{|t|}\textrm{e}^{-\rho |i-j|} +\tilde{\varepsilon }^{\frac{8}{10}}\textrm{e}^{-\rho |i-j|} 2\left( \tilde{\varepsilon }^{-\frac{1}{920}}\frac{1}{|t|}+\frac{1}{|t|}\right) \Big )\\&\quad \le \frac{\tilde{\varepsilon }^{\frac{135}{100}}}{|t|}\textrm{e}^{-\rho |i-j|}. \end{aligned} \end{aligned}$$

Therefore, Töplitz–Lipschitz property for the error term \(\breve{R}_3\) can be verified as follows:

$$\begin{aligned} \begin{aligned} \left\| \lim \limits _{t\rightarrow \infty }\frac{\partial \breve{R}^{(z_i+t)}}{\partial z_{j+t}}\right\| _{D(\tilde{r}-4\sigma ),\mathcal {O}}&=\Vert g_{i,j,\infty }\textrm{e}^{\textrm{i}\beta _{i,j,\infty }}\Vert _{D(\tilde{r}-4\sigma ),\mathcal {O}}\\&\quad \le \tilde{\varepsilon }^{\frac{77}{50}}\textrm{e}^{-\rho |i-j|}\frac{32}{16\sigma ^2}\tilde{\varepsilon }^{-\frac{1}{690}}\\&\quad \le \tilde{\varepsilon }^{\frac{4}{3}}\textrm{e}^{-\rho |i-j|}, \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&\left\| \frac{\partial \breve{R}^{(z_i+t)}}{\partial z_{j+t}}-\lim \limits _{t\rightarrow \infty } \frac{\partial \breve{R}^{(z_i+t)}}{\partial z_{j+t}}\right\| _{D(\tilde{r}-4\sigma ),\mathcal {O}}\\&\quad \le \Vert (g_{i+t,j+t}-g_{i,j,\infty })\textrm{e}^{\textrm{i}\beta _{i+t,j+t}}\Vert _{D(\tilde{r}-4\sigma )\times \mathcal {O}} +\Vert g_{i,j,\infty }(\textrm{e}^{\textrm{i}\beta _{i+t,j+t}}-\textrm{e}^{\textrm{i}\beta _{i,j,\infty }})\Vert _{D(\tilde{r}-4\sigma ),\mathcal {O}}\\&\quad \le 2\frac{\tilde{\varepsilon }^{\frac{135}{100}}}{|t|}\textrm{e}^{-\rho |i-j|} \tilde{\varepsilon }^{-\frac{2}{276}}\tilde{\varepsilon }^{-\frac{1}{690}} +2\tilde{\varepsilon }^{\frac{77}{50}}\textrm{e}^{-\rho |i-j|} \tilde{\varepsilon }^{-\frac{2}{276}}\frac{\tilde{\varepsilon }^{-\frac{1}{276}}}{|t|}\\&\quad \le \frac{\tilde{\varepsilon }^{\frac{4}{3}}}{|t|}\textrm{e}^{-\rho |i-j|}. \end{aligned} \end{aligned}$$

\(\square \)

6 KAM Step

In this section, we give the proof of Theorem 4.1. Throughout this section, we denote by C a global constant independent of any iterative step but may depend on \(\tau \), \(A_0\), \(A_1\) and \(A_2\).

Suppose we arrive at the \(\nu ^{th}\) iterative step and the \(S-\)reversible vector field \(X_\nu =N_\nu +P_\nu \) on \(D(s_\nu , r_\nu )\times \mathcal {O}_\nu \), where the normal form vector field

$$\begin{aligned} \begin{aligned} N_\nu =&\omega \frac{\partial }{\partial \theta }+\textrm{i}\Omega _\nu (\theta ;\xi )z\frac{\partial }{\partial z}-\textrm{i}\overline{\Omega _\nu (\theta ;\xi )}\bar{z}\frac{\partial }{\partial \bar{z}}\\ =&\omega \frac{\partial }{\partial \theta }+\textrm{i}(\Omega (\xi )+B_\nu (\theta ;\xi ))z\frac{\partial }{\partial z}-\textrm{i}(\Omega (\xi )+\overline{B_\nu (\theta ;\xi )})\bar{z}\frac{\partial }{\partial \bar{z}} \end{aligned} \end{aligned}$$
(6.1)

with \(\Omega _{j}(\xi )=d(\xi )j+\tilde{\Omega }_{j}(\xi )\) and \(d(\xi ),\,\tilde{\Omega }_{j}\in C^1_W(\mathcal {O}),\) \(\overline{B_\nu (\theta )}=B_\nu (-\theta )\).

The \(S-\)reversible perturbation \(P_\nu \) has Töplitz–Lipschitz property (A4) on \(D(s_\nu , r_\nu )\) with \(\varepsilon _\nu , \rho _\nu \) in place of \(\varepsilon , \rho .\)

Our aim is to find an \(S-\)invariant transformation

$$\begin{aligned} \Phi _\nu : D(r_{\nu +1}, s_{\nu +1})\times \mathcal {O}_{\nu }\rightarrow D(r_\nu , s_\nu )\times \mathcal {O}_\nu \end{aligned}$$

such that \(\Phi ^*_\nu X_\nu =N_{\nu +1}+P_{\nu +1}\) with new normal form \(N_{\nu +1}\) and a much smaller perturbation term \(P_{\nu +1}.\)

For notational convenience, below we denote

$$\begin{aligned} Q_{n+1}:=Q_{n_0+\nu }, \end{aligned}$$
(6.2)

where \(n_0\in \mathbb {N}\) is some suitable fixed positive integer. Similar to the usual KAM literature, for other sequences, we drop the subscript \(\nu \), write the symbol ‘\(+\)’ for ‘\(\nu +1\)’ and write the symbol ‘−’ for ‘\(\nu -1\)’. Then the goal is to find an \(S-\)invariant transformation \(\Phi : D(r_{+}, s_{+})\times \mathcal {O}\rightarrow D(r, s)\times \mathcal {O}\) such that it transforms

$$\begin{aligned} \begin{aligned} X=&N+P\\ =&\omega \frac{\partial }{\partial \theta }+\textrm{i}(\Omega (\xi )+B(\theta ;\xi ))z\frac{\partial }{\partial z}-\textrm{i}(\Omega (\xi )+\overline{B(\theta ;\xi )})\bar{z}\frac{\partial }{\partial \bar{z}}+P\\ \end{aligned} \end{aligned}$$
(6.3)

into

$$\begin{aligned} \begin{aligned} X_{+}=&N_{+}+P_{+}\\ =&\omega \frac{\partial }{\partial \theta }+\textrm{i}(\Omega (\xi )+B_{+}(\theta ;\xi ))z\frac{\partial }{\partial z}-\textrm{i}(\Omega (\xi )+\overline{B_{+}(\theta ;\xi )})\bar{z}\frac{\partial }{\partial \bar{z}}+P_{+}.\\ \end{aligned} \end{aligned}$$

6.1 A Finite Induction

In this subsection, we will perform a finite induction procedure due to the lack of Diophantine condition on \(\omega \).

Let

$$\begin{aligned} \begin{aligned} \widetilde{X}_0=&\widetilde{N}_0+\widetilde{P}_0\\ =&\omega \frac{\partial }{\partial \theta }+\textrm{i}(\Omega (\xi )+B(\theta ,\xi ))z\frac{\partial }{\partial z}-\textrm{i}(\Omega (\xi )+\overline{B(\theta ,\xi )})\bar{z}\frac{\partial }{\partial \bar{z}}+P. \end{aligned} \end{aligned}$$
(6.4)

According to previous assumptions, we have \(\Vert \widetilde{P}_0\Vert _{s;D(r,s)\times \mathcal {O}}\le \varepsilon .\)

Let \(0<r_+<r\) and \(0<\rho _+<\rho .\) To give finite induction, we let

$$\begin{aligned} \tilde{\varepsilon }_0=\varepsilon , \tilde{r}_0=2r_{+}, \tilde{s}_0=s, \tilde{\rho }_{0}=\rho . \\ K=\lfloor \left( \frac{\gamma ^{2}}{2\varepsilon ^{\frac{1}{2}}}\right) ^{\frac{1}{4\tau +6}}\rfloor +1. \end{aligned}$$

Now let \(n\ge n_{0}\) and define

$$\begin{aligned}{} & {} L=2+\left\lfloor \frac{2^{n+2-n_{0}}c\tau U \ln Q_{n+1}}{2(24\tau +36)\ln \frac{5}{2}}\right\rfloor , \\{} & {} r=\frac{r_{0}}{4Q_{n}^{4}},\quad \quad \varepsilon =\varepsilon _{-}Q_{n+1}^{-2^{n+2-n_{0}}c\tau U}, \\{} & {} \zeta =\sum \limits _{i=0}^{n-n_{0}}\varepsilon _{i}^{\frac{1}{2}},\quad \quad \gamma =\gamma _{0}-3\sum \limits _{i=0}^{n-n_{0}}\varepsilon _{i}^{\frac{1}{2}}, \end{aligned}$$

where the constant \(U=U(\mathcal {A})\) is defined in Lemma 3.2. For \(m=1,2,3,...,L,\) we define the following sequences:

$$\begin{aligned} \begin{aligned}&\tilde{\varepsilon }_{m}=\tilde{\varepsilon }_{m-1}^{\frac{5}{4}} =\tilde{\varepsilon }_{0}^{(\frac{5}{4})^{m}},\quad \quad \tilde{\zeta }_{m}=\sum \limits _{q=0}^{m-1} \tilde{\varepsilon }_{q},\\&\tilde{\sigma }_{m}=\frac{\tilde{r}_{0}}{5\cdot 2^{m+2}}, \quad \quad K^{(m)}=\left( \frac{\gamma ^{2}\tilde{\sigma }^{2}_{m}}{2C_3\tilde{\zeta }_{m}^{\frac{1}{2}}}\right) ^{\frac{1}{2\tau +1}}, \\&\tilde{K}_{m}=\left\lfloor \tilde{\sigma }_{m}^{-1} \ln \tilde{\varepsilon }_{m-1}^{-1}\right\rfloor +1, \quad \quad \tilde{r}_{m}=\tilde{r}_{m-1}-5\tilde{\sigma }_{m-1}, \\&\tilde{\eta }_{m}=\tilde{\varepsilon }_{m-1}^\frac{1}{3}, \quad \quad \tilde{s}_{m}=\tilde{\eta }_{m}\tilde{s}_{m-1},\\&\tilde{\rho }_{m}=\frac{L-m}{L}\tilde{\rho }_{0}+\frac{m}{L}\rho _{+}. \\ \end{aligned} \end{aligned}$$
(6.5)

Lemma 6.1

Let \(0<\varepsilon _0<1\) and \(n_0\) be a positive integer such that

$$\begin{aligned} Q_{n_0+1}^{-4c\tau U}<\varepsilon _0<\min \{2^{-32}r^4_0, 2^{-18}\gamma ^{12}_0\},\,\,\,\,\ln \varepsilon ^{-1}_0\le \varepsilon ^{-\frac{1}{24\tau +36}}_0. \end{aligned}$$
(6.6)

Suppose \(Q_{n}\) (see (6.2)) satisfies

$$\begin{aligned} Q_{n}\ge (n+2-n_{0})2^{n+2-n_{0}}c\tau U,\,\,\, c>\frac{30(2\tau +3)}{\tau U}, \end{aligned}$$
(6.7)

then for finite sequences defined above, the following three inequalities hold:

  1. (1)

    \(360\tilde{r}_{m}Q_{n+1}\zeta \le (r-\tilde{r}_{m})^3\);

  2. (2)

    \(\frac{256}{(r-\tilde{r}_m)^2}e^{-\frac{r-\tilde{r}_m}{2}Q_{n+1}}\zeta \le \frac{1}{2}\tilde{\zeta }_m^{\frac{1}{2}}\);

  3. (3)

    \(\tilde{K}_m\le \min \{K, K^{(m)}\}\).

The proof of Lemma 6.1 is postponed to the appendix.

Proposition 6.2

Suppose all the assumptions in Lemma 6.1 are still satisfied, \(0<\varepsilon _0\le \frac{(r_0s_0\gamma _0)^{12\tau +36}}{Q_{n_0}^{2c\tau U}}\) and

$$\begin{aligned} \Vert B\Vert _{\infty ,D(\tilde{r}_{0})\times \mathcal {O}}\le \zeta , \end{aligned}$$
(6.8)

then for all \(\Omega (\xi )+[B(\cdot , \xi )]\in \mathcal{M}\mathcal{C}_\omega (\gamma , \tau , K, \mathcal {O}),\) the following holds: For \(0\le m\le L-1,\) there is an \(S-\)invariant coordinate transformation

$$\begin{aligned} \tilde{\Phi }_m: D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}\rightarrow D(\tilde{r}_{m},\tilde{s}_{m})\times \mathcal {O} \end{aligned}$$

such that \(S-\)reversible vector field

$$\begin{aligned} \begin{aligned} \tilde{X}_{m+1}&=(\tilde{\Phi }_m)^*\tilde{X}_m=\tilde{N}_{m+1}+\tilde{P}_{m+1}\\&=\omega \frac{\partial }{\partial \theta }+\textrm{i}(\Omega (\xi )+B(\theta ,\xi )+b_{m+1}(\theta ,\xi ))z\frac{\partial }{\partial z}\\&\quad -\textrm{i}(\Omega (\xi )+\overline{B(\theta ,\xi )}+\overline{b_{m+1}(\theta ,\xi )})\bar{z}\frac{\partial }{\partial \bar{z}}+\tilde{P}_{m+1} \end{aligned} \end{aligned}$$
(6.9)

with

$$\begin{aligned}{} & {} \Vert b_{m+1}\Vert _{\infty ,D(\tilde{r}_{m+1})\times \mathcal {O}}\le \tilde{\zeta }_{m+1}, \end{aligned}$$
(6.10)
$$\begin{aligned}{} & {} \Vert \tilde{P}_{m+1}\Vert _{\tilde{s}_{m+1};D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} \le \tilde{\varepsilon }_{m+1} \end{aligned}$$
(6.11)

Moreover, \(\tilde{\Phi }_m\) satisfies

$$\begin{aligned}{} & {} \Vert \tilde{\Phi }_m-id\Vert _{\tilde{s}_{m+1};D(\tilde{r}_{m+1},\tilde{s}_{m+1}) \times \mathcal {O}}\le C\tilde{\varepsilon }^{\frac{81}{100}}_m, \end{aligned}$$
(6.12)
$$\begin{aligned}{} & {} \Vert \mathcal {D}\tilde{\Phi }_m-I d\Vert _{\tilde{s}_{m+1};D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} \le C\tilde{\varepsilon }^{\frac{47}{100}}_m. \end{aligned}$$
(6.13)

Proof

Suppose

$$\begin{aligned} \widetilde{X}_{m}=\widetilde{N}_{m}+\widetilde{P}_{m}, \end{aligned}$$

where \(\widetilde{N}_{m}\) and \(\widetilde{P}_{m}\) satisfy (6.8), (6.9)–(6.11) with m in place of \(m+1.\) \(\widetilde{P}_m\) can be rewritten as

$$\begin{aligned} \widetilde{P}_m=\widetilde{R}_m+(\widetilde{P}_m-\widetilde{R}_m), \end{aligned}$$

with

$$\begin{aligned} \begin{aligned} \widetilde{R}_m =\sum \limits _{\textsf {v}\in \{z_j, \bar{z}_j\}} \sum \limits _{\begin{array}{c} {|k|\le \tilde{K}_m,}\\ {|\alpha |+|\beta |\le 1} \end{array}} \widetilde{P}^{(\textsf {v})}_{m, k \alpha \beta }\textrm{e}^{\textrm{i}\langle k, \theta \rangle }z^{\alpha }\bar{z}^{\beta }\frac{\partial }{\partial \textsf {v}}\\ \end{aligned} \end{aligned}$$
(6.14)

We set the parameters \((r, \tilde{r}, \zeta , \tilde{\zeta }, \gamma , \sigma , K)\) in Section 5 to be \((r, \tilde{r}_m, \zeta , \tilde{\zeta }_m, \gamma , \tilde{\sigma }_m, \tilde{K}_m)\) here. Then, by Lemma 6.1, the assumptions (i)–(iii) in (5.6–(5.8)) are satisfied. It follows that the homological equation

$$\begin{aligned}{}[\tilde{N}_m, F_m]+\tilde{R}_m=\llbracket \tilde{R}_m\rrbracket \end{aligned}$$

admits an \(S-\)invariant approximate solution \(F_m\) on \(D(\tilde{r}_m-2\tilde{\sigma }_m,\tilde{s}_m)\times \mathcal {O}\) satisfying the estimate

$$\begin{aligned} \begin{aligned} \Vert F_m\Vert _{\tilde{s}_m,D(\tilde{r}_m-2\tilde{\sigma }_m,\tilde{s}_m)\times \mathcal {O}}&\le \frac{C_0\zeta ^{2}Q_{n+1}^{2}\tilde{K}_{m}^{2\tau +1}}{\gamma ^{2}\tilde{\sigma }_{m}^{2}(r-\tilde{r}_m)^{4}}\Vert \tilde{R}_m\Vert _{\tilde{s}_m,D(\tilde{r}_m,\tilde{s}_m)\times \mathcal {O}}. \end{aligned} \end{aligned}$$
(6.15)

By the definition of \(\varepsilon \), \(n\ge n_{0}\) and \(c>\frac{5(12\tau +18)}{2\tau U},\)

$$\begin{aligned}{} & {} \tilde{\varepsilon }_m\le \tilde{\varepsilon }_0=\varepsilon =\varepsilon _{-}Q_{n+1}^{-2^{n+2-n_{0}}c\tau U} \le Q_{n+1}^{-120(2=\tau +3)}, \\{} & {} Q_{n+1}\le \tilde{\varepsilon }_{m}^{-\frac{1}{120(2\tau +3)}}. \end{aligned}$$

In the proof of Lemma 6.1, we have obtained

$$\begin{aligned} \tilde{K}_{m}\le \left( \frac{1}{\tilde{\varepsilon }_0}\right) ^{\frac{1}{12\tau +18}} \le \tilde{\varepsilon }_m^{-\frac{1}{12\tau +18}}. \end{aligned}$$

By the definition of \(\tilde{\sigma }_{m},\)

$$\begin{aligned}{} & {} \tilde{\sigma }_{m}=\frac{r_0}{5\cdot 2^{m+3}}Q_{n+1}^{-4}\ge Q_{n+1}^{-5}\varepsilon ^{\frac{1}{24(2\tau +3)}}, \\{} & {} \frac{1}{\tilde{\sigma }_{m}}\le Q_{n+1}^{5}\varepsilon ^{-\frac{1}{2(24\tau +36)}}\le \tilde{\varepsilon }_{m}^{-\frac{1}{24\tau +36}}. \end{aligned}$$

Owing to the assumptions on \(\varepsilon _0\), one can obtain

$$\begin{aligned}{} & {} \varepsilon _0\le \frac{(r_0s_0\gamma _0)^{12\tau +36}}{Q_{n_0}^{2c\tau U}}\le \gamma _{0}^{12\tau +36}, \\{} & {} \frac{1}{\gamma _0}\le \left( \frac{1}{\varepsilon _0}\right) ^{\frac{1}{12\tau +36}} \le \left( \frac{1}{\varepsilon }\right) ^{\frac{1}{12\tau +36}} \end{aligned}$$

and

$$\begin{aligned} \frac{1}{\gamma }\le \frac{2}{\gamma _0}\le 2\left( \frac{1}{\varepsilon }\right) ^{\frac{1}{12\tau +36}} \le 2\tilde{\varepsilon }_m^{-\frac{1}{12\tau +36}}. \end{aligned}$$

It is obvious to obtain that

$$\begin{aligned} \frac{1}{(r-\tilde{r}_{m})^4}\le \left( \frac{8Q_{n}^{4}}{r_0}\right) ^4 \le Q_{n+1}^{2}\le \tilde{\varepsilon }_m^{-\frac{1}{60(2\tau +3)}}. \end{aligned}$$

Using the inequalities above, we have

$$\begin{aligned} \begin{aligned}&\Vert F_m\Vert _{\tilde{s}_m,D(\tilde{r}_m-2\tilde{\sigma }_m,\tilde{s}_m)\times \mathcal {O}}\\&\quad \le \frac{C_0\zeta ^{2}Q_{n+1}^{2}\tilde{K}_{m}^{2\tau +1}}{\gamma ^{2}\tilde{\sigma }_{m}^{2}(r-\tilde{r}_m)^{4}}\Vert \tilde{R}_m\Vert _{\tilde{s}_m,D(\tilde{r}_m,\tilde{s}_m)\times \mathcal {O}}\\&\quad \le \frac{C_0\zeta ^{2}Q_{n+1}^{2}\tilde{K}_{m}^{2\tau +1}}{\gamma ^{2}\tilde{\sigma }_{m}^{2}(r-\tilde{r}_m)^{4}}\tilde{\varepsilon }_m\\&\quad \le 16C_0\varepsilon _0 \tilde{\varepsilon }_{m}^{1-\frac{2}{10(24\tau +36)}-\frac{2\tau +1}{12\tau +18} -\frac{2}{12\tau +36}-\frac{20}{10(24\tau +36)}-\frac{2}{10(24\tau +36)}}\\&\quad \le \tilde{\varepsilon }_{m}^{\frac{81}{100}}. \end{aligned} \end{aligned}$$
(6.16)

Let \(\tilde{\Phi }_m=\phi ^1_{F_m},\) then

$$\begin{aligned} \begin{aligned} \tilde{X}_{m+1}&=(\tilde{\Phi }_m)^*\tilde{X}_{m}\\&=(\tilde{N}_{m}+\tilde{R}_{m})\circ X_{F}^{t}+\tilde{R}_{m}\circ X_{F}^{t}+ (\tilde{P}_{m}-\tilde{R}_{m})\circ X_{F}^{t}\\&=\tilde{N}_{m}+\llbracket \tilde{R}_m\rrbracket +\check{R}+\int \limits _{0}^{1}[\tilde{R}_{m}(t),F_m] \circ \phi ^{t}_{F_m}dt+(\tilde{P}_{m}-\tilde{R}_{m})\circ \phi _{F_m}^{1}\\&=\tilde{N}_{m+1}+\tilde{P}_{m+1}, \end{aligned} \end{aligned}$$
(6.17)

where

$$\begin{aligned} \tilde{N}_{m+1}= \tilde{N}_{m}+\llbracket \tilde{R}_m \rrbracket , \end{aligned}$$
$$\begin{aligned} \tilde{P}_{m+1}=\breve{R}_{m}+\int ^1_0(\phi ^t_{F_m})^*[\tilde{R}_m(t),F_m]dt+(\phi ^1_{F_m})^*(\tilde{P}_m-\tilde{R}_m), \end{aligned}$$
(6.18)

where \(\tilde{R}_m(t)=(1-t)(\llbracket \tilde{R}_m \rrbracket +\breve{R}_m)+t\tilde{R}_m.\)

Now

$$\begin{aligned} b_{m+1,j}(\theta ,\xi )=b_{m,j}(\theta ,\xi )+\textrm{i}\widetilde{R}^{z_jz_j}_m(\theta ,\xi ), \end{aligned}$$

for which we have

$$\begin{aligned} \begin{aligned} \Vert b_{m+1}\Vert _{\infty ,D(\tilde{r}_{m+1})\times \mathcal {O}}&\le \Vert b_{m}\Vert _{\infty ,D(\tilde{r}_{m})\times \mathcal {O}}+ \Vert diag\tilde{R}^{zz}_m\Vert _{\infty ,D(\tilde{r}_{m})\times \mathcal {O}} \\&\le \Vert b_{m}\Vert _{\infty ,D(\tilde{r}_{m},\tilde{s}_{m})\times \mathcal {O}}+ \Vert \tilde{P}_m\Vert _{\tilde{s}_{m};D(\tilde{r}_{m},\tilde{s}_{m})\times \mathcal {O}}\\&\le \tilde{\zeta }_m+\tilde{\varepsilon }_m \le \tilde{\zeta }_{m+1} \end{aligned} \end{aligned}$$

Now we consider the estimate for coordinate transformation \(\tilde{\Phi }_m:=\phi ^1_{F_m}.\) To this end, we first prove

$$\begin{aligned} \Vert DF_m\Vert _{\tilde{s}_{m+1},D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} \le \tilde{\varepsilon }_{m}^{\frac{47}{100}}. \end{aligned}$$
(6.19)

Indeed, by Cauchy’s estimate and (6.16), one has

$$\begin{aligned} \begin{aligned}&\Vert DF_m\Vert _{\tilde{s}_{m+1},D_{\tilde{\rho }_{m+1}}(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad =\Vert DF_m\Vert _{\tilde{s}_{m+1},D_{\tilde{\rho }_{m+1}}(\tilde{r}_{m}-5\tilde{\sigma }_{m},\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad \le \frac{1}{\tilde{\eta }_{m+1}}\frac{C}{3\tilde{\sigma }_m}\Vert F_m\Vert _{\tilde{s}_{m}, D_{\tilde{\rho }_{m}}(\tilde{r}_{m}-2\tilde{\sigma }_{m},\tilde{s}_{m})\times \mathcal {O}}\\&\quad \le \frac{C}{3}\tilde{\varepsilon }_{m}^{(\frac{81}{100}-\frac{1}{3}-\frac{1}{24\tau +36})} \le \tilde{\varepsilon }_{m}^{\frac{47}{100}}. \end{aligned} \end{aligned}$$

As a consequence, for every \(-1\le t\le 1\) the flow \(\phi ^t_{F_m}\) generated by vector field \(F_m\) defines an \(S-\)invariant coordinate transformation

$$\begin{aligned} \phi ^t_{F_m}: D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O} \rightarrow D(\tilde{r}_{m},\tilde{s}_m)\times \mathcal {O} \end{aligned}$$

and by Gronwall’s inequality and (6.16) (6.19).

$$\begin{aligned}{} & {} \begin{aligned} \Vert \phi _{F_{m}}^{t}-id\Vert _{\tilde{s}_{m+1},D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} \le C\Vert F_m\Vert _{\tilde{s}_{m+1},D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} \le C\tilde{\varepsilon }_{m}^\frac{81}{100}, \end{aligned} \\{} & {} \begin{aligned} \Vert D\phi _{F_{m}}^{t}-Id\Vert _{\tilde{s}_{m+1},D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} \le C\Vert DF_m\Vert _{\tilde{s}_{m+1},D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} \le C\tilde{\varepsilon }_{m}^\frac{47}{100}. \end{aligned} \end{aligned}$$

Finally, we give the estimate for \(\tilde{P}_{m+1}\) in (6.18). It follows from the definition of \(\tilde{K}_{m}\) that

$$\begin{aligned} e^{-\tilde{K}_{m}\tilde{\sigma }_{m}}=\tilde{\varepsilon }_{m-1}= \tilde{\varepsilon }_{m}^{\frac{4}{5}}. \end{aligned}$$

By the conclusion in (5.10) and (6.16),

$$\begin{aligned} \begin{aligned}&\Vert \breve{R}_m\Vert _{\tilde{s}_m,D(\tilde{r}_m-4\tilde{\sigma }_m,\tilde{s}_m)\times \mathcal {O}}\\&\quad =C_0\zeta ^{2}Q_{n+1}^{2}\tilde{\sigma }_{m}^{-2}(r-\tilde{r}_m)^{-2}e^{-\tilde{K}_{m}\tilde{\sigma }_{m}} \Big (\Vert \tilde{R}_m\Vert _{\tilde{s}_m,D(\tilde{r}_m,\tilde{s}_m)\times \mathcal {O}}\\&\qquad +\Vert \underline{b}_m\Vert _{D(\tilde{r}_m)}\Vert F_m\Vert _{\tilde{s}_m,D(\tilde{r}_m-2\tilde{\sigma }_m,\tilde{s}_m)\times \mathcal {O}}\Big )\\&\quad \le C_0\cdot 4\varepsilon _0\tilde{\varepsilon }_{m}^{-\frac{2}{10(24\tau +36)}-\frac{20}{10(24\tau +36)} -\frac{1}{10(24\tau +36)}+\frac{4}{5}} (\tilde{\varepsilon }_m+\tilde{\zeta }_{m}^{\frac{1}{2}}\tilde{\varepsilon }_m^{\frac{81}{100}})\\&\quad \le C_0\cdot 4\varepsilon _0\tilde{\varepsilon }_{m}^{-\frac{2}{10(24\tau +36)}-\frac{20}{10(24\tau +36)} -\frac{1}{10(24\tau +36)}+\frac{4}{5}+\frac{81}{100}}\\&\quad \le \tilde{\varepsilon }_{m}^{\frac{8}{5}}. \end{aligned} \end{aligned}$$
(6.20)

Using Cauchy’s estimate and above estimate, we have

$$\begin{aligned} \begin{aligned}&\Vert \breve{R}_m\Vert _{\tilde{s}_{m+1},D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad \le \frac{1}{\tilde{\eta }_{m+1}}\Vert \breve{R}_m\Vert _{\tilde{s}_{m},D(\tilde{r}_{m}-4\tilde{\sigma }_{m}, \tilde{s}_{m})\times \mathcal {O}}\\&\quad \le \tilde{\varepsilon }_{m}^{-\frac{1}{3}}\tilde{\varepsilon }_{m}^{\frac{8}{5}} \le \frac{1}{4}\tilde{\varepsilon }_{m+1}. \end{aligned} \end{aligned}$$
(6.21)

Then we get the estimate for \(\tilde{R}_m(t)=(1-t)(\llbracket \tilde{R}_m \rrbracket +\breve{R}_m)+t\tilde{R}_m\) obtained above,

$$\begin{aligned} \begin{aligned}&\Vert \tilde{R}_m(t)\Vert _{\tilde{s}_{m},D(\tilde{r}_{m}-4\tilde{\sigma }_{m},4\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad \le \Vert \breve{R}\Vert _{\tilde{s}_{m},D(\tilde{r}_{m}-4\tilde{\sigma }_{m},\tilde{s}_{m})\times \mathcal {O}} +2\Vert \tilde{P}_m\Vert _{\tilde{s}_{m},D(\tilde{r}_{m},\tilde{s}_{m})\times \mathcal {O}}\\&\quad \le \tilde{\varepsilon }_{m}^{\frac{8}{5}}+2\tilde{\varepsilon }_{m}. \end{aligned} \end{aligned}$$
(6.22)

The estimates for \(\tilde{R}_{m}(t)\) and \(F_m\) imply that

$$\begin{aligned} \begin{aligned}&\Vert (\phi ^{t}_{F_m})^*[\tilde{R}_{m}(t),F_m]\Vert _{\tilde{s}_{m+1},D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad \le \frac{2}{\tilde{\eta }_{m+1}\tilde{\sigma }_m}\Vert \tilde{R}_{m}(t)\Vert _{\tilde{s}_{m},D(\tilde{r}_m-4\tilde{\sigma }_m,4\tilde{s}_{m+1})\times \mathcal {O}} \Vert F_m\Vert _{\tilde{s}_{m},D(\tilde{r}_m-4\tilde{\sigma }_m,4\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad \le 2\tilde{\varepsilon }_{m}^{-\frac{1}{3}}\tilde{\varepsilon }_m^{-\frac{1}{24\tau +36}} (2\tilde{\varepsilon }_{m}+\tilde{\varepsilon }_{m}^{\frac{8}{5}}) \tilde{\varepsilon }_{m}^{\frac{81}{100}} \le \frac{1}{4}\tilde{\varepsilon }_{m+1}. \end{aligned} \end{aligned}$$
(6.23)

Consider the estimate for \(\Vert (\phi ^1_{F_m})^*(\tilde{P}_m-\tilde{R}_m)\Vert _{\tilde{s}_{m+1};D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}}.\) We rewrite \(\tilde{P}_m-\tilde{R}_m\) as \(\tilde{P}_m-\tilde{R}_m=\tilde{P}_{(1)m}+\tilde{P}_{(2)m}\) where

$$\begin{aligned} \tilde{P}_{(1)m}=\sum \limits _{\textsf {v}\in \{z_j, \bar{z}_j: j\ge 1\}} \sum \limits _{\begin{array}{c} {|k|> \tilde{K}_m,}\\ {|\alpha |+|\beta |\le 1} \end{array}}\tilde{P}^{(\textsf {v})}_{m, k \alpha \beta }\textrm{e}^{\textrm{i}\langle k, \theta \rangle }z^{\alpha }\bar{z}^{\beta }\frac{\partial }{\partial \textsf {v}} \end{aligned}$$

and

$$\begin{aligned} \tilde{P}_{(2)m}=\sum \limits _{\textsf {v}\in \{z_j, \bar{z}_j:j\ge 1\}} \sum \limits _{\begin{array}{c} {k\in \mathbb {Z}^2,}\\ {|\alpha |+|\beta |\ge 2} \end{array}}\tilde{P}^{(\textsf {v})}_{m, k \alpha \beta }\textrm{e}^{\textrm{i}\langle k, \theta \rangle }z^{\alpha }\bar{z}^{\beta }\frac{\partial }{\partial \textsf {v}}. \end{aligned}$$

Then

$$\begin{aligned} \begin{aligned}&\Vert (\phi ^1_{F_m})^*(\tilde{P}_m-\tilde{R}_m)\Vert _{\tilde{s}_{m+1};D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad \le 2\Vert \tilde{P}_{(1)m}\Vert _{\tilde{s}_{m+1}; D(\tilde{r}_{m}-5\tilde{\sigma }_m,\tilde{s}_{m+1})\times \mathcal {O}} +2\Vert \tilde{P}_{(2)m}\Vert _{\tilde{s}_{m+1}; D(\tilde{r}_{m}-5\tilde{\sigma }_m,\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad \le 2\tilde{\eta }_{m+1}^{-1}\frac{32}{(5\tilde{\sigma }_m)^2}\textrm{e}^{-5\tilde{K}_m\tilde{\sigma }_{m}} \Vert \tilde{P}_{m}\Vert _{\tilde{s}_{m};D(\tilde{r}_{m},\tilde{s}_{m})\times \mathcal {O}} +2\tilde{\eta }_{m+1}\Vert \tilde{P}_{m}\Vert _{\tilde{s}_{m};D(\tilde{r}_{m},\tilde{s}_{m})\times \mathcal {O}}\\&\quad \le \frac{64}{25}\tilde{\varepsilon }_{m}^{-\frac{1}{3}}\tilde{\varepsilon }_{m}^{4} \tilde{\varepsilon }_{m}^{-\frac{1}{12\tau +18}}\tilde{\varepsilon }_{m}+ 2\tilde{\varepsilon }_{m}^{\frac{1}{3}}\tilde{\varepsilon }_{m} \le \frac{1}{2}\tilde{\varepsilon }_{m+1}. \end{aligned} \end{aligned}$$

All these estimates yield that

$$\begin{aligned} \begin{aligned}&\Vert \tilde{P}_{m+1}\Vert _{\tilde{s}_{m+1};D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad \le \Vert \breve{R}_m\Vert _{\tilde{s}_{m+1},D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} +\Vert (\phi ^{t}_{F_m})^*[\tilde{R}_{m}(t),F_m]\Vert _{\tilde{s}_{m+1},D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}}\\&\qquad +\Vert (\phi ^1_{F_m})^*(\tilde{P}_m-\tilde{R}_m)\Vert _{\tilde{s}_{m+1};D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad \le \tilde{\varepsilon }_{m+1}. \end{aligned} \end{aligned}$$

\(\square \)

Now we verify the Töplitz–Lipschitz property of \( \widetilde{P}_{m+1}\).

Lemma 6.3

Suppose the vector field \(\widetilde{P}_{m}\) in \( \widetilde{X}_{m}=\widetilde{N}_{m}+\widetilde{P}_{m}\) above satisfies Töplitz–Lipschitz property (A4) on \(D(\tilde{r}_{m},\tilde{s}_{m})\) with \(\widetilde{\varepsilon }_{m},\, \tilde{\rho }_m\) in place of \(\varepsilon ,\,\rho \). Then the vector field \(\widetilde{P}_{m+1}\) in Proposition 6.2 also satisfies Töplitz–Lipschitz property (A4) on \(D(\tilde{r}_{m+1},\tilde{s}_{m+1})\) with \(\widetilde{\varepsilon }_{m+1},\, \tilde{\rho }_{m+1}\) in place of \(\varepsilon ,\,\rho \).

Proof

From the proof of Proposition 6.2, it is not difficult to verify the following inequality holds:

$$\begin{aligned} \left( \frac{32}{\tilde{\sigma }^2_m}\textrm{e}^{-\tilde{K}_m\tilde{\sigma }_m}\right) ^{\frac{4}{3}}\le \tilde{\varepsilon }_m \le \min \bigg \{\tilde{\sigma }^{(24\tau +36)}_m, \left( \frac{(r-\tilde{r}_{m})^2}{Q_{n+1}}\right) ^{60(2\tau +3)}, \left( \frac{\gamma ^2}{C_0\tilde{K}_{m}^{2\tau +1}}\right) ^{\frac{50}{9}}\bigg \}. \end{aligned}$$

It is just the assumption (5.27) in Proposition 5.2 if we replace \((r, \tilde{r}, \gamma , \sigma , K)\) in Section 5 by \((r, \tilde{r}_m, \gamma , \tilde{\sigma }_m, \tilde{K}_m)\). Then using Proposition 5.2, \(F_m\) and \(\breve{R}_{m}\) in Proposition 6.2 satisfy Töplitz–Lipschitz property (A4) with \(\tilde{\varepsilon }^{\frac{3}{5}}_m\), \(\tilde{\varepsilon }^{\frac{4}{3}}_m\), respectively, in place of \(\varepsilon \) on \(D(\tilde{r}_m-5\tilde{\sigma }_m, \tilde{s}_m)\).

Below we verify \(\widetilde{P}_{m+1}\) satisfies Töplitz–Lipschitz property (A4). Note that it can be rewritten as

$$\begin{aligned} \begin{aligned} \widetilde{P}_{m+1}&=\breve{R}_{m}+\widetilde{P}_m-\widetilde{R}_m+[\widetilde{P}_m,F_m]+\frac{1}{2!}[[\tilde{N}_m,F_m],F_m]+\frac{1}{2!}[[\widetilde{P}_m,F_m],F_m]\\&\quad +\cdots +\frac{1}{i!}[\cdots [\tilde{N}_m,\underbrace{F_m]\cdots ,F_m}_i]+\frac{1}{i!}[\cdots [\widetilde{P}_m,\underbrace{F_m]\cdots ,F_m}_i]+\cdots . \end{aligned} \end{aligned}$$

Thus it suffices to verify \([\widetilde{P}_m,F_m]\) and \(\widetilde{P}_m-\widetilde{R}_m\) satisfy Töplitz–Lipschitz property.

  • We first prove \([\widetilde{P}_m,F_m]\) satisfies Töplitz–Lipschitz property. By the definition of Lie bracket, the \(z_i-\)component of \([\tilde{P}_m,F_m]\) is

    $$\begin{aligned}{}[\tilde{P}_m,F_m]^{(z_i)}=\sum \limits _{u\in \mathscr {V}}\left( \frac{\partial \tilde{P}_{m}^{(z_i)}}{\partial u}F_{m}^{(u)}+\frac{\partial F_{m}^{(z_i)}}{\partial u}\tilde{P}_{m}^{(u)}\right) . \end{aligned}$$

    In what follows, we only consider \(\frac{\partial }{\partial z_{j}}[\widetilde{P}_m,F_m]^{(z_{i})}\) and the derivatives with respect to the other components are similarly analyzed. To this end, it suffices to consider \(\sum \limits _{h}\frac{\partial ^2 \widetilde{P}_m^{(z_i)}}{\partial z_h \partial z_j}F_m^{(z_h)}\) and \(\sum \limits _{h}\frac{\partial \widetilde{P}_m^{(z_i)}}{\partial z_h }\frac{\partial F_m^{(z_h)}}{\partial z_j }\) in \(\frac{\partial }{\partial z_{j}}[\widetilde{P}_m, F_m]^{(z_{i})}\) since the other terms can be similarly studied. Let \( p^{zz}_{ij}:=\lim \limits _{t\rightarrow \infty }\frac{\partial \widetilde{P}_m^{(z_{i+t})}}{\partial z_{j+t}},\,\,f^{zz}_{ij}:=\lim \limits _{t\rightarrow \infty }\frac{\partial F_m^{(z_{i+t})}}{\partial z_{j+t}}.\) Then

    $$\begin{aligned} \begin{aligned}&\left\| \sum \limits _{h}\left( \frac{\partial ^2 \tilde{P}_m^{(z_{i+t})}}{\partial z_{h} \partial z_{j+t}} F_m^{(z_{h})}-\lim \limits _{t\rightarrow \infty }\frac{\partial ^2 \tilde{P}_m^{(z_{i+t})}}{\partial z_{h}\partial z_{j+t}}F_m^{(z_{h})}\right) \right\| _{D{(\tilde{r}_{m+1},\tilde{s}_{m+1})}}\\&\quad \le \Vert F_m\Vert _{D{(\tilde{r}_{m}-2\tilde{\sigma }_m,\tilde{s}_{m})}}\left\| \frac{\partial \tilde{P}_m^{(z_{j+t})}}{\partial z_{j+t}}- p^{zz}_{ij}\right\| _{D{(\tilde{r}_{m},\tilde{s}_{m})}}\\&\quad \le \frac{\tilde{\varepsilon }_{m}^{1+\frac{81}{100}}}{|t|}\textrm{e}^{-\tilde{\rho }_m|i-j|} \le \tilde{\varepsilon }_{m}^{\frac{16}{25}}\frac{\tilde{\varepsilon }_{m+1}}{|t|}\textrm{e}^{-\tilde{\rho }_{m+1}|i-j|}. \\&\left\| \sum \limits _{h}\left( \frac{\partial \tilde{P}_m^{(z_{i+t})}}{\partial z_{h+t}}\frac{\partial F_m^{(z_{h+t})}}{\partial z_{j+t} }-p^{zz}_{ih}f^{zz}_{hj}\right) \right\| _{D(\tilde{r}_{m+1},\tilde{s}_{m+1})}\\&\quad \le \sum \limits _{h}\Vert f^{zz}_{hj}\Vert _{D{(\tilde{r}_{m+1},\tilde{s}_{m+1})}}\left\| \frac{\partial \tilde{P}_m^{(z_{i+t})}}{\partial z_{h+t}}-p^{zz}_{ih}\right\| _{D(\tilde{r}_{m+1},\tilde{s}_{m+1})}\\&\qquad +\sum \limits _{h}\Vert p^{zz}_{ih}\Vert _{{D(\tilde{r}_{m+1},\tilde{s}_{m+1})}}\left\| \frac{\partial F_m^{(z_{h+t})}}{\partial z_{j+t} }-f^{zz}_{hj}\right\| _{D(\tilde{r}_{m+1},\tilde{s}_{m+1})}\\&\qquad +\sum \limits _{h}\left\| \frac{\partial \tilde{P}_m^{(z_{i+t})}}{\partial z_{h+t}}-p^{zz}_{ih}\right\| _{D(\tilde{r}_{m+1},\tilde{s}_{m+1})}\left\| \frac{\partial F_m^{(z_{h+t})}}{\partial z_{j+t}}-f^{zz}_{hj}\right\| _{D(\tilde{r}_{m+1},\tilde{s}_{m+1})}\\&\quad \le \sum \limits _{h}\Vert f^{zz}_{hj}\Vert _{D{(\tilde{r}_{m}-2\tilde{\sigma }_m,\tilde{s}_{m})}}\left\| \frac{\partial \tilde{P}_m^{(z_{i+t})}}{\partial z_{h+t}}-p^{zz}_{ih}\right\| _{D(\tilde{r}_{m},\tilde{s}_{m})}\\&\qquad +\sum \limits _{h}\Vert p^{zz}_{ih}\Vert _{{D(\tilde{r}_{m}-4\tilde{\sigma }_m,\tilde{s}_{m})}}\left\| \frac{\partial F_m^{(z_{h+t})}}{\partial z_{j+t} }-f^{zz}_{hj}\right\| _{D(\tilde{r}_{m},\tilde{s}_{m})}\\&\qquad +\sum \limits _{h}\left\| \frac{\partial \tilde{P}_m^{(z_{i+t})}}{\partial z_{h+t}}-p^{zz}_{ih}\right\| _{D(\tilde{r}_{m},\tilde{s}_{m})}\left\| \frac{\partial F_m^{(z_{h+t})}}{\partial z_{j+t}}-f^{zz}_{hj}\right\| _{D(\tilde{r}_{m}-2\tilde{\sigma }_m,\tilde{s}_{m})}\\&\quad \le \sum \limits _{h}\left( \frac{\tilde{\varepsilon }_{m}^{1+\frac{3}{5}}}{|t|} \textrm{e}^{-\tilde{\rho }_m|j-h|}\textrm{e}^{-\tilde{\rho }_m|i-h|} +\frac{\tilde{\varepsilon }_{m}^{1+\frac{3}{5}}}{|t|} \textrm{e}^{-\tilde{\rho }_m|i-h|}\textrm{e}^{-\tilde{\rho }_m|j-h|} +\frac{\tilde{\varepsilon }_{m}^{1+\frac{3}{5}}}{|t|} \textrm{e}^{-\tilde{\rho }_m|i-h|}\textrm{e}^{-\tilde{\rho }_m|j-h|}\right) \\&\quad \le \frac{3\tilde{\varepsilon }_{m}^{1+\frac{3}{5}}}{|t|}\textrm{e}^{-\tilde{\rho }_{m+1}|i-j|} \sum \limits _{h}\textrm{e}^{-(\tilde{\rho }_m-\tilde{\rho }_{m+1} )(|j-h|+|i-h|)}\\&\quad \le \tilde{3\varepsilon }_{m}^{\frac{3}{10}}\frac{\tilde{\varepsilon }_{m+1}}{|t|}\textrm{e}^{-\tilde{\rho }_{m+1}|i-j|},\,\,\hbox {if}\,\,\, \tilde{\varepsilon }^{\frac{1}{20}}_m\frac{\textrm{e}^{(\tilde{\rho }_m-\tilde{\rho }_{m+1})}+1}{\textrm{e}^{(\tilde{\rho }_m-\tilde{\rho }_{m+1})}-1}\le 1. \end{aligned} \end{aligned}$$
  • We prove \(\tilde{P}_m-\tilde{R}_m\) satisfies Töplitz–Lipschitz property. Let

    $$\begin{aligned} \tilde{P}_m-\tilde{R}_m=\tilde{P}_{(1)m}+\tilde{P}_{(2)m} \end{aligned}$$

    where

    $$\begin{aligned}{} & {} \tilde{P}_{(1)m}=\sum \limits _{\textsf {v}\in \{z_j, \bar{z}_j:j\ge 1\}} \sum \limits _{\begin{array}{c} {|k|>\tilde{K}_m,}\\ {|\alpha |+|\beta |\le 1} \end{array}}\tilde{P}^{(\textsf {v})}_{m, k \alpha \beta }\textrm{e}^{\textrm{i}\langle k, \theta \rangle }z^{\alpha }\bar{z}^{\beta }, \\{} & {} \tilde{P}_{(2)m}=\sum \limits _{\textsf {v}\in \{z_j, \bar{z}_j:j\ge 1\}} \sum \limits _{\begin{array}{c} {k\in \mathbb {Z}^2,}\\ {|\alpha |+|\beta |\ge 2} \end{array}}\tilde{P}^{(\textsf {v})}_{m, k \alpha \beta }\textrm{e}^{\textrm{i}\langle k, \theta \rangle }z^{\alpha }\bar{z}^{\beta }. \end{aligned}$$

    We write

    $$\begin{aligned} \tilde{P}_{(1)m}^{(z_i)} =\sum \limits _{|k|> \tilde{K}_m}\tilde{P}^{(z_i)}_{m,k00}\textrm{e}^{\textrm{i}\langle k, \theta \rangle } +\sum \limits _{|k|> \tilde{K}_m,j}\tilde{P}^{(z_i)}_{m,k e_j 0}\textrm{e}^{\textrm{i}\langle k, \theta \rangle }z_j +\sum \limits _{|k|> \tilde{K}_m,j}\tilde{P}^{(z_i)}_{m,k0e_j}\textrm{e}^{\textrm{i}\langle k, \theta \rangle }\bar{z}_j, \end{aligned}$$

    and

    $$\begin{aligned} \frac{\partial \tilde{P}_{(1)m}^{(z_i)}}{\partial z_j} =\sum \limits _{|k|> \tilde{K}_m}\tilde{P}^{(z_i)}_{m, k e_j 0}\textrm{e}^{\textrm{i}\langle k, \theta \rangle } =\mathcal {R}_{\tilde{K}_m}\frac{\partial \tilde{P}^{(z_i)}_{m}}{\partial z_j}(\theta ,0,0). \end{aligned}$$

    Then

    $$\begin{aligned} \begin{aligned}&\left\| \frac{\partial \tilde{P}_{(1)m}^{(z_{i+t})}}{\partial z_{j+t}} -\lim \limits _{t\rightarrow \infty }\frac{\partial \tilde{P}_{(1)m}^{(z_{i+t})}}{\partial z_{j+t}}\right\| _{D{(\tilde{r}_{m+1},\tilde{s}_{m+1})}}\\&\quad \le \frac{32}{\tilde{\sigma }_{m}^2}\textrm{e}^{-\tilde{K}_m\tilde{\sigma }_{m}}\left\| \frac{\partial \tilde{P}_{m}^{(z_{i+t})}}{\partial z_{j+t}}-\lim \limits _{t\rightarrow \infty } \frac{\partial \tilde{P}_{m}^{(z_{i+t})}}{\partial z_{j+t}}\right\| _{D(\tilde{r}_{m},\tilde{s}_{m})}\\&\quad \le \tilde{\varepsilon }_{m}^{\frac{3}{4}}\frac{\tilde{\varepsilon }_m}{|t|}\textrm{e}^{-\tilde{\rho }_{m}|i-j|}\\&\quad \le \tilde{\varepsilon }_{m}^{\frac{1}{2}} \frac{\tilde{\varepsilon }_{m+1}}{|t|}\textrm{e}^{-\tilde{\rho }_{m+1}|i-j|}. \end{aligned} \end{aligned}$$

    For \(\tilde{P}^{(z_i)}_{(2)m},\) we note that

    $$\begin{aligned} \begin{aligned} \tilde{P}^{(z_i)}_{(2)m}&=\sum \limits _{|\alpha |+|\beta |\ge 2}\tilde{P}^{(z_i)}_{m,\alpha \beta }(\theta )z^{\alpha }\bar{z}^{\beta }\\&=\tilde{P}^{(z_i)}_{m, e_{l+j},0}(\theta )z_lz_j+\tilde{P}^{(z_i)}_{m,e_{l}, e_{j}}(\theta )\bar{z}_lz_j+\tilde{P}^{(z_i)}_{m,0,e_{l+j}}(\theta )\bar{z}_l\bar{z}_j+O(\Vert z\Vert ^3_p) \end{aligned} \end{aligned}$$

    and

    $$\begin{aligned} \frac{\partial \tilde{P}_{(2)m}^{(z_i)}}{\partial z_j} =O\left( \sum \limits _{l}z_l\frac{\partial ^2 \tilde{P}_{m}^{(z_i)}}{\partial z_l\partial z_j}(\theta ,0,0)\right) . \end{aligned}$$

    Then

    $$\begin{aligned}{} & {} \left\| \frac{\partial \tilde{P}_{(2)m}^{(z_{i+t})}}{\partial z_{j+t}} -\lim \limits _{t\rightarrow \infty }\frac{\partial \tilde{P}_{(2)m}^{(z_{i+t})}}{\partial z_{j+t}}\right\| _{D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} \nonumber \\{} & {} \quad \le C\left\| \sum \limits _{l}z_l\frac{\partial ^2 \tilde{P}_{m}^{(z_{i+t})}}{\partial z_l\partial z_{j+t}}(\theta ,0,0))-\lim \limits _{t\rightarrow \infty }\sum \limits _{l}z_l\frac{\partial ^2 \tilde{P}_{m}^{(z_{i+t})}}{\partial z_l\partial z_{j+t}}(\theta ,0,0))\right\| _{D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} \nonumber \\{} & {} \quad \le C\sup \limits _{\Vert z^{\pm }\Vert _p<\tilde{s}_{m+1}}\sum \limits _{l}|z_l|\left\| \frac{\partial }{\partial z_{l}}\left( \frac{\partial \tilde{P}_{m}^{(z_{i+t})}}{\partial z_{j+t}} -\lim \limits _{t\rightarrow \infty }\frac{\partial \tilde{P} _{m}^{(z_i+t)}}{\partial z_{j+t}}\right) \right\| _{D(\tilde{r}_{m}, \frac{1}{2}\tilde{s}_{m})\times \mathcal {O}} \nonumber \\{} & {} \quad \le C\sup \limits _{\Vert z^{\pm }\Vert _p<\tilde{s}_{m+1}}\sum \limits _{l}|z_l|\frac{\textrm{e}^{p|l|}}{\tilde{s}_{m}}\left\| \frac{\partial \tilde{P}_{m} ^{(z_{i+t})}}{\partial z_{j+t}}-\lim \limits _{t\rightarrow \infty }\frac{\partial \tilde{P}_{m}^{(z_i+t)}}{\partial z_{j+t}}\right\| _{D(\tilde{r}_{m},\tilde{s}_{m})\times \mathcal {O}}\nonumber \\{} & {} \quad \le \frac{C\tilde{s}_{m+1}}{\tilde{s}_{m}}\frac{\tilde{\varepsilon }_m}{|t|} \textrm{e}^{-\tilde{\rho }_{m}|i-j|}\nonumber \\{} & {} \quad \le C\tilde{\varepsilon }_{m}^{\frac{1}{12}}\frac{\tilde{\varepsilon }_{m+1}}{|t|}\textrm{e}^{-\tilde{\rho }_{m+1}|i-j|}. \end{aligned}$$
    (6.24)

    Similarly, one can verify that

    $$\begin{aligned} \left\| \lim \limits _{t\rightarrow \infty }\frac{\partial \tilde{P}_{(1)m}^{(z_{i+t})}}{\partial z_{j+t}}\right\| _{D{(\tilde{r}_{m+1},\tilde{s}_{m+1})}} \le \tilde{\varepsilon }_{m}^{\frac{1}{2}} \tilde{\varepsilon }_{m+1}\textrm{e}^{-\tilde{\rho }_{m+1}|i-j|} \end{aligned}$$
    (6.25)

    and

    $$\begin{aligned} \left\| \lim \limits _{t\rightarrow \infty }\frac{\partial \tilde{P}_{(2)m}^{(z_{i+t})}}{\partial z_{j+t}}\right\| _{D(\tilde{r}_{m+1},\tilde{s}_{m+1})}\le C\tilde{\varepsilon }_{m}^{\frac{1}{12}}\tilde{\varepsilon }_{m+1}\textrm{e}^{-\tilde{\rho }_{m+1}|i-j|}. \end{aligned}$$
    (6.26)

    Therefore, \(\tilde{P}_{m+1}\) satisfies Töplitz–Lipschitz property with \(\tilde{\varepsilon }_{m+1}, \tilde{\rho }_{m+1}\) in place of \(\varepsilon , \rho .\)

\(\square \)

After performing L steps of finite iteration, we take \(r_{+}=\frac{\tilde{r}_0}{2}\) and \(s_{+}=\tilde{s}_{L}.\) It is obvious that

$$\begin{aligned}{} & {} r_{+}\le \tilde{r}_{L-1}-\frac{\tilde{r}_0}{2^{m+1}}=\tilde{r}_{L}, \end{aligned}$$
(6.27)
$$\begin{aligned}{} & {} s_{+} =\tilde{\varepsilon }_{0}^{\frac{4}{3}((\frac{5}{4})^L-1)}\tilde{s}_{0}. \end{aligned}$$
(6.28)

All above analysis implies that

Corollary 6.1

Consider X in (6.3). For every \(0<\gamma <1,\) \(\tau \ge 10,\) \(s>0,\) \(r>0\), \(\rho >0\), \(\mathcal {A}\ge 14\), if

$$\begin{aligned} \Omega +[B]\in \mathcal{M}\mathcal{C}_\omega (\gamma , \tau , K, \mathcal {O}), \end{aligned}$$
$$\begin{aligned}{} & {} \Vert B\Vert _{\infty ,D(r)\times \mathcal {O}}\le \zeta , \end{aligned}$$
(6.29)
$$\begin{aligned}{} & {} \Vert P\Vert _{s; D(r,s)\times \mathcal {O}}\le \varepsilon \end{aligned}$$
(6.30)

and P has Töplitz–Lipschitz property on D(rs),  then there exists \(s_{+}>0,\) \(r_{+}>0\), \(\rho _{+}>0\), and a real analytic, nearly identity, \(S-\)invariant transformation

$$\begin{aligned} \Phi : D(r_{+}, s_{+})\times \mathcal {O}\rightarrow D(r, s)\times \mathcal {O} \end{aligned}$$

of the form

$$\begin{aligned} (\theta , z, \bar{z})\mapsto (\theta , W(\theta , z, \bar{z}), \overline{W}(\theta , z, \bar{z})) \end{aligned}$$

where W and \(\overline{W}\) are affine in \(z, \bar{z},\) which transforms above X into

$$\begin{aligned} \begin{aligned} X_{+}&=N_{+}+P_{+}\\&=\omega \frac{\partial }{\partial \theta }+\textrm{i}(\Omega (\xi )+B_{+}(\theta ,\xi ))z\frac{\partial }{\partial z}-\textrm{i}(\Omega (\xi )+\overline{B_{+}(\theta ,\xi )})\bar{z}\frac{\partial }{\partial \bar{z}}+P_{+}\\ \end{aligned} \end{aligned}$$
(6.31)

with

$$\begin{aligned} \Vert B_+\Vert _{\infty ,D(r_{+})\times \mathcal {O}}\le \zeta _+, \end{aligned}$$
(6.32)
$$\begin{aligned} \Vert P_+\Vert _{s_+; D(r_+,s_+)\times \mathcal {O}}\le \varepsilon _+, \end{aligned}$$
(6.33)

and \(P_+\) has Töplitz–Lipschitz property on \( D(r_+,s_+),\) Moreover, \(\Phi \) satisfies

$$\begin{aligned} \Vert \Phi -id\Vert _{s_+; D(r_+,s_+)\times \mathcal {O}}\le \varepsilon ^{\frac{4}{5}}, \end{aligned}$$
(6.34)
$$\begin{aligned} \Vert \mathcal {D}\Phi -I d\Vert _{s_+;D(r_+,s_+)\times \mathcal {O}}\le \varepsilon ^{\frac{9}{20}}. \end{aligned}$$
(6.35)

Proof

Take

$$\begin{aligned} \Phi =\tilde{\Phi }_{0}\circ \tilde{\Phi }_{1}\circ \cdots \circ \tilde{\Phi }_{L-1}, \end{aligned}$$

then by Proposition 6.2, we have

$$\begin{aligned} X_+=\Phi ^*X=N_++P_+=\tilde{N}_L+\tilde{P}_L \end{aligned}$$

with \(B_+=B+b_L\) and thus

$$\begin{aligned} \begin{aligned} \Vert B_+\Vert _{\infty ,D(r_{+})\times \mathcal {O}}&\le \Vert B\Vert _{\infty ,D(\tilde{r}_{L})\times \mathcal {O}}+\Vert b_L\Vert _{\infty ,D(\tilde{r}_{L})\times \mathcal {O}}\\&\le \zeta +\tilde{\zeta }_L \\&\le \zeta _+. \end{aligned} \end{aligned}$$

By Proposition 6.2 and Lemma 6.3, (6.33) holds true and \(P_+\) has Töplitz–Lipschitz property.

Now we verify (6.34) and (6.35).

$$\begin{aligned} \begin{aligned} \Vert \Phi -id\Vert _{s_{+};D(r_{+},s_{+})\times \mathcal {O}} =&\Vert \tilde{\Phi }_{0}\circ \tilde{\Phi }_{1}\circ \cdots \circ \tilde{\Phi }_{L-1}-id\Vert _{s_{+};D(r_{+},s_{+})\times \mathcal {O}}\\ \le&\Vert \tilde{\Phi }_{0}-id\Vert _{\tilde{s}_{1};D(\tilde{r}_{1},\tilde{s}_{1}) \times \mathcal {O}}\\&+\sum \limits ^{L-1}_{j=1}\prod ^{j-1}_{b=0}\Vert \mathcal {D} \tilde{\Phi }_{b}\Vert _{\tilde{s}_{b+1}; D(\tilde{r}_{b+1},\tilde{s}_{b+1})\times \mathcal {O}} \Vert \tilde{\Phi }_{j}-id\Vert _{\tilde{s}_{j+1};D(\tilde{r}_{j+1},\tilde{s}_{j+1})\times \mathcal {O}}\\ \le&C\tilde{\varepsilon }^{\frac{81}{100}}_0+4C\tilde{\varepsilon }^{\frac{81}{100}}_1\\ \le&2C\tilde{\varepsilon }^{\frac{81}{100}}_0 \le \varepsilon ^{\frac{4}{5}}. \end{aligned} \end{aligned}$$

We first have

$$\begin{aligned} \begin{aligned} \Vert \mathcal {D}\tilde{\Phi }^{-1}_m-I d\Vert _{\tilde{s}_{m+1};D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}} \le&\sum \limits ^{\infty }_{j=1}\Vert \mathcal {D}\tilde{\Phi }_m-I d\Vert ^j_{\tilde{s}_{m+1};D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}}\\ \le&\sum \limits ^{\infty }_{j=1}\left( C\tilde{\varepsilon }^{\frac{47}{100}}_m\right) ^{j} \le 2C\tilde{\varepsilon }^{\frac{47}{100}}_m, \end{aligned} \end{aligned}$$
(6.36)

then

$$\begin{aligned} \begin{aligned}&\Vert \mathcal {D}\Phi -Id\Vert _{s_{+};D(r_{+},s_{+})\times \mathcal {O}}\\&\quad =\Vert \mathcal {D}(\tilde{\Phi }_{0}\circ \tilde{\Phi }_{1}\circ \cdots \circ \tilde{\Phi }_{L-1})-Id\Vert _{s_{+};D(r_{+},s_{+})\times \mathcal {O}}\\&\quad \le \prod ^{L-2}_{b=0}\Vert \mathcal {D}\tilde{\Phi }_{b}\Vert _{\tilde{s}_{b+1};D(\tilde{r}_{b+1}, \tilde{s}_{b+1})\times \mathcal {O}}\Vert \mathcal {D}\tilde{\Phi }_{L-1}-Id\Vert _{\tilde{s}_{L}; D(\tilde{r}_{L},\tilde{s}_{L})\times \mathcal {O}}\\&\qquad +\sum \limits ^{L-2}_{j=0}\prod ^{j}_{b=0}\Vert \mathcal {D}\tilde{\Phi }_{b}\Vert _{\tilde{s}_{b+1}; D(\tilde{r}_{b+1},\tilde{s}_{b+1})\times \mathcal {O}}\Vert \mathcal {D}\tilde{\Phi }^{-1}_{j}-Id\Vert _{\tilde{s}_{m+1};D(\tilde{r}_{m+1},\tilde{s}_{m+1})\times \mathcal {O}}\\&\quad \le 2C\tilde{\varepsilon }^{\frac{47}{100}}_{L-1} +2C\sum \limits ^{L-2}_{j=0}\tilde{\varepsilon }_{j}^{\frac{47}{100}} \le \varepsilon ^{\frac{9}{20}}. \end{aligned} \end{aligned}$$

\(\square \)

6.2 An Infinite Induction

Given \(r>0\), \(s>0\), \(\rho >0\), \(\tau \ge 10,\) \(0<\gamma <1\), \(\mathcal {A}\ge 14\) and U defined in Sect. 5. Suppose c is a constant with \(c>\frac{30(2\tau +3)}{\tau U}.\) Let \(0<\varepsilon <1\) and \(n_0\in \mathbb {N}\) such that

$$\begin{aligned} Q_{n_0+1}^{-4c\tau U}<\varepsilon <\min \left\{ \frac{(rs\gamma )^{12\tau +36}}{Q_{n_0}^{2c\tau U}},\,\,2^{-32}r^4_0,\,\,2^{-18}\gamma ^{12}_0\right\} \end{aligned}$$
(6.37)

and

$$\begin{aligned} \ln \varepsilon ^{-1}\le \varepsilon ^{-\frac{1}{24\tau +36}}. \end{aligned}$$
(6.38)

For \(\nu \ge 1,\) we first define

$$\begin{aligned} \varepsilon _{\nu }=\varepsilon _{\nu -1}\cdot Q_{n_0+\nu }^{-2^{\nu +1}c\tau U},\,\varepsilon _{0}=\varepsilon . \end{aligned}$$
(6.39)

We also define other sequences as follows.

$$\begin{aligned} \begin{aligned}&\zeta _{\nu }=\sum \limits _{i=0}^{\nu -1}\varepsilon _{i}^{\frac{1}{2}},\,\,\,\gamma _{\nu }=\gamma _{0}-3\sum \limits _{i=0}^{\nu -1}\varepsilon _{i}^{\frac{1}{2}},\,\gamma _{0}=\gamma , \\&K_\nu =\Big \lfloor \left( \frac{\gamma ^{2}_\nu }{2\varepsilon ^{\frac{1}{2}}_\nu }\right) ^{\frac{1}{4\tau +6}}\Big \rfloor +1,\, r_{\nu }=\frac{r_{0}}{4Q_{n_0+\nu -1}^{4}},\,r_{0}=r,\\&s_{\nu }=s_{\nu -1}\cdot \varepsilon _{\nu -1}^{\frac{4}{3}\Big (\left( \frac{5}{4}\right) ^{2+\lfloor \frac{2^{\nu }c\tau U \ln Q_{n_0+\nu -1}}{24(2\tau +3)\ln \frac{5}{2}}\rfloor }-1\Big )},\,s_{0}=s,\\&\rho _{\nu }=\rho _0\left( 1-\sum \limits _{i=2}^{\nu +1}2^{-i}\right) ,\,\rho _0=\rho ,\, D_{\nu }=D(r_{\nu },s_{\nu }). \\ \end{aligned} \end{aligned}$$
(6.40)

Obviously, \(s_0>\cdots>s_\nu>s_{\nu +1}>\cdots \searrow 0\) and \(r_0>\cdots>r_\nu>r_{\nu +1}>\cdots \searrow 0.\)

According to the preceding analysis in Sect. 6.1, we obtain the following iterative lemma.

Lemma 6.4

(Iterative Lemma) Suppose \(\varepsilon \) satisfies (6.37) and (6.38), and the \(S-\)reversible vector field

$$\begin{aligned} \begin{aligned}&X_{\nu }=N_{\nu }+P_{\nu }\\&\quad =\omega \frac{\partial }{\partial \theta }+\textrm{i}(\Omega (\xi )+B_{\nu }(\theta ;\xi ))z\frac{\partial }{\partial z}-\textrm{i}(\Omega (\xi )+\overline{B_{\nu }(\theta ;\xi )})\bar{z}\frac{\partial }{\partial \bar{z}}+P_{\nu }(\theta , z, \bar{z}; \xi )\\ \end{aligned} \end{aligned}$$

on \(D_\nu \times \mathcal {O}_\nu \) satisfies

$$\begin{aligned}{} & {} \Vert B_{\nu }\Vert _{\infty ,D(r_{\nu })\times \mathcal {O}_{\nu }}\le \zeta _{\nu }. \end{aligned}$$
(6.41)
$$\begin{aligned}{} & {} \Omega (\xi )+[B_\nu (\cdot ;\xi )] \in \mathcal{M}\mathcal{C}_\omega (\gamma _\nu , \tau , K_\nu , \mathcal {O}_{\nu }), \end{aligned}$$
(6.42)
$$\begin{aligned}{} & {} \Vert P_\nu \Vert _{s_\nu ; D_\nu \times \mathcal {O}_\nu }\le \varepsilon _\nu \end{aligned}$$
(6.43)

and \(P_\nu \) has Töplitz–Lipschitz property (A4) with \(\varepsilon _\nu ,\, \rho _\nu \) in place of \(\varepsilon ,\, \rho .\) Then there exists a real analytic, \(S-\)invariant transformation

$$\begin{aligned} \Phi _\nu : D_{\nu +1}\times \mathcal {O}_\nu \rightarrow D_\nu \end{aligned}$$

of the form \((\theta , z, \bar{z};\xi )\mapsto (\theta , W_\nu (\theta , z, \bar{z};\xi ), \overline{W}_\nu (\theta , z, \bar{z};\xi )),\) where \(W_\nu \) and \(\overline{W}_\nu \) are affine in \(z, \bar{z},\) satisfying

$$\begin{aligned}{} & {} \Vert \Phi _\nu -id\Vert _{s_{\nu +1}; D_{\nu +1}\times \mathcal {O}_\nu }\le \varepsilon ^{\frac{4}{5}}_\nu , \\{} & {} \Vert \mathcal {D}\Phi _\nu -I d\Vert _{s_{\nu +1};D_{\nu +1}\times \mathcal {O}_\nu } \le \varepsilon ^{\frac{9}{20}}_\nu \end{aligned}$$

and a closed subset

$$\begin{aligned} \mathcal {O}_{\nu +1}=\mathcal {O}_\nu \setminus \bigcup _{K_\nu \le |k|< K_{\nu +1}}\bigcup _{0<|l|\le 2}\Gamma ^{\nu +1}_{kl}(\gamma _{\nu +1}), \end{aligned}$$
(6.44)

where

$$\begin{aligned} \Gamma ^{\nu +1}_{kl}(\gamma _{\nu +1})=\{\xi \in \mathcal {O}_\nu :|\langle k, \omega \rangle +\langle l, \Omega (\xi )+[B_{\nu +1}]\rangle |<\frac{\gamma _{\nu +1}}{\langle k\rangle ^\tau }\} \end{aligned}$$

such that \(X_{\nu +1}=(\Phi _{\nu })^*X_\nu =N_{\nu +1}+P_{\nu +1}\) satisfies the same assumptions as \(X_\nu \) with ‘\(\nu +1\)’ in place of ‘\(\nu \)’.

Proof

By the assumptions (6.41), (6.42) and (6.43), applying Corollary 6.1, we obtain a real analytic, \(S-\)invariant transformation \(\Phi _\nu \) as described in the lemma such that \(X_{\nu +1}=(\Phi _{\nu })^*X_\nu =N_{\nu +1}+P_{\nu +1}.\) Moreover, the estimate

$$\begin{aligned} \Vert B_{\nu +1}-B_\nu \Vert _{\infty ,D(r_{\nu +1})\times \mathcal {O}_{\nu }} \le 2\varepsilon _\nu \le \varepsilon _{\nu }^{\frac{1}{2}} \end{aligned}$$
(6.45)

holds.

The new parameter set \(\mathcal {O}_{\nu +1}\) can be constructed as follows. For all \(\xi \in \mathcal {O}_\nu \) and \(|k|< K_\nu ,\) by (6.45),

$$\begin{aligned} \begin{aligned}&|\langle k, \omega \rangle +\langle l, \Omega (\xi )+[B_{\nu +1}]\rangle |\\&\quad \ge |\langle k,\omega \rangle +\langle l,\Omega (\xi )+[B_{\nu }]\rangle |-|\langle l, [B_{\nu +1}]-[B_\nu ]\rangle |\\&\quad \ge \frac{\gamma _{\nu }}{\langle k\rangle ^\tau } -\frac{\varepsilon _{\nu }^{\frac{1}{2}}}{\langle k\rangle ^\tau } \ge \frac{\gamma _{\nu +1}}{\langle k\rangle ^\tau }. \end{aligned} \end{aligned}$$

Then it remains to exclude the resonant sets \(\Gamma ^{\nu +1}_{kl}(\gamma _{\nu +1})\) for \(K_\nu \le |k|< K_{\nu +1}\) and \(0<|l|<2,\) and we thus obtain the desired parameter set \(\mathcal {O}_{\nu +1}\) in (7.2).

Applying Corollary 6.1 again, on \(D_{\nu +1}\times \mathcal {O}_{\nu +1},\) \(X_{\nu +1}\) has the same properties as those of \(X_{\nu }.\) \(\square \)

7 Convergence and Measure Estimates

7.1 Convergence

We begin with the \(S-\)reversible vector field

$$\begin{aligned} \begin{aligned} X_{0}&=N_{0}+P_{0}\\&=N+P\\&=\omega \frac{\partial }{\partial \theta }+\textrm{i}\Omega (\xi )z\frac{\partial }{\partial z}-\textrm{i}\Omega (\xi )\bar{z}\frac{\partial }{\partial \bar{z}}+P(\theta , z, \bar{z}; \xi ) \end{aligned} \end{aligned}$$
(7.1)

on D(rs) with \(\xi \in \mathcal {O}.\) \(\Omega _j\in C^1_W(\mathcal {O})\) and \(B_0=0.\) The non-resonance condition

$$\begin{aligned} \Omega \in \mathcal{M}\mathcal{C}_\omega (\gamma _0, \tau , K_0, \mathcal {O}_0) \end{aligned}$$

is satisfied by setting

$$\begin{aligned} \mathcal {O}_{0}=\mathcal {O} \setminus \bigcup _{0<|k|\le K_{0}}\bigcup _{0<|l|\le 2}\Gamma ^{0}_{kl}(\gamma _{0}). \end{aligned}$$
(7.2)

The perturbation

$$\begin{aligned} \Vert P_0\Vert _{s_0; D_{0}\times \mathcal {O}_0}\le \varepsilon _0. \end{aligned}$$

We conclude form the iterative Lemma 6.4 that there exists a decreasing sequence of domains \(D_\nu \times \mathcal {O}_\nu \) and a sequence of transformations

$$\begin{aligned} \Phi ^\nu :=\Phi _0\circ \cdots \circ \Phi _{\nu -1}:D_{\nu }\times \mathcal {O}_{\nu -1}\rightarrow D_0, \end{aligned}$$

such that \((\Phi ^\nu )^*X_0=N_\nu +P_\nu ,\,\,\nu \ge 1,\) which satisfies the properties in Lemma 6.4. Following from [40], \(\Phi ^\nu \) converge uniformly on \(D_\infty \times \mathcal {O}_\gamma \subseteq \bigcap \limits _{\nu \ge 0} D_\nu \times \mathcal {O}_\nu ,\) where \(D_\infty :=\mathbb {T}^2\times \{0\}\times \{0\}\), \(\mathcal {O}_\gamma :=\bigcap \limits _{\nu \ge 0}\mathcal {O}_\nu ,\) to a Whitney smooth family of smooth (\(C^\infty \)) torus embedding

$$\begin{aligned} \Phi : \mathbb {T}^2\times \mathcal {O}_\gamma \rightarrow \mathcal {P}. \end{aligned}$$

Similarly, \(B_\nu \) converge uniformly on \(\mathbb {T}^2\times \mathcal {O}_\gamma \) to a limit \(B_*.\) Moreover,

$$\begin{aligned} \begin{aligned}&\Vert X\circ \Phi ^\nu -\mathcal {D}\Phi ^\nu \cdot N_\nu \Vert _{s_\nu ;D_{\nu }\times \mathcal {O}_\gamma } \\ \le&\Vert \mathcal {D}\Phi ^\nu \Vert _{s_\nu ;D_{\nu }\times \mathcal {O}_\gamma } \Vert (\Phi ^\nu )^*X - N_\nu \Vert _{s_\nu ;D_{\nu }\times \mathcal {O}_\gamma } \\ =&O(\varepsilon _\nu ). \end{aligned} \end{aligned}$$

Let \(\nu \rightarrow \infty ,\) we have \(X\circ \Phi =\mathcal {D}\Phi \cdot N_*\) on \(D_\infty \) for each \(\xi \in \mathcal {O}_\gamma ,\) where

$$\begin{aligned} \begin{aligned} N_*=&\omega \frac{\partial }{\partial \theta }+\textrm{i}(\Omega (\xi )+B_{*}(\theta ;\xi ))z\frac{\partial }{\partial z}-\textrm{i}(\Omega (\xi )+\overline{B_{*}(\theta ;\xi )})\bar{z}\frac{\partial }{\partial \bar{z}}.\\ \end{aligned} \end{aligned}$$
(7.3)

As in [40], \(\Phi \) can be extended to \(D(0, \frac{s}{2})\times \mathcal {O}_\gamma \) since \(\Phi \) is affine in \(z,\,\bar{z}\). More precisely, uniformly on \(D(0, \frac{s}{2})\times \mathcal {O}_\gamma ,\) we have

$$\begin{aligned} (\Phi ^\nu )^*X - N_\nu \longrightarrow \Phi ^*X - N_*=:P_*,\,\,\hbox {as}\,\,\nu \rightarrow \infty , \end{aligned}$$

such that

$$\begin{aligned} P_*=\sum \limits _{\textsf {v}\in \{z_j, \bar{z}_j, j\ge 1\}}\sum \limits _{|\alpha |+|\beta |\ge 2}P^{(\textsf {v})}_{*\alpha \beta }(\theta ; \xi )z^{\alpha }\bar{z}^{\beta }\frac{\partial }{\partial \textsf {v}}. \end{aligned}$$

At last, we verify \(C^{\infty }\) smoothness of \(\Phi ^{\infty }\) on \(\theta \). For \(\Phi ^{\nu }\) defined above,

$$\begin{aligned} \Vert \mathcal {D}\Phi ^{\nu }\Vert _{s_{\nu },D_{\nu }\times \mathcal {O}} \le \prod _{j=0}^{\nu -1}\Vert \mathcal {D}\Phi _{j}\Vert _{s_{j},D_{j}\times \mathcal {O}} \le \prod _{j=0}^{\nu -1}\left( 1+\varepsilon _j^{\frac{9}{20}}\right) \le 2, \end{aligned}$$

then

$$\begin{aligned} \Vert \Phi ^{\nu +1}-\Phi ^{\nu }\Vert _{s_{\nu +1},D_{\nu +1}\times \mathcal {O}} \le \Vert \mathcal {D}\Phi ^{\nu }\Vert _{s_{\nu },D_{\nu }\times \mathcal {O}}\cdot \Vert \Phi _{\nu }-id\Vert _{s_{\nu },D_{\nu }\times \mathcal {O}} \le 2\varepsilon _{\nu }^{\frac{4}{5}}. \end{aligned}$$

By the definition of \(\varepsilon _\nu \) we know that for every \(b\in \mathbb {Z}^2\), there exists some \(\mathcal {N}\in \mathbb {N}\) so that for all \(\nu \ge \mathcal {N}\), we have \(2\left( \frac{4Q_{n_0+\nu }^{4}}{r_0}\right) ^{|b|}\le \varepsilon _{\nu }^{-\frac{2}{5}}.\)

Then by Cauchy estimate, we have

$$\begin{aligned} \begin{aligned} \left| \frac{\partial ^{|b|}}{\partial \theta ^b}\left( \Phi ^{\nu +1}-\Phi ^{\nu }\right) \right| \le&r_{\nu +1}^{-|b|}\Vert \Phi ^{\nu +1}-\Phi ^{\nu }\Vert _{s_{\nu +1},D_{\nu +1}\times \mathcal {O}}\\ \le&\left( \frac{4Q_{n_0+\nu }^{4}}{r_0}\right) ^{|b|}\cdot 2\varepsilon _{\nu }^{\frac{4}{5}}\\ \le&\varepsilon _{\nu }^{-\frac{2}{5}}\cdot \varepsilon _{\nu }^{\frac{4}{5}} =\varepsilon _{\nu }^{\frac{2}{5}}, \end{aligned} \end{aligned}$$

which implies the limit \(\Phi ^\infty =\lim \limits _{\nu \rightarrow \infty }\Phi ^\nu \) is \(C^\infty \) smooth on \(\theta \).

7.2 Measure Estimates

In this subsection, we complete the Lebesgue measure estimate of the parameter set \(\mathcal {O} \setminus \mathcal {O}_\gamma .\) In the process of constructing iterative sequences, we obtain a decreasing sequence of closed sets \(\mathcal {O}_0\supset \mathcal {O}_1\supset \cdots \) such that \(\mathcal {O}_\gamma =\bigcap \limits _{\nu \ge 0}\mathcal {O}_\nu \) and

$$\begin{aligned} \mathcal {O} \setminus \mathcal {O}_\gamma =\bigcup _{\nu \ge 0} \bigcup _{\begin{array}{c} {K_{\nu -1}<|k|\le K_{\nu },}\\ {0<|l|\le 2} \end{array}}\Gamma ^{\nu }_{kl}(\gamma _{\nu }), \end{aligned}$$
(7.4)

where for \(\nu \ge 0,\)

$$\begin{aligned} \Gamma ^{\nu }_{kl}(\gamma _{\nu })=\left\{ \xi \in \mathcal {O}_{\nu -1}:|\langle k, \omega \rangle +\langle l, \Omega (\xi )+[B_{\nu }]\rangle |<\frac{\gamma _{\nu }}{\langle k\rangle ^\tau }\right\} \end{aligned}$$

here \(B_{0}=0, K_{-1}=0.\)

In the following, it remains to consider the case of \(l=e_i-e_j,\,i\ne j,\) which is the most difficult one.

Note that \(\Omega _j(\xi )=d(\xi )j+\widetilde{\Omega }_j(\xi ).\) \(\Vert B_{\nu }\Vert _{\infty ,D(r_{\nu })\times \mathcal {O}_{\nu -1}}\le 2\varepsilon ^{\frac{1}{2}}_0,\)

Let

$$\begin{aligned}{} & {} \underline{\Omega }_\nu (\xi ):=\Omega (\xi )+[B_{\nu }], \\{} & {} \Gamma ^{\nu }_{knm}(\gamma _{\nu }):=\left\{ \xi \in \mathcal {O}_{\nu -1}:|\langle k, \omega \rangle +\underline{\Omega }_{\nu ,n}(\xi )-\underline{\Omega }_{\nu ,m}(\xi )\rangle |<\frac{\gamma _{\nu }}{K_\nu ^\tau }\right\} , \end{aligned}$$

and

$$\begin{aligned} M^\nu :=\langle k, \omega \rangle +\underline{\Omega }_{\nu ,n}(\xi )-\underline{\Omega }_{\nu ,m}(\xi ). \end{aligned}$$

Lemma 7.1

Suppose \(\mathcal {C}=\mathcal {C}(A_0, A_1, A_2)\) is a constant defined in Section 5. Then for any given \(n,m\in \mathbb {N}\) with \(|n-m|\le \mathcal {C}K_\nu \), there are \(n_0, m_0, t\ge 1\) with \(1\le n_0\le 2\mathcal {C}K_\nu , 1\le m_0\le 2\mathcal {C}K_\nu \) such that \(n=n_0+ t, m=m_0+ t.\) Thus

$$\begin{aligned} \bigcup \limits _{n,m\ge 1} \Gamma ^{\nu }_{knm}\subset \bigcup \limits _{1\le n_0, m_0\le 2\mathcal {C}K_\nu , t\ge 1} \Gamma ^{\nu }_{k,n_0+t,m_0+t}. \end{aligned}$$

Proof

It is easy to see that there exists a \(t_0 \ge 1\) such that \(|n-t_0|\le \mathcal {C} K_\nu .\) Take \(n_0=n-t_0\) and \(m_0=n_0+m-n\), then

$$\begin{aligned} |m_0|\le |n_0|+|m-n|\le 2\mathcal {C}K_\nu . \end{aligned}$$

We have \(\square \)

Lemma 7.2

For fixed \(k, n_0, m_0,\)

$$\begin{aligned} {{\,\textrm{meas}\,}}\left( \bigcup \limits _{ t \in \mathbb {N}} \Gamma ^{\nu }_{k,n_0+t,m_0+t}\right) \le c\frac{\gamma _\nu }{K^{\frac{\tau }{2}}_\nu }, \end{aligned}$$

here c is a constant depending on \(A_0, A_1, A_2\) and \({{\,\textrm{meas}\,}}(\mathcal {O})\).

Proof

Let \(\underline{\Omega }_{\nu ,j}=d(\xi )j+\Omega ^0_{\nu ,j},\) and \(M_\nu (t)=\langle k, \omega \rangle +\underline{\Omega }_{\nu ,n_0+t}-\underline{\Omega }_{\nu ,m_0+t}.\)

From the Töplitz–Lipschitz property of \(P_\nu \), we conclude that

$$\begin{aligned} |M_\nu (t)-\lim _{t\rightarrow \infty }M_\nu (t)|<\frac{\varepsilon _0}{|t|}. \end{aligned}$$

Let

$$\begin{aligned} \Gamma ^{\nu }_{k,n_0,m_0,\infty }:=\left\{ \xi \in \mathcal {O}_{\nu -1}:|\lim _{t\rightarrow \infty }M_\nu (t)|<\frac{\gamma _\nu }{K_\nu ^{\frac{\tau }{2}}}\right\} . \end{aligned}$$

For \(\xi \in \mathcal {O}_{\nu -1}{\setminus } \Gamma ^{\nu }_{k,n_0,m_0,\infty },\, |\lim _{t\rightarrow \infty }M_\nu (t)|\ge \frac{\gamma _\nu }{K_\nu ^{\frac{\tau }{2}}}.\)

When \(|t|>K_\nu ^{\frac{\tau }{2}},\) for \(\xi \in \mathcal {O}_{\nu -1}{\setminus } \Gamma ^{\nu }_{k,n_0,m_0,\infty },\) we have

$$\begin{aligned} \begin{aligned} |M_\nu (t)| \ge&|\lim _{t\rightarrow \infty }M_\nu (t)|-|M_\nu (t)-\lim _{t\rightarrow \infty }M_\nu (t)|\\ \ge&\frac{\gamma _\nu }{K_\nu ^{\frac{\tau }{2}}}-\frac{\varepsilon _0}{t}\\ \ge&\frac{\gamma _\nu }{K_\nu ^{\frac{\tau }{2}}}-\frac{\varepsilon _0}{K_\nu ^{\frac{\tau }{2}}} \ge \frac{\gamma _\nu }{K_\nu ^{\tau }}. \end{aligned} \end{aligned}$$

Thus

$$\begin{aligned} \mathcal {O}_{\nu -1}\setminus \Gamma ^{\nu }_{k,n_0,m_0,\infty }\subseteq \{\xi \in \mathcal {O}_{\nu -1}:|M_\nu (t)|\ge \frac{\gamma _{\nu }}{K_\nu ^\tau }\} \end{aligned}$$

then

$$\begin{aligned} \Gamma ^{\nu }_{k,n_0,m_0,\infty }\supseteq \bigcup \limits _{|t|> K_\nu ^{\tau /2}}\{\xi \in \mathcal {O}_{\nu -1}:|M_\nu (t)|<\frac{\gamma _{\nu }}{K_\nu ^\tau }\}. \end{aligned}$$

Notice that by Lemma 9.2, one has \(|\partial _\xi (M_\nu (t))|\ge \frac{A_1}{4},\) then

$$\begin{aligned} {{\,\textrm{meas}\,}}\left( \Gamma ^{\nu }_{k,n_0,m_0,\infty }\right) \le c\frac{\gamma _{\nu }}{K_\nu ^{\tau }} \end{aligned}$$

and

$$\begin{aligned} {{\,\textrm{meas}\,}}\left( \bigcup \limits _{|t|> K_\nu ^{\frac{\tau }{2}}}\left\{ \xi \in \mathcal {O}_{\nu -1}:|M_\nu (t)|<\frac{\gamma _{\nu }}{K_\nu ^\tau }\right\} \right) \le c\frac{\gamma _{\nu }}{K_\nu ^{\tau }}. \end{aligned}$$

When \(|t|\le K_\nu ^{\frac{\tau }{2}},\) consider the resonant set

$$\begin{aligned} \Gamma ^{\nu }_{k,n_0,m_0,t}:=\{\xi \in \mathcal {O}_{\nu -1}:|M_\nu (t)|<\frac{\gamma _\nu }{K_\nu ^{\tau }}\}. \end{aligned}$$

We have

$$\begin{aligned} {{\,\textrm{meas}\,}}\left( \bigcup \limits _{|t|\le K_\nu ^{\frac{\tau }{2}}}\Gamma ^{\nu }_{k,n_0,m_0,t}\right) \le 2K_\nu ^{\frac{\tau }{2}}\frac{c\gamma _{\nu }}{K_\nu ^{\tau }} \le c\frac{\gamma _{\nu }}{K_\nu ^{\frac{\tau }{2}}}. \end{aligned}$$

Therefore,

$$\begin{aligned} {{\,\textrm{meas}\,}}\left( \bigcup \limits _{t\in \mathbb {N}}\Gamma ^{\nu }_{k,n_0+t,m_0+t}\right) \le c\frac{\gamma _{\nu }}{K_\nu ^{\frac{\tau }{2}}}. \end{aligned}$$

\(\square \)

According to the above analysis, we obtain the following lemma.

Lemma 7.3

Let \(\tau >10.\) Then the total measure of resonant set should be excluded during the KAM iteration is

$$\begin{aligned} {{\,\textrm{meas}\,}}\left( \mathcal {O} \setminus \mathcal {O}_\gamma \right) =O(\gamma ). \end{aligned}$$

Proof

$$\begin{aligned} \mathcal {O} \setminus \mathcal {O}_\gamma =\bigcup _{\nu \ge 0} \bigcup _{\begin{array}{c} {K_{\nu -1}<|k|\le K_{\nu },}\\ {0<|l|\le 2} \end{array}} \Gamma ^{\nu }_{kl}(\gamma _{\nu }), \\ \begin{aligned} {{\,\textrm{meas}\,}}\left( \mathcal {O} \setminus \mathcal {O}_\gamma \right) \le&\sum \limits _{\nu \ge 0}{{\,\textrm{meas}\,}}\left( \bigcup _{\begin{array}{c} {K_{\nu -1}<|k|\le K_{\nu },}\\ {0<|l|\le 2} \end{array}} \Gamma ^{\nu }_{kl}(\gamma _{\nu })\right) \\ \le&\sum \limits _{\nu \ge 0}cK_\nu \frac{\gamma _{\nu }}{K_\nu ^{\frac{\tau }{2}}}=O(\gamma ). \end{aligned} \end{aligned}$$

\(\square \)

8 Proof of Theorem 1.1

We give the proof of Theorem 1.1 by Theorem 4.1.

Let \(\lambda _j=\mu ^2j^2\) and \(\phi _j(x)=\sqrt{\frac{2\mu }{\pi }}\sin \mu jx,\,(j\ge 1)\) be the eigenvalues and eigenfunctions of operator \(-\frac{d^2}{dx^2}\) under Dirichlet boundary conditions \(y(0)=0=y(\frac{\pi }{\mu }).\) We also denote \(\psi _j(x)=\sqrt{\frac{2\mu }{\pi }}\cos \mu jx\), and obviously \(\frac{d}{dx}\phi _j(x)=\mu j\psi _j(x).\)

To write Eq. (1.1) as an infinite dimensional reversible system, we introduce for \(p>0\) the following two Banach spaces consist of odd functions

$$\begin{aligned} \mathcal {W}^{p}_{odd}=\{u=\sum \limits _{j\ge 1}q_j\phi _j: \Vert u\Vert _{p}=\sum \limits _{j\ge 1}\textrm{e}^{p j}|q_j|<\infty \}, \end{aligned}$$
(8.1)

and even functions

$$\begin{aligned} \mathcal {W}^{p}_{even}=\{u=\sum \limits _{j\ge 0}p_j\psi _j: \Vert u\Vert _{p}=\sum \limits _{j\ge 0}\textrm{e}^{p j}|p_j|<\infty \}. \end{aligned}$$
(8.2)

Through the inverse discrete Fourier transform \(\mathcal {S}:\ell ^{p}\rightarrow \mathcal {W}^{p}_{odd}\) (resp. \(\mathcal {W}^{p}_{even}\)), \(\mathcal {W}^{p}_{odd}\) (resp. \(\mathcal {W}^{p}_{even}\)) may be identified with the space \(\ell ^{p}.\)

Let \(u=\mathcal {S} q=\sum \limits _{j\ge 1}q_j\phi _j\in \mathcal {W}^{p}_{odd}.\) We write

$$\begin{aligned} f(\theta , x, u, u_x, u_t)=\sum _{i,j,h\ge 0}f_{ijh}(\theta ,x)u^iu^j_x u^h_t. \end{aligned}$$

By conditions (1.3)–(1.5), \(f_{ijh}(\theta ,x)\) satisfies

$$\begin{aligned} f_{ijh}(-\theta ,x)=(-1)^hf_{ijh}(\theta ,x) \end{aligned}$$

and

$$\begin{aligned} f_{ijh}(\theta ,-x)=(-1)^{i+h+1}f_{ijh}(\theta ,x). \end{aligned}$$

For every x and \(i, j, h\ge 0\), \(f_{ijh}(\cdot , x)\) is real analytic in the strip \(\{\theta : |\hbox {Im} \theta |<r,\,r>0\}.\) For every \(\theta ,\) \(f_{ijh}(\theta ,\cdot )\in \mathcal {W}^{\rho }_{p}\cup \mathcal {W}^{p}_{even}.\)

Then Eq. (1.1) is written as

$$\begin{aligned} \ddot{q}_j+\lambda _jq_j+\varepsilon g_j(\omega t, q, \dot{q})=0,\,j\ge 1 \end{aligned}$$
(8.3)

where \(g_j(\omega t, q, \dot{q})=\int ^{\frac{\pi }{\mu }}_0 f(\omega t, x, \mathcal {S}q, (\mathcal {S}q)_x, (\mathcal {S}q)_t)\phi _jdx\) and reversible condition (1.3) becomes \(g_j(\omega t, q, \dot{q})=g_j(-\omega t, q, -\dot{q}).\)

Let \(z_j=-\sqrt{\lambda }_j q_j+\textrm{i}\dot{q}_j, \bar{z}_j=-\sqrt{\lambda }_j q_j-\textrm{i}\dot{q}_j\), then Eq. (8.3) can be rewritten as

$$\begin{aligned} {\left\{ \begin{array}{ll} \dot{\theta }=\omega ,\quad \\ \dot{z}_j=\textrm{i}\sqrt{\lambda _j} z_j-\textrm{i}\varepsilon g_j(\theta , \ldots ,-\frac{z_i+\bar{z}_i}{2\sqrt{\lambda _i}},\ldots , \frac{z_i-\bar{z}_i}{2\textrm{i}},\ldots ),\quad \\ \dot{\bar{z}}_j=-\textrm{i}\sqrt{\lambda _j} \bar{z}_j+\textrm{i}\varepsilon g_j(\theta , \ldots ,-\frac{z_i+\bar{z}_i}{2\sqrt{\lambda _i}},\ldots , \frac{z_i-\bar{z}_i}{2\textrm{i}},\ldots ),\,j\ge 1 \end{array}\right. } \end{aligned}$$
(8.4)

which is reversible with respect to the involution \(S(\theta , z,\bar{z})=(-\theta , \bar{z},z)\).

Let \(s>0,\) then on D(rs) the corresponding \(S-\)reversible vector field of system (8.4) is

$$\begin{aligned} X(\theta , z,\bar{z}; \mu )=N(\theta , z,\bar{z}; \mu )+P(\theta , z,\bar{z}; \mu ), \end{aligned}$$
(8.5)

and

$$\begin{aligned}{} & {} N=\omega \frac{\partial }{\partial \theta }+\textrm{i}\Omega (\mu ) z\frac{\partial }{\partial z}-\textrm{i}\Omega (\mu ) \bar{z}\frac{\partial }{\partial \bar{z}}, \\{} & {} P=\sum \limits _{j\ge 1}-\textrm{i}\varepsilon g_j\frac{\partial }{\partial z_j}+\sum \limits _{j\ge 1}\textrm{i}\varepsilon g_j\frac{\partial }{\partial \bar{z}_j}. \end{aligned}$$

where \(\Omega _j(\mu )=\sqrt{\lambda _j}=\mu j\).

Now we give the verification of assumptions (A1)–(A4) for (8.5).

Verifying assumptions (A1) and (A2): Taking \(\xi \equiv \mu \in [1,2]\) as the parameter. \(\Omega _j(\xi )=d(\xi ) j +\tilde{\Omega }_j(\xi )\) with \(d(\xi )=\xi ,\) \(\tilde{\Omega }_j(\xi )=0.\) Let \(A_1=1\) and it is obvious that (A1) is satisfied. Then \(\langle k, \omega \rangle +\langle l, \Omega (\xi )\rangle =\langle k, \omega \rangle +\xi \langle l\rangle \not \equiv 0\) on [1, 2]. and

$$\begin{aligned} |\partial _\xi (\langle k, \omega \rangle +\langle l, \Omega (\xi )\rangle )|=|\langle l\rangle |\ge 1 \end{aligned}$$

Then there is a subset \(\mathcal {O}\subset [1,2]\) of positive Lebesgue measure such that (A2) holds.

Verifying assumptions (A3) and (A4):

We first verify (A3).

$$\begin{aligned} f(\theta , x, u, u_x, u_t)=b_{0}(\theta ,x)+O(|u|), \end{aligned}$$

where \(b_{0}(-\theta ,x)=b_{0}(\theta ,x),\) \(b_{0}(\theta ,-x)=-b_{0}(\theta ,x)\) and \(b_{0}(\theta ,x)=\sum _{i\ge 1}\hat{b}_{0i}(\theta )\phi _i(x)\in \mathcal {W}^{p}_{odd}.\) Suppose \(\sup \limits _{|Im\theta |<r}\Vert b_{0}(\theta ,\cdot )\Vert _p<s.\)

In the following, let \(C=C(r)>0\) be some appropriate large constant and take \(\varepsilon _0=C\varepsilon .\)

Note that \(P^{(\theta )}=0\) and \(P^{(z^{\pm }_i)}=\mp \textrm{i}\varepsilon g_j=\mp \textrm{i}\varepsilon \int ^{\frac{\pi }{\mu }}_0( b_{0}(\theta ,x)\phi _i+O(|u|))dx.\) Then

$$\begin{aligned} \begin{aligned}&\Vert P\Vert _{s;D(r,s)}\\&\quad =\frac{1}{s}\sup \limits _{\Vert z^{\pm }\Vert _p<s}\sum \limits _{i\ge 1}\textrm{e}^{ip}\left( \Vert P^{(z_i)}\Vert _{D(r)\times \mathcal {O}} +\Vert P^{(\bar{z}_i)}\Vert _{D(r)\times \mathcal {O}}\right) \\&\quad \le \frac{C\varepsilon }{s}(\Vert b_{0}(\theta ,\cdot )\Vert _p+\Vert z\Vert _p)\le C\varepsilon =\varepsilon _0. \end{aligned} \end{aligned}$$

We then verify (A4). Without loss of generality, we only verify the case of

$$\begin{aligned} f(\theta , x, u, u_x, u_t)=b_{0}(\theta ,x)+b_{1}(\theta ,x)u+b_{2}(\theta ,x)u_x+b_{3}(\theta ,x)u_t, \end{aligned}$$

and other higher order terms can be verified similarly and won’t cause any essential difficulty. Here

$$\begin{aligned}{} & {} b_{0}(-\theta ,x)=b_{0}(\theta ,x),\,\,b_{0}(\theta ,-x)=-b_{0}(\theta ,x),\\{} & {} b_{1}(-\theta ,x)=b_{1}(\theta ,x),\,\,b_{1}(\theta ,-x)=b_{1}(\theta ,x),\\{} & {} b_{2}(-\theta ,x)=b_{2}(\theta ,x),\,\,b_{2}(\theta ,-x)=-b_{2}(\theta ,x),\\{} & {} b_{3}(-\theta ,x)=-b_{3}(\theta ,x),\,\,b_{3}(\theta ,-x)=b_{3}(\theta ,x). \end{aligned}$$

One can expand \(b_{l}(\theta ,x)\) \((l=0, 1, 2, 3)\) as follows: for \(l=0,2,\)

$$\begin{aligned} b_{l}(\theta ,x)=\sum _{k\ge 1}\hat{b}_{lk}(\theta )\phi _k(x)\in \mathcal {W}^{p}_{odd}, \end{aligned}$$

and for \(l=1,3,\)

$$\begin{aligned} b_{l}(\theta ,x)=\sum _{k\ge 1}\hat{b}_{lk}(\theta )\psi _k(x)\in \mathcal {W}^{p}_{even}. \end{aligned}$$

Suppose

$$\begin{aligned} |\hat{b}_{lk}(\theta )|\le \textrm{e}^{-pk}\sup \limits _{|Im \theta |<r}\Vert b_l(\theta ,\cdot )\Vert _p\le C\textrm{e}^{-p k},\, l=1,2,3. \end{aligned}$$

Denote \(c=\frac{1}{4}(\sqrt{\frac{2\mu }{\pi }})^3.\) We have

$$\begin{aligned} \begin{aligned} g_i(\omega t, q, \dot{q})&=\int ^{\frac{\pi }{\mu }}_0 f(\omega t, x,\mathcal {S}q, (\mathcal {S}q)_x, (\mathcal {S}q)_t)\phi _idx\\&=\int ^{\frac{\pi }{\mu }}_0b_{0}(\theta ,x)\phi _idx \pm \sum _{k\pm j=\pm i}\frac{c}{\mu }q_j\hat{b}_{1k}(\theta ) \pm \sum _{k\pm j=\pm i}cjq_j\hat{b}_{2k}(\theta ) \pm \sum _{k\pm j=\pm i}\frac{c}{\mu }\dot{q}_j\hat{b}_{3k}(\theta ).\\ \end{aligned} \end{aligned}$$

Then

$$\begin{aligned} \begin{aligned} P^{(z_{i})}&=-\textrm{i}\varepsilon g_i(\theta , \ldots ,-\frac{z_j+\bar{z}_j}{2\mu j},\ldots , \frac{z_j-\bar{z}_j}{2\textrm{i}},\ldots )\\&=-\textrm{i}\varepsilon \int ^{\frac{\pi }{\mu }}_0b_{0}(\theta ,x)\phi _idx\\&\quad \pm \sum _{k\pm j=\pm i}\frac{\textrm{i}\varepsilon c}{2\mu ^2j}\hat{b}_{1k}(\theta )(z_j+\bar{z}_j) \pm \sum _{k\pm j=\pm i}\frac{\textrm{i}\varepsilon c}{2\mu }\hat{b}_{2k}(z_j+\bar{z}_j) \pm \sum _{k\pm j=\pm i}\frac{\varepsilon c}{2\mu }\hat{b}_{3k}(\theta )(z_j-\bar{z}_j)\\ \end{aligned} \end{aligned}$$

Hence

$$\begin{aligned} \begin{aligned} \frac{\partial P^{(z_{i})}}{\partial z_{j}}&=\sum _{k-j=i}\left( -\frac{\textrm{i}\varepsilon c}{2\mu ^2j}\hat{b}_{1k}(\theta )+\frac{\textrm{i}\varepsilon c}{2\mu }\hat{b}_{2k}(\theta )+\frac{\varepsilon c}{2\mu }\hat{b}_{3k}(\theta )\right) \\&\quad +\sum _{k-j=-i}\left( \frac{\textrm{i}\varepsilon c}{2\mu ^2j}\hat{b}_{1k}(\theta )-\frac{\textrm{i}\varepsilon c}{2\mu }\hat{b}_{2k}(\theta )-\frac{\varepsilon c}{2\mu }\hat{b}_{3k}(\theta )\right) \\&\quad +\sum _{k+j=i}\left( \frac{\textrm{i}\varepsilon c}{2\mu ^2j}\hat{b}_{1k}(\theta )+\frac{\textrm{i}\varepsilon c}{2\mu }\hat{b}_{2k}(\theta )-\frac{\varepsilon c}{2\mu }\hat{b}_{3k}(\theta )\right) .\\ \end{aligned} \end{aligned}$$
$$\begin{aligned} \begin{aligned} \frac{\partial P^{(z_{i+t})}}{\partial z_{j+t}}&=-\frac{\textrm{i}\varepsilon c}{2\mu ^2(j+t)}\hat{b}_{1(i+j+2t)}(\theta )+\frac{\textrm{i}\varepsilon c}{2\mu }\hat{b}_{2(i+j+2t)}(\theta )+\frac{\varepsilon c}{2\mu }\hat{b}_{3(i+j+2t)}(\theta )\\&\quad +\frac{\textrm{i}\varepsilon c}{2\mu ^2(j+t)}\hat{b}_{1(j-i)}(\theta )-\frac{\textrm{i}\varepsilon c}{2\mu }\hat{b}_{2(j-i)}(\theta )-\frac{\varepsilon c}{2\mu }\hat{b}_{3(j-i)}(\theta )\\&\quad +\frac{\textrm{i}\varepsilon c}{2\mu ^2(j+t)}\hat{b}_{1(i-j)}(\theta )+\frac{\textrm{i}\varepsilon c}{2\mu }\hat{b}_{2(i-j)}(\theta )-\frac{\varepsilon c}{2\mu }\hat{b}_{3(i-j)}(\theta ).\\ \end{aligned} \end{aligned}$$

Taking \(\varepsilon _0=C\varepsilon \) and \(\rho =p,\) we get

$$\begin{aligned} \begin{aligned} \left\| \lim _{t\rightarrow \infty }\frac{\partial P^{(z_{i+t})}}{\partial z_{j+ t}}\right\| _{D(r,s)\times \mathcal {O}} \le C\varepsilon \textrm{e}^{-|i-j|\rho } =\varepsilon _0\textrm{e}^{-|i-j|\rho }. \end{aligned} \end{aligned}$$

Since \(\textrm{e}^{-\rho |i+j+2t|}=\textrm{e}^{-\rho (i+j)}\textrm{e}^{-2\rho t}\le \frac{1}{t}\textrm{e}^{-\rho |i-j|},\) then

$$\begin{aligned} \begin{aligned}&\left\| \frac{\partial P^{(z_{i+t})}}{\partial z_{j+ t}}-\lim _{t\rightarrow \infty }\frac{\partial P^{(z_{i+t})}}{\partial z_{j+ t}}\right\| _{D(r,s)\times \mathcal {O}}\\&\quad =\Big \Vert -\frac{\textrm{i}\varepsilon c}{2\mu ^2(j+t)}\hat{b}_{1(i+j+2t)}(\theta )+\frac{\textrm{i}\varepsilon c}{2\mu }\hat{b}_{2(i+j+2t)}(\theta )+\frac{\varepsilon c}{2\mu }\hat{b}_{3(i+j+2t)}(\theta )\\&\qquad +\frac{\textrm{i}\varepsilon c}{2\mu ^2(j+t)}\hat{b}_{1(j-i)}(\theta )+\frac{\textrm{i}\varepsilon c}{2\mu ^2(j+t)}\hat{b}_{1(i-j)}(\theta )\Big \Vert _{D(r)\times \mathcal {O}}\\&\quad \le \frac{C\varepsilon }{t}\Vert \hat{b}_{1(i+j+2t)}(\theta )\Vert _{D(r)}+C\varepsilon \Vert \hat{b}_{2(i+j+2t)}(\theta )\Vert _{D(r)} +C\varepsilon \Vert \hat{b}_{3(i+j+2t)}(\theta )\Vert _{D(r)}\\&\qquad +\frac{C\varepsilon }{t}\Vert \hat{b}_{1(j-i)}(\theta )\Vert _{D(r)} +\frac{C\varepsilon }{t}\Vert \hat{b}_{1(i-j)}(\theta )\Vert _{D(r)}\\&\quad \le \frac{\varepsilon _0}{|t|}\textrm{e}^{-|i- j|\rho }. \end{aligned} \end{aligned}$$