1 The Yomdin–Gromov Lemma

For a \(C^r\)-smooth function on a domain \(U\subset \mathbb {R}^m\) we denote by \(\Vert f\Vert \) the maximum norm on U and

$$\begin{aligned} \Vert f\Vert _r := \max _{|{\varvec{\alpha }}|\leqslant r} \frac{\Vert D^{\varvec{\alpha }}f\Vert }{{\varvec{\alpha }}!}. \end{aligned}$$
(1)

In his work on Shub’s entropy conjecture, Yomdin [9, 10] proved a lemma on \(C^r\)-smooth parametrizations of semialgebraic sets. This was further refined by Gromov in [5], see also [2], with the following formulation now known as the Yomdin–Gromov algebraic lemma.

Theorem 1

(Yomdin–Gromov algebraic lemma). Let \(X\subset [0,1]^n\) be a semialgebraic set of dimension \(\mu \) defined by conditions \(p_j(\mathbf {x})=0\) or \(p_j(\mathbf {x})<0\), where \(p_j\) are polynomials and \(\sum \deg p_j=\beta \). Let \(r\in \mathbb {N}\). There exists a constant \(C=C(n,\mu ,r,\beta )\) and semialgebraic maps \(\phi _1,\ldots ,\phi _C:(0,1)^\mu \rightarrow X\) such that their images cover X and \(\Vert \phi _j\Vert _r\leqslant 1\) for \(j=1,\ldots ,C\).

Pila and Wilkie later realized that this theorem has remarkable applications in the seemingly unrelated area of Diophantine approximation. For the generality required by these applications, they stated and proved an analog of the algebraic lemma for general o-minimal structures [8, Theorem 2.3] (see [4] for general background on o-minimal geometry).

Theorem

(Pila–Wilkie’s version of Yomdin–Gromov). Let \(X=\{X_p\subset [0,1]^n\}\) be a definable family of sets in an o-minimal structure, with \(\dim X_p\leqslant \mu \). There exists a constant \(C=C(X,r)\) such that for any p there exist definable maps \(\phi _1,\ldots ,\phi _C:(0,1)^\mu \rightarrow X_p\) such that their images cover \(X_p\) and \(\Vert \phi _j\Vert _r\leqslant 1\) for \(j=1,\ldots ,C\).

In addition to Pila–Wilkie’s proof, Burguet [2] has also given a proof in the semialgebraic setting around the same time. Both of these proofs roughly follow Gromov’s presentation, but the technical details are significantly more involved. This is due to an issue with potentially unbounded derivatives that was not explicitly treated in Gromov’s text, see the first paragraph of [8, Sect. 4]. In both Pila–Wilkie’s and Burguet’s papers, the problem is resolved by an additional approximation argument on \(C^r\)-smooth maps. We also remark that Kocel–Cynk, Pawłucki and Vallete have given a proof based on a somewhat different approach in the general o-minimal setting [6].

In this paper, we give a formal treatment of Gromov’s original proof. In particular, we introduce a slightly stronger notion of cellular parametrizations in Definition 7, and prove the algebraic lemma with the additional requirement that the parametrizing maps are cellular. This, in combination with some elementary lemmas on differentiable functions in o-minimal structures (see Sect. 3.4), allows us to recover Gromov’s original inductive argument without any technical complications.

Remark 1

(On the asymptotic constants). The constants C(Xr) and \(C(n,\mu ,r,\beta )\) in these statements are purely existential, and one could ask about their dependence on r and on the complexity \(\beta \) in semialgebraic case or, more generally, whenever this complexity can be defined (e.g. Pfaffian sets). A good understanding of these constants plays a crucial role in some potential applications of the algebraic lemma, both in dynamics and in Diophantine approximation. We refer the reader to [1, Sect. 1] for a discussion of these applications.

We briefly summarize the current state of the art. Gromov’s presentation [5] gives the polynomial dependence on \(\beta \) in the semialgebraic case (but no explicit dependence on r). Cluckers, Pila and Wilkie [3] give polynomial dependence on r for globally subanalytic (and slightly more general) sets, with no explicit dependence on \(\beta \). In [1] we give a result with polynomial dependence on both r and \(\beta \) in the semialgebraic case: this is the statement which is most useful in the potential applications. We also give polynomial dependence on r in the globally subanalytic case. In a work in progress of the first author and Jones, Schmidt and Thomas, a bound polynomial in \(\beta \) (but not in r) is established for sets definable using restricted-Pfaffian functions. This is based on a suitable adaptation of the approach presented in the present paper to the restricted Pfaffian structure.

1.1 Statement of the Main Result

We prove a refined version of the Yomdin–Gromov algebraic lemma for general o-minimal structures using the notion of cellular parametrizations introduced below. To simplify the terminology for readers not familiar with o-minimal structures, we will assume everywhere below that we are working with an o-minimal structure over the reals \(\mathbb {R}\). However, all the proofs carry over to the general case without change.

We denote \(I:=(0,1)\). For a vector \(\mathbf {x}_{1..\ell }\in \mathbb {R}^\ell \), we denote by \(\mathbf {x}_{1...i}\) the vector consisting of its first i coordinates.

Definition 2

Let XY be sets and \(F:X\rightarrow 2^Y\) be a map taking points of X to subsets of Y. Then we denote

$$\begin{aligned} X\odot F := \{(x; y) : x \in X; y \in F(x)\}\subset X\times Y. \end{aligned}$$

Definition 3

A cell \(\mathcal {C}\) of length zero is the point \(\mathbb {R}^0\).

A cell \(\mathcal {C}\subset \mathbb {R}^{\ell +1}\) of length \(\ell +1\) is defined as \(\mathcal {C}=\mathcal {C}_{1\dots \ell }\odot \mathcal {F}\), where

  1. (1)

    the base \(\mathcal {C}_{1\dots \ell }\subset \mathbb {R}^\ell \) is a cell of length \(\ell \),

  2. (2)

    the map \(\mathcal {F}:\mathcal {C}_{1\dots \ell }\rightarrow 2^{\mathbb {R}}\) is defined as either \(\mathcal {F}(\mathbf {x}_{1\dots \ell })=\{a(\mathbf {x}_{1\dots \ell })\}\) or \(\mathcal {F}(\mathbf {x}_{1\dots \ell })=\left( a_1(\mathbf {x}_{1\dots \ell }),a_2(\mathbf {x}_{1\dots \ell })\right) \) and

  3. (3)

    the map \(\mathcal {F}\) is continuous. Equivalently, the function \(a(\mathbf {x}_{1\dots \ell })\) (the functions \(a_1(\mathbf {x}_{1\dots \ell }), a_2(\mathbf {x}_{1\dots \ell })\), respectively) are continuous functions on \(\mathcal {C}_{1\dots \ell }\), with \(a_1(\mathbf {x}_{1\dots \ell })<a_2(\mathbf {x}_{1\dots \ell })\) for every \( \mathbf {x}_{1\dots \ell }\in \mathcal {C}_{1\dots \ell }\) in the latter case.

The set \(\mathcal {F}(x)\) is called the fiber of \(\mathcal {C}\), i.e. of the natural projection \(\mathcal {C}\rightarrow \mathcal {C}_{1\dots \ell }\).

Definition 4

A cell of length zero is a basic cell. A basic cell \(\mathcal {C}\subset \mathbb {R}^\ell \) of length \(\ell \) is a cell with a basic cell of length \(\ell -1\) as a base and either the interval I or the singleton \(\{0\}\) as the (constant over the base) fiber.

Remark 5

A classical definition allows infinite intervals as fibers, i.e. the functions \(a_i(\mathbf {x}_{1\dots \ell })\) are allowed to take infinite values. We consider the bounded sets only, so we do not need this generality (and avoid it to simplify notations).

Definition 6

A continuous map \(f=(f_1,\dots ,f_{\ell }):\mathcal {C}\rightarrow \mathbb {R}^\ell \), where \(\mathcal {C}\) is a cell of length \(\ell \), is called cellular if for every \(i=1,\dots ,\ell \)

  • \(f_i(\mathbf {x}_{1...\ell })=f_i(\mathbf {x}_{1...i})\), i.e. \(f_i\) depends only on the first i coordinates of \(\mathbf {x}\), and

  • \(f_i(\mathbf {x}_{1...i-1},\cdot )\) is strictly increasing for every \(\mathbf {x}_{1\cdots i-1}\in \mathcal {C}_{1\cdot i-1}\) (where the cell \(\mathcal {C}_{1...i-1}\) is the coordinate projection of \(\mathcal {C}\) to \(\mathbb {R}^{i-1}=\{x_i=\cdots =x_\ell =0\}\subset \mathbb {R}^\ell \)).

Note in particular that cellular maps preserve dimension and the composition of cellular maps is cellular.

Definition 7

A cellular r-parametrization of a definable set \(X\subset \mathbb {R}^\ell \) is a collection \(\Phi =\{\phi _\alpha :\mathcal {C}_\alpha \rightarrow X\}\) of definable cellular \(C^r\)-smooth maps \(\phi _\alpha \) defined on basic cells \(\mathcal {C}_\alpha \) with \(\Vert \phi _\alpha \Vert _r\leqslant 1\) such that \(X=\cup _\alpha \phi _\alpha (\mathcal {C}_\alpha )\).

A cellular r-parametrization of a definable map \(F:X\rightarrow Y\) is a cellular r-parametrization \(\Phi \) of X satisfying \(\Vert \phi _\alpha ^*F\Vert _r\leqslant 1\) for every \(\phi _\alpha \in \Phi \).

Remark 8

Let \(X\subset \mathbb {R}^\ell \), \(Y\subset \mathbb {R}^q\) and \(F:X\rightarrow Y\) a definable map. Then \(\{\phi _\alpha :\mathcal {C}_\alpha \rightarrow X\}\) is a cellular r-parametrization of a F if and only if \(\{(\phi _\alpha ,F\circ \phi _\alpha ):\mathcal {C}_\alpha \times \{0\}^n\rightarrow {{\,\mathrm{gr}\,}}F\}\) is a cellular r-parametrization of the graph \({{\,\mathrm{gr}\,}}F\).

We will prove the Yomdin–Gromov lemma in the following form.

Theorem 2

Let \(\ell ,r\in \mathbb {N}\). Then

\({\mathrm {S}}_\ell \):

Every definable set \(X\subset I^\ell \) admits a cellular r-parametrization.

\({\mathrm {F}}_\ell \):

Every definable function \(F:X\rightarrow Y\) with \(X\subset I^\ell \) and \(Y\subset I^q\) (for any q) admits a cellular r-parametrization.

We remark that the cellular formulation of the Yomdin–Gromov lemma makes it automatically uniform over parameters: a cellular parametrization of a family with the parameters placed as the initial variables gives a cellular parametrization of each fiber by restriction. This uniformity is essential in the applications.

Remark 9

This exposition appeared first as a part of the course “Tame geometry and applications” given by authors at the Weizmann Institute of Science, Fall 2018.

2 Why \(C^r\)-Smooth?

Before going into the proof of the Yomdin–Gromov lemma we will address a natural question. Semialgebraic sets are analytic objects. Why would one, starting with such tame objects, venture into the far less rigid smooth category? It would certainly seem natural to expect a far more rigid parametrization, say by holomorphic maps with respect to some suitable norm. It turns out that there are deep obstructions hiding in the background.

Ideally, one would like to replace finite smoothness order \(r\in \mathbb {N}\) by a bound for all derivatives,

$$\begin{aligned} \Vert f\Vert _\infty := \sup _{{\varvec{\alpha }}} \frac{\Vert D^{\varvec{\alpha }}f\Vert }{{\varvec{\alpha }}!}. \end{aligned}$$
(2)

If \(U\subset \mathbb {R}^m\) and \(f:U\rightarrow \mathbb {R}^n\) has \(\Vert f\Vert _\infty <\infty \) then f continues holomorphically to a 1-neighborhood \(N_1(U)\subset \mathbb {C}^m\) of U. Moreover, in \(N_{1/2}U\), we have

$$\begin{aligned} \max _{N_{1/2}(U)} |f| \leqslant {\text {const}}\Vert f\Vert _\infty . \end{aligned}$$
(3)

So, instead of this \(\infty \)-norm we might as well use the norm given by the maximum of the analytic continuation of f to a neighborhood of some fixed radius. Below we will write

$$\begin{aligned} \Vert f\Vert _\omega := \max _{N_1(U)} |f|. \end{aligned}$$
(4)

One would ideally like to prove the Yomdin–Gromov lemma with the maps \(\phi _i\) extendable to 1-neighborhood of \((0,1)^k\) and with this stronger norm. Unfortunately this is impossible already for the simple family of semialgebraic sets (originally considered in this context by Yomdin in [11]),

$$\begin{aligned} X_\varepsilon = [(-1,1)\times (-1,1)] \cap \{xy=\varepsilon \}. \end{aligned}$$
(5)

We will show that an \(\omega \)-parametrization of \(X_\varepsilon \) will require at least \(\log |\log \varepsilon |\) maps so cannot be uniform over the family \(\left\{ X_\varepsilon \right\} \). To explain this we take a brief detour to the geometry of hyperbolic Riemann surfaces.

2.1 Hyperbolic Geometry

Recall that the upper half-plane \(\mathbb {H}\) admits a unique hyperbolic metric of constant curvature \(-4\) given by \(|\,\mathrm dz|/2y\). A Riemann surface U is called hyperbolic if its universal cover is the upper half-plane \(\mathbb {H}\). In this case, U inherits from \(\mathbb {H}\) a unique metric of constant curvature \(-4\) which we denote by \({\text {dist}}(\cdot ,\cdot ;U)\) (we sometimes omit U from this notation if it is clear from the context). By the uniformization theorem, a domain \(U\subset \mathbb {C}\) is hyperbolic if and only if its complement contains at least two points.

The following is a straightforward consequence of the classical Schwarz lemma obtained by lifting the map to universal covers.

Lemma 10

(Schwarz–Pick [7, Theorem 2.11]). If \(f:S\rightarrow S'\) is a holomorphic map between hyperbolic domains \(S,S'\) then

$$\begin{aligned} {\text {dist}}(f(p),f(q);S') \leqslant {\text {dist}}(p,q;S) \qquad \forall p,q\in S. \end{aligned}$$
(6)

2.2 The Obstruction

Suppose \(f:(0,1)\rightarrow X_\varepsilon \) is a map with \(\Vert f\Vert _\omega \leqslant 2\). Then f extends analytically to the 1-neighborhood of (0, 1) in \(\mathbb {C}\) and is bounded by 2 there in absolute value. By analytic continuation, f continues to satisfy \(xy=\varepsilon \) in \(N_1(0,1)\), so

$$\begin{aligned} f: N_1(0,1) \rightarrow \{xy=\varepsilon \} \cap \{|x|,|y|<2\}. \end{aligned}$$
(7)

Consider the projection \(\pi (x,y)=x\). Then the composition gives a map

$$\begin{aligned} \pi \circ f : N_1(0,1) \rightarrow \{ \varepsilon /2<|x|<2 \}=A(\varepsilon /2,2). \end{aligned}$$
(8)

The domain and the range are hyperbolic domains. So by Schwarz–Pick Lemma 10, we have

$$\begin{aligned} {\text {diam}}([\pi \circ f](0,1);A(\varepsilon /2,2)) \leqslant {\text {diam}}((0,1);N_1(0,1)) = {\text {const}}. \end{aligned}$$
(9)

We see that the set of x-s covered by f has bounded hyperbolic diameter in \(A(\varepsilon /2,2)\). We would eventually like to cover every \(x\in \pi (X_e)=(\varepsilon ,1)\). A simple computation gives

$$\begin{aligned} {\text {diam}}((\varepsilon ,1);A(\varepsilon /2,2)) \sim \log |\log \varepsilon | \end{aligned}$$
(10)

so indeed at least \(\log |\log \varepsilon |\) maps will be needed to cover \(X_\varepsilon \).

Remark 11

One can show that the bound above is asymptotically sharp, i.e. \(X_\varepsilon \) can indeed be covered by \(O(\log |\log \varepsilon |)\) maps of unit \(\omega \)-norm. Indeed, it suffices to find such a collection of such maps from (0, 1) to \(X_\varepsilon \) which extend analytically to the complex disc D(2), with both coordinates bounded by 2 on this disc. Equivalently by considering only the x-coordinate, we may look for a collection of maps from (0, 1) into \((\varepsilon ,1)\) which extend to maps \(D(2)\rightarrow A(\varepsilon /2,2)\). Passing to the logarithmic chart, we seek maps from D(2) to the strip

$$\begin{aligned} S_{\varepsilon /2} = \{ \log \varepsilon -1< {\text {Re}}t < 1 \} \end{aligned}$$
(11)

such that the images of (0, 1) cover \((\log \varepsilon ,0)\). This is easily achieved using affine maps, where the radius of the image is taken to be proportional to the distance from the boundary of \(S_{\varepsilon /2}\), and we leave it for the reader to verify that in this manner one does obtain a covering using \(O(\log |\log \varepsilon |)\) maps.

3 Proof of Theorem 2

We start with a trivial transitivity remark. Assume that \(\Phi =\{\phi _\alpha :\mathcal {C}_\alpha \rightarrow X\}\) is a cellular r-parametrization of X and \(\Phi _\alpha =\{\phi _{\alpha ,\beta }:\mathcal {C}_{\alpha ,\beta }\rightarrow \mathcal {C}_\alpha \}\) is a cellular r-parametrization of \(\mathcal {C}_\alpha \). Then the collection \(\{\phi _\alpha \circ \phi _{\alpha ,\beta }\}\) is “almost” a cellular r-parametrization of X: by the chain rule \(\Vert \phi _\alpha \circ \phi _{\alpha ,\beta }\Vert _r=O_{\ell ,r}(1)\) and a linear subdivision reduces the norms to 1. We will use this reduction freely.

A similar remark holds for \(\Phi =\{\phi _\alpha :\mathcal {C}_\alpha \rightarrow X\}\) a cellular r-parametrization of \(F:X\rightarrow Y\) and \(\Phi _\alpha =\{\phi _{\alpha ,\beta }:\mathcal {C}_{\alpha ,\beta }\rightarrow \mathcal {C}_\alpha \}\) a cellular r-parametrization of \(f_\alpha ^*F:\mathcal {C}_\alpha \rightarrow Y\).

We record the following simple lemma.

Lemma 12

Let \(n\in \mathbb {N}\) and assume that every definable map \(F:X\rightarrow I\) with \(\dim X=n\) admits a cellular r-parametrization. Then the same is true for every definable map \(F:X\rightarrow Y\) with \(Y\subset I^m\).

Proof

Let \(\Phi \) be a cellular r-parametrization of \(F_1\). It will be enough to find a cellular r-parametrization for \(F\circ \phi _1\) for each \(\phi _1\in \Phi \). In other words, we may reduce to the case \(\Vert F_1\Vert _r=O_r(1)\). We now do the same for \(F_2\), noting that after the composition we still have \(\Vert F_1\circ \phi _2\Vert _r=O_r(1)\) by the chain rule, and now also \(\Vert F_2\Vert _r=O_r(1)\). Repeating this for each coordinate, we finally get \(\Vert F_i\Vert _r=O_{m,r}(1)\) for every \(F_i\) and an additional linear subdivision finishes the proof. \(\square \)

The proof of the Yomdin–Gromov lemma is by induction on \(\ell \). Statement \({\mathrm {S}}_1\) is trivial. We establish \({\mathrm {F}}_1\) as a base case, and then show \({\mathrm {S}}_{\leqslant \ell }+{\mathrm {F}}_{\leqslant \ell }\implies {\mathrm {S}}_{\ell +1}\) and \({\mathrm {F}}_{<\ell }+{\mathrm {S}}_{\leqslant \ell }\implies {\mathrm {F}}_\ell \).

3.1 Proof of \({\mathrm {F}}_1\)

We will start with a simple lemma due to Gromov about dampening derivatives of univariate functions.

Lemma 13

Let \(r\geqslant 2\). Suppose that \(f:I\rightarrow I\) is a definable function with \(\Vert f\Vert _{r-1}\leqslant 1\). Then f has a cellular r-parametrization.

Proof

By o-minimality we may divide I into finitely many subintervals where \(f^{(r)}\) is monotone and has constant sign. Thus, we assume without loss of generality that \(f^{(r)}\) is positive and monotone decreasing on I. For any \(x\in I\)

$$\begin{aligned} \frac{2(r-1)!}{x} \geqslant \frac{f^{(r-1)}(x)-f^{(r-1)}(0)}{x} = f^{(r)}(c_x) \geqslant f^{(r)}(x), \end{aligned}$$
(12)

where \(c_x\in (0,x)\) is chosen by the mean-value theorem. Let \(\tilde{f}(x)=f(x^2)\). When computing the \(\tilde{f}^{(r)}\) we get a bunch of bounded terms plus a term \(O_r(x^r f^{(r)}(x^2))\), which is bounded by \(O_r(x^{r-2})\). Since \(r\geqslant 2\) we get \(\Vert \tilde{f}\Vert _r=O_r(1)\) and a linear subdivision of I finishes the proof. \(\square \)

We use this to obtain the following.

Lemma 14

Let \(X\subset I^2\) be a definable set of dimension 1. For every \(r\in \mathbb {N}\) there exists a collection of maps \(\{\phi _\alpha :I\rightarrow X\}\) such that: i) \(\cup _\alpha \phi _\alpha (I)=X{\setminus }\Sigma \) for some finite set \(\Sigma \); ii) \(\Vert \phi _\alpha \Vert _r\leqslant 1\) for every \(\phi _\alpha \); iii) every coordinate of every \(\phi _\alpha \) is monotone.

Proof

By cell decomposition we decompose X into finitely many points, intervals \(\{x_0\}\times (a,b)\) and graphs of definable functions \(f:(a,b)\rightarrow I\). We denote by \(\Sigma \) the set of points, and easily parametrize the vertical intervals as required. It remains to parametrize the graphs, and we treat each of them separately.

By o-minimality we may assume that f is either constant (the parametrization is then trivial) or monotone, continuously differentiable, and one of

$$\begin{aligned} f' \leqslant -1&-1\leqslant f'<0&0<f'\leqslant 1&1\leqslant f'&\end{aligned}$$
(13)

holds uniformly. Changing the orientation of (ab) and exchanging the roles of x and y if needed we may assume \(0<f'\leqslant 1\) in (ab). We are now in position to apply Lemma 13 repeatedly \(r-1\) times to obtain an r-parametrization \(\{\tilde{\phi }_\alpha :I\rightarrow I\}\) of f. Setting \(\phi _\alpha =(\tilde{\phi }_\alpha ,f\circ \tilde{\phi }_\alpha )\) then gives the required parametrization of the graph of f (but note the x and y coordinates may have been exchanged). Condition (iii) follows from the monotonicity of \(\tilde{\phi }_\alpha \) and of f. \(\square \)

We are now ready to deduce \({\mathrm {F}}_1\). For the case \(q=1\), apply Lemma 14 to the graph of F and let \(\{\phi _\alpha =(\phi ^x_\alpha ,\phi ^y_\alpha )\}\) denote the resulting collection. Then \(\Phi =\{\phi _\alpha ^x\}\) (plus the finitely many points x-coordinates of \(\Sigma \), covered by zero-dimensional basic cells) is a cellular r-parametrization of F. Indeed, it is cellular by condition (iii), it covers the domain of F since \(\phi _\alpha \) covers the graph by condition (i), and

$$\begin{aligned} \Vert F\circ \phi _\alpha ^x\Vert _r = \Vert \phi _\alpha ^y\Vert _r \leqslant 1 \end{aligned}$$
(14)

by condition (ii).

The case of general q now follows by Lemma 12. Note that we could not have obtained this directly from Lemma 13 because the assumption \(r\geqslant 2\) is crucial there, and the reduction in Lemma 14 involves changing the order of the variables and is not cellular.

3.2 The Step \({\mathrm {S}}_{\leqslant \ell }+{\mathrm {F}}_{\leqslant \ell }\implies {\mathrm {S}}_{\ell +1}\)

By cell decomposition it is enough to prove the claim for every cell \(C\subset I^{\ell +1}\). We assume that \(C=C_{1..\ell }\odot (a,b)\), with the fiber \((a,b)\subset I\) (the case \(C=C_{1..\ell }\odot \{a\}\) is similar but easier). By \({\mathrm {F}}_{\leqslant \ell }\) we may assume that the map \((a,b):C_{1\dots \ell }\rightarrow I^2\) already admits a cellular r-parametrization by maps \(f_\alpha :\mathcal {C}_\alpha \rightarrow C_{1..\ell }\). Let \(f:\mathcal {C}\rightarrow C_{1..\ell }\) be one of these maps. Then \(\Vert f^*a\Vert _r,\Vert f^*b\Vert _r\leqslant 1\) and setting

$$\begin{aligned} \mathcal {C}':=\mathcal {C}\times I, \qquad f'(\mathbf {x}_{1..\ell +1})=(f,\mathbf {x}_{\ell +1} f^*b+(1-\mathbf {x}_{\ell +1}) f^*a) \end{aligned}$$
(15)

we have \(\Vert f'\Vert _r\leqslant O_\ell (1)\). Taking a linear subdivision of \(\mathcal {C}'\) finishes the proof.

3.3 The Step \({\mathrm {F}}_{<\ell }+{\mathrm {S}}_{\leqslant \ell }\implies {\mathrm {F}}_\ell \)

3.3.1 A Family Version of \({\mathrm {F}}_\ell \)

We will need a “family version” of \({\mathrm {F}}_\ell \) as follows.

\(F_\ell for families\) Let \(\{F_\lambda :X\rightarrow Y\}_{\lambda \in I}\) be a definable family. Then there exists (i) a disjoint partition \(I=\cup I_j\) into finitely many points and intervals; (ii) for every \(I_j\) a collection of basic cells \(\mathcal {C}_\alpha \) and cellular maps \(\{f_{\alpha ,\lambda }:\mathcal {C}_\alpha \rightarrow X\}_{\lambda \in I_j}\) such that (1) \(\Vert f_{\alpha ,\lambda }\Vert _r\leqslant 1\) and \(\Vert f_{\alpha ,\lambda }^*F_\lambda \Vert _r\leqslant 1\) for every fixed \(\lambda \in I_\alpha \); and (2) for every \(\lambda \in I_j\) we have \(X=\cup _\alpha f_{\alpha ,\lambda }(\mathcal {C}_\alpha )\).

It is not difficult to obtain such a family version by adding parameter to all of the statements in Sect. 3. However, to simplify the presentation we take a shortcut introduced in the Pila–Wilkie paper: we show in Sect. 3.5 that the family version of \({\mathrm {F}}_{\ell }\) follows from the regular version by general o-minimality considerations.

3.3.2 Reduction to \(\Vert F(\mathbf {x}_1,\cdot )\Vert _r\leqslant 1\) for Every \(\mathbf {x}_1\in I\)

By the family version of \({\mathrm {F}}_{\ell -1}\) we may, thinking \(\mathbf {x}_1\) as a parameter, find a cellular r-parametrization \(\Phi =\{\phi ^{\mathbf {x}_1}_\beta \}\) of F with respect to the \(\mathbf {x}_{2..\mu }\) variables (we consider each interval \(I_j\) separately and rescale back to I). Fix one \(\phi ^{\mathbf {x}_1}=\phi ^{\mathbf {x}_1}_\beta \) and set \(\hat{F}=\left( {\text {id}},\phi ^{\mathbf {x}_1}, F\circ ({\text {id}},\phi ^{\mathbf {x}_1})\right) \). Then \(\Vert \hat{F}(\mathbf {x}_1,\cdot )\Vert _r\leqslant O_{\ell ,r}(1)\) for every fixed \(\mathbf {x}_1\in I\). By o-minimality \(\hat{F}\) is \(C^r\)-smooth outside a positive-codimension set \(V\subset I^\ell \).

We first use \({\mathrm {S}}_\ell \) to find a cellular r-parametrization \(\{f_{V,\alpha }:\mathcal {C}_{V,\alpha }\rightarrow V\}\). Each \(\mathcal {C}_{V,\alpha }\) must have dimension strictly smaller than \(\ell \), i.e. it has a \(\{0\}\)-coordinate, so we can find a cellular r-parametrization for each \(f_{V,\alpha }^*\hat{F}\) using \(F_{<\ell }\) as above. By projection, this r-parametrization will produce a cellular r-parametrization of the image \({\text {Im}}\left( f_{V,\alpha }^*\hat{F}\right) ={\text {gr}}\hat{F}\), which is what is required, see Remark 8.

We now use \({\mathrm {S}}_\ell \) to find a cellular r-parametrization \(\{f_\alpha :\mathcal {C}_\alpha \rightarrow I^\ell {\setminus } V\}\). Fixing one such \(\mathcal {C},f\) we note that \(f^*\hat{F}\) is \(C^r\)-smooth on \(\mathcal {C}\), and crucially we still have \(\Vert f^*\hat{F}(\mathbf {x}_1,\cdot )\Vert _r=O_{\ell ,r}(1)\) for every fixed \(\mathbf {x}_1\in I\) because \(\Vert f\Vert _r\leqslant 1\) and \(f_1\) does not depend on \(\mathbf {x}_{2..\ell }\). As before we may assume that \(\mathcal {C}=I^\ell \) and use linear subdivision to get \(\Vert f^*\hat{F}(\mathbf {x}_1,\cdot )\Vert _r\leqslant 1\). As above, it will suffice to find a cellular r-parametrization of \(f^*\hat{F}\).

3.3.3 Induction over the First Unbounded Derivative \({\varvec{\alpha }}\)

We return to our original notation replacing F by \(f^*\hat{F}\). We may now assume that \(F:I^\ell \rightarrow I^\ell \times Y\subset I^{\ell +q}\) is \(C^r\)-smooth and \(\Vert F(\mathbf {x}_1,\cdot )\Vert _r\leqslant 1\) for every \(\mathbf {x}_1\in I\). Let \({\varvec{\alpha }}\in \mathbb {N}^\ell \) be the first index, in lexicographic order, such that \(|{\varvec{\alpha }}|\leqslant r\) and \(\Vert F^{({\varvec{\alpha }})}\Vert >1\). If no such \({\varvec{\alpha }}\) exists we are done. We will reparametrize F by cellular r-maps such that the pullback has strictly larger \({\varvec{\alpha }}\) and then finish the argument by induction on \({\varvec{\alpha }}\).

3.3.4 Reparametrization of the \(\mathbf {x}_1\) Variable

By assumption \({\varvec{\alpha }}_1>0\). Using Lemma 16 and treating the finitely many exceptional \(\mathbf {x}_1\) values by induction on \(\ell \), we may assume without loss of generality that \(F^{({\varvec{\alpha }})}(\mathbf {x}_1,\cdot )\) is bounded for every \(\mathbf {x}_1\in I\). Define

$$\begin{aligned} S := \left\{ \mathbf {x}_{1..\mu } \in I^\ell : \Vert F^{({\varvec{\alpha }})}(\mathbf {x}_{1..\mu })\Vert \geqslant \tfrac{1}{2} \sup _{I^{\ell -1}} \Vert F^{({\varvec{\alpha }})}(\mathbf {x}_1,\cdot )\Vert \right\} \end{aligned}$$
(16)

Choose a definable curve \(\gamma :I\rightarrow S\) such that \(\gamma _1(\mathbf {x}_1)=\mathbf {x}_1\). Using \({\mathrm {F}}_1\) we find a cellular r-parametrization \(\Phi \) of \((\gamma ,F^{({\varvec{\alpha }}-1_1)}\circ \gamma )\). Fix \(\phi \in \Phi \) and set \(\tilde{F}:=F\circ (\phi ,{\text {id}})\).

3.3.5 Finishing-Up: A Bound on All Derivatives up to \({\varvec{\alpha }}\)

Recall that all derivatives of \((\phi ,{\text {id}})\) up to order r and all derivatives \(F^{({\varvec{\beta }})}\) with \({\varvec{\beta }}<{\varvec{\alpha }}\) are bounded by 1. It follows easily using the chain rule that \(\tilde{F}^{({\varvec{\beta }})}=O_{\ell ,r}(1)\) for \({\varvec{\beta }}<{\varvec{\alpha }}\). Computing \(\tilde{F}^{({\varvec{\alpha }})}\) we get terms that add up to \(O_{\ell ,r}(1)\), and the term \((\phi ')^{{\varvec{\alpha }}_1} \cdot F^{({\varvec{\alpha }})}\circ (\phi ,{\text {id}})\). Now

$$\begin{aligned} \Vert (\phi ')^{{\varvec{\alpha }}_1} \cdot F^{({\varvec{\alpha }})}\circ (\phi ,{\text {id}})\Vert \leqslant \Vert (\phi ')^{{\varvec{\alpha }}_1} \cdot 2 F^{({\varvec{\alpha }})}\circ \gamma \circ \phi \Vert \leqslant 2\Vert \phi '\cdot F^{({\varvec{\alpha }})}\circ \gamma \circ \phi \Vert \end{aligned}$$
(17)

since \(|\phi '|\leqslant 1\) and \({\varvec{\alpha }}_1\geqslant 1\). To bound the right-hand side, we compute

$$\begin{aligned} (F^{({\varvec{\alpha }}-1_1)}\circ \gamma \circ \phi )' = \phi '\cdot \left( F^{({\varvec{\alpha }})}\circ \gamma +\sum _{j=2}^\mu \gamma _j' \cdot F^{({\varvec{\alpha }}-1_1+1_j)}\circ \gamma \right) \circ \phi \end{aligned}$$
(18)

and note that the left hand side, \(|\phi '\cdot \gamma _j'\circ \phi |\) are \(O_{\ell ,r}(1)\) by the choice of \(\phi \) and, as \({\varvec{\alpha }}-1_1+1_j\prec {\varvec{\alpha }}\), \(\Vert F^{({\varvec{\alpha }}-1_1+1_j)}\Vert \) are \(O_{\ell ,r}(1)\) by induction on lexicographic order assumption. Therefore \(\Vert \phi '\cdot F^{({\varvec{\alpha }})}\circ \gamma \circ \phi \Vert \) is also \(O_{\ell ,r}(1)\), and a further subdivision and linear reparametrization finishes our induction on \({\varvec{\alpha }}\).

3.4 Boundedness of Derivatives

In this section, we prove a simple lemma on boundedness of derivatives used in Sect. 3.3.4. We let \(\mu \) denote the Lebesgue measure (or just sum of lengths of intervals).

Lemma 15

Let \(\{f_\varepsilon (t):I\rightarrow I\}\) be a definable family of functions depending on a parameter \(\varepsilon \). Then for every \(\varepsilon \),

$$\begin{aligned} \mu \big (\{ t\in I : |f'_\varepsilon (t)|>M \}\big ) < \frac{C}{M}, \end{aligned}$$
(19)

where C is a constant independent of \(\varepsilon \).

Proof

The set where \(f_\varepsilon '(t)>M\) (resp. \(f'(t)<-M\)) is a union of intervals, with their number uniformly bounded by o-minimality, and each of length at most 1/M: otherwise \(f_\varepsilon \) would leave I along such an interval. \(\square \)

Lemma 16

Let \(f:I^\ell \rightarrow I^q\) be definable, and suppose that \(\Vert \frac{\partial f}{\partial x_j}\Vert \leqslant 1\) for \(j=2,\ldots ,\ell \). Then the function \(\frac{\partial f}{\partial x_1}(\mathbf {x}_1,\cdot )\) is bounded for almost every fixed \(\mathbf {x}_1\in I\).

Proof

Without loss of generality we can assume that \(q=1\). Assume the contrary. Then, by o-minimality, the set

$$\begin{aligned} \left\{ \mathbf {x}_1\in I : \left| \frac{\partial f}{\partial x_1}(\mathbf {x}_1,\cdot )\right| \text { is unbounded}\right\} \end{aligned}$$
(20)

contains an interval, and we might as well assume after restriction and rescaling that it is I. For each M we can choose a curve \(\gamma _M:I\rightarrow I^{\ell -1}\) such that

$$\begin{aligned} \left| \frac{\partial f}{\partial x_1}(t,\gamma _M(t))\right| >M \qquad \forall t\in I, \end{aligned}$$
(21)

and we may further assume the dependence on M is definable. Applying Lemma 15 to the coordinates of \(\gamma _M(t)\) and to \(f(t,\gamma _M(t))\), we see that outside a set of measure \(\tilde{C}/M\) we have \(\Vert \gamma _M'(t)\Vert \leqslant M/(3\ell )\) as well as

$$\begin{aligned} \left| \sum _{j=2}^\ell \frac{\partial f}{\partial x_j}(t,\gamma _M(t))\gamma _{M,j}'(t) + \frac{\partial f}{\partial x_1}(t,\gamma _M(t))\right| = \left| f\left( t,\gamma _M(t)\right) '\right| \leqslant M/3. \end{aligned}$$
(22)

This is impossible as soon as \(M>\tilde{C}\): the summation term in the left hand side is bounded by M/3, and the second term is at least M. \(\square \)

3.5 Automatic Uniformity over Families

In this section, we give a model-theoretic proof that the statement \({\mathrm {F}}_\ell \) for an arbitrary o-minimal structure (and fixed \(\ell ,r\in \mathbb {N}\)) implies the family version for an arbitrary o-minimal structure (with the same \(\ell ,r\)). This is the approach employed by Pila and Wilkie [8], and we repeat it here with some more explicit details for non-experts in model-theory who are nevertheless interested in understanding the mechanics of this general reduction. However, a reader unfamiliar with the relevant notions from model theory can alternatively check that the family version of \({\mathrm {F}}_\ell \) can be proven in the same manner as the usual statement of \({\mathrm {F}}_\ell \), essentially verbatim.

Let \(\mathcal {M}\) be an o-minimal structure, now not necessarily over \(\mathbb {R}\), and consider a family \(\{F_\lambda :X\rightarrow Y\}_{\lambda \in I}\). Let \(\mathcal {L}\) be the language of \(\mathcal {M}\). Let \(\Phi :=\{\phi _\alpha (\mathbf {p},\mathbf {a})\}\) denote the set of all \(\mathcal {L}\)-formulas in two sets of variables and \(N\in \mathbb {N}\). For every \({\varvec{\phi }}\in \Phi ^N\), we can write the first-order formula \(\psi _{\varvec{\phi }}(\lambda )\) stating that “there exists \(\mathbf {p}\) such that the formulas \({\varvec{\phi }}_1(\mathbf {p},\cdot ),\ldots ,{\varvec{\phi }}_N(\mathbf {p},\cdot )\) define N cellular maps \(f_1,\ldots {,}f_N\) which form a cellular r-parametrization of \(F_\lambda \)”. We claim that there are \({\varvec{\phi }}^1,\ldots ,{\varvec{\phi }}^q\) be such that \(\forall \lambda \in I:\vee _{j=1}^q \psi _{{\varvec{\phi }}^j}(\lambda )\) holds in \(\mathcal {M}\).

Suppose not. Let c denote a new constant and consider the theory

$$\begin{aligned} T:={{\,\mathrm{Th}\,}}_\mathcal {L}(\mathcal {M})\cup \{c\in I\}\cup \{\lnot \psi _{\varvec{\phi }}(c) : N\in \mathbb {N}, {\varvec{\phi }}\in \Phi ^N \}. \end{aligned}$$
(23)

This theory is finitely consistent by our assumption (in fact an interpretation for c exists in \(\mathcal {M}\)). It is, therefore, consistent by compactness, and we have an elementary extension \(\mathcal {M}\subset \tilde{\mathcal {M}}\) which is again an o-minimal structure. But the axioms of T state that \(F_c^{\tilde{\mathcal {M}}}\) has no cellular r-parametrization, and this contradicts \({\mathrm {F}}_\ell \) for \(\tilde{\mathcal {M}}\).

Now choose \({\varvec{\phi }}^1,\ldots ,{\varvec{\phi }}^q\) as above and set \(I_j:=\{\lambda \in I:\psi _{{\varvec{\phi }}^j}(\lambda )\}\). By definable choice there is a definable map \(\lambda \rightarrow \mathbf {p}(\lambda )\) such that for every \(\lambda \in I_j\), the formulas

$$\begin{aligned} {\varvec{\phi }}^j_1(\mathbf {p}(\lambda ),\mathbf {x}), \ldots , {\varvec{\phi }}^j_{N_j}(\mathbf {p}(\lambda ),\mathbf {x}) \end{aligned}$$
(24)

define \(N_j\) cellular maps which form a cellular r-parametrization of \(F_\lambda \). Finally, \(\cup _j I_j=I\), and refining this into a partition by points/intervals using o-minimality proves the claim.