1 Introduction

In this work, we are interested in studying the ramification behaviour of the p-torsion points of the formal group associated to an abelian variety over an unramified local field.

Let p be a rational prime, let F denote a finite, unramified extension of \({\mathbb Q}_p\), let K be the completion of the maximal unramified extension of \({\mathbb Q}_p\), let \({\overline{K}}\) be some fixed algebraic closure of K, and let \(\mathcal {O}:=\mathcal {O}_{{\overline{K}}}\). Let A be an abelian variety defined over F, with good reduction, let \(\mathcal {A}\) denote the Néron model of A over \(\textrm{Spec}(\mathcal {O}_F)\), and let \(\widehat{\mathcal {A}}\) be the formal completion of \(\mathcal {A}\) along the identity of its special fiber, i.e. the formal group of A.

To state these results, we need the following definitions. In [7], Fontaine studied the \(\mathcal {O}\)-module \(\Omega :=\Omega ^1_{\mathcal {O}/\mathcal {O}_K}\cong \Omega ^1_{\mathcal {O}/\mathcal {O}_F}\) of Kähler differentials of \(\mathcal {O}\) over \(\mathcal {O}_K\), or over \(\mathcal {O}_F\). The \(\mathcal {O}\)-module \(\Omega \) is a torsion and p-divisible \(\mathcal {O}\)-module, with a semi-linear action of \(G_F\). Let \(d:\mathcal {O}\rightarrow \Omega \) denote the canonical derivation, which is surjective. We denote by \(\mathcal {O}^{(1)}:=\ker (d)\), the kernel of d, which is an \(\mathcal {O}_K\)-sub-algebra of \(\mathcal {O}\). Membership of an element x of \(\mathcal {O}\) inside of \(\mathcal {O}^{(1)}\) reflects ramification properties of x (see e.g., Definition 2.2 and Lemma 2.3).

By using previous work of the authors [9], we are able to prove that the \(\mathcal {O}^{(1)}\)-points of the Tate module of \(\mathcal {A}\) are trivial, which implies that following theorem.

Theorem A

Let A be an abelian variety over F with good reduction. Then there is \(n_0\geqslant 1\) such that for every \(m\geqslant n_0\) and \(0\ne P\in \widehat{\mathcal {A}}[p^m](\mathcal {O}){\setminus } \widehat{\mathcal {A}}[p^{n_0 -1 }](\mathcal {O})\), we have \(P\notin \widehat{\mathcal {A}}(\mathcal {O}^{(1)})\).

For a more concretely description of what this means in terms of the coordinates of the torsion point P, we refer the reader to Remark 3.6.

Our second result gives conditions on \(\widehat{\mathcal {A}}\) for which one may take \(n_0 = 1\) in Theorem 1. More precisely, we describe a condition on a formal group \({\mathscr {F}}\) of dimension g over \({{\,\textrm{Spf}\,}}(\mathcal {O}_K)\) which implies that \(0\ne P = (x_1,\dots ,x_g) \in {\mathscr {F}}[p](\mathcal {O})\), the field of definition K(P)/K is tamely ramified. The condition is discussed in Sect. 4 and is related to a symmetric formal group law from [5].

Theorem B

Let \({\mathscr {F}}\) be a strict (Definition 4.1) formal group of dimension g over \({{\,\textrm{Spf}\,}}(\mathcal {O}_K)\). For \(0\ne P = (x_1,\dots ,x_g) \in {\mathscr {F}}[p](\mathcal {O})\), the field of definition K(P)/K is tamely ramified and \(\mathcal {O}_{K(P)} \cong \mathcal {O}_K[x_1,\dots ,x_g]\). Moreover, \(K({\mathscr {F}}[p])/K\) is tamely ramified.

1.1 Related results

In [12, Section 1], Serre showed that for \(E/\mathbb {Q}_p\) an elliptic curve with good supersingular reduction, the field extension \(\mathbb {Q}_p(E[p])/\mathbb {Q}_p\) is tamely ramified, and his proof relies on a detailed study of the formal group attached to E. In particular, Serre explicitly determined the p-adic valuation of the points on E[p] (note that when \(E/\mathbb {Q}_p\) has good supersingular reduction we have that \(E[p]\cong \widehat{{\mathcal E}}[p]\) where \(\widehat{{\mathcal E}}\) is the formal group of the Néron model of E), which allowed him to embed E[p] into a certain vector space on which the wild inertia group acts trivially.

In her thesis, Arias-de-Reyna generalized this approach, and in [5, Theorem 3.3], she showed that if there exists a positive rational number such that for all \(0\ne P = (x_1,\dots ,x_g) \in \widehat{\mathcal {A}}[p](\mathcal {O})\), the minimum of p-adic valuation of the coordinates of P equals \(\alpha \), then the action of wild inertia on \(\widehat{\mathcal {A}}[p]\) is trivial, and so \(K(\widehat{\mathcal {A}}[p])/K\) is tamely ramified. She goes on to define the notion of a symmetric formal group law on a formal group of dimension 2, and then proves [5, Theorem 4.15] that if formal group of dimension 2 has a symmetric formal group law and height 4, then the above statement about the p-adic valuation of the p-torsion points holds. Later in [5, Theorem 5.9], she identifies a family of genus 2 curves whose Jacobians have associated formal groups with a symmetric formal group law and height 4, and hence their p-torsion defines a tamely ramified extension. We also mention work of Rosen and Zimmerman [11], in which the authors study the Galois group of \(K({\mathscr {F}}[p^n])\) where \({\mathscr {F}}\) is a generic commutative formal group of dimension 1 and height h.

In the global setting i.e., when working over a number field \(F/\mathbb {Q}\), Coleman [1] studied the ramification properties of torsion points on abelian varieties in relation to the Manin–Mumford conjecture. More precisely, he conjectured (loc. cit. Conjecture B) that for a smooth, projective, geometrically integral curve C/F and any Galois stable torsion packet T in \(C(\overline{F})\), the field F(T)/F is unramified at a certain prime \(\mathfrak {p}\) of F. Coleman proved this conjecture when \(\mathfrak {p}\) is large enough, and using work of Bogomolov, he provided a new proof of the Manin–Mumford conjecture.

We conclude by noting that our Theorem 1 generalizes [5, Theorem 4.15].

1.2 Outline of paper

In Sect. 2, we recall the definition of the Fontaine integral, our previous work on the kernel of the Fontaine integral [9], and a different perspective on the Fontaine integral via work of Wintenberger. In Sect. 3, we prove Theorem 1. We conclude in Sect. 4 with the definition of a strict formal group and our proof of Theorem 1.

1.3 Conventions

We establish the following notations and conventions throughout the paper.

1.3.1 Fields

Fix a rational prime \(p>2\). Let K denote the completion of maximal unramified extension of \({\mathbb Q}_p\), let \(\overline{K}\) be a fixed algebraic closure of K, and let \({\mathbb C}_p\) denote the completion of \(\overline{K}\) with respect to the unique extension v of the p-adic valuation on \({\mathbb Q}_p\) (normalized such that \(v(p) = 1\)). For a tower of field extensions \({\mathbb Q}_p\subset F\subset K\), we denote by \(G_K\) and respectively \(G_F\) the absolute Galois groups of K and F respectively. We denote \(\mathcal {O}:=\mathcal {O}_{{\overline{K}}}\).

1.3.2 Abelian varieties

We will consider an abelian variety A defined over some subfield \(F\subset K\) such that \([F:{\mathbb Q}_p]<\infty \), with good reduction over F. Let \(\mathcal {A}\) denote the Néron model of A over \(\textrm{Spec}(\mathcal {O}_F)\) and also denote by \(\widehat{\mathcal {A}}\) the formal completion of \(\mathcal {A}\) along the identity of its special fiber, i.e. the formal group of A. We note that the formation of Néron models commutes with unramified base change. We will denote the Tate module of A (resp. the Néron model \(\mathcal {A}\) of A) by \(T_p(A)\) (resp. \(T_p(\mathcal {A})\)). We note that \(T_p(A)\cong T_p(\mathcal {A})\) as \(G_F\)-modules.

1.3.3 Formal groups

We will let \({\mathscr {F}}\) denote a formal group over \({{\,\textrm{Spf}\,}}(\mathcal {O}_K)\). Recall that \(\widehat{\mathcal {A}}\) is a formal group of dimension \(\dim (A)\) and of height h which satisfies \(\dim (A) \leqslant h \leqslant 2\dim (A)\). We refer the reader to [8] for an extensive treatment of formal groups and to [4, Chapter 4.2 and Chapter 5] for a more concise treatment.

2 Fontaine integration for abelian varieties with good reduction

In this section, we recall the construction of the Fontaine integration as well as our previous work concerning the kernel of the Fontaine integral.

2.1 The differentials of the algebraic integers

First, we recall for the reader’s convenience the notation established above. Let K denote the maximal unramified extension of \({\mathbb Q}_p\), let \(\overline{K}\) be a fixed algebraic closure of K, and let \({\mathbb C}_p\) denote the completion of \(\overline{K}\). Let \(G_K\) denote the absolute Galois group of K. We denote \(\mathcal {O}:=\mathcal {O}_{{\overline{K}}}\). Fix a finite extension F of \({\mathbb Q}_p\) in K. For a \(G_K\)-representation V, the n-th Tate twist of V is denoted by V(n), which is just the tensor product of V with the n-fold product of the p-adic cylcotomic character \(\mathbb {Q}_p(1)\).

In [7], Fontaine studied a fundamental object related to these choices, namely the \(\mathcal {O}\)-module \(\Omega :=\Omega ^1_{\mathcal {O}/\mathcal {O}_K}\cong \Omega ^1_{\mathcal {O}/\mathcal {O}_F}\) of Kähler differentials of \(\mathcal {O}\) over \(\mathcal {O}_K\), or over \(\mathcal {O}_F\). The \(\mathcal {O}\)-module \(\Omega \) is a torsion and p-divisible \(\mathcal {O}\)-module, with a semi-linear action of \(G_F\). Let \(d:\mathcal {O}\rightarrow \Omega \) denote the canonical derivation, which is surjective.

Important examples of algebraic differentials arise as follows: Let \((\varepsilon _n)\) denote a compatible sequence of primitive pth roots of unity in \(\overline{K}\). Then

$$\begin{aligned} \frac{d\varepsilon _n}{\varepsilon _n} = d (\log \varepsilon _n) \in \Omega \quad \text{ and } \quad p\left( \frac{d\varepsilon _{n+1}}{\varepsilon _{n+I}}\right) =\frac{d\varepsilon _n}{\varepsilon _n}. \end{aligned}$$

Next, we recall a theorem of Fontaine.

Theorem 2.1

([7, Théorème 1’]) Let \((\varepsilon _n)\) denote a compatible sequence of primitive pth roots of unity in \(\overline{K}\). For \(\alpha \in K\), write \(\alpha = a/p^r\) for some \(a\in \mathcal {O}\). The morphism \(\xi :\overline{K}(1) \rightarrow \Omega \) defined by

$$\begin{aligned} \xi (\alpha \otimes (\varepsilon _n)_n) = a \frac{d\varepsilon _r}{\varepsilon _r} \end{aligned}$$

is surjective and \(G_K\)-equivariant with kernel

$$\begin{aligned} \ker (\xi ) = \underline{a}_{K} := \left\{ x\in \overline{K}: v(x) \geqslant - \frac{1}{p-1} \right\} . \end{aligned}$$

Moreover, \(\Omega \cong {\overline{K}(1)}/{\underline{a}_K(1)} \cong (\overline{K}/\underline{a}_K)(1)\) and \(V_p(\Omega ) = {{\,\textrm{Hom}\,}}_{\mathbb {Z}_p}(\mathbb {Q}_p,\Omega ) \cong \mathbb {C}_p(1)\).

Theorem 2.1 implies the following:

$$\begin{aligned} T_p(\Omega )\otimes _{{\mathbb Z}_p}{\mathbb Q}_p:=\left( \varprojlim _n \Omega [p^n]\right) \otimes _{{\mathbb Z}_p}{\mathbb Q}_p\cong \left( \varprojlim \left( \Omega {\mathop {\leftarrow }\limits ^{p}}\Omega {\mathop {\leftarrow }\limits ^{p}}\cdots {\mathop {\leftarrow }\limits ^{p}}\Omega \cdots \right) \right) \otimes _{{\mathbb Z}_p}{\mathbb Q}_p\cong {\mathbb C}_p(1) \end{aligned}$$

as \(G_F\)-modules.

We denote by \(\mathcal {O}^{(1)}:=\ker (d)\), the kernel of d, which is an \(\mathcal {O}_K\)-sub-algebra of \(\mathcal {O}\). Indeed, if \(a,b\in \mathcal {O}^{(1)}\), then \(d(ab)=ad(b)+bd(a)=0\), and so \(ab\in \mathcal {O}^{(1)}\). In order to better understand \(\mathcal {O}^{(1)}\), we recall a construction from the first and last author [10].

Definition 2.2

Let \(a\in \mathcal {O}\). Let L/K be a finite extension which contains a, let \(\pi \) be a uniformizer of L, and let \(f\in \mathcal {O}_K[x]\) be such that \(a = f(\pi )\). Then, define

$$\begin{aligned} \delta (a) :=\min \left( v\left( \frac{f'(\pi )}{\mathcal {D}_{L/K}}\right) ,0 \right) \end{aligned}$$

where \(\mathcal {D}_{L/K}\) denotes the different ideal of L/K. Note that \(\delta \) does not depend on \(\pi \), f, or F, and so it defines a function \(\delta :\mathcal {O}\rightarrow (-\infty ,0]\).

Lemma 2.3

(Properties of \(\delta \)) The function \(\delta \) from Definition 2.2 satisfies the following properties.

  1. (1)

    If \(a,b\in \mathcal {O}\), then \(\delta (a + b) \geqslant \min (\delta (a),\delta (b))\), and if \(\delta (a) \ne \delta (b)\), then we have equality.

  2. (2)

    If \(a,b\in \mathcal {O}\), then \(\delta (ab) \geqslant \min (\delta (a) + v(b),\delta (b) + v(a))\).

  3. (3)

    If \(f\in \mathcal {O}_K[x]\) and \(\alpha \in \mathcal {O}\), then \(\delta (f(\alpha )) = \min (v(f'(\theta )) + \delta (\theta ),0)\).

  4. (4)

    If \(x,y \in \), then \(xdy = 0\) if and only if \(v(x) + \delta (y) \geqslant 0\).

  5. (5)

    For \(a\in \mathcal {O}\), \(\delta (a) = 0\) if and only if \(a\in \mathcal {O}^{(1)}\).

  6. (6)

    The formula \(\delta (a db) :=\min (v(a) + \delta (b),0)\) is well-defined and give a map \(\delta :\Omega \rightarrow (-\infty ,0]\), which makes the obvious diagram commutative.

We will use the follow properties of \(\delta \) in our study of the Fontaine integral.

Lemma 2.4

([10, Lemma 2.2]) Let \(a,b \in \mathcal {O}\) be such that \(\delta (a)\leqslant \delta (b)\). Then there exists \(c\in \mathcal {O}_{K[a,b]}\) such that \(c d a = d b\).

Proposition 2.5

([10, Theorem 2.2]) Let L/K be an algebraic extension. Then L is deeply ramified (loc. cit. Definition 1.1) if and only if \(\delta (\mathcal {O}_L)\) is unbounded.

2.2 The definition of Fontaine’s integration

We are now ready to define Fontaine’s integration. Let \( H^0(\mathcal {A}, \Omega ^1_{\mathcal {A}/\mathcal {O}_F})\) and respectively \({{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\) denote the \(\mathcal {O}_F\)-modules of invariant differentials on \(\mathcal {A}\) and respectively its Lie algebra. Note that \(\omega \in H^0(\mathcal {A}, \Omega ^1_{\mathcal {A}/\mathcal {O}_F})\) being invariant implies that \((x\oplus _{ \mathcal {A}} y)^*(\omega ) = x^*(\omega ) + y^*(\omega )\) and \([p]^*(\omega ) = p\omega \) where \(\oplus _{ \mathcal {A}}\) is the group law in \(A({\overline{K}})\).

Definition 2.6

Let \(\underline{u}=(u_n)_{n\in {\mathbb N}}\in T_p(A)\) and \(\omega \in H^0(\mathcal {A}, \Omega ^1_{\mathcal {A}/\mathcal {O}_F})\). Each \(u_n\in \mathcal {A}(\mathcal {O})\) corresponds to a morphism \(u_n:\textrm{Spec}(\mathcal {O})\rightarrow \mathcal {A}\), and hence we can pullback \(\omega \) along this map giving us a Kähler differential \(u_n^*(\omega )\in \Omega \). The sequence \(\left( u_n^*(\omega )\right) _{n\geqslant 0}\) is a sequence of differentials in \(\Omega \) satisfying \(pu_{n+1}^*(\omega )=u_n^*(\omega )\), and hence defines an element in \(V_p(\Omega )\cong {\mathbb C}_p(1)\).

The Fontaine integration map

$$\begin{aligned} \varphi _\mathcal {A}:T_p(A)\rightarrow {{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F}{\mathbb C}_p(1) \end{aligned}$$

is a non-zero \(G_F\)-equivariant map defined by

$$\begin{aligned} \varphi _\mathcal {A}(\underline{u})(\omega ):=\left( u_n^*(\omega )\right) _{n\geqslant 0}\in V_p(\Omega )\cong {\mathbb C}_p(1). \end{aligned}$$

Remark 2.7

Using Theorem 2.1 and the function \(\delta \) from Definition 2.2, we can give an alternative description of the Fontaine integration map. Let \(\underline{u}=(u_n)_{n\geqslant 0}\in T_p(A)\) and \(\omega \in H^0(\mathcal {A}, \Omega ^1_{\mathcal {A}/\mathcal {O}_F})\). Each \(u_n\in \mathcal {A}(\mathcal {O})\) corresponds to a morphism \(u_n:\textrm{Spec}(\mathcal {O})\rightarrow \mathcal {A}\), and hence we can pullback \(\omega \) along this map giving us a Kähler differential \(u_n^*(\omega )\in \Omega \).

For every \(n \geqslant 0\), there is a maximal \(m(n) \geqslant 0\) such that \(u_n^*(\omega ) = \alpha _n (d\varepsilon _{m(n)}/\varepsilon _{m(n)})\) with \(\alpha _n \in \mathcal {O}\) where \(\varepsilon _{m(n)}\) is some primitive \(p^{m(n)}\)-th root of unity. To see this, we first note that

$$\begin{aligned} \delta \left( \frac{d\varepsilon _r}{\varepsilon _r}\right) = -r - \frac{1}{p^r(p-1)} \end{aligned}$$

for any primitive \(p^r\)-th root of unity. This result follow from the definition of \(\delta \) and a result of Tate [14, Proposition 5] on the valuation of the different ideal of \(K(\varepsilon _r)/K\). By taking \(m(n) = -[\delta (u_n^*(\omega ))]\) where [x] denotes the greatest integer of the real number x, we can use Lemma 2.3.(6) and Lemma 2.4 to deduce the above equality.

Now using Theorem 2.1, we have that

$$\begin{aligned} \varphi _{\mathcal {A}}(\underline{u})(\omega ) = \lim _{n\rightarrow \infty }p^{n-m(n)}\alpha _n \in {\mathbb C}_p. \end{aligned}$$

Moreover, using the definition of \(\delta \) and this above interpretation, we can see that if \(\underline{u}\in T_p(A)^{G_K}\) (i.e., if \(\underline{u}\) is an unramified path), then \(\varphi _{\mathcal {A}}(\underline{u})(\omega ) = 0\). Indeed, it is clear from the definition of \(\delta \) that \(m(n) = 0\).

2.3 The kernel of the Fontaine integral

In [9], we studied the kernel of \(\varphi _A\). As noted in Remark 2.7, we have that \(T_p(A)^{G_K}\) lies in \(\ker (\varphi _A)\), and in [9, Theorem 4.5, Theorem A.4], we showed that \(T_p(A)^{G_K} = \ker (\varphi _A)\). In proving these results, we determined the kernel of the Fontaine integral when restricted to the Tate module of the formal group of A. This result will play a role later on, and so we present it below.

Theorem 2.8

([9, Theorem 5.5]) Let A be an abelian variety over F with good reduction, let \(\mathcal {A}\) denote its Néron model, and let \(\widehat{\mathcal {A}}\) be the formal group of A. The Fontaine integral restricted to the Tate module of \(\widehat{\mathcal {A}}\) is injective i.e., \(\ker ((\varphi _A)_{|T_p(\widehat{\mathcal {A}})}) = 0\).

2.4 Another point of view on the Fontaine integration map

In this subsection, we give another perspective on the Fontaine integration map, which will naturally lead us towards an application of Theorem 2.8.

We keep all the notations from the previous sections and Subsection 1.3. Recall that we let \(\mathcal {O}:=\mathcal {O}_{{\overline{K}}}\) and we have the \(\mathcal {O}\)-module \(\Omega :=\Omega ^1_{\mathcal {O}/\mathcal {O}_K}\) with its canonical derivation \(d:\mathcal {O}\rightarrow \Omega \). Note that d is surjective and \(\Omega \) is p-divisible, and let us denote \(\mathcal {O}^{(1)}:=\ker (d)\).

Lemma 2.9

Let \(A_\textrm{inf}^{(1)}\) denote the p-adic completion of \(\mathcal {O}^{(1)}\). Then, the exact sequence of \(G_F\)-modules

$$\begin{aligned} 0\rightarrow \mathcal {O}^{(1)}\rightarrow \mathcal {O}{\mathop {\rightarrow }\limits ^{d}}\Omega \rightarrow 0 \end{aligned}$$

induces another exact sequence:

$$\begin{aligned} 0\rightarrow T_p(\Omega )\rightarrow A_\textrm{inf}^{(1)}{\mathop {\rightarrow }\limits ^{\gamma }}\mathcal {O}_{{\mathbb C}_p}\rightarrow 0, \end{aligned}$$

where \(\gamma \) is an \(\mathcal {O}_F\)-algebra homomorphism and \(T_p(\Omega )\) is seen as an ideal of \(A_\textrm{inf}^{(1)}\) of square 0.

Proof

The statement follows from [3, Lemme 3.8] and also from [10, Corollary 1.1], but we present another proof below.

We consider the diagram

figure a

The snake lemma gives the exact sequence of \(G_F\)-modules:

$$\begin{aligned} 0\rightarrow \Omega [p^n]\rightarrow \mathcal {O}^{(1)}/p^n\mathcal {O}^{(1)}\rightarrow \mathcal {O}/p^n\mathcal {O}\rightarrow 0. \end{aligned}$$

By taking the projective limit with respect to n of this exact sequence, we obtain the claim. \(\square \)

Recall that we have the isomorphism \({{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\cong H^0(\mathcal {A}, \Omega ^1_{\mathcal {A}/\mathcal {O}_F})^\vee \). By Lemma 2.9, we have the short exact sequence

$$\begin{aligned} 0\rightarrow T_p(\Omega ) \rightarrow A_\textrm{inf}^{(1)}\rightarrow \mathcal {O}_{{\mathbb C}_p}\rightarrow 0, \end{aligned}$$

where \(T_p(\Omega )\) is an ideal of \(A_\textrm{inf}^{(1)}\) such that \((T_p(\Omega ))^2=0\).

By definition, we have

$$\begin{aligned} {{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F}T_p(\Omega ) \cong \ker \left( \mathcal {A}(A_\textrm{inf}^{(1)})\rightarrow \mathcal {A}(\mathcal {O}_{{\mathbb C}_p})\right) , \end{aligned}$$

and hence we have the following short exact sequence of abelian groups with \(G_F\)-action

$$\begin{aligned} 0\rightarrow {{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F}T_p(\Omega )\rightarrow \mathcal {A}(A_\textrm{inf}^{(1)})\rightarrow \mathcal {A}(\mathcal {O}_{{\mathbb C}_p})\rightarrow 0. \end{aligned}$$

Consider the following commutative diagram with exact rows

figure b

The snake lemma gives a \(G_K\)-equivariant map

$$\begin{aligned} \nu _n:\mathcal {A}(\mathcal {O}_{{\mathbb C}_p})[p^n]\cong A({\overline{K}})[p^n]\rightarrow {{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F} \Omega [p^n] \end{aligned}$$

and by taking the projective limit over n’s, we obtain a map

$$\begin{aligned} \nu :T_p(A) \rightarrow {{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F}T_p(\Omega ). \end{aligned}$$

Proposition 2.10

The map obtained above

$$\begin{aligned} \nu : T_p(A) \rightarrow {{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F}T_p(\Omega ) \subset \textrm{Lie}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F}{\mathbb C}_p(1) \end{aligned}$$

coincides with Fontaine’s integral, i.e. we have \(\nu =(\varphi _\mathcal {A})\).

Proof

In [15, Section 4, page 394], Wintenberger used a generalization of the above construction to obtain an integration pairing which coincides with the Colmez integration pairing \(\langle \cdot ,\cdot \ \rangle _{{{\,\textrm{Cz}\,}}}\). The result now follows from [2, Proposition 6.1].\(\square \)

3 Consequences of Theorem 2.8: ramification of p-power torsion points on \(\widehat{\mathcal {A}}\)

In this section, we use the interpretation of the Fontaine integral from Proposition 2.10 and Theorem 2.8 to deduce properties concerning the ramification of p-power torsion points on the formal group of A.

To begin, we recall the diagram

figure c

Above, we only wrote a piece of the snake lemma, and by writing more of it, we have an exact sequence of \(G_K\)-modules

$$\begin{aligned} 0\rightarrow \mathcal {A}(A_\textrm{inf}^{(1)})[p^n]\rightarrow \mathcal {A}(\mathcal {O})[p^n]\rightarrow {{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F} \Omega [p^n]. \end{aligned}$$

By taking projective limits, we have the exact sequence

$$\begin{aligned} 0\rightarrow T_p(\mathcal {A}(A_\textrm{inf}^{(1)}))\rightarrow T_p(A){\mathop {\rightarrow }\limits ^{\varphi _\mathcal {A}}}{{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F} T_p(\Omega )\subset {{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F} {\mathbb C}_p(1). \end{aligned}$$
(3.1)

Therefore, Theorem 2.8 implies that \(T_p(\widehat{\mathcal {A}}(A_\textrm{inf}^{(1)}))=0\).

To study consequences of this property, we will use another ring instead of \(A_\textrm{inf}^{(1)}\).

Definition 3.1

([6]) Let \(\theta :A_\textrm{inf}^{(1)}\rightarrow \mathcal {O}_{{\mathbb C}_p}\) denote the projection map. Then, we define \(D_f:=\theta ^{-1}(\mathcal {O})\). In [6, Remark 1.4.7], Fontaine gives the following construction of \(D_f\). Let us recall that

$$\begin{aligned} V_p(\Omega )=T_p(\Omega )\otimes _{{\mathbb Z}_p} {\mathbb Q}_p=\varprojlim \left( \Omega {\mathop {\leftarrow }\limits ^{p}} \Omega {\mathop {\leftarrow }\limits ^{p}}\cdots {\mathop {\leftarrow }\limits ^{p}}\Omega \cdots \right) \end{aligned}$$

and that \(\Omega \) and \(V_p(\Omega )\) are \(\mathcal {O}\)-modules. We make \(R:=V_p(\Omega )\oplus \mathcal {O}\) into a commutative ring by defining multiplication as follows: \((u, \alpha )(v, \beta )=(\beta u+\alpha v, \alpha \beta )\) for \((u, \alpha ), (v, \beta )\in R\), i.e. we require that \(V_p(\Omega )\) is an ideal of R of square 0. Then we have

$$\begin{aligned} D_f=\{\left( u=(u_n)_{n\geqslant 0},\alpha \right) \in R\ |\ d(\alpha )=u_0\}. \end{aligned}$$

By Definition 3.1, we have an exact sequence of \(G_K\)-modules

$$\begin{aligned} 0\rightarrow T_p(\Omega )\rightarrow D_f{\mathop {\rightarrow }\limits ^{\theta }} \mathcal {O}\rightarrow 0, \end{aligned}$$

where \(\theta (u,\alpha )=\alpha \), and the p-adic completion of \(D_f\) is \(A_\textrm{inf}^{(1)}\). We note that we may construct the diagram above in the same way using \(D_f\) instead of \(A_\textrm{inf}^{(1)}\), which produces the exact sequence (3.1) with \(D_f\) instead of \(A_\textrm{inf}^{(1)}\). Instead of the exact sequence (3.1) above, we will have the following exact sequence

$$\begin{aligned} 0\rightarrow T_p(\mathcal {A}(D_f))\rightarrow T_p(A){\mathop {\rightarrow }\limits ^{\varphi _\mathcal {A}}}{{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F} T_p(\Omega )\subset {{\,\textrm{Lie}\,}}(\mathcal {A})(\mathcal {O}_F)\otimes _{\mathcal {O}_F} {\mathbb C}_p(1). \end{aligned}$$

Again, the Theorem 2.8 implies that \(T_p\bigl (\widehat{\mathcal {A}}(D_f)\bigr )=0\).

We will use this observation to deduce that \(T_p\bigl (\widehat{\mathcal {A}}(\mathcal {O}^{(1)})\bigr )=0\). In order to do so, we need to show that \(\widehat{\mathcal {A}}[p^n](D_f)\cong \widehat{\mathcal {A}}[p^n](\mathcal {O}^{(1)})\), for all \(n\geqslant 1\), which is accomplished through the following two lemmas.

Lemma 3.2

Let \(x\in \mathcal {A}[p^n](\mathcal {O}^{(1)})\), then there is \(x'\in \mathcal {A}(D_f)\) with \(\theta (x')=x\) and such that \([p^n](x')=0\).

Proof

Recall that \(D_f:=\left\{ \bigl ((x_n)_n, y)\in V_p(\Omega )\times \mathcal {O}\ |\ x_0=dy\right\} \), i.e. we have an exact sequence

$$\begin{aligned} 0\rightarrow T_p(\Omega )\rightarrow D_f{\mathop {\rightarrow }\limits ^{\theta }}\mathcal {O}\rightarrow 0 \end{aligned}$$

and this exact sequence splits over \(\mathcal {O}^{(1)}\subset \mathcal {O}\), i.e. the following diagram is cartesian and has exact rows

$$\begin{aligned} \begin{array}{cccccccccc} 0&{}\longrightarrow &{}T_p(\Omega )&{}\longrightarrow &{} D_f&{}{\mathop {\longrightarrow }\limits ^{\theta }}&{}\mathcal {O}&{}\longrightarrow &{} 0\\ &{}&{}||&{}&{}\cup &{}&{}\cup \\ 0&{}\longrightarrow &{}T_p(\Omega )&{}\longrightarrow &{}T_p(\Omega )\oplus \mathcal {O}^{(1)}&{}{\mathop {\longrightarrow }\limits ^{\theta }}&{}\mathcal {O}^{(1)}&{}\longrightarrow &{} 0. \end{array} \end{aligned}$$

In particular, the section \(s:\mathcal {O}^{(1)} \longrightarrow D_f\) is defined by \(s(x):=(0,x)\). Then s defines a morphism \(s:\mathcal {A}(\mathcal {O}^{(1)})\rightarrow \mathcal {A}(D_f)\), and if \(x\in \mathcal {A}[p^n](\mathcal {O}^{(1)})\), then \(s(x)\in \mathcal {A}[p^n](D_f)\).\(\square \)

In the next lemma, we show a converse of Lemma 3.2 at least for the formal group \(\widehat{\mathcal {A}}\) of A.

Lemma 3.3

Let \(\widehat{\mathcal {A}}\) denote the formal group of the abelian scheme \(\mathcal {A}\) and fix \(n\geqslant 1\) an integer. Let

$$\begin{aligned} 0\ne P\in \widehat{\mathcal {A}}(\mathcal {O})[p^n]\backslash \widehat{\mathcal {A}}(\mathcal {O}^{(1)}) \end{aligned}$$

and let \(Q\in \widehat{\mathcal {A}}(D_f)\) be a point such that \(\theta (Q)=P\). Then \([p^m]Q\ne 0\) for all \(m\geqslant n\).

Proof

Let \(m\geqslant n\) and denote by

$$\begin{aligned}{}[p^m](X_1,\dots ,X_g):=\left( f_1(X_1,\dots ,X_g), f_2(X_1,\dots ,X_g),\dots , f_g(X_1,\dots ,X_g)\right) \end{aligned}$$

the multiplication by \(p^m\) on \(\widehat{\mathcal {A}}\). Let \(S:=\mathcal {O}_F[[X_1,\dots ,X_g)]]/I\), where I is the ideal generated by \(f_1, f_2,\dots ,f_g\), then we know S is a finite flat \(\mathcal {O}_F\)-algebra, in particular S is a free \(\mathcal {O}_F\)-module and \(\widehat{\mathcal {A}}[p^m]:=\textrm{Spec}(S)\) with the co-multiplication of \(\widehat{\mathcal {A}}\), is a finite flat group-scheme, so \(\widehat{\mathcal {A}}\times _{\mathcal {O}_F}\textrm{Spec}(F)\) is an étale, therefore smooth, group-scheme over F. This implies that the image in \(S\otimes _{\mathcal {O}_F}F\) of the determinant of the matrix:

$$\begin{aligned} \left( \begin{array}{cccccc} \frac{\partial (f_1)}{\partial (X_1)} &{} \frac{\partial (f_1)}{\partial (X_2)}&{}\dots &{}\frac{\partial (f_1)}{\partial (X_g)}\\ \vdots &{}\vdots &{}\ddots &{}\vdots \\ \frac{\partial (f_g)}{\partial (X_1)} &{} \frac{\partial (f_g)}{\partial (X_2)}&{}\dots &{}\frac{\partial (f_g)}{\partial (X_g)}\end{array}\right) \end{aligned}$$

is a unit.

Let now \(P=(x_1,x_2,\dots ,x_g)\in \mathfrak {m}_{\mathcal {O}}^g\in \widehat{\mathcal {A}}[p^n](\mathcal {O})\backslash \widehat{\mathcal {A}}[p^n](\mathcal {O}^{(1)})\), i.e. there is \(1\leqslant i\leqslant g\) such that \(x_i\) is not in \(\mathcal {O}^{(1)}\). Let \(P'=(y_1,y_2,\dots ,y_g)\in \widehat{\mathcal {A}}(D_f)\) such that \(\theta (P')=P\), i.e. \(y_j=\alpha _j+x_j\), with \(\alpha _j\in V_p(\Omega )\) and \(d(x_j)=\alpha _{j,0}\), for all \(1\leqslant j\leqslant g\). By the above assumption \(\alpha _i\ne 0\). As in \(D_f\) we have \(\alpha _j\alpha _k=0\) for all \(1\leqslant j,k\leqslant g\), the Taylor formula implies that if \([p^m](P')=0\) we must have:

$$\begin{aligned} f_s(x_1,\dots ,x_g)+ \sum _{j=1}^g\frac{\partial (f_s)}{\partial (X_j)}(x_1,\dots ,x_g)\alpha _j=\sum _{j=1}^g\frac{\partial (f_s)}{\partial (X_j)}(x_1,\dots ,x_g)\alpha _j =0 \end{aligned}$$

for every \(1\leqslant s\leqslant g\). But the determinant of the matrix

$$\begin{aligned} \left( \begin{array}{cccccc} \frac{\partial (f_1)}{\partial (X_1)}(x_1,\dots ,x_g) &{} \frac{\partial (f_1)}{\partial (X_2)}(x_1,\dots ,x_g)&{}\dots &{}\frac{\partial (f_1)}{\partial (X_g)}(x_1,\dots ,x_g)\\ \vdots &{}\vdots &{}\ddots &{}\vdots \\ \frac{\partial (f_g)}{\partial (X_1)}(x_1,\dots ,x_g) &{} \frac{\partial (f_g)}{\partial (X_2)}(x_1,\dots ,x_g)&{}\dots &{}\frac{\partial (f_g)}{\partial (X_g)}(x_1,\dots ,x_g)\end{array}\right) \end{aligned}$$

is a unit in \(\overline{K}\), i.e. it is non-zero and \(\alpha _i\ne 0\). This is a contradiction.\(\square \)

Remark 3.4

We note that the group-scheme \(\widehat{\mathcal {A}}[p^m]\) is not smooth over \(\mathcal {O}_F\) (for example S/pS could have nilpotents). We also remark that as \(\widehat{\mathcal {A}}[p^m]\times _{\mathcal {O}_F}\textrm{Spec}(F)\) is smooth, the map \(\theta \otimes 1:\widehat{\mathcal {A}}[p^m](D_f\otimes _{\mathcal {O}_F}K)\rightarrow \widehat{\mathcal {A}}[p^m](\overline{F})\) is surjective, but this is clear as \(D_f\otimes _{\mathcal {O}_F}F=V_p(\Omega )\oplus \overline{F}\).

Lemma 3.2 and Proposition 3.3 imply that the map \(\theta \) gives an isomorphism \(\widehat{\mathcal {A}}[p^n](D_f)\cong \widehat{\mathcal {A}}[p^n](\mathcal {O}^{(1)})\), for all \(n\geqslant 1\). Combining this with the fact that \(T_p\bigl (\widehat{\mathcal {A}}(D_f)\bigr )=0\), we have the following result.

Theorem 3.5

(\(=\)Theorem 1) Let A be an abelian variety over F with good reduction. Then there is \(n_0\geqslant 1\) such that for every \(m\geqslant n_0\) and \(0\ne P\in \widehat{\mathcal {A}}[p^m](\mathcal {O}){\setminus } \widehat{\mathcal {A}}[p^{n_0--1 }](\mathcal {O})\), we have \(P\notin \widehat{\mathcal {A}}(\mathcal {O}^{(1)})\)

Remark 3.6

We observe that the above is a result regarding ramification properties of the p-power torsion points of the formal group of our abelian variety with good reduction over F (c.f. Proposition 2.5). More precisely, let \(m\geqslant n_0\) and \(0\ne P\in \widehat{\mathcal {A}}[p^m](\mathcal {O}){\setminus } \widehat{\mathcal {A}}[p^{n_0--1 }](\mathcal {O})\) be as in the theorem above. Let \(P=(x_1,x_2,...,x_g)\) with \(x_i\in \mathcal {O}\) for \(1\leqslant i\leqslant g\). Lemma 3.5 says the following: let \(L=K[P]:=K[x_1,x_2,...,x_g]\), let \(\pi \) denote a uniformizer of L and let \(\mathcal {D}_{L/K}\) denote the different ideal of L/K. For every \(1\leqslant i\leqslant g\) let \(f_i(X)\in \mathcal {O}_K[X]\) be polynomials such that \(f_i(\pi )=x_i\), for every i. Then there is \(1\leqslant j\leqslant g\) such that \(v(f_j'(\pi ))< v(\mathcal {D}_{L/K})\) (i.e. \(x_j\notin \mathcal {O}^{(1)}\)).

4 A theorem on the ramification type of the field obtained by adjoining a p-torsion point of a formal group

In this subsection we study the ramification properties of the extension K[P]/K, where P is a non-zero p-power torsion point of the formal group of A.

We continue to denote by K the completion of the maximal unramified extension of \({\mathbb Q}_p\) in an algebraic closure of \({\mathbb Q}_p\), which we denote \({\overline{K}}\). Let \({\mathscr {F}}\) denote a formal group of dimension g over \(\textrm{Spf}(\mathcal {O}_K)\). For example, \({\mathscr {F}}\) can be the formal group of the Néron model of A over \(\textrm{Spf}(\mathcal {O}_K)\).

To begin, we define the notion of a strict formal group.

Definition 4.1

Consider the multiplication-by-p map

$$\begin{aligned}(X_1,\dots .,X_g) = (f_1(X_1,\dots .,X_g),f_2(X_1,\dots .,X_g),\dots ,f_g(X_1,\dots .,X_g))\end{aligned}$$

on \({\mathscr {F}}\) where each \(f_i(X_1,\dots .,X_g)\) is a power series in with coefficients in \(\mathcal {O}_K\). For each \(1\leqslant i\leqslant g\), define \(F_i(X_1,\dots ,X_g)\) to be the form comprised of monomials of \(f_i\) which have unit coefficient and minimal degree, where we consider each monomial \(X_1,\dots ,X_g\) to be of degree 1.

Let \(d_1,\dots ,d_g\) denote the degree of these forms \(F_1(X_1,\dots ,X_g),\dots ,F_g(X_1,\dots ,X_g)\), respectively, which we note are (possibly distinct) powers of p. Let \(G_1(X_1,\dots ,X_g),\dots ,G_g(X_1,\dots ,X_g)\) denote the reductions modulo p of the forms \(F_1(X_1,\dots ,X_g),\dots ,F_g(X_1,\dots ,X_g)\).

Consider the system of equations

$$\begin{aligned} G_1(X_1,\dots ,X_g) = G_2(X_1,\dots ,X_g) = \cdots = G_g(X_1,\dots ,X_g) = 0, \end{aligned}$$
(4.1)

We say that the formal group \({\mathscr {F}}\) is strict if \(d_1 = d_2 = \cdots = d_g\) and the only solution to (4.1) is \((0,0,\dots ,0) \in (\overline{\mathbb {F}_p})^g\).

Remark 4.2

If \({\mathscr {F}}\) is a formal group of dimension 1, then it is clear that \({\mathscr {F}}\) is strict since we will have that \(F_1(X_1) = uX_1^{p^h}\), where h is the height of \({\mathscr {F}}\). Moreover, if \({\mathscr {F}}\) is the product of 1-dimensional formal groups, then again \({\mathscr {F}}\) is strict.

Remark 4.3

We can given an equivalent characterization of strict as follows. Consider the \(g\times g\) matrix \(M = (a_{ij})\) where the entry \(a_{ij}\) consists of the coefficient of \(X_i\) in the linear form \(G_j\) for each \(1\leqslant i,j \leqslant g\). The condition that \({\mathscr {F}}\) be strict is equivalent to the determinant of M being non-zero.

We refer the reader to [5, Remark 4.14] for an example of a 2-dimensional formal group of height 4 where this condition holds, and here we note that the above degrees all equal \(p^2\). Moreover, we remark that the proof from loc. cit. holds for any g-dimensional formal group of height 2g where the degrees \(d_1 = d_2 = \cdots = d_g\) all equal \(p^2\).

With this definition, we can state our main result.

Theorem 4.4

Let \({\mathscr {F}}\) be a strict formal group of dimension g. For \(0\ne P = (x_1,\dots ,x_g) \in {\mathscr {F}}[p](\mathcal {O})\), the field of definition K(P)/K is tamely ramified and \(\mathcal {O}_{K(P)} \cong \mathcal {O}_K[x_1,\dots ,x_g]\). Moreover, \(K({\mathscr {F}}[p])/K\) is tamely ramified.

Proposition 4.5

Let \({\mathscr {F}}\) be a strict the formal group of dimension g. For every \(0\ne P\in {\mathscr {F}}[p](\mathcal {O})\), the coordinates of P are not all in \(\mathcal {O}^{(1)}\).

Proof

Let \(0\ne P = (x_1,\dots ,x_g) \in {\mathscr {F}}[p](\mathcal {O})\) be a non-zero p-torsion point. By Theorem 4.4, we know that the extension K(P)/K is tamely ramified and that there exists some coordinate \(x_i\) which is a uniformizer for K(P)/K. By [13, Proposition III.6.13], we have that \(v(\mathcal {D}_{K(P)/K}) > 0 \) where \(\mathcal {D}_{K(P)/K}\) is the different ideal of K(P)/K. Now since \(x_i\) is a uniformizer for K(P)/K, we have that

$$\begin{aligned} \delta (x_i) = -v(\mathcal {D}_{K(P)/K}) < 0, \end{aligned}$$

where \(\delta \) is the function defined in Definition 2.2. By Lemma 2.3.(5), \(x_i \notin \mathcal {O}^{(1)}\) as desired. \(\square \)

For the remainder of this section, we focus on proving Theorem 4.4. The proof can be broken down into three steps.

  1. (1)

    Given a non-zero p-torsion point \(P = (x_1,\dots ,x_g) \in {\mathscr {F}}[p](\mathcal {O})\), we will carefully construct linear combinations \(z_i^*\) of the \(x_1,\dots ,x_n\) which satisfy nice properties in terms of their valuations and distances between their \(\overline{K}\)-conjugates. See Lemma 4.6.

  2. (2)

    Next, we consider the change of variables (i.e., the isomorphism of formal groups) which sends the coordinate \(X_i\) to the linear combination \(Z_i^*\) described above. We use the properties of the \(z_i^*\) and the strictness of \({\mathscr {F}}\) to precisely determine the valuation of \(z_i^*\) and to estimate the valuation of the difference between them and their \(\overline{K}\)-conjugates.

  3. (3)

    Finally, we use Krasner’s lemma to deduce that one of the original coordinates \(x_1,\dots ,x_g\) of P must be a uniformizer for the maximal order of K(P), from which Theorem 4.4 follows.

Lemma 4.6

Let \({\mathscr {F}}\) be a formal group of dimension g over \({{\,\textrm{Spf}\,}}(\mathcal {O}_K)\). Let \(0\ne P= (x_1,\dots ,x_g) \in {\mathscr {F}}[p](\mathcal {O})\). There exist linear combinations \(z_1^*,\dots ,z_g^*\) of \(x_1,\dots ,x_g\) with coefficients in \((\mathcal {O}_K)^{\times } \cup \{0\}\) which satisfy:

  1. (1)

    \(K(z_{i}^*) \cong K(P)\),

  2. (2)

    \(v(z_{i}^*) = \min \{ v(x_{1}),\dots , v(x_{g})\}\),

  3. (3)

    \(v(z_{i}^* - \sigma (z_{i}^*)) = \min \{ v(x_{1} - \sigma (x_{1})),\dots , v(x_{g} - \sigma (x_{g}))\}\) for all \(\sigma \in {{\,\textrm{Gal}\,}}(\widetilde{K(P)}/K)\) where \(\widetilde{K(P)}\) is the Galois closure of K(P),

and such the matrix M representing the change of coordinates \((z_1^*,\dots ,z_g^*)^t= M (x_1,\dots ,x_g)^t\) is invertible. Here the exponent t indicates the transpose of a matrix.

Proof

Let \(e:= [K(P): K]\). Our proof will involve making a series of linear combinations. To begin, we will construct the element \(z_1^*\). First, consider all the linear combinations of the form

$$\begin{aligned} \mathcal {B}_1:=\{z = u_1x_1 + \cdots + u_gx_g \text { where } u_i\in (\mathcal {O}_K)^{\times } \cup \{0\} \text { and }u_1 \ne 0 \}. \end{aligned}$$
(4.2)

By our assumptions on K, the set of \(u_i (\text{ mod } p)\) is infinite, with \(u_i\) as in the above formula, and hence we may find one linear combination, call it \(z_1\) in \(\mathcal {B}_1\), satisfying the following two conditions:

  1. (a)

    \(v(z) \geqslant v(z_1)\) for all other linear combinations z from \(\mathcal {B}_1\),

  2. (b)

    \(K(z_1) \cong K(P)\).

To show that \(K(z_1) \cong K(P)\) holds, consider the following. There are exactly e embeddings of \(K(x_1, \dots ,x_g)\) into the fixed algebraic closure \({\overline{K}}\), call them \(\sigma _1, \dots , \sigma _e\). Note that the vectors \({{\textbf {w}}}_j:= (\sigma _j(x_1), \dots , \sigma _j(x_g))\), \( 1\leqslant j \leqslant e\), are distinct. Indeed, if for some \( i \ne j\) the vectors \({{\textbf {w}}}_i\) and \({{\textbf {w}}}_j\) coincide, then \(\sigma _i\) and \(\sigma _j\) will coincide at \(x_1,\dots ,x_g\) and so they will coincide on \(K(x_1,\dots ,x_g)\), which is not the case.

Consider now, for each pair (ij) with \(1\leqslant i, j \leqslant e\) and \(i\ne j\), the hyperplane \({\mathcal H}_{i,j}\) given by

$$\begin{aligned} {\mathcal H}_{i,j} = \left\{ (c_1,c_2,\dots ,c_g): c_1, \dots , c_g \in K, \sum _{l = 1}^g c_l ( \sigma _i(x_l) - \sigma _j(x_l)) = 0 \right\} . \end{aligned}$$

Since the vectors \({{\textbf {w}}}_j\), \(1 \leqslant j \leqslant e\) are distinct, none of \({\mathcal H}_{i,j}\) covers the full space \(K^g\). Denote by \({\mathcal H}\) the union of these finitely many hyperplanes. Choose now any \(c_1,\dots ,c_g \in K\) such that the point \((c_1,c_2,\dots ,c_g)\) lies outside \({\mathcal H}\). Then we claim that the element

$$\begin{aligned} z:= c_1x_1+ \cdots + c_g x_g \end{aligned}$$

satisfies \(K(z) = K(x_1,\dots ,x_g)\). Indeed, \(\sigma _1(z),\dots ,\sigma _e(z)\) are distinct. For, if two of them are equal, say \(\sigma _i(z) = \sigma _j(z)\) with \(i\ne j\), then \((c_1,c_2\dots ,c_g) \) is forced to lie in \({\mathcal H}_{i,j}\). Thus \(\sigma _1(z),\dots ,\sigma _e(z)\) are distinct, so z has at least e distinct conjugates over K. Hence \([K(z):K] \geqslant e\) and in conclusion \(K(z) = K(x_1,\dots ,x_g)\). Moreover, we can find an element \(z_1 \in \mathcal {B}_1\) satisfying conditions (a) and (b) above.

We pause to note that the matrix M representing the change of coordinates \((z_1,x_2,\dots ,x_g)^t= M (x_1,x_2,\dots ,x_g)^t\) is invertible. Indeed, the matrix M has units along the diagonal, the coefficients of the linear combination \(z_1\) in the first row, and zeros elsewhere, hence the determinant is a unit.

We now look at the distances between these linear combinations and various of their conjugates over K. Fix \(\sigma \in \text {Gal}(\overline{{\mathbb Q}_p}/K)\) and consider the infimum of the values \(v(z - \sigma (z))\) for the linear combinations z as in (4.2), i.e. we look at \(\inf \{ v\left( z-\sigma (z)\right) \ |\ z\in \mathcal {B}_1\}\). We first show that this infimum exists and is attained by some linear combination. Note that if \(\sigma _{|K(z_1)} = \text {id}\), then since all linear combinations belong to \(K(x_1,\dots ,x_g)\), we have that \(z - \sigma (z) = 0\) for all linear combinations z, i.e. \(\inf \{ v\left( z-\sigma (z)\right) \ |\ z\in \mathcal {B}_1\}=\infty =v(z_1-\sigma (z_1))\). If we consider \(\sigma \) such that \(\sigma _{|K(z_1)} \ne \text {id}\), then we at least have that \(v(z - \sigma (z)) < \infty \) for some linear combinations z, for example for \(z = z_1\). We have that the set of possible valuations is discrete, and the set of values \(v(z - \sigma (z))\) has a lower bound, namely \(\min \{v(x_1 - \sigma (x_1)),\dots ,v(x_n - \sigma (x_g))\}\). Thus, the infimum of the set of values \(v(z - \sigma (z))\) is attained by some linear combination z from (4.2), but note it need not be obtained by \(z_1\).

Let G denote \({{\,\textrm{Gal}\,}}(\widetilde{K(P)}/K)\) where \(\widetilde{K(P)}\) is the Galois closure of K(P)/K. For a fixed \(\sigma \in G\), let \(z_{\sigma }\in \mathcal {B}_1\) denote one such linear combination attaining the minimum \(v(z_\sigma - \sigma (z_\sigma ))=\min \{ v\left( z-\sigma (z)\right) \ |\ z\in \mathcal {B}_1\}\). We claim that we can find a linear combination of \(z_1\) and of all of these \(z_{\sigma }\) where \(\sigma \in G\) which simultaneously achieves these minima. To do this, consider all linear combinations

$$\begin{aligned} z^* = z_1 + \sum _{\sigma \in G}u_{\sigma }z_{\sigma } \quad \text { where } u_{\sigma } \in (\mathcal {O}_K)^{\times }\cup \{0\}. \end{aligned}$$
(4.3)

We will choose the \(u_\sigma \)’s such that each such \(z^*\) will live in \(\mathcal {B}_1\).

To achieve our desired simultaneous minima, we start with one \(\sigma \in G\), call it \(\sigma _1\). First, we let \(z^*= z_1\). This linear combination might work in that it already attains the minimum at \(\sigma _1\); by this we mean that \(v(z - \sigma _1(z)) \geqslant v(z_1 - \sigma _1(z_1))\) holds for all linear combinations \(z\in \mathcal {B}_1\) from (4.2). If this is the case, then we set \(y_1:=z_1\). Now suppose that \(z_1\) does not attain the minimum at \(\sigma \). In this case, we may use any unit \(u\in (\mathcal {O}_K)^{\times }\) and set \(z^*:= z_1 + uz_{\sigma _1}\). Indeed, for any unit \(u \in (\mathcal {O}_K)^{\times }\) and \(z^*\) above, we have that

$$\begin{aligned} v(z^* - \sigma _1(z^*)) =v\left( z_1-\sigma _1(z_1)+u(z_{\sigma _1}-\sigma _1(z_{\sigma _1})\right) = v(z_{\sigma _1} - \sigma _1(z_{\sigma _1})), \end{aligned}$$

because \(v(z_1-\sigma _1(z_1))>v(z_{\sigma _1}-\sigma _1(z_{\sigma _1})\). Let \(y_1:=z_1+uz_{\sigma _1}\) with unit \(u \in (\mathcal {O}_K)^{\times }\) such that \(y_1\in \mathcal {B}_1\). We have that for such \(y_1\in \mathcal {B}_1\), \(v(y_1-\sigma _1(y_1))\leqslant v(z-\sigma _1(z)\) for all \(z\in \mathcal {B}_1\) and that the \(u's\) with \(y_1\in \mathcal {B}_1\) have the property that \(u \pmod p\) avoids a finite number of elements in \(\overline{{\mathbb F}_p}\).

For another automorphism \(\sigma _2\in G\), we proceed along the same lines, that is: if \(y_1\) has the property that \(v(y_1-\sigma _2(y_1))\leqslant v(z-\sigma _2(y_1))\) for all \(z\in \mathcal {B}_1\) we set \(y_2:=y_1\). If the above is not true, let \(z^*:=y_1+ux_{\sigma _2}\), for some \(u\in \mathcal {O}_K^\times \). Then as above we have: \(v(z^*-\sigma _2(z^*))=v\left( x_{\sigma _2}-\sigma _2(x_{\sigma _2})\right) \) therefore \(z^*\) realizes the minimum for \(\sigma _2\), for all \(u's\) for which \(z^*\in \mathcal {B}_1\). For \(\sigma _1\), the worst that can happen is that \(v(y_1-\sigma _1(y_1))=v\left( x_{\sigma _2}-\sigma _1(x_{\sigma _2})\right) \), i.e.  if we denote by \(\pi \) a uniformizer of \(\widetilde{K(P)}\), we have \(y_1-\sigma _1(y_1)=a\pi ^\alpha \) and \(x_{\sigma _2}-\sigma _1(x_{\sigma _2})=b\pi ^\alpha \), with \(a,b\in \mathcal {O}_{K(P)}^\times \). Therefore \(z^*-\sigma _1(z^*)=(a+ub)\pi ^a.\) Now the residue field of \(\mathcal {O}_{K(P)}\) is k, therefore by choosing \(u\in \mathcal {O}_K^{\times }\) such that \(a+ub( \text{ mod } \pi )\ne 0\) we have \(v(z^*-\sigma _1(z^*))=v(y_1-\sigma _1(y_1))\) and therefore \(y_2:=z^*\) realizes the minima for both \(\sigma _1\) and \(\sigma _2\).

Continuing in this fashion, we arrive at the conclusion that there exist linear combinations of the form

$$\begin{aligned} z^*_1 = z_1 + \sum _{\sigma \in G}u_{\sigma }z_{\sigma } \end{aligned}$$
(4.4)

where \(u_{\sigma } \in (\mathcal {O}_K)^{\times }\cup \{0\}\) which satisfy the following four conditions:

  1. (1)

    \(z_{1}^* \in \mathcal {B}_1\),

  2. (2)

    \(K(z^*_1) \cong K(P)\),

  3. (3)

    \(v(z^*_1) = \min \{ v(x_1),\dots , v(x_n)\}\),

  4. (4)

    \(v(z^{*}_1 - \sigma (z^*_1)) = \min \{ v(x_1 - \sigma (x_1)),\dots , v(x_g - \sigma (x_g))\}\) for all \(\sigma \in G\),

as desired. Again, we pause to note that the matrix M representing the change of coordinates \((z_1^*,x_2,\dots ,x_g)^t= M (x_1,x_2,\dots ,x_g)^t\) is invertible. Indeed, the matrix M has units along the diagonal, the coefficients of the linear combination \(z_1^*\) in the first row, and zeros elsewhere, hence the determinant is clearly a unit.

We now wish to iterate this construction as follows. First, consider the set of all linear combinations

$$\begin{aligned} \mathcal {B}_2:=\{z' = u_1z_1^* + u_2x_2 + \cdots + u_gx_g \text { where } u_i\in (\mathcal {O}_K)^{\times } \cup \{0\} \text { and }u_2 \ne 0 \}. \end{aligned}$$
(4.5)

Then, we can repeat the above construction to arrive at a linear combination \(z_2 \in \mathcal {B}_2\) satisfing:

  1. (a)

    \(v(z') \geqslant v(z_2)\) for all other linear combinations z from \(\mathcal {B}_2\),

  2. (b)

    \(K(z_2) \cong K(P)\).

Furthermore, we can follow the above construction to say that there exist linear combinations of the form

$$\begin{aligned} z^*_2 = z_2 + \sum _{\sigma \in G}u_{\sigma }z_{\sigma } \end{aligned}$$
(4.6)

where \(u_{\sigma } \in (\mathcal {O}_K)^{\times }\cup \{0\}\) which satisfy the following four conditions:

  1. (1)

    \(z_{2}^* \in \mathcal {B}_2\),

  2. (2)

    \(K(z^*_2) \cong K(P)\),

  3. (3)

    \(v(z^*_2) = \min \{ v(x_1),\dots , v(x_n)\}\),

  4. (4)

    \(v(z^{*}_2 - \sigma (z^*_2)) = \min \{ v(x_1 - \sigma (x_1)),\dots , v(x_g - \sigma (x_g))\}\) for all \(\sigma \in G\),

Note that the matrix \(M'\) representing the change of coordinates \((z_1^*,z_2^*,\dots ,x_g)^t = M'(z_1^*,x_2,\dots ,x_g)^t\) is invertible. Indeed, \(M'\) has units on the diagonal, the coefficients of the linear combination \(z_2^*\) in the second row, and has zeros everywhere else. Although this matrix is not triangular, we can make it so by switching the second row with the first and interchanging the first and second columns; these operations will not change the determinant. After these operations, the matrix becomes triangular with units on the diagonal, and hence the will be invertible. Moreover, we see that the matrix \(M''\) representing the change of coordinates \((z_1^*,z_2^*,\dots ,x_g)^t = M''(x_1,x_2,\dots ,x_g)^t\) is invertible since \(M'' = M' \cdot M\).

We continue in this fashion for all of the remaining coordinates \(x_3,\dots ,x_g\) and arrive at our desired claim, namely that there exists linear combinations \(z_1^*,\dots ,z_g^*\) of \(x_1,\dots ,x_g\) with coefficients in \((\mathcal {O}_K)^{\times } \cup \{0\}\) which satisfy:

  1. (1)

    \(K(z_{i}^*) \cong K(P)\),

  2. (2)

    \(v(z_{i}^*) = \min \{ v(x_{1}),\dots , v(x_{g})\}\),

  3. (3)

    \(v(z_{i}^* - \sigma (z_{i}^*)) = \min \{ v(x_{1} - \sigma (x_{1})),\dots , v(x_{g} - \sigma (x_{g}))\}\) for all \(\sigma \in G\)

and such the matrix M representing the change of coordinates \((z_1^*,\dots ,z_g^*)^t= M (x_1,\dots ,x_g)^t\) is invertible. \(\square \)

We now complete the proof of Theorem 4.4.

Proof of Theorem 4.4

Fix \(0\ne P = (x_1,\dots ,x_g) \in {\mathscr {F}}[p](\mathcal {O})\) and let \(e = [K(P): K]\). We first remark that since K is a completion of the maximal unramified extension, we have that \(e > 1\), and hence the extension [K(P) : K] is totally ramified. Indeed, this follows from that fact that the group-scheme \({\mathscr {F}}[p]\) is connected. In Lemma 4.6, we constructed linear combinations \(z_i^*\) of \(x_1,\dots ,x_g\) satisfying certain properties. Let \(Z_i^*\) denote the same linear combinations of the coordinates \(X_1,\dots ,X_g\) (so if we were to evaluate \(Z_i^*\) at \((x_1,\dots ,x_g)\) we would recover \(z_i^*\)). The last condition from Lemma 4.6 implies that the change of variables \((X_1,\dots ,X_g) \mapsto (Z_1^*,\dots ,Z_g^*)\) is an isomorphism of formal groups.

We claim that the isomorphism of formal groups \((X_1,\dots ,X_g) \mapsto (Z_1^*,\dots ,Z_g^*)\) will preserve strictness. As each of the \(Z_i^*\) are linear combinations of \(X_1,\dots ,X_g\) with coefficients in \((\mathcal {O}_K)^{\times } \cup \{0\}\), this isomorphism of formal groups will act linearly on terms of minimal degree, and hence it changes \(F_1,\dots ,F_g\) by a linear transformation, which is invertible. We note that it also does the same to \(G_1,\dots ,G_g\), therefore, it transforms the set of solutions of the system by an invertible transformation. Moreover, the system having or not having a single solution \((0,\dots ,0)\) is the same before or after an isomorphism. We pause to note that it is crucial that the degrees \(d_1, d_2, \dots , d_g\) from Definition 4.1 are all equal.

For the remainder of the proof, we work with this isomorphic formal group with coordinates \((Z_1^*,\dots ,Z_g^*)\). We note that the vector \((z_1^*,\dots ,z_g^*)\) will reduce mod p to the point \((0,\dots ,0) \in k^g\) because all \(z_i^*\) have valuations strictly positive. But the \(z_i^*\) have the same valuation so we can divide all of them by one of them, and consider the vector \((z_1^*/z_g^*,\dots ,z_{g-1}^*/z_g^*,1)\), which will not reduce the zero vector over the residue field. Since \({\mathscr {F}}\) was assumed to be strict, the reduction of \((z_1^*/z_g^*,\dots ,z_{g-1}^*/z_g^*,1)\) cannot be a common root of all of \(G_1,\dots ,G_g\). Therefore, there exists an index j for which \(G_j(z_1^*/z_g^*,\dots ,1)\) is not zero in the residue field k, and hence the valuation of \(F_j(z_1^*,\dots ,z_n^*)\) equals the valuation of each of its individual monomials.

We now want to determine the valuation of \(z_j^*\), and hence the valuation of every other \(z_i^*\) as they have the same valuation. By considering the equation \(f_j(z_1^*,\dots ,z_g^*)\), we have

$$\begin{aligned} 0= & {} pz_j^* + p(\text {terms of degree between}\,2\hbox { and }d_j - 1) + F_j(z_1^*,\dots ,z_n^*) \\{} & {} + (\text {higher degree terms}). \end{aligned}$$

We claim that \(v(z_j^*) = 1/(d_j-1)\) where \(d_j\) is the degree of \(F_j\). To see this choose a unit \(u_1\) in \(\mathcal {O}_K\) that is a representative for the element in the residue field corresponding to \(z_1^*/z_g^*\), and similarly choose \(u_2,\dots ,u_{g-1}\). Then \(F_j(u_1,\dots ,u_{g-1},1)\) is a unit in \(\mathcal {O}_K\), because its image in the residue field is nonzero, by the above choice of j. But this is a form (of degree \(d_j\)) so we can divide by \(u_j\) inside \(F_j\), and re-denoting the \(u_j\)’s in consideration, we have that \(F_j( u_1,\dots ,u_{j-1}, 1, u_{j+1},\dots ,u_g)\) is a unit, and moreover,

$$\begin{aligned} F_j(z_1^*,\dots ,z_g^*) = F_j(u_1,\dots ,1,\dots ,u_g) z_j^{*^{d_j}} + ( \text {terms of strictly larger valuation}). \end{aligned}$$

Plugging this into the the above equation, we arrive at the equation

$$\begin{aligned} 0 = pz_j^* + F_j(u_1,\dots ,1,\dots ,u_g) z_j^{*^{d_j}} + (\text {terms of strictly larger valuation}). \end{aligned}$$
(4.7)

The minimum valuation in the equality of (4.7) must be attained in at least two terms, and these terms are forced to be \(pz_j^*\) and \(F_j(u_1,\dots ,1,\dots ,u_g) z_j^{*^{d_j}}\). Our claim now follows since \(F_j(u_1,\dots ,1,\dots ,u_g) \) is a unit in \(\mathcal {O}_K\).

We now want to study the relationship between the valuations \(v(z_j^* - \sigma (z_j^*))\) where \(\sigma \in G\). Let \(u:= F_j(u_1,\dots ,1,\dots ,u_g)\) which is a unit in \(\mathcal {O}_K\). For each \(\sigma \in G\), we will consider (4.7) and

$$\begin{aligned} 0 = p\sigma (z_j^*) + u \sigma (z_j^{*})^{d_j} + (***) \end{aligned}$$
(4.8)

where \((***)\) corresponds to terms strictly larger valuation. If we subtract equality (4.8) from (4.7), we arrive at the following:

$$\begin{aligned} 0 = p(z_j^* - \sigma (z_j^*)) + u(z_j^{*^{d_j}} - \sigma (z_j^{*})^{d_j}) + (\text {interesting terms}). \end{aligned}$$
(4.9)

By condition (3) of Lemma 4.6, the valuation of the “interesting terms" from (4.9) will be larger than \(v(z_j^* - \sigma (z_j^*))\). Indeed, these interesting terms are in fact monomials of the form \(z_1^{*^{m_1}} z_2^{*^{m_2}} \dots z_n^{*^{m_g}}\) where some of the \(m_i\) could be zero and the total degree is strictly greater than \(d_j\). In any case, we may deal with the difference of such monomials and their conjugates as follows. Suppose for example, that we have a term of the for \(z_2^{*^{m_2}}z_3^{*^{m_3}} - \sigma ( z_2^{*^{m_2}}z_3^{*^{m_3}})\). Then by adding and subtracting the term \(z_2^{*^{m_3}} \sigma ( z_3^{*^{m_3}})\), the difference we need to deal with can then be written as \((z_2^* - \sigma (z_2^*))\) times something of positive valuation, plus \((z_3^* - \sigma (z_3^*))\) times something of positive valuation, and so we can use property (3) of Lemma 4.6 for this particular \(\sigma \) to get our desired claim.

We now arrive at the crucial claim of the proof. Recall that G denotes the Galois group of the Galois closure of K(P)/K. We claim that \(v(z_j^* - \sigma (z_j^*)) = v(z_j^*)\) for all \(\sigma \in G\). Assume that \(v(z_j^* - \sigma (z_j^*)) = v(z_j^*) + t\) where \(t > 0\). By the above discussion and condition (3) of Lemma 4.6, we can save the value t from each term, and since there must be at least two terms of equal valuation in (4.9), we have that

$$\begin{aligned} v(p(z_j^* - \sigma (z_j^*))) \geqslant v(u(z_j^{*^{d_j}} - \sigma (z_j^{*})^{d_j})). \end{aligned}$$

Note that we cannot guarantee the equality of these valuations because there could be other terms of total minimal degree other than \(u(z_j^{*^{d_j}} - \sigma (z_j^{*})^{d_j})\), but the above inequality will suffice. Using the above inequality and previous equality \(v(pz_j^*) = v(uz_j^{*^{d_j}})\), and letting \(w = \sigma (z_j^*)/z_j^*\), we have that

$$\begin{aligned} t = v(z_j^* - \sigma (z_j^*)) - v(z_j^*) = v((z_j^* - \sigma (z_j^*))/z_j^*) = v(1 - w) \geqslant v(1 - w^{d_j}). \end{aligned}$$

Let \(y = 1 - w\). We have that \(v(y) = t\), which by assumption is strictly greater than 0. We now arrive at a contradiction by considering the above inequality and the equation

$$\begin{aligned} 1 - w^{d_j} = 1 - (1 - y)^{d_j} = d_jy + (\text {terms times }y^2) + y^{d_j}, \end{aligned}$$

and noting that all terms in the above have valuation strictly greater than \(v(y) = t\). Therefore, we have that \(t= 0 \), and hence \(v(z_j^* - \sigma (z_j^*)) = v(z_j^*)\) for all \(\sigma \in G\).

To conclude our proof, we use Krasner’s lemma [13, Exercise II.2.1] to explicitly describe the extension \(K(z_j^*)/K\) and show that \([K(z_j^*): K] = d_j - 1\). Recall that our \(z_j^*\) satisfies the equation (4.7). Consider the polynomial \(P(Z) = p + uZ^{d_j - 1}\) where \(u = F_j(u_1,...1,...u_n)\) as above. Note that \(z_j^*\) is not a root of P(Z), but it satisfies the following inequality \(v(P(z_j^*)) > v(pz_j^*) \) where the right side here is also equal to \(v( u z_j^{*^{d_j}})\).

On the other hand, the roots \(\theta _1,\dots ,\theta _{d_j - 1}\) of P(Z) each have valuation exactly \(1/(d_j - 1)\) since P(Z) is Eisenstein at p. We now compute the valuation of the derivative of P(Z) evaluated at a root in two different ways. First, \(P'(\theta _l) = u \prod _{i\ne l} (\theta _l - \theta _i)\) and hence

$$\begin{aligned} v(P'(\theta _l)) = \sum _{i\ne l} v(\theta _l - \theta _i) \geqslant \frac{d_j - 2}{d_j - 1}. \end{aligned}$$

Second, we directly compute that \(P'(\theta _l) = (d_j - 1)u\theta _l^{d_j - 2},\) which yields

$$\begin{aligned} v(P'(\theta _l)) = \frac{d_j - 2}{d_j - 1}. \end{aligned}$$

The first inequality and the second equality imply that \(v(\theta _l - \theta _i) = 1/(d_j - 1)\) for all \(1 \leqslant i \ne j \leqslant d_j - 1\). We also have that \(P(z_j^*) = u(z_j^* - \theta _1)\cdots (z_j^* - \theta _{d_j - 1})\) and hence

$$\begin{aligned} v(P(z_j^*)) = \sum _{i = 1}^{d_j -1} v(z_j^* - \theta _i). \end{aligned}$$

There are exactly \(d_j - 1\) terms here and since \(v(P(z^*)) > 1\), it follows that at least one term must be strictly larger than \(1/(d_j - 1)\). Without loss of generality, we may assume that \(v(z_j^* - \theta _1) > 1/(d_j - 1).\) Now we have the strict inequality

$$\begin{aligned} v(z_j^* - \theta _1) > 1/(d_j - 1) = v(\theta _1 - \theta _i) \end{aligned}$$

for all \(1 < i \leqslant d_j - 1\), and hence Krasner’s lemma implies that \(K(\theta _1) \subseteq K(z_j^{*})\). Recall that we have just shown that \(v(z_j^* - \sigma (z_j^*)) = v(z_j^*) = 1/(d_j - 1)\) for all \(\sigma \in G\). Moreover, we can just switch \(z_j^*\) and \(\theta _1\) to arrive at the inequality

$$\begin{aligned} v(\theta _1 - z_j^*) > v(z_j^* - \sigma (z_j^*)) = 1/(d_j - 1), \end{aligned}$$

and so we can apply Krasner’s lemma again to deduce that \(K(z_j^*) \subseteq K(\theta _1)\). Therefore, we have shown that \(K(P)\cong K(z_j^*)\cong K(\theta _1)\) and this extension is totally ramified of degree \(d_j- 1\), hence tamely ramified. We also have that \(z_j^*\), which is a linear combination of \(x_1,\dots , x_g\) with unit coefficients, is a uniformizer for K(P) and hence \(\mathcal {O}_{K(P)} \cong \mathcal {O}_K[x_1,\dots ,x_g]\). Finally, we note that there exists some coordinate \(x_i\) of P which has valuation \(v(x_i) = v(z_j^*)\) by condition (3) of Lemma 4.6 and hence \(x_i\) is a uniformizer for K(P) as well.

The second statement of Theorem 4.4 follows because \(K({\mathscr {F}}[p])\) is the compositum of the fields K(P)/K as P varies over points in \({\mathscr {F}}[p]\) and the compositum of tamely ramified extensions is again tamely ramified.\(\square \)