1 Introduction

In [6], Bugeaud, Corvaja, and Zannier proved the following theorem.

Theorem 1.1

Let ab be multiplicatively independent integers \(\ge 2\), and let \(\varepsilon > 0\). Then, provided n is sufficiently large, we have

$$\begin{aligned} \gcd (a^ n - 1,b ^n - 1) < \exp (\varepsilon n). \end{aligned}$$

The authors of that paper obtained the result by contradiction. They began by constructing a family of vectors in terms of na, and b. Then they showed that if the bound is not satisfied, then the vectors must lie in a lower-dimensional linear subspace by the Schmidt Subspace Theorem. Using this result they are able to derive algebraic relations on powers of a and b, which guarantee that ab are multiplicatively dependent. Silverman interpreted in [28] this result as a special case of Vojta’s Conjecture. Theorem 1.1 has been further generalized in [19] where Levin interpreted it as another special case of Vojta’s Conjecture.

One may ask whether a similar inequality holds for iterations of polynomials, as iterations are dynamical analogues of power maps. It seems that current tools are not powerful enough to tackle this problem unconditionally. In [25] Silverman observed that one can interpret the greatest common divisor as a height function on some blowup of the projective plane. Furthermore, assuming Vojta’s Conjecture (cf. [31]), Silverman gave in [28] an upper bound for the greatest common divisor of the values of some polynomial functions, in terms of the absolute values of the initial points. See also [24] for an application of Silverman’s method to \(\gcd \) bounds of analytic functions. Many other authors have worked out various generalization and variations of this problem, both over number fields and function fields (see [1, 8,9,10, 14, 17, 21, 27] for example). See also [13, 23] for related unlikely intersection results, interpreted in the context of Ailon–Rudnick type result [1].

In this paper, we apply Silverman’s method in the situation of iterations. In fact, we will prove a Silverman-type estimate for a fixed smaller iteration, and derive some results on gcd’s. However, there are some technical difficulties. First, in order to have the required operands of the greatest common divisor, one needs to blow up a Zariski closed subset in general (as opposed to subvarieties in [28]), depending on the prescribed constant \(\varepsilon \). Second, in the case of the rational functions the numerators of iterates might not be iterates of any polynomial, so we need a more detailed analysis. We also need to control the degree of ramification, for this we also need the very mild assumption that \(\alpha ,\beta \) are not exceptional.

Definition 1.2

Let X be an algebraic variety defined over \(\overline{{\mathbb {Q}}}\). We say that a sequence \((x_n)_n\subseteq X(\overline{{\mathbb {Q}}})\) is generic in X if for any proper Zariksi closed subset \(Y\subsetneq X\), there exists an \(N\in {\mathbb {N}}\) such that for all \(n\ge N\), \(x_n\notin Y\). A point \(x_0\in \overline{{\mathbb {Q}}}\) is said to be exceptional for a rational function \(\phi \in \overline{{\mathbb {Q}}}(x)\) if the backward orbit \(\cup _{n = 0}^{\infty } (\phi ^{\circ n})^{-1}(\{x_0\})\) is finite.

A main result of this paper is the following theorem. It is based on Vojta’s Conjecture, which is described in Conjecture 2.7.

Theorem A

Assume Vojta’s Conjecture for blowups of \({\mathbb {P}}^1 \times {\mathbb {P}}^1\). Suppose \(a, b, \alpha ,\beta \in {\mathbb {Z}}\), and that \(f(x),g(x)\in {\mathbb {Z}}[x]\) are polynomials of degree \(d\ge 2\). Assume that \(\alpha ,\beta \) are not exceptional for fg respectively. Assume that the sequence \((f^{\circ n}(a), g^{\circ n}(b))_n\) is generic in \(\overline{{\mathbb {Q}}}^2\). Then for each given \(\varepsilon >0\), there exists a constant \(C = C(\varepsilon ,a,b,\alpha ,\beta ,f,g)>0 \), such that for all \(n\ge 1\), we have

$$\begin{aligned} \gcd (f^{\circ n}(a)-\alpha , g^{\circ n}(b) -\beta ) \le C\cdot \exp ({\varepsilon \cdot d^n}). \end{aligned}$$

Remark

Let \(d_1 = \deg (f), d_2 = \deg (g)\). The result is trivial when \(d_1 \ne d_2\) and \(d = \max (d_1,d_2)\), and is proved in [8] for the case \(d_1 = d_2 =1\). We use the convention that \(\gcd (0,0) = 0\). But this involves only finitely many n, since the sequence \((f^{\circ n}(a), g^{\circ n}(b))_n\) is generic, and hence so is \((f^{\circ n}(a)-\alpha , g^{\circ n}(b) -\beta )_n\).

In [32] Xie proved the Dynamical Mordell–Lang Conjecture for polynomial endomorphisms of the affine plane. Therefore the genericity of the sequence \((f^{\circ n}(a), g^{\circ n}(b))_n\) is equivalent to the Zariski density of \((f^{\circ n}(a), g^{\circ n}(b))_n\). On the other hand, Medvedev and Scanlon gave in [20] characterizations of periodic curves under split polynomial endomorphisms of \( {\mathbb {P}}^1 \times {\mathbb {P}}^1\). The equation of the curve should meet certain commutativity conditions, which are unlikely to hold in general. Therefore the genericity condition of the sequence \((f^{\circ n}(a), g^{\circ n}(b))_n\) is a mild condition.

Actually we will prove Theorem 2.11 and obtain Theorem A as a consequence. In [28] Silverman defined a more general gcd height which is the log of gcd in the case of rational integers. In the same paper he proved most results in this more general framework. See Sect. 2 for the precise definitions and statements. The idea of the proof of Theorem 2.11 is as follows. Following the idea of Silverman, we prove in Theorem 3.5 an upper bound for the greatest common divisor \(\mathrm {gcd}(F_1(a'), G_1(b'))\) for general square-free polynomials \(F_1, G_1\) and we will apply it (essentially) to \(\mathrm {gcd}(f^{\circ D}(a'), g^{\circ D}(b'))\) for some large D depending on \(\varepsilon \) and d with \(a' = f^{\circ (n-D)}(a)\) and \(b' = g^{\circ (n-D)}(b)\).

The plan of this paper is as follows. Section 2 contains a table of notation, basics of height functions and algebraic geometry, a statement of Vojta’s Conjecture, some results concerning the gcd height, and statements of other main theorems of this paper. We prove our main theorem concerning the gcd height in Sect. 3. In Sect. 4, we first cite a genericity criterion for the case when \(f = g\) are non-special polynomials, replacing the genericity condition. We also cite a theorem of Corvaja and Zannier for the case of power maps. At the end of Sect. 4 we give several examples to explain why the genericity condition in Theorem A is necessary; our policy is to include only results which are easy to state and hopefully clarify things greatly. In Sect. 5, we give a conditional result for characterizing large gcd’s.

2 Preliminaries

We use the following notation throughout this paper.

K:

a number field.

M(K):

the set of places of K.

\(n_v\):

the local degree \([K_v: {\mathbb {Q}}_w]\) where w is the contraction of v on \({\mathbb {Q}}\); the product formula has power \(n_v\) for the place v.

fg:

rational functions defined over K.

d:

the degree of f and g.

\(h_{{\mathbb {P}}^n}\):

the Weil height on \({\mathbb {P}}^n(K)\).

\({\hat{h}}_f\):

the canonical height with respect to f.

\(f^{\circ n}\):

the n-th iterate of f.

\(|\cdot |_v\):

the v-adic absolute value.

\(v^+(\cdot )\):

\(\max (0, -\log |\cdot |_v)\).

For \(P =[x_0, \dots , x_n] \in {\mathbb {P}}^n({\overline{K}})\), choose a number field L over which P is defined and define the Weil height

$$\begin{aligned} h_{{\mathbb {P}}^n}(P) = \frac{1}{[L:{\mathbb {Q}}]} \sum _{v\in M(L)} n_v\max \left( \log |x_0|_v, \dots , \log |x_n|_v\right) . \end{aligned}$$
(2.1)

This definition is independent of the choice of L.

Suppose \(f: {\mathbb {P}}^1 \rightarrow {\mathbb {P}}^1\) is an endomorphism of degree \(d\ge 2\). Then following a construction of Tate, Call and Silverman defined in [7] the canonical height \(h_f\) associated with f as

$$\begin{aligned} {\hat{h}}_f(P) = \lim _{n\rightarrow \infty } \frac{h_{{\mathbb {P}}^1}\left( f^{\circ n}(P)\right) }{d^n} \end{aligned}$$

for all \(P\in {\mathbb {P}}^1({\overline{K}}) \).

Theorem 2.1

([7]) The canonical height satisfies

  1. 1.

    \(\displaystyle {\hat{h}}_f(P) = h_{{\mathbb {P}}^1}(P) + O(1), \)

  2. 2.

    \( {\hat{h}}_f(f(P)) = d \cdot {\hat{h}}_f(P).\)

See also Section 3.3 of [29] for more details. Here the implied constant in O(1) is effective and depends only on n and the morphism f, but not on the point \(P\in {\mathbb {P}}^n({\overline{K}}) \).

Now we introduce some notions in algebraic geometry. For more information one may refer to [15].

Definition 2.2

Let \(R = {\overline{K}}[X_0, \dots , X_n]\) and let \(T \subseteq R\) be a set of homogeneous polynomials in \(X_0, \dots , X_n\). Every set

$$\begin{aligned} \mathrm {zero}(T) := \{ P \in {\mathbb {P}}^n({\overline{K}})~|~f(P) = 0 ~\text {for all}~f\in T\} \end{aligned}$$

is called a Zariski closed subset of \({\mathbb {P}}^n({\overline{K}})\). A Zariski closed subset \(V\subseteq {\mathbb {P}}^n({\overline{K}})\) is called a projective variety if it cannot be written as a union of two Zariski closed proper subsets.

To give more general definition of height functions, we need the notion of divisors on nonsingular varieties. See Sections 1.5 and 2.6 of [15] for more details.

Definition 2.3

Let X be a nonsingular projective variety. The group of Weil divisors on X is the free abelian group generated by the closed subvarieties of codimension one on X. It is denoted by \(\mathrm {Div}(X)\). Denote by \(K(X)^*\) the multiplicative group of nonzero rational functions on X. Each rational function \(f\in K(X)^*\) gives a principal divisor

$$\begin{aligned} \mathrm {div}(f) = \sum _{Y\subsetneq X~\mathrm {codimension~1}} \mathrm {ord}_Y(f) \cdot Y. \end{aligned}$$

The group \(\mathrm {Div}(X)\) divided by the subgroup of principal divisors is called the divisor class group of X.

Remark

In the case when X is nonsingular, the class group is canonically isomorphic to the group \(\mathrm {Pic}(X)\). For the definition of the latter, see Section 2.6 of [15].

Definition 2.4

Suppose \(D \in \mathrm {Div}(X)\). The complete linear system of D is the set

$$\begin{aligned} L(D) = \{f\in K(X)^*~|~D + \mathrm {div}(f) \ge 0\} \cup \{0\}. \end{aligned}$$

If \(L(D) \ne 0\), then L(D) induces a rational morphism \(\phi _D: X \dashrightarrow {\mathbb {P}}^n\). For more details refer to Section A.3 of [16].

Definition 2.5

A divisor \(D\in \mathrm {Div}(X)\) is said to be very ample if the above map \(\phi _D\) is an embedding. A divisor D is said to be ample if an positive integral multiple nD of D is very ample.

Fix a nonsingular variety X defined over K. For each divisor \(D\in \mathrm {Div}(X)\) defined over K we can define height functions \(h_{X, D}: X({\overline{K}}) \rightarrow {\mathbb {R}}\) as below. For more details, including that these height functions are well-defined, refer to [16], Theorem B.3.2.

  • If D is very ample, choose an embedding \(\phi _D: X\rightarrow {\mathbb {P}}^n\). Then define \(h_{X,D}(x) = h_{{\mathbb {P}}^n}(\phi _D(x))\).

  • If D is ample, then suppose nD is very ample, define \(h_{X, D} = 1/n\cdot h_{X, nD}\).

  • In general, we can write \(D = D_1 - D_2\) with \(D_1, D_2\) ample, and define \(h_{X, D} = h_{X, D_1} - h_{X, D_2}\).

The following theorem is one of the most important results in Diophantine geometry. See also Sections 2.3 and 2.4 of [5] and Chapter 4 of [18].

Theorem 2.6

(The Weil Height Machine, Part of [16], Theorem B.3.2) In the context of the above paragraphs, the height functions constructed in this way, are determined, up to O(1). They satisfy the following properties.

  • Let \(D, E\in \mathrm {Div}(X)\). Then \(h_{X, D+E} = h_{X, D} + h_{X, E} + O(1)\).

  • (Northcott’s Theorem) Let \(D\in \mathrm {Div}(X)\) be ample. Then for every finite extension \(K'/K\) and every constant B, the set

    $$\begin{aligned} \{P\in X(K')~|~h_{X, D}(P) \le B \} \end{aligned}$$

    is finite.

  • Let \(D, E\in \mathrm {Div}(X)\) with \(D = E + \mathrm {div}(f)\). Then

    $$\begin{aligned} h_{X,D}(P) = h_{X,E}(P) + O(1) \end{aligned}$$

    for all \(P\in X({\overline{K}})\).

Remark

Formula (2.1) can be thought of as \(h_{{\mathbb {P}}^n, H}\) in the context of Theorem 2.6 where H is a hyperplane in \({\mathbb {P}}^n\).

Remark

The \(``O(1)''\) constants that appear in Theorem 2.6 depend on the varieties, divisors, and morphisms, but they are independent of the points on the varieties. In the sense of Theorem 2.6, height functions globally differed by O(1) can be thought of as the same height function. Therefore in terms of Item 1 of Theorem 2.1, the canonical height \({\hat{h}}_{f}\) can be thought of as \(h_{{\mathbb {P}}^1, H}\) where H is a point in \({\mathbb {P}}^1\).

Intuitively, ampleness is a positivity notion on algebraic varieties and it is closely related with height functions. The more “ample” a divisor D is, the more “positive” the height function \(h_{X,D}\) is. We will use the following version of Vojta’s Conjecture. It is Conjecture 3.4.3 of the monograph [31]. For the definition of normal crossing divisor, see Chapter 5, Remark 3.8.1 of [15].

Conjecture 2.7

(Vojta) Let K be a number field, and let X be a nonsingular projective variety defined over K. Suppose A is an ample normal crossing divisor on X and \(K_X\) is the canonical divisor of X, both defined over K. Let \(h_A\) and \(h_{K_X}\) be the corresponding height functions respectively. For each fixed \(\varepsilon >0\), there is a Zariski closed proper subset V of X and a constant C such that

$$\begin{aligned} h_{K_X}(x) \le \varepsilon \cdot h_A(x)+ C \end{aligned}$$

for all \(x\in X(K){\setminus } V(K)\).

In fact, for algebraic variety X, Silverman defined in [25] height functions \(h_{X,Y}\) with respect to any closed subschemes Y. For our purpose it is enough to recall the following part.

Theorem 2.8

([25]) Let X be a projective variety and let Z(X) denote the set of closed subschemes of X. For each \(V\in Z(X)\) there is a map \(h_{X,D}: X({\overline{K}}) \rightarrow \mathbb {{R}}_{\ge 0}\) such that these \(h_{X,D}\) satisfy the following conditions:

  1. 1.

    If \(D \in Z(X)\) is a positive divisor, then \(h_{X,D}\) is the usual height function associated to D given by Theorem 2.6;

  2. 2.

    Let \(\phi : X\rightarrow X'\) be a morphism of varieties, and let \(Y'\in Z(X')\). Then

    $$\begin{aligned} h_{X,\phi ^*{Y'}} = h_{X',Y'} + O(1). \end{aligned}$$
    (2.2)

Concerning the relationship between the greatest common divisor and heights, we briefly recall Silverman’s idea in [28]. For all \(v\in M({{\mathbb {Q}}})\) and \(a\in {\mathbb {Z}}\), recall that \(v^+(a) = \max (-\log |a|_v, 0)\in [0, +\infty ]\). Silverman began his discussion in [25] by writing the greatest common divisor as

$$\begin{aligned} \log {\mathrm {gcd}}(a,b) = \sum _{v\in M({{\mathbb {Q}}})} \min (v^+(a),v^+(b)) \end{aligned}$$
(2.3)

for \(a,b\in {\mathbb {Z}}\). Then he extends this function for \(a,b\in {\mathbb {Q}}\) by the same formula. By the last paragraph on page 337 of [28], Eq. (2.3) can be interpreted as the height function on \({\mathbb {P}}^1 \times {\mathbb {P}}^1\) with respect to the subschemes (0, 0), and furthermore as a height function associated with a divisor on the blowup of \({\mathbb {P}}^1 \times {\mathbb {P}}^1\) along (0, 0). See page 163 of [15] for the definition of blowup and strict transform. See pages 28–29 of [15] for a concrete example of blowing up a point.

Proposition 2.9

([15], Chapter 5, Proposition 3.1) Let \(\pi : {\tilde{W}}\rightarrow W\) be the blowup of a nonsingular surface W at a point P. Then

  1. 1.

    \(\pi \) induces an isomorphism of \({\tilde{W}}-\pi ^{-1}(P)\) and \(W-P\),

  2. 2.

    The set \(E: = \phi ^{-1}(P)\) is isomorphic to \({\mathbb {P}}^{1}\). It is called the exceptional divisor of the blowup \(\pi \),

  3. 3.

    \({{\tilde{W}}}\) is nonsingular.

The following definition is a slight generalization of that given by Silverman in [28].

Definition 2.10

Let K be a number field and let X / K be a smooth variety. Let \(Y/K \subsetneq X/K\) be a subscheme of codimension \(r \ge 2\). Let \( \pi :{\tilde{X}}\rightarrow X\) be the blowup of X along Y, and let \({\tilde{Y}} = \pi ^{-1} (Y ) \) be the exceptional divisor of the blowup. For \(x\in (X - Y)(K)\), we let \({\tilde{x}} = \pi ^{-1} (x) \in {\tilde{X}}\). The generalized (logarithmic) greatest common divisor of the point \(x\in (X - Y)(k)\) with respect to Y is the quantity

$$\begin{aligned} h_{\gcd } (x;Y) := h_{X,Y}(x) = h_{{\tilde{X}}, {\tilde{Y}}} ({\tilde{x}}) \end{aligned}$$

where the last inequality follows from (2.2).

For a number fields K and for \(a,b\in K\) we also define the generalized gcd as

$$\begin{aligned} h_{\gcd }(a,b) = \frac{1}{[K:{\mathbb {Q}}]}\sum _{v\in M(K)} n_v \min ( v^+(a), v^+(b)). \end{aligned}$$
(2.4)

As a consequence of the Weil height machine, the relationship between \(h_{\gcd }\) as in Definition 2.10 and in Eq. (2.4) is shown at the end of this paragraph. See [25, 28] for some interesting cases over \({\mathbb {Z}}\) where the contribution from the places at infinity is zero or bounded. Suppose K is a number field. Let \(X = {\mathbb {P}}^1 \times {\mathbb {P}}^1\) and let \(f(X_1)\in K[X_1], g(X_2)\in K[X_2]\) be square-free polynomials. Then over \({\overline{{\mathbb {Q}}}}\) the vanishing set Z(f) and Z(g) define two divisors \(D_1\) and \(D_2\) on X. Set \(Y = D_1 \cap D_2\). Then for all points \(x = (x_1, x_2)\in ({\mathbb {P}}^1 \times {\mathbb {P}}^1)(K)\) such that \(f(x_1) \ne 0\) and \(g(x_2) \ne 0\), we have

$$\begin{aligned} \begin{aligned} h_{\gcd }\left( f(x_1),g(x_2) \right)&= h_{{\mathbb {P}}^1 \times {\mathbb {P}}^1, (0,0)}(f(x_1), g(x_2)) \\&= h_{{\mathbb {P}}^1 \times {\mathbb {P}}^1, (f,g)^{*}(0,0)}(x_1, x_2) +O(1)\\&= h_{X,Y}(x) +O(1) \\&= h_{\mathrm {gcd}}(x;Y) +O(1), \end{aligned} \end{aligned}$$

where the second equality follows from (2.2).

Our goal is to prove the following theorem.

Theorem 2.11

Let K be a number field. Assume Vojta’s Conjecture for blowups of \({\mathbb {P}}^1 \times {\mathbb {P}}^1\) and K. Suppose \(a,b,\alpha ,\beta \in K\). Let \(f,g\in K(X)\) with degree \(d\ge 2\). Assume that the sequence \((f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta )_n \subseteq {\mathbb {P}}^1(\overline{{\mathbb {Q}}})\times {\mathbb {P}}^1(\overline{{\mathbb {Q}}})\) is generic, and \(\alpha \) and \(\beta \) are not exceptional for f and g respectively. Then for each given \(\varepsilon >0\), there exists a constant \(C = C(\varepsilon ,a,b,\alpha ,\beta ,f,g)\) such that

for all \(n\ge 1\), we have

$$\begin{aligned} h_{\mathrm {gcd}}(f^{\circ n}(a)-\alpha , g^{\circ n}(b) -\beta ) \le {\varepsilon \cdot d^n} + C. \end{aligned}$$

We can also conclude the periodicity of an irreducible component of the Zariski closure \(\overline{(f^{\circ n}(a), g^{\circ n}(b))_n}\) under (fg) in the cases when the Dynamical Mordell–Lang Conjecture is proved. See Sect. 4 for more details. Thanks to the powerful theorems proved in [2, 20, 22], we can give some concrete conditions for \((f^{\circ n}(a),g^{\circ n}(b))_n\) being generic in the case when \(f=g\) are so-called non-special polynomials (see Sect. 4).

Theorem 2.12

Let K be a number field and \(f\in K[x]\) be a polynomial of degree \(d\ge 2\). Assume Vojta’s Conjecture for blowups of \({\mathbb {P}}^1 \times {\mathbb {P}}^1\) and for K. Assume that f is not conjugate (by a rational automorphism defined over \({\overline{K}}\)) to a power map or a Chebyshev map. Suppose \(a,b,\alpha , \beta \in K\) and \(\alpha ,\beta \) are not exceptional for f. Assume that there is no polynomial \(h\in {\overline{K}}[x]\) such that \(h\circ f^{\circ k} = f^{\circ k}\circ h\) for some \(k\in {\mathbb {N}}_{>0} \) and \(h(a) = b,~h(\alpha ) = \beta \) or \(h(b) =a,~h(\beta ) = \alpha \), then for any \(\varepsilon >0\), there exists a \(C = C(\varepsilon ,a,b,\alpha ,\beta ,f,g) >0\) such that for all \(n\ge 1\), we have

$$\begin{aligned} h_{\gcd }(f^{\circ n}(a) - \alpha , f^{\circ n}(b) - \beta ) \le \varepsilon \cdot d^n + C. \end{aligned}$$

3 The Proof of Theorem 2.11

Throughout this section we denote by X the surface \({\mathbb {P}}^1 \times {\mathbb {P}}^1\).

3.1 Algebraic geometry of \({\mathbb {P}}^1 \times {\mathbb {P}}^1\) and its blowups

By Chapter 2, Example 6.6.1 of [15] we have

$$\begin{aligned} \begin{aligned} \mathrm {Pic}(X) \cong {\mathbb {Z}} \oplus {\mathbb {Z}}. \end{aligned} \end{aligned}$$

where the image of the divisor class of an irreducible curve C is the degrees of its projection into the two coordinates \((\deg (\mathrm {pr}_1: C \rightarrow {\mathbb {P}}^1), \deg (\mathrm {pr}_2: C\rightarrow {\mathbb {P}}^1))\). More generally, if the image of a divisor \(D\in \mathrm {Pic}(X)\) is (ab), then we say that D is of type (a, b). Fix \(D_1\in \mathrm {Pic}(X)\) to be a divisor of type (1, 0) and Fix \(D_2\in \mathrm {Pic}(X)\) to be a divisor of type (0, 1). The intersection product on X is given by the rule

$$\begin{aligned} ((a,b).(a',b')) = ab' + a'b \end{aligned}$$
(3.1)

and extend by \({\mathbb {Q}}\)-linearity to \(\mathrm {Pic}(X) \otimes {\mathbb {Q}}\). In other words, the intersection product on X is determined by the matrix

(3.2)

Let K be a number field. Recall that a one-variable polynomial over a field K is called square-free if it does not have repeated roots in \({\overline{K}}\). Suppose \(f \in K[X_1]\) and \(g\in K[X_2]\) are square-free polynomials in one variable, Let Y be the scheme-theoretic intersection

$$\begin{aligned} Y = Z(f) \cap Z(g)\subseteq {\mathbb {P}}^1 \times {\mathbb {P}}^1, \end{aligned}$$

which is the subscheme defined by the ideal \((f) + (g)\), is then a reduced cycle of codimension 2.

Suppose \(Z(f) = \{ \alpha _1, \dots , \alpha _m\}\), \(Z(g) = \{\beta _1, \dots , \beta _n\}\). Then

$$\begin{aligned} Y = \cup _{1\le i\le m, 1\le j \le n}\{ (\alpha _i, \beta _j)\}, \end{aligned}$$

each with multiplicity one. Also divisors \(\{X_1 = \alpha _i\}\) and \(\{X_2 = \beta _j\}\) meet transversally, hence Y is a reduced cycle of codimension 2. To simplify notation write \(Y= \{Q_1, \dots , Q_s \}\). Let \(\pi : {\tilde{X}}\rightarrow X\) be the blowup of \(X ={\mathbb {P}}^1 \times {\mathbb {P}}^1\) along Y, let \({\tilde{Y}}\) be the preimage of Y, and let \({\tilde{P}}\) be the preimage of P. Then \({\tilde{X}}\) is a nonsingular variety by Proposition 2.9.

The following properties are useful to find the canonical divisor and an ample divisor on \({\tilde{X}}\).

Proposition 3.1

([15], Chapter 5, Propositions 3.2 and 3.3) Suppose \(\pi : {\bar{X}} \rightarrow X\) is the blowup of a surface X at a point P and let E be the exceptional divisor. The natural maps \(\pi ^*: \mathrm {Pic} (X) \rightarrow \mathrm {Pic}({\bar{X}})\) and \({\mathbb {Z}} \rightarrow \mathrm {Pic} ({\bar{X}})\) defined by \(1\mapsto E\) give rise to an isomorphism \(\mathrm {Pic} ({\bar{X}}) \rightarrow \mathrm {Pic} (X) \oplus {\mathbb {Z}}\). Let \(\pi _{*}: \mathrm {Pic} ({\bar{X}}) \rightarrow \mathrm {Pic} (X)\) denote the projection on the first factor. The intersection theory on \({\bar{X}}\) is determined by the rules:

  1. 1.

    if \(C,D\in \mathrm {Pic}(X)\), then \((\pi ^*C.\pi ^*D) = (C. D)\),

  2. 2.

    if \(C\in \mathrm {Pic}(X)\), then \((\pi ^*C.E) = 0\),

  3. 3.

    it holds that \(E^2 = -1\),

  4. 4.

    (a special case of the projection formula) if \(C\in \mathrm {Pic}(X)\) and \(D\in \mathrm {Pic}({\bar{X}})\), then \((\pi ^*C. D) = (C. \pi _* D)\).

Else, the canonical divisor of \({\bar{X}}\) is given by \(K_{{\bar{X}}} = \pi ^*K_X + E\) where E is the exceptional divisor.

Since the blowup of Y does not involve the blowup at a point on an exceptional curve, we have that \({\tilde{X}}\) can be obtained by blowing up s distinct points on X one by one. Therefore applying Proposition 3.1s times yields

$$\begin{aligned} \mathrm {Pic}({\tilde{X}}) \cong \mathrm {Pic}({X})\bigoplus \bigoplus _{i=1}^s {\mathbb {Z}}\cdot {\tilde{Y}}_i. \end{aligned}$$

Define \(\pi ^*\) and \(\pi _*\) similarly as in Proposition 3.1.

Since \(Y_i\)’s are preimages of distinct \(Q_i\)’s, so if \(i=j\), then \(({\tilde{Y}}_i, {\tilde{Y}}_i) = -1\) by Item 3 of Proposition 3.1, and if \(i\ne j\) then \({\tilde{Y}}_i\) and \({\tilde{Y}}_j\) do not intersect and we have \(({\tilde{Y}}_i, {\tilde{Y}}_j) = 0\). Therefore \(({\tilde{Y}}_i, {\tilde{Y}}_j) = -\delta _{ij}\). Combining this with Eq. (3.1) we know that the intersection product matrix on \(\mathrm {Pic}({\tilde{X}}) \otimes {\mathbb {Q}}\) is

(3.3)

with empty entries zero.

By Theorem 3.1

$$\begin{aligned} \begin{aligned} K_{{\tilde{X}}} = \pi ^*K_X + {\tilde{Y}}_{1} + \dots + {\tilde{Y}}_{s} \end{aligned} \end{aligned}$$

where each \({\tilde{Y}}_{i}\) is the preimage of \(Q_i\).

We can choose \(-K_X\) to be the normal crossing divisor \(\{X_1 = a \} + \{X_1 = b\} + \{X_2 = a'\} + \{ X_2 = b'\}\) where \(a,b,a',b'\) are distinct nonzero algebraic numbers in K. By Definition 2 of [28], we still have \(h_{\mathrm {gcd}} (P;Y ) = h_{{\tilde{X}}, {\tilde{Y}}}({\tilde{P}}).\)

To apply Vojta’s Conjecture, let \(A\in \mathrm {Pic}(X)\) be an ample divisor of type (1, 1) and consider the \({\mathbb {Q}}\)-divisor

$$\begin{aligned} \begin{aligned} {{\tilde{A}}}: = \pi ^*A - \frac{1}{N}\left( {\tilde{Y}}_1 + \dots + {\tilde{Y}}_s\right) \in \mathrm {Pic}({\tilde{X}})\otimes {\mathbb {Q}}. \end{aligned} \end{aligned}$$

Lemma 3.2

\({\tilde{A}}\) is ample when \(N>s\).

Proof

We need the following definition from Chapter 1, Exercise 5.3 of [15].

Definition 3.3

Let \(Y \subseteq {\mathbb {A}}^2\) be a curve defined by the equation \(f(X_1,X_2) = 0\). Let \(P = (x_1,x_2)\) be a point of \({\mathbb {A}}^2\). Make a linear change of coordinates so that P becomes the point (0, 0). Then write f as a sum \(f = f_0 + f_1 + \ldots + f_d\), where \(f_i\) is a homogeneous polynomial of degree i in \(X_1\) and \(X_2\). Then we define the multiplicity of P on Y, denoted \(\mu _P(Y)\), to be the least r such that \(f_r\ne 0\).

We also need the following lemma, which we state without proof.

Lemma 3.4

([15], Chapter 1, Exercise 7.5(a)) An irreducible curve Y of degree \(d > 1\) in \({\mathbb {P}}^2\) cannot have a point of multiplicity \(\ge d\).

Now let \(C\subseteq {\mathbb {P}}^1 \times {\mathbb {P}}^1\) be an irreducible curve of type (ab) . Let \({\tilde{C}}\) be its strict transform. By Lemma 3.4 we know that C cannot have a point of multiplicity \(\ge \deg (C)\). By Chapter 5, Proposition 3.6 of [15],

$$\begin{aligned} ({\tilde{Y}}_i.{\tilde{C}}) = ({\tilde{Y}}_i.\pi ^*C-\mu _{Q_i}(C)\cdot {\tilde{Y}}_i) = \mu _{Q_i}(C). \end{aligned}$$
(3.4)

Now let \(\mathrm {pr}_i: C\rightarrow {\mathbb {P}}^1\) be the projection to the i-th coordinate. Then \(\deg \mathrm {pr}_1 = a,~\deg \mathrm {pr}_2 = b\). This is to say, if we restrict C to \({\mathbb {A}}^2\), then the defining equation has degree b on \(X_1\) and degree a on \(X_2\). It follows that \(\deg (C) \le a +b\). By the Item 4 of Proposition 3.1, we have

$$\begin{aligned} \left( \pi ^*A. {\tilde{C}}\right) = \left( A.\pi _*{\tilde{C}}\right) = \left( A. C\right) = a+b. \end{aligned}$$

Then by Eq. (3.4) and linearity

$$\begin{aligned} \begin{aligned} ({\tilde{A}}. {\tilde{C}})&= \left( \pi ^*A. {\tilde{C}}\right) - \frac{1}{N}\left( ({\tilde{Y}}_1.{\tilde{C}}) + \dots + ({\tilde{Y}}_s.{\tilde{C}})\right) \\&= a + b - \frac{1}{N}\left( \mu _{Q_1}(C) + \dots + \mu _{Q_s}(C)\right) \\&\ge a +b - \frac{1}{N} \cdot s \cdot (a+b) \\&> 0 \end{aligned} \end{aligned}$$

as \(N> s \).

We also have

$$\begin{aligned} \begin{aligned} ({\tilde{A}}. {\tilde{Y}}_i)&= \left( \pi ^*A. {\tilde{Y}}_i\right) - \frac{1}{N}\left( ({\tilde{Y}}_1.{\tilde{Y}}_i) + \dots + ({\tilde{Y}}_s.{\tilde{Y}}_i)\right) \\&= 0 - \frac{1}{N} \left( -\delta _{1i} - \dots - \delta _{si}\right) \\&= \frac{1}{N}. \end{aligned} \end{aligned}$$

Finally by the previous equality

$$\begin{aligned} \begin{aligned} ({\tilde{A}}. {\tilde{A}})&= \left( {\tilde{A}}.~ \pi ^*A\right) - \frac{1}{N}\left( ({\tilde{A}}.~ {\tilde{Y}}_1) + \dots + ({\tilde{A}}.~ {\tilde{Y}}_s)\right) \\&> \left( \pi _{*}{\tilde{A}}.~A\right) -\frac{1}{N}\cdot \frac{s}{N} \\&= (A.A) - \frac{s}{N^2} \\&\ge 1 + 1 - \frac{s}{N^2} \\&>0 \end{aligned} \end{aligned}$$

as \(N >s\).

But

$$\begin{aligned} \mathrm {Pic}({\tilde{X}}) = \pi ^*\mathrm {Pic}({X})\bigoplus \bigoplus _{i=1}^s {\mathbb {Z}}\cdot {\tilde{Y}}_i, \end{aligned}$$

and every effective curve C in \({\tilde{X}}\) is linearly equivalent to a non-negative combination of \({\tilde{Y}}_i\)’s and the strict transform of effective curves in X, so \({\tilde{A}}\) is ample by the Nakai-Moishezon criterion (see Chapter 5, Theorem 1.10 of [15]). \(\square \)

3.2 The proof, continued

We first prove the following modification of Theorem 2 of [28].

Theorem 3.5

With notation as in Sect. 3.1, let K be a number field. Suppose \(f \in K[t_1]\) and \(g\in K[t_2]\) are square-free polynomials in one variable, Let

$$\begin{aligned} Y = Z(f) \cap Z(g)\subseteq X = {\mathbb {P}}^1 \times {\mathbb {P}}^1\end{aligned}$$

as in the Sect. 3.1. Also recall from there that \({\tilde{X}}\) is the blowup of X along Y.

Assume that Vojta’s conjecture is true for \({\tilde{X}}\) over K. Fix \(\varepsilon > 0\). Then there is a algebraic subset \(V\subsetneq {\mathbb {P}}^1 \times {\mathbb {P}}^1\), depending on fg and \(\varepsilon \), so that for each \(P = (x_1,x_2) \in {\mathbb {P}}^1(K)\times {\mathbb {P}}^1(K)\), either

  1. 1.

    \(P\in V\), or

  2. 2.

    \(h_{\gcd } ( f(x_1),g (x_2) )\le (3 + \varepsilon ) \left( h(x_1) +h( x_2) \right) + O(1).\)

Proof of Theorem 3.5

We follow the proof in [28]. By Lemma 3.2 and assuming Vojta’s Conjecture we have

$$\begin{aligned} \begin{aligned} h_{{\tilde{X}}, K_{{\tilde{X}}}}({\tilde{P}}) \le \varepsilon \cdot h_{{\tilde{X}}, {{\tilde{A}}}}({\tilde{P}}) + C_\varepsilon \end{aligned} \end{aligned}$$

for all \(P\in X(K){\setminus } V(K)\). Also \( K_{{\tilde{X}}} = \pi ^* K_X + {\tilde{Y}}\) and \({\tilde{A}} = \pi ^* A - 1/N\cdot {\tilde{Y}}\), so

$$\begin{aligned} \begin{aligned} h_{{\tilde{X}}, \pi ^* K_{{X}}}({{\tilde{P}}}) +h_{{\tilde{X}}, {\tilde{Y}}}({\tilde{P}})&\le \varepsilon \cdot h_{{\tilde{X}}, {\pi ^*{A}}}({\tilde{P}}) - \frac{1}{N} \cdot h_{{\tilde{X}}, {\tilde{Y}}}({\tilde{P}}) + C_\varepsilon , \\ h_{{X}, K_{{X}}}({{P}}) + \left( 1 + \frac{1}{N}\right) h_{{\tilde{X}}, {\tilde{Y}}}({P})&\le \varepsilon \cdot h_{{X}, {A}}({P}) + C_\varepsilon ', \\ \left( 1 + \frac{1}{N}\right) h_{\mathrm {gcd}}(P;Y)&\le \varepsilon \cdot h_{X, A}(P) + h_{X,-K_X} (P) + C_\varepsilon ', \\ h_{\mathrm {gcd}}(P;Y)&\le \varepsilon \cdot h_{X, A}(P) + h_{X,-K_X} (P) + C_\varepsilon ''. \\ \end{aligned} \end{aligned}$$

But \(K_X\) is linearly equivalent to \(-2A\), and let \(P= (x_1,x_2)\). Then

$$\begin{aligned} \begin{aligned} h_{X, -K_X}(P)&= 2\cdot \left( h(x_1) + h(x_2)\right) + O(1), \\ h_{X,A}(P)&= h(x_1) + h(x_2) , \\ h_{\gcd }(P;Y)&= h_{\gcd }(f(x_1),g(x_2)). \end{aligned} \end{aligned}$$

Now Theorem 3.5 is verified. \(\square \)

For the proof of Theorem 2.11 we need with the following

Lemma 3.6

Let \(\sigma , \tau \in K(x)\) be Möbius transforms defined over K. Set \(f_\sigma = \sigma f\sigma ^{-1},~g_\tau = \tau g\tau ^{-1}\). Then there exists a constant \(C>0\), depending on \(\alpha , \beta , f, g, \sigma , \tau \), such that for all \(a,b\in K\), for all finite set \(S\subset M(K)\) containing all the archimedean places and for all \(n\in {\mathbb {N}}\), we have

$$\begin{aligned} \left| h_{\gcd , S}\left( f_\sigma ^{\circ n}\left( \sigma a\right) - \sigma \alpha , g_\tau ^{\circ n}\left( \tau b\right) - \tau \beta \right) - h_{\gcd ,S}(f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta )\right| \le C. \end{aligned}$$

Proof

It suffices to show that for any fixed \(\alpha \in K\), and for any fixed Möbius transform \(\sigma \), there exists a finite set \(S'\subset M(K)_{\text {fin}}\) and a constant \(C'>0\), such that for all \(x\in K\) and \(v\in S'\), we have \(\left| v^{+}\left( \sigma x - \sigma \alpha \right) - v^+(x - \alpha )\right| \le C'\), and for all \(x\in K\) and \( v\in M(K)_{\text {fin}}{\setminus } S'\), we have \( v^{+}\left( \sigma x - \sigma \alpha \right) = v^+(x - \alpha )\).

Since each Möbius transform defined over K is a composition of translations, dilations and inverses defined over K, it suffices to prove the result for the case when \(\sigma \) is one of the above three types of maps. The result is trivial for translations and dilations.

If \(\sigma (x) = 1/x\), write \(x = x_1 /x_2, \alpha = \alpha _1/\alpha _2\), \(x_1, x_2,\alpha _1,\alpha _2 \in {\mathcal {O}}_K\). Since the class number of K is finite, there exists \(\gamma \in {\mathcal {O}}_K\) such that for fixed \(\alpha \in {\mathcal {O}}_K\) and for all \(x\in {\mathcal {O}}_K\) we can always choose \(x_1,x_2,\alpha _1,\alpha _2\) such that the ideals \(\gcd (x_1, x_2)~|~\gamma ,~ \gcd (\alpha _1, \alpha _2) ~|~\gamma \). Now

$$\begin{aligned} \left| x-a\right| _v = \left| \frac{\alpha _2 x_1 - \alpha _1 x_2}{\alpha _2 x_2}\right| _v, ~\left| \sigma x - \sigma \alpha \right| _v = \left| \frac{\alpha _2 x_1 - \alpha _1 x_2}{\alpha _1 x_1}\right| _v. \end{aligned}$$

But the ideal

$$\begin{aligned} \begin{aligned} \gcd ({\alpha _2 x_1 - \alpha _1 x_2}, {\alpha _2 x_2})~&|~\gcd ({\alpha _2^2 x_1 - \alpha _1\alpha _2 x_2}, {\alpha _1\alpha _2 x_2}) = \gcd (\alpha _2^2 x_1 , \alpha _1\alpha _2 x_2)\\&|~\gcd (\alpha _1\alpha _2^2 x_1 , \alpha _1\alpha _2^2 x_2) ~|~ \alpha _1\alpha _2^2\gamma , \end{aligned} \end{aligned}$$

so

$$\begin{aligned} \begin{aligned} v^+(\alpha _2 x_1 - \alpha _1 x_2) -v(\alpha _1\alpha _2^2\gamma ) \le v^+(x - \alpha ) \le v^+(\alpha _2 x_1 - \alpha _1 x_2). \end{aligned} \end{aligned}$$

Similarly

$$\begin{aligned} \begin{aligned} v^+(\alpha _2 x_1 - \alpha _1 x_2) -v(\alpha _1^2\alpha _2\gamma ) \le v^+(\sigma x - \sigma \alpha ) \le v^+(\alpha _2 x_1 - \alpha _1 x_2). \end{aligned} \end{aligned}$$

Therefore

$$\begin{aligned} \begin{aligned} \left| v^+(\sigma x- \sigma \alpha ) - v^+(x - \alpha ) \right| \le \max \left( v(\alpha _1\alpha _2^2), v(\alpha _1^2\alpha _2\gamma )\right) \le v(\alpha _1^2 \alpha _2^2\gamma ). \end{aligned} \end{aligned}$$

Hence we may choose \(S' = \{v\in M(K)_{\text {fin}}~|~v(\alpha _1)\ne 0,~v(\alpha _2) \ne 0~\text {or}~v(\gamma ) \ne 0\}\). \(\square \)

Lemma 3.7

(Lemma 3.52 of [29]) Let \(\phi : {\mathbb {P}}^1 \rightarrow {\mathbb {P}}^1\) be a rational map of degree at least 2 and let \(Q \in {\mathbb {P}}^1\) be a point such that Q is not a totally ramified fixed point of \({\mathbb {P}}^1\). Let \(e_Q(\phi )\) be the multiplicity of \(\phi \) at Q. Then

$$\begin{aligned} \lim _{m \rightarrow \infty } \frac{e_Q(\phi ^m)}{(\deg \phi )^m} = 0. \end{aligned}$$

Proof of Theorem 2.11

By Lemma 3.6 for \(h_{\gcd ,S}\) we may assume that \(\alpha = \beta = 0\). For any fixed integer D, write in the lowest terms \(f^{\circ D} = F_1/F_2\) and \(g^{\circ D} = G_1/G_2\) where \(F_1, F_2,G_1,G_2\) are polynomials with coefficients in \({\mathcal {O}}_K\).

Write

$$\begin{aligned} F_1(x)= & {} a_0 + \dots + a_Nx^N,\\ F_2(x)= & {} b_0 + \dots + b_Mx^M,\\ G_1(x)= & {} a_0' + \dots + a'_{N'}x^{N'},\\ G_2(x)= & {} b_0' + \dots + b'_{M'}x^{M'} \end{aligned}$$

with all coefficients in \({\mathcal {O}}_K\). By Lemma 3.6 we may assume that all preimages of 0 under f and g are not \(\infty \). This implies that \(N\ge M\). Let

$$\begin{aligned} S:=\{v\in M(K)~|~v(a_N) \ne 0,~ v(b_M) \ne 0,~ v(a'_{N'}) \ne 0,~ \text {or}~ v(b'_{M'}) \ne 0\} \cup M(K)_{\infty }. \end{aligned}$$

Then S is finite. For each place \(v\notin S\) and for any \(x_0\in K\), if \(v(x_0) \ge 0\), then \(v(F_2(x)) \ge 0\) and hence \(v^{+}(f^{\circ D}(x_0)) \le v^{+}\left( F_1 (x_0)\right) . \) If \(v(x_0) < 0 \), then

$$\begin{aligned} v^{+}(f^{\circ D}(x_0)) = v^{+}\left( \frac{a_N x_0^N}{b_M x_0^M}\right) =v^{+}\left( { x_0^{N-M}}\right) =0 \le v^{+}\left( F_1 (x_0)\right) . \end{aligned}$$

In either case we have

$$\begin{aligned} \begin{aligned} v^{+}(f^{\circ D}(x_0))&\le v^{+}\left( F_1 (x_0)\right) . \end{aligned} \end{aligned}$$

Similarly for any \(v\notin S\) and for any \(y_0\in K\),

$$\begin{aligned} \begin{aligned} v^{+}(g^{\circ D}(y_0))&\le v^{+}\left( G_1 (y_0)\right) . \end{aligned} \end{aligned}$$

Therefore the sum of the part of \(h_{\gcd }\) outside S satisfies

$$\begin{aligned} h_{\gcd , S}\big (f^{\circ D}(a'), g^{\circ D}(b')\big ):= & {} \frac{1}{[K:{\mathbb {Q}}]}\sum _{v\in M(K)_{\text {fin}}{\setminus } S} n_v \min \left( v^+(f^{\circ D}(a')), v^+( g^{\circ D}(b'))\right) \nonumber \\\le & {} \frac{1}{[K:{\mathbb {Q}}]}\sum _{v\in M(K)_{\text {fin}}{\setminus } S} n_v \min \left( v^+(F_1(a')), v^+( G_1 (b')) \right) \nonumber \\\le & {} h_{\gcd , S}\big (F_1(a'), G_1 (b')\big ). \end{aligned}$$
(3.5)

Let \(F_1^{\mathrm {rad}}(x) = \mathrm {rad}(F_1)(x)\), and let \(G_1^{\mathrm {rad}}(y) = \mathrm {rad}(G_1)(y)\), where for a one-variable polynomial P, \(\mathrm {rad}(P)\) is the product of all monic irreducible polynomials dividing P. As the sequence \(\left( f^{\circ (n-D)}(a), g^{\circ (n-D)}(b)\right) _n\) is generic in \({\mathbb {P}}^1(\overline{{\mathbb {Q}}}) \times {\mathbb {P}}^1(\overline{{\mathbb {Q}}})\), there exists \(N'' = N''(\varepsilon , f,g, a,b\), such that for all \(n \ge N''\) we have

$$\begin{aligned} \left( f^{\circ (n-D)}(a), g^{\circ (n-D)}(b)\right) \notin V(K) \end{aligned}$$
(3.6)

where V is as in Theorem 3.5. Apply Theorem 3.5 to the point \(\left( f^{\circ (n-D)}(a), g^{\circ (n-D)}(b)\right) \) and the functions \(F_1^{\mathrm {rad}}\) and \(G_1^{\mathrm {rad}}\), with \(\varepsilon = 1\). Let \(u = f^{\circ (n-D)}(a),v = g^{\circ (n-D)}(b)\). Then

$$\begin{aligned} h_{\gcd , S}\left( F_1^{\mathrm {rad}}(u), G_1^{\mathrm {rad}}(v)\right) \le h_{\gcd }\left( F_1^{\mathrm {rad}}(u), G_1^{\mathrm {rad}}(v)\right)\le & {} 4 \left( h\left( u\right) + h(v)\right) +O(1).\nonumber \\ \end{aligned}$$
(3.7)

Let \(M' = \sup E\) where

$$\begin{aligned} \begin{aligned} E = \cup _{(x,y)\in (f^{\circ {D}},~g^{\circ {D}})^{-1}(0,0)} \left\{ e_{x}(f^{\circ D} -0), e_{y}(g^{\circ D} -0)\right\} \end{aligned} \end{aligned}$$

where \(e_Q(\phi )\) is the multiplicity of \(\phi \) at Q. In the following inequalities, the implied constants only depend on \(f,g,a,b, \alpha ,\beta , D\). Compared with Theorem 2.11, we have an extra dependence with D. However, this dependence will be removed when \(\varepsilon \) is involved later. We have

$$\begin{aligned}&h_{\gcd , S}\left( f^{\circ n}(a), g^{\circ n}(b)\right) \\&\quad = h_{\gcd , S}\left( f^{\circ D}(f^{\circ (n-D)}(a)), g^{\circ D}(g^{\circ (n-D)}(b))\right) \\&\quad \le h_{\gcd ,S}\left( F_1 (f^{\circ (n-D)}(a)), G_1(g^{\circ (n-D)}(b)))\right) ~(\text {by }(3.5))\\&\quad \le h_{\gcd , S}\left( \left( F_1^{\mathrm {rad}}\circ f^{\circ (n-D)}(a)\right) ^{M'}, \left( G_1^{\mathrm {rad}}\circ g^{\circ (n-D)}(b)\right) ^{M'} \right) +O(1) \\&\quad \le M'\cdot \left( {4} \cdot {h}\left( f^{\circ (n-D)}(a)\right) + 4\cdot h\left( g^{\circ (n-D)}(b)\right) +O(1) \right) +O(1) ~(\text {by } (3.7))\\&\quad \le M '\cdot \left( {4}d^{n-D} \cdot {\hat{h}}_f \left( a\right) + {4} d^{n-D} \cdot {\hat{h}}_g \left( b\right) + O(1) \right) +O(1) ~(\text {by~2 of Theorem}~2.1)\\&\quad \le d^n\cdot \frac{M'}{d^{D}} \cdot \left( {4} {\hat{h}}_f(a) + 4{\hat{h}}_g(b) + C\right) + O(1)~(\text {by~1 of Theorem}~2.1). \end{aligned}$$

Since 0 is not exceptional for f or g, we know that they are not totally ramified fixed point of \(f^{\circ 2}, g^{\circ 2}\) respectively. Indeed, if 0 were totally ramified fixed point of \(f^{\circ 2}\), then

$$\begin{aligned} \cup _{i = 1}^\infty (f^{\circ i})^{-1} (0) = \{0\} \cup f^{-1}(0) \end{aligned}$$

is a finite set, and hence 0 is exceptional for f. Similar argument holds for g. Therefore by Lemma 3.7, we can choose \(D = D(\varepsilon , f, g,a,b)\in {\mathbb {N}}\) sufficiently large so that

$$\begin{aligned} \frac{M'}{d^D}\cdot \left( {4} {\hat{h}}_f(a) + 4{\hat{h}}_g(b) + C \right) < \frac{\varepsilon }{2}. \end{aligned}$$

Thus, we have

$$\begin{aligned} h_{\gcd , S}\left( f^{\circ n}(a), g^{\circ n}(b)\right) \le \frac{\varepsilon }{2}\cdot d^n . \end{aligned}$$
(3.8)

Now look at all places \(v\in S\) and all infinite v. By assumption we know that 0 is not exceptional with respect to f and g and a and b are not preperiodic with respect to f and g. By Lemma 4.1 of [30] (cited below as Lemma 3.8, see also Theorem E of [26] for an archimedean version), we know for all sufficiently large \(n\in {\mathbb {N}}\),

$$\begin{aligned} \begin{aligned} n_v v^+\left( f^{\circ n}(a)\right) \le \frac{\varepsilon }{2\cdot ([K:{\mathbb {Q}}]+|S|)}\cdot d^n ,\\ n_v v^+\left( g^{\circ n}(b)\right) \le \frac{\varepsilon }{2\cdot ([K:{\mathbb {Q}}] +|S|)} \cdot d^n . \end{aligned} \end{aligned}$$
(3.9)

Combining equations (2.4), (3.8) and (3.9), we obtain the requested estimate and hence finish up the proof of Theorem 2.11. \(\square \)

Lemma 3.8

([30]) Let K be a number field and let \(\phi \) be a rational function of degree \(d\ge 2\) defined over K. Suppose 0 is exceptional with respect to \(\phi \) and let a be a point in \({\mathbb {P}}^1({\overline{K}})\) for which there is a strictly increasing sequence integers \((e_i)_{i=1}^\infty \) such that \(\phi ^{\circ e_i}(a) \ne 0\). Then

$$\begin{aligned} \lim _{i\rightarrow \infty } \frac{v^+\left( \phi ^{\circ e_i}(a)\right) }{d^{e_i}} = 0. \end{aligned}$$

4 On the genericity condition

The Dynamical Mordell–Lang Conjecture predicts that given an endomorphism \(\phi : X\rightarrow X\) of a complex quasi-projective variety X, for any point \(P\in X\) and any subvariety \(Y\subsetneq X\), the set \(\{n\in {\mathbb {N}}~|~\phi ^{\circ n}(P) \in Y \}\) is a finite union of arithmetic progressions (sets of the form \(\{a, a+d, a+ 2d,\dots \} \) with \(a,d\in {\mathbb {N}}_{\ge 0})\). The Dynamical Mordell–Lang Conjecture was proposed in [12]. See also [3, 11] for earlier works. In the case of étale maps we know that the Dynamical Mordell–Lang Conjecture is true. See the recent monograph [4]. Xie proved in [32] the Dynamical Mordell–Lang Conjecture for polynomial endomorphisms of the affine plane.

Proof of Theorem 2.12

The result is clearly true in the case when (ab) is preperiodic under (ff). When (ab) is not preperiodic under (ff), by Theorem A it suffices to show that the sequence \((f^{\circ n}(a), f^{\circ n}(b))_n\) is generic. If there were infinitely many iterates \((f^{\circ n}(a), f^{\circ n}(b))\) lying on a curve C, then by Theorem 0.1 of [32], the Dynamical Mordell–Lang Conjecture for polynomial endomorphisms of the affine plane, we know C itself is periodic under (ff). Replacing f by an iterate \(f^{\circ m}\) we may assume that C is fixed under (ff). Now we can apply the results of [20, 22] classification for invariant curves. In fact, using these results Baker and DeMarco demonstrated in page 32 of [2] that the irreducible invariant curve in the above theorem must be a graph of the form \( y = h(x)\) or \(x = h(y)\), for a polynomial h which commutes with some \(f^{\circ k}\) with initial conditions as in Theorem 2.12. This contradicts the assumption of Theorem 2.12. \(\square \)

We give two examples to show that if the assumption of Theorem 2.12 is not verified, then we might not have the upper bound.

Example 4.1

Under the hypothesis of the above proof and use the same notation. Assume that the curve is given by \(y = h(x)\) and \(h\circ f^{\circ k} = f^{\circ k}\circ h\) for some \(k\in {\mathbb {N}}_{>0} \). Suppose \(n = mk\) with \(k\in {\mathbb {N}}\). If \(h(\alpha ) = \alpha \), then

$$\begin{aligned} \begin{aligned} \gcd (f^{\circ n}(a) - \alpha , f^{\circ n}(b) - \alpha )&= \gcd (f^{\circ mk}(a) - \alpha , f^{\circ mk}\left( h(a)\right) - \alpha )\\&= \gcd (f^{\circ mk}(a) - \alpha , h(f^{\circ mk}(a)) - h(\alpha ))\\&= |f^{\circ mk}(a) - \alpha | = |f^{\circ n}(a) - \alpha |. \end{aligned} \end{aligned}$$

Example 4.2

Let \(f(x) = g(x) = x^3 + x\). Assume \(a = -b\) and \(\alpha = -\beta \). Then for \(h(x) = -x\), we have \(h\circ f = f\circ h\), \(h(a) = b\) and \(h(\alpha ) = \beta \). Now

$$\begin{aligned} f^{\circ n}(a) -\alpha = f^{\circ n}(-b) + \beta = -f^{\circ n}(b) + \beta = -(g^{\circ n}(b) - \beta ), \end{aligned}$$

so

$$\begin{aligned} \gcd (f^{\circ n}(a) -\alpha , g^{\circ n}(b) -\beta ) = |f^{\circ n}(a) - \alpha | \gg |a|^{\delta ^n} \end{aligned}$$

for any \(\delta <3\).

In the case of power maps, if \((f^{\circ n}(a), g^{\circ n}(b))_n\) is generic, the following unconditional result is proved by Corvaja and Zannier ([8]).

Example 4.3

Suppose K is a number field and suppose \(a,b,\alpha , \beta \in K\). Also suppose that f and g are power maps, and ab are multiplicatively independent. Let \(d = \max (\deg f, \deg g)\), then for each fixed \(\varepsilon >0\), there exists some \(C = C(f,g,a,b)\) such that

$$\begin{aligned} \begin{aligned} \gcd (f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta )&\le C\cdot \max \left( {h}(a), {h}(b) \right) ^{\varepsilon d^n}. \end{aligned} \end{aligned}$$
(4.1)

In fact, the genericity of the sequence \((f^{\circ n}(a), g^{\circ n}(b))_n\) is equivalent to the multiplicative independence of a and b. The assumption that \(\alpha \) and \(\beta \) are not exceptional implies that \(\alpha \ne 0\) and \(\beta \ne 0\). Then Inequality (4.1) is a consequence of Inequality (1.2) of Corvaja and Zannier [8].

Now we provide an example to explain that the genericity of \( (f^{\circ n}(a), f^{\circ n}(b))_n\) is necessary for power maps.

Example 4.4

Let \( a = 125, b =25, \alpha = \beta = 1, f(x) = x^2, g(y) = y^2\). Then \(\gcd (f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta ) \) is divisible by \(5^{2^n} - 1 = O\left( (f^{\circ n}(a))^{1/3}\right) \).

5 When is the \(\gcd \) large?

As we have seen, when the sequence \((f^{\circ n}(a), g^{\circ n}(b))_n\) is not generic, \(\gcd (f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta )\) might be big in general. Our goal in this section is to show the following result.

Theorem 5.1

Assume Vojta’s Conjecture. Suppose \(f,g\in {\mathbb {Z}}[X]\) and \(a,b,\alpha ,\beta \in {\mathbb {Z}}\). Then for all \(\eta >0\),

  • either the set

    $$\begin{aligned} \{n\in {\mathbb {N}}~|~\log \gcd (f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta )\ge \eta \cdot d^n\} \end{aligned}$$

    is a finite union of arithmetic progressions, or

  • there is a finite union of arithmetic progressions J such that

    $$\begin{aligned} \lim _{n\rightarrow \infty , n\in J} \frac{1}{\eta d^n}\cdot {\log \gcd (f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta )} = 1. \end{aligned}$$

Proof

We choose D as in the proof of Theorem 2.11. That is, we choose \(D = D(\varepsilon , f, g,a,b)\in {\mathbb {N}}\) sufficiently large so that

$$\begin{aligned} \frac{M'}{d^D}\cdot \left( {4} {\hat{h}}_f(a) + 4{\hat{h}}_g(b) + C \right) < \frac{\eta }{2} \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} M'&= \max _{f^{\circ D} (x) = \alpha ,~g^{\circ D}(y) = \beta } \left( e_{x}(f^{\circ D} -\alpha ), e_{y}(g^{\circ D} -\beta )\right) . \end{aligned} \end{aligned}$$

Then the proof of Theorem 2.11 shows that assuming Vojta’s Conjecture, there is a proper algebraic subset \(V\subseteq {\mathbb {P}}^1 \times {\mathbb {P}}^1\) such that as long as \((f^{\circ (n-D)}(a), g^{\circ (n-D)}(b)) \notin V\) and n is sufficiently large, we have

$$\begin{aligned} \log \gcd (f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta ) < \frac{\eta }{2} \cdot d^n . \end{aligned}$$

Let \(I = \{n\in {\mathbb {N}}~|~(f^{\circ (n-D)}(a), g^{\circ (n-D)}(b)) \in V\}\). Then the set

$$\begin{aligned} \{n\in {\mathbb {N}}{\setminus } I~|~ \log \gcd (f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta )\ge \eta \cdot d^n\} \end{aligned}$$

is finite. By the Dynamical Mordell–Lang Theorem for polynomial maps on the affine plane (cf. [32]), I is a finite union of arithmetic progressions. Hence it suffices to show that the set

$$\begin{aligned} \{n\in I~| ~\log \gcd (f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta )\ge \eta \cdot d^n\} \end{aligned}$$

is a finite union of arithmetic progressions. Looking at each irreducible component of V, it is enough to consider the case when V is a curve. In that case the set

$$\begin{aligned} \{(f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta ) ~|~n\in I\} \end{aligned}$$

is contained in the curve \(V':=f^{\circ (D)}(V)+(-\alpha ,-\beta )\) where \(+\) means translation on \({\mathbb {A}}^2\). By abuse of notation, we also donote by \(V'\) its Zariski closure in \({\mathbb {P}}^1 \times {\mathbb {P}}^1\). Suppose \(\iota : V'\hookrightarrow {\mathbb {P}}^1 \times {\mathbb {P}}^1\) is the inclusion map.

Suppose \((x_1, x_2)\in V'\) and fix \(D'\in \mathrm {Div}(V')\) of degree 1, then

$$\begin{aligned} \begin{aligned} h_{\gcd }(x_1, x_2)&= h_{{\mathbb {P}}^1 \times {\mathbb {P}}^1, (0,0)} (x_1, x_2) \\&= h_{V',~ \iota ^*(0,0)}(x_1, x_2) + O(1) \\&= \deg (\iota ^*(0,0))\cdot h_{V', D'} (x_1,x_2) + O(1) \end{aligned} \end{aligned}$$

where the last equality follows from Proposition B.3.5 of [16], due originally to Siegel.

Clearly it’s enough to consider the case when a is not preperiodic under f and b is not preperiodic under g. In this case the projection \(\pi _1: V'\rightarrow {\mathbb {P}}^1,~(x_1,x_2)\mapsto x_1\) is dominant. Fix \(D\in \mathrm {Div}({\mathbb {P}}^1)\) of degree 1. Then

$$\begin{aligned} h_{V', D'} (x_1,x_2) =\frac{1}{\deg ({\pi _1})}\cdot h_{{\mathbb {P}}^1, D}(x_1) + O(1) \end{aligned}$$

by Theorem 2.6. Now

$$\begin{aligned} \begin{aligned} h_{\gcd }(f^{\circ n}(a) -\alpha , g^{\circ n}(b) - \beta )&= \frac{\deg (\iota ^*(0,0))}{\deg (\pi _1)}\cdot h_{{\mathbb {P}}^1, D}(f^{\circ n}(a) -\alpha ) + O(1) \\&= \frac{\deg (\iota ^*(0,0))}{\deg (\pi _1)} \cdot \left( {\hat{h}}_f(a)\cdot d^n + O(1)\right) + O(1). \end{aligned} \end{aligned}$$

Therefore, in the case when \(V'\) is a curve, if \(\displaystyle \eta = {\hat{h}}_f(a) \cdot \frac{\deg (\iota ^*(0,0))}{{\deg (\pi _1)}}\), then

$$\begin{aligned} \lim _{n\rightarrow \infty , n\in J}\frac{1}{\eta \cdot d^n} \cdot {\log \gcd (f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta )} = 1 \end{aligned}$$

for a finite union of arithmetic progression J; otherwise the set

$$\begin{aligned} \{n\in I~| ~\log \gcd (f^{\circ n}(a) - \alpha , g^{\circ n}(b) - \beta )\ge \eta \cdot d^n\} \end{aligned}$$

is always a finite set or complement of a finite set. Hence for general \(V'\), for all but finitely many \(\eta \), the set in the statement is a finite union of arithmetic progressions. \(\square \)