1 Introduction

1.1 Notation and general background

We consider permutations on a set \(\Omega \) of size n. Given two such permutations \(\pi \) and \(\sigma \), we let \(hd(\pi , \sigma ) = |\{ x\in \Omega : \pi (x) \ne \sigma (x) \}|\), so \(hd(\pi , \sigma )\) is the number of elements of \(\Omega \) at which \(\pi \) and \(\sigma \) disagree. When \(hd(\pi , \sigma ) = d\), we say that \(\pi \) and \(\sigma \) and are at Hamming distance d, or that the Hamming distance between\(\pi \) and \(\sigma \) is d. A permutation array A is a set of permutations on \(\Omega \). We say that \(hd(A) = d\) if \(d = min\{ hd(\pi , \sigma ): \pi , \sigma \in A \}\). For positive integers n and d with \(d\le n\) we let M(nd) be the maximum number of permutations in any permutation array A satisfying \(hd(A) \ge d\).

Consider a fixed ordering \(x_{1}, x_{2}, \cdots , x_{n}\) of the elements of \(\Omega \). The image string of the permutation \(\sigma \in A\) is the string \(\sigma (x_{1})\sigma (x_{2})\cdots \sigma (x_{n})\). Thus the permutation array A can also be regarded as an \(|A|\times n\) matrix whose rows are the image strings of the permutations in A. When \(hd(A) = d\), any two rows of A disagree in at least d positions and some pair of rows disagrees in exactly d positions. In particular, if G is a permutation group acting on \(\Omega \), then we obtain a \(|G|\times n\) permutation array whose rows consist of the image strings of all the elements of G. We refer to this array by G, and we use hd(G) to refer to the hamming distance of this array.

The study of permutation arrays began (to our knowledge) with the papers [10, 15], where good bounds on M(nd) (together with other results) were developed based on combinatorial methods, motivated by the Gilbert–Varshamov bounds for binary codes. In recent years there has been renewed interest in permutation arrays, motivated by suggested applications in power line transmission [14, 19, 26, 34], block ciphers [11], and in multilevel flash memories [22, 23].

We review here some of the known results and methods for estimating M(nd).

Some elementary exact values and bounds on M(nd) are the following (summarized with short proofs in [9]); \(M(n,2) = n!\), \(M(n,3) = \frac{n!}{2}\), \(M(n,n) = n\), \(M(n,d)\ge M(n-1,d)\), \(M(n,d)\ge M(n,d+1)\), \(M(n,d)\le nM(n-1,d)\), and \(M(n,d)\le \frac{n!}{(d-1)!}\). More sophisticated bounds were developed in the above cited papers [10, 15], with a recent improvement in [35]. The smallest interesting case for d is \(d=4\). Here some interesting and non-elementary bounds for M(n, 4) were developed in [13], using linear programming, characters on the symmetric group \(S_{n}\), and Young diagrams. In [24] it is shown that if \(K>0\) is a constant with \(n > e^{30/K^{2}}\) and \(s < n^{1-K}\), then \(M(n,n-s) \ge \theta (s! \sqrt{log(n)})\). The lower bound is achieved by a polynomial time randomized construction, using the Lovasz Local Lemma in the analysis.

There are various construction methods for permutation arrays. First there is a connection with mutually orthogonal latin squares (MOLS). Letting N(n) denote the maximum number of MOLS of order n, it was shown in [8] that \(M(n,n-1) \ge nN(n)\). From this it follows that if q is a prime power, then \(M(q,q-1) = q(q-1)\). Computational approaches for bounding M(nd) for small n and d, including clique search, and the use of automorphisms are described in [9, 21, 29]. There are also constructions of permutation arrays that arise from the use of permutation polynomials, also surveyed in [9], which we mention briefly below.

Additional construction methods are coset search [2] and partition and extension [3]. In the first of these, one starts with a permutation group G on n letters with \(hd(G) = d\), and which is a subgroup of some group H (for example \(H = S_{n}\)). Now for disjoint permutation arrays AB on the same set of letters, let \(hd(A,B) =\) min\(\{hd(\sigma , \tau ): \sigma \in A,\tau \in B \}\). For \(x\notin G\) we observe that the coset xG of G in H is a permutation array with \(hd(xG) = hd(G)\). For cosets \(x_{1}G, x_{2 }G, \cdots , x_{k}G\) of G, the Hamming distance of the permutation array \(\cup _{1\le i\le k}x_{i}G\) is the minimum of d and m, where \(m =\) min\(\{hd(x_{i}G, x_{j}G): 1\le i < j\le k\}\). The method of coset search is to iteratively find coset representatives \(x_{i}\) so that m, while in general less than d, is still reasonably large. The partition and extension method is a way of obtaining constructive lower bounds \(M(n+1, d+1)\) from such bounds for M(nd).

Moving closer to the subject of this paper, we consider a class of optimal constructions which arise through sharply transitive groups. We say that a permutation array A on a set \(\Omega \) of size n is sharply k-transitive on \(\Omega \) if given any two k-tuples \(x_{1}, x_{2}, \cdots , x_{k}\) and \(y_{1}, y_{2}, \cdots , y_{k}\) of distinct elements of \(\Omega \) there exists a unique \(g\in A\) such that \(g(x_{i}) = y_{i}\) for all \(1\le i\le k\). In our applications A will be the set of image strings of a permutation group acting on \(\Omega \). From the bound \(M(n,d)\le \frac{n!}{(d-1)!}\) we have for any positive integer k that \(M(n, n-k+1)\le \frac{n!}{(n-k)!}\). Now if G is a sharply k-transitive group acting on \(\Omega \), then \(|G| = \frac{n!}{(n-k)!}\). Also, in such a G any two distinct elements gh of G can agree in at most \(k-1\) positions, since otherwise \(gh^{-1}\) is a nonidentity element of G fixing at least k elements of \(\Omega \), contrary to sharp k-transitivity. Thus \(hd(G) \ge n-k+1\). So a sharply k-transitive group G implies the existence of an optimal array (the set of image strings of elements of G) realizing \(M(n,n-k+1) = \frac{n!}{(n-k)!}\). The following theorem gives a strong converse to the above, including the generalization to arbitrary arrays that may not be groups.

Theorem 1

[4] Let A be a permutation array on a set of n letters satisfying \(hd(A)\ge n-k+1\). Then \(|A| = \frac{n!}{(n-k)!} = M(n, n-k+1) \Longleftrightarrow A\) is sharply k-transitive on this set.

The sharply k-transitive groups (for \(k\ge 2\)) are known, and these are as follows [6, 12, 28];

  • \(k = 2\): the Affine General Linear Group AGL(1, q) acting on the finite field GF(q), consisting of the transformations \(\{ x\rightarrow ax+b: x, a\ne 0, b\in GF(q) \}\),

  • \(k = 3\): the Projective Linear Group PGL(2, q) acting on \(GF(q)\cup \{\infty \}\), consisting of the transformations \(\{ x\rightarrow \frac{ax+b}{cx+d}: x, a, b, c, d\in GF(q), ad-bc\ne 0\}\),

  • \(k = 4\): the Mathieu group \(M_{11}\) acting on a set of size 11,

  • \(k = 5\): the Mathieu group \(M_{12}\) acting on a set of size 12,

  • arbitrary k: the symmetric group \(S_{k}\) acting on a set of size k is sharply k and \((k-1)\)-transitive, as well as the alternating group \(A_{k}\) acting on a set of size k is sharply \((k-2)\)-transitive ([28], Theorem 7.1.4).

In this paper we obtain new lower bounds on M(nd) for n and d near a prime power. Previous results of this kind are given in [9] where it is shown that for \(n = 2^{k}\) with \(n\not \equiv 1\)(mod 3) we have \(M(n,n-3)\ge (n+2)n(n-1)\) and \(M(n,n-4)\ge \frac{1}{3}n(n-1)(n^{2}+3n+8)\). It is also shown that for any prime power n with \(n\not \equiv 2\)(mod 3) we have \(M(n,n-2)\ge n^{2}\). These results are based on permutation polynomials. Similar such results appearing in [2], are based on a contraction operation applied to permutation arrays defined in the next section. The latter results yield \(M(n-1, n-3)\ge n^{2} - n\) when \(n\not \equiv 1\)(mod 3), \(M(n-2, n-5)\ge n(n-1)\) when \(n\not \equiv 2\)(mod 3) with \(n\not \equiv 0,1\)(mod 5), and \(M(n, n-3)\ge n^{3} - n\) when \(n\not \equiv 1\)(mod 3).

Our goal is to obtain new lower bounds when q is prime power satisfying \(q\equiv 1\) (mod 3); that is, to resolve the case left open in [2], namely, where the hamming distance under contraction decreases by 3. We accomplish this by applying the contraction operation to the permutation arrays AGL(1, q) and PGL(2, q) (acting on GF(q) and on \(GF(q)\cup \{\infty \}\) respectively), obtaining the following constructive lower bounds, where q a prime power satisfying \(q\equiv 1\) (mod 3).

  • (1) for \(q\ge 7\), \(M(q-1,q-3)\ge (q^{2} - 1)/2\) for q odd and \(M(q-1,q-3)\ge (q-1)(q+2)/3\) for q even, and

  • (2) \(M(q,q-3)\ge Kq^{2}log(q)\) for some constant \(K>0\) if q is odd, and

  • (3) bounds for M(nd) for a finite number of exceptional pairs nd, obtained from the Mathieu groups.

We will use standard graph theoretic notation. In particular for a graph G and \(S\subseteq V(G)\) we let \([S]_{G}\) be the graph with vertex set S and edge set \(E([S]_{G}) = \{ xy: x, y \in S, xy\in E(G) \}\), and we call it the graph induced byS. When G is understood by context, we abbreviate \([S]_{G}\) by [S].

1.2 Contraction and the contraction graph

Consider a permutation array A acting on a set \(\Omega = \{ x_{1}, x_{2}, \cdots , x_{n}\}\) of size n, where the elements of \(\Omega \) are ordered by their subscripts. We distinguish some element, say \(x_{n}\), by renaming it F. Thus the image string of any element \(\sigma \in A\) will be \(\sigma (x_{1})\sigma (x_{2})\cdots \sigma (F)\), and we say that \(\sigma (x_{i})\) occurs in position or coordinate\(x_{i}\) of the string. Now for any \(\pi \in A\), define the permutation \(\pi ^{\triangle }\) on \(\Omega \) by

$$\begin{aligned} \pi ^{\triangle }(x) = {\left\{ \begin{array}{ll} \pi (F) &{}\quad \text {if } x = \pi ^{-1}(F),\\ F &{}\quad \text {if } x = F,\\ \pi (x) &{}\quad \text {otherwise.}\\ \end{array}\right. } \end{aligned}$$

Thus the image string of \(\pi ^{\triangle }\) is obtained from the image string of \(\pi \) by interchanging the symbols F and \(\pi (F)\) if \(\pi (F)\ne F\), while \(\pi ^{\triangle }=\pi \) if and only if \(\pi (F) = F\). In either case, \(\pi ^{\triangle }\) has F as its final symbol. We let \(\pi ^{\triangle }_{-}\) be the permutation on \(n-1\) symbols obtained from \(\pi ^{\triangle }\) by dropping the last symbol F from \(\pi ^{\triangle }\). As an example, if \(\pi = aFbcd\), then \(\pi ^{\triangle } = adbcF\), and \(\pi ^{\triangle }_{-} = adbc\). We call the operation \(\pi \rightarrow \pi ^{\triangle }_{-} \)contraction, and we call \(\pi ^{\triangle }_{-}\) the contraction of the permutation\(\pi \). Further, for any subset \(R\subset A\), let \(R^{\triangle } = \{ \pi ^{\triangle }: \pi \in R \}\), and \(R^{\triangle }_{-} = \{ \pi ^{\triangle }_{-}: \pi \in R \}\). So \(R^{\triangle }_{-}\) is a permutation array on the symbol set \(\Omega - \{F\}\) of size \(n-1\), and is called the contraction of R.

We note some basic properties related to the contraction operation.

Lemma 2

Let G be a permutation group acting on the set \(\Omega \) of size n, and let \(\pi ,\sigma \in G\).

  • (a) The only coordinates in either \(\pi \) or \(\sigma \) whose values are affected by the \(\triangle \) operation are \(\pi ^{-1}(F), \sigma ^{-1}(F),\) and F. Thus \(hd(\pi ^{\triangle }, \sigma ^{\triangle })\ge hd(\pi , \sigma ) - 3.\)

  • (b) Assume \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi , \sigma ) - 3\). Then \(\pi \sigma ^{-1}\) contains a 3-cycle in its disjoint cycle factorization, and |G| is divisible by 3.

  • (c) Let \(S\subseteq G\). Then \(|S^{\triangle }| = |S^{\triangle }_{-}|\) and \(hd(S^{\triangle }) = hd(S^{\triangle }_{-})\). If also \(hd(S) > 3\), then \(|S| = |S^{\triangle }|\).

Proof

Part (a) follows immediately from the definition of the \(\triangle \) operation.

For (b), the assumption implies that there are positions \(x_{i}, x_{j}, F\) at which the image strings of \(\pi \) and \(\sigma \) disagree and \(\pi ^{\triangle }\) and \(\sigma ^{\triangle }\) agree. So for some indices st we must have \(\pi (x_{i}) = x_{s}, \pi (x_{j}) = F, \pi (F) = x_{t}\), while \(\sigma (x_{i}) = F, \sigma (x_{j}) = x_{t}, \sigma (F) = x_{s}\). Then \(\pi \sigma ^{-1}\) (composing left to right) contains the 3-cycle \((x_{i}, F, x_{j})\) in its disjoint cycle factorization. Thus the subgroup of G generated by \(\pi \sigma ^{-1}\) has order divisible by 3, and hence |G| is divisible by 3 by Lagrange’s theorem.

Consider (c). The first two equalities follow from the fact that all image strings in \(S^{\triangle }\) have F as their last coordinate. To see \(|S| = |S^{\triangle }|\) when \(hd(S) > 3\), suppose to the contrary that \(\pi ^{\triangle } = \sigma ^{\triangle }\) for distinct \(\pi , \sigma \in S\). As noted in the proof of part (a), \(\pi ^{\triangle }\) and \(\sigma ^{\triangle }\) can agree in at most three positions where \(\pi \) and \(\sigma \) disagreed. Thus \(\pi \) and \(\sigma \) already agreed in at least \(n-3\) positions. So \(hd(\pi , \sigma )\le 3\), a contradiction. \(\square \)

Consider a permutation array H on n symbols with \(hd(H) \, {=} \, d\). The array \(H^{\triangle }_{-}\) is on \(n{-}1\) symbols and satisfes \(hd(H^{\triangle }_{-})\,{\ge }\, d\,{-}\,3\) by Lemma 2a,c. For the arrays \(H = AGL(1,q), PGL(2,q)\) and certain Mathieu groups, our goal is to find a subset \(I\subset H\) with \(hd(I^{\triangle }_{-})\ge d-2\); that is, a subset I whose contraction \(I^{\triangle }_{-}\) has Hamming distance larger by 1 than the lower bound \(d-3\) for \(hd(H^{\triangle }_{-})\) given by Lemma 2a. The lower bound \(M(n-1, d-2)\ge |I^{\triangle }_{-}|\) follows. Now our underlying arrays H will satisfy \(hd(H) > 3\), so \(hd(I)\ge hd(H) > 3\). Thus by Lemma 2c we have \(hd(I^{\triangle }_{-}) = hd(I^{\triangle })\) and \(|I^{\triangle }_{-}| = |I^{\triangle }| = |I|\). So we get \(M(n-1, d-2)\ge |I|\), yielding the main results of this paper.

To find such a subset I of H, we employ a graph \(C_{H}\) defined as follows.

Definition 3

Let H be a permutation array with \(hd(H) = d\). Define the contraction graph forH, denoted \(C_{H}\), by \(V(C_{H}) = H\) and \(E(C_{H}) = \{ \pi \sigma : \pi , \sigma \in H, hd(\pi ^{\triangle }, \sigma ^{\triangle }) = d-3\}\).

For \(\pi \in C_{H}\), notice that if \(\pi (F) = F\), then \(\pi \) is an isolated point in \(C_{H}\). This is because then \(\pi ^{\triangle } = \pi \), so that for any other \(\sigma \in C_{H}\) we have \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi , \sigma ^{\triangle }) \ge hd(\pi , \sigma ) - 2\), implying no edge joining \(\pi \) and \(\sigma \) in \(C_{H}\). We thus have the following characterization of edges in \(C_{H}\):

$$\begin{aligned} \pi \sigma \in E(C_{H})\Longleftrightarrow \{ \pi (F)\ne F, \sigma (F)\ne F, \sigma (\pi ^{-1}(F)) = \pi (F), \pi (\sigma ^{-1}(F)) = \sigma (F) \}. \end{aligned}$$
(1)

This condition on edges is illustrated in Fig. 1.

Fig. 1
figure 1

Neighbors \(\pi , \sigma \) in \(C_{H}\); \(\sigma (\pi ^{-1}(F)) = \pi (F)\), and \(\pi (\sigma ^{-1}(F)) = \sigma (F)\)

Since \(hd(H^{\triangle })\ge d-3\) by Lemma 2a, it follows that any independent set I of vertices in \(C_{H}\) must satisfy \(hd(I^{\triangle })\ge d-2\). Now by using Lemma 2a, c (together with \(hd(H) > 3\) for our arrays H) we get \(M(n-1, d-2)\ge |I|\) as explained in the preceding paragraph. We are thus reduced to finding a large independent set in \(C_{H}\) for each of the arrays \(H = AGL(1,q), PGL(2,q)\), and Mathieu groups considered in this paper.

2 The contraction graph for AGL(1, q)

Let q be a prime power. Recall the Affine General Linear Group AGL(1, q) acting as a permutation group on the finite field GF(q) of size q, as the set of transformations \(\{ x\rightarrow ax+b: a\ne 0, x, b\in GF(q) \}\) under the binary operation of composition. For any \(\pi \in AGL(1,q)\) the permutation \(\pi ^{\triangle }\) on GF(q) is defined as in the previous section, based on some ordering \(x_{1}, x_{2}, \cdots , x_{q}\) of the elements of GF(q), where \(F = x_{q}\) is a distinguished element. Clearly \(|AGL(1,q)| = q(q-1)\). By standard facts AGL(1, q) is sharply 2-transitive in this action, and it is straightforward to see that \(hd(AGL(1,q)) = q-1\) (see Lemma 4a for most of that short proof).

Our goal in this section is to obtain a lower bound on \(M(q-1,q-3)\) for prime powers \(q\ge 7\) satisfying \(q\equiv 1(\)mod 3). Our method will involve the contraction graph \(C_{AGL(1,q)}\) for AGL(1, q), which we henceforth abbreviate by \(C_{A}(q)\).

By definition we then have \(V(C_{A}(q)) = AGL(1,q)\), and \(E(C_{A}(q)) = \{ \pi \sigma : hd(\pi ^{\triangle }, \sigma ^{\triangle }) = q - 4 \}\). Following the plan described in the previous section, we find an independent set I in \(C_{A}(q)\). Once we have such an I, then \(I^{\triangle }_{-}\) is a permutation array on \(q-1\) symbols, and by Lemma 2c satisfies \(hd(I^{\triangle }_{-}) = hd(I^{\triangle })\ge q-3\). This implies the lower bound \(M(q-1,q-3) \ge |I^{\triangle }_{-}| = |I^{\triangle }| = |I|\), the last equality by Lemma 2c, since \(q\ge 7\) implies \(hd(I) \ge q-1 > 3\). The actual size of I will then yield our precise lower bound.

We are thus reduced to finding a large independent set I in \(C_{A}(q)\), and from this we get the bound \(M(q-1,q-3)\ge |I|\). We begin on that in the following Lemma, which establishes relations in the finite field GF(q) that correspond to edges in the graph \(C_{A}(q)\).

Lemma 4

Let \(\pi \) and \(\sigma \) be vertices of the graph \(C_{A}(q)\), \(q\equiv 1(\)mod 3), say with \(\sigma (x) = ax + r\) and \(\pi (x) = bx + s\).

  • (a) If \(a\ne b\), then \(hd(\pi , \sigma ) = q - 1\).

  • (b) If \(\pi (F) = F\), then \(\pi \) is an isolated point in \(C_{A}(q)\). There are \(q-1\) points \(\pi \) satisfying \(\pi (F) = F\).

  • (c) Suppose \(\pi \) and \(\sigma \) are neighbors in \(C_{A}(q)\). Then

    • (c1) \(hd(\pi , \sigma ) = q-1\), and \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi , \sigma ) - 3\), and

    • (c2) \(\frac{a}{b}\) and \(\frac{b}{a}\) are the distinct roots of the quadratic \(t^{2} + t + 1 = 0\) over GF(q).

Proof

For (a), just observe that \(\pi (x) = \sigma (x)\) has the unique solution \(x = \frac{s - r}{a - b}\).

For the first claim in (b) suppose not, and let \(\sigma \) be a neighbor of \(\pi \) in \(C_{A}(q)\). Then we have \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = q - 4\), implying also that \(hd(\pi , \sigma ) = q - 1\) by Lemma 2a. Let i be the coordinate of agreement between \(\pi \) and \(\sigma \). Since \(\pi (F) = F\), we have \(\pi ^{\triangle } = \pi \). Thus \(hd(\pi , \sigma ^{\triangle }) = q - 4\). Now \(\sigma ^{\triangle }\) can have at most two coordinates, apart from i, in which it agrees with \(\pi \), these being F and \(j = \sigma ^{-1}(F)\). So altogether \(\pi \) and \(\sigma ^{\triangle }\) agree in at most the 3 coordinates ijF. So \( q - 4 = hd(\pi , \sigma ^{\triangle }) \ge q-3\), a contradiction.

Now consider the second claim in (b). Since \(\pi (F) = F\), we have \(q-1\) choices for the value \(\pi (i)\) for any fixed \(i\in GF(q)\), \(i\ne F\). Hence there are \(q-1\) choices for the ordered pair \(( \pi (F),\pi (i) )\), each such choice determining \(\pi \) uniquely by the sharp 2-transitivity of AGL(1, q) acting on GF(q). The claim follows.

For c1), by the definition of edges in \(C_{A}(q)\) we have \(q-4 = hd(\pi ^{\triangle }, \sigma ^{\triangle }) \ge hd(\pi , \sigma ) - 3\) using Lemma 2a. Since \(hd(\pi , \sigma ) = q\) or \(q-1\), it follows that \(hd(\pi , \sigma ) = q-1\) and we have equality throughout, as required.

Consider c2). By part c1) we have \(hd(\pi , \sigma )= q-1\) and \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi , \sigma ) - 3\). So there are distinct \(\alpha , \beta \in GF(q)\), with neither \(\alpha \) nor \(\beta \) being F, such that \(\sigma (F) = i\), \(\sigma (\alpha ) = F\), and \(\sigma (\beta ) = j\), and \(\pi (F) = j\), \(\pi (\alpha ) = i\), and \(\pi (\beta ) = F\) for distinct \(i,j\in GF(q)\). This gives the following set of equations in GF(q).

$$\begin{aligned} \left\{ \begin{array}{rcl} \sigma (\alpha )-\sigma (\beta )=&{} F - j &{}= a(\alpha -\beta ) \\ \sigma (\alpha )-\sigma (F) =&{}F - i &{}= a(\alpha - F) \\ \pi (\alpha )-\pi (\beta ) =&{}i - F &{}= b(\alpha -\beta ) \\ \pi (\alpha )-\pi (F) =&{}i - j &{}= b(\alpha - F).\\ \end{array}\right. \end{aligned}$$
(2)

The second and third equations of (2) imply

$$\begin{aligned} a(\alpha -F)=-b(\alpha -\beta ). \end{aligned}$$
(3)

Now starting with the first equation of (2) we obtain

$$\begin{aligned} a(\alpha -\beta )&= F - j \\&= (F - i)+(i-j) \\&= (a+b)(\alpha -F) \quad \text {(by the second and fourth equations of (2))}. \end{aligned}$$

Multiplying Eq. (3) by a and the last equation by b, we obtain the equations

$$\begin{aligned} \left\{ \begin{array}{rl} a^2(\alpha -F) &{}= -ab(\alpha -\beta ) \\ ab(\alpha -\beta ) &{}= b(a+b)(\alpha -F) .\\ \end{array} \right. \end{aligned}$$
(4)

Thus \(a^2(\alpha -F) = -b(a+b)(\alpha -F)\), and on dividing by \(\alpha -F\) (since \(\alpha \ne F\)) we obtain

$$\begin{aligned} a^2+b(a+b)=0. \end{aligned}$$
(5)

Dividing Eq. (5) by \(a^2\) or by \(b^2\), we obtain that a / b and b / a are both roots of the equation \(t^2+t+1=0\).

We show that a / b and b / a are distinct. Assuming otherwise, then \(a/b = 1\) or \(-1\). If q is even then from \(1+ t + t^{2} = 0\) we get the contradiction \(1 = 0\) since the characteristic is 2. Now assume q is odd. If \(a/b = 1\), then we get \(1+1+1=0\), forcing \(q\equiv 0(\)mod 3), a contradiction. If \(a/b = -1\), then we get \(1=0\), again a contradiction. \(\square \)

Let \(t_{1}\) and \(\frac{1}{t_{1}}\) be the distinct roots of \(t^{2} + t + 1 = 0\) in GF(q) for \(q\equiv 1 \pmod 3\) (by Corollary 23a). Let \(\pi \in C_{A}(q)\) with \(\pi (x) = ax + r\), and let \(\sigma \) be a neighbor of \(\pi \) in \(C_{A}(q)\). Then by Lemma 4c2 we have \(\sigma (x) = at_{1}x + s_{1}\) or \(\sigma (x) = a\frac{1}{t_{1}}x + s_{2}\), so far with \(s_{1}\) and \(s_{2}\) undetermined. The next lemma shows that \(s_{1}\) and \(s_{2}\) are uniquely determined by \(\pi \) and \(t_{1}\).

Lemma 5

Let q be a prime power with \(q\equiv 1 \pmod 3\). Suppose \(\pi \) is not an isolated point of \(C_{A}(q)\), say with \(\pi (x) = ax + r\). Let \(t_{1}\) be a root of \(t^{2} + t + 1 = 0\) in GF(q). Then the neighbors of \(\pi \) in \(C_{A}(q)\) are \(\sigma _{1}\) and \(\sigma _{2}\), given by \(\sigma _{1}(x) = at_{1}x + (a - t_{1})F + r(1 + t_{1})\) and \(\sigma _{2}(x) = a\frac{1}{t_{1}}x + (a - \frac{1}{t_{1}})F + r(1 + \frac{1}{t_{1}})\). In particular, each non-isolated point of \(C_{A}(q)\) has degree 2 in \(C_{A}(q)\).

Proof

Let \(N(\pi )\) be the set of neighbors of \(\pi \) in \(C_{A}(q)\). First we verify that \(\sigma _{1}, \sigma _{2}\in N(\pi )\), giving details only for \(\sigma _{1}\in N(\pi )\) as the containment \(\sigma _{2}\in N(\pi )\) is proved similarly. To do this, we show that all conditions of (1) are satisfied with \(\sigma _{1}\) playing the role of \(\sigma \). Clearly \(\pi (F)\ne F\) since \(\pi \) is not isolated. To show \(\sigma _{1}(F)\ne F\), assume not. Suppose first that \(a\ne 1\). Then \(\sigma _{1}(F) = F\) yields \(F = \frac{r}{1-a}\). But now we get \(\pi (F) = \frac{ar}{1-a} + r = \frac{r}{1-a} = F\), a contradiction. Next suppose \(a = 1\) so \(\pi (x) = x + r\). Then \(\sigma _{1}(F) = F\) together with \(a = 1\) yields \(r(1+t_{1}) = 0\). Combining this with \(t_{1}\ne -1\) yields \(r = 0\). But then \(\pi (F) = F\), a contradiction.

So it remains to show that \(\sigma _{1}(\pi ^{-1}(F)) = \pi (F)\) and \(\pi (\sigma _{1}^{-1}(F)) = \sigma _{1}(F)\). For the first equality, solving \(ax + r = F\) we obtain \(\pi ^{-1}(F) = \frac{F-r}{a}\). Thus \(\sigma _{1}(\pi ^{-1}(F)) = at_{1}\big (\frac{F-r}{a} \big ) + (a-t_{1})F + r(1 + t_{1}) = aF + r = \pi (F)\), as required. For the second equality, from the formula for \(\sigma _{1}\) we obtain \(\sigma _{1}^{-1}(F) = \frac{1}{at_{1}}\big ( F(1-a+t_{1}) - r(1+t_{1})\big )\). Plugging this into \(\pi \) and simplifying, we obtain \(\pi (\sigma _{1}^{-1}(F)) = \frac{1}{t_{1}}\big ( F(1-a+t_{1}) - r(1+t_{1})\big ) + r\). Working backwards from the equality \(\pi (\sigma _{1}^{-1}(F)) = \sigma _{1}(F)\) we must show that \(\frac{1}{t_{1}}\big ( F(1-a+t_{1}) - r(1+t_{1})\big ) = F\big (a-t_{1}+at_{1}\big ) + rt_{1}.\) This is equivalent to \(F\big (1-a+t_{1} \big ) = r(t_{1}^{2}+t_{1}+1) + F(at_{1}-t_{1}^{2}+at_{1}^{2}) = F(at_{1}-t_{1}^{2}+at_{1}^{2})\). We are now reduced to showing \(1-a+t_{1} = at_{1}-t_{1}^{2}+at_{1}^{2}.\) This follows from \(t_{1}^{2} + t_{1} + 1 = 0\).

Now let \(\sigma \in N(\pi )\), and we show that \(\sigma = \sigma _{1}\) or \( \sigma _{2}\). By Lemma 4, we know that \(\sigma (x) = at_{1}x + s_{1}\) or \(\sigma (x) = a\frac{1}{t_{1}}x + s_{2}\) for suitable \(s_{1}, s_{2}\in GF(q)\). Suppose first that \(\sigma (x) = at_{1}x + s_{1}\). Applying the equality \(\sigma (\pi ^{-1}(F)) = \pi (F)\) together with \(\pi ^{-1}(F) = \frac{F-r}{a}\), we get \(at_{1}\big (\frac{F-r}{a}\big ) + s_{1} = aF + r\). so \(s_{1} = (a-t_{1})F + r(1+t_{1})\). Thus \(\sigma = \sigma _{1}\). A very similar argument shows that if \(\sigma (x) = a\frac{1}{t_{1}}x + s_{2}\), then \(s_{2} = (a-\frac{1}{t_{1}})F + r(1+ \frac{1}{t_{1}})\), and thus \(\sigma = \sigma _{2}\). So we have \(N(\pi ) = \{\sigma _{1}, \sigma _{2}\}\), completing the proof.

Consider the subgroup \(Q = \{ x + b: b\in GF(q)\}\) of AGL(1, q). Clearly \(|Q| = q\), and for each \(h\in GF(q), h\ne 0\), Q has the coset \(hxQ = \{ hx + b: b\in GF(q) \}\), which we abbreviate by \(Q_{h}\).

Theorem 6

Let q be a prime power with \(q\equiv 1 \pmod 3\). Then the connected components of \(C_{A}(q)\) are as follows.

  • (a) There are \(q-1\) isolated points, these being the points \(\pi \) satisfying \(\pi (F) = F\).

  • (b) If q is odd, then each non-isolated point component is a cycle of length 6.

  • (c) If q is even, then each non-isolated point component is a cycle of length 3.

Proof

For part (a), we show that \(\pi \in C_{A}(q)\) is an isolated point if and only if \(\pi (F) = F\). Then (a) follows by Lemma 4b.

If \(\pi (F) = F\), then immediately \(\pi \) is isolated in \(C_{A}(q)\) by Lemma 4b. For the converse, suppose to the contrary that \(\pi \) is isolated and that \(\pi (F)\ne F\). Let \(\sigma _{1}\) be given by \(\sigma _{1}(x) = at_{1}x + (a - t_{1})F + r(1 + t_{1})\) as in Lemma 5. Then the proof of Lemma  5, starting from the established claim \(\pi (F)\ne F\) (this claim being an assumption here), shows that \(\sigma _{1}\) is a neighbor of \(\pi \) in \(C_{A}(q)\). This contradicts \(\pi \) being isolated.

Consider part (b). By Lemma 5 each nontrivial component of \(C_{A}(q)\) is a cycle. Let \(\pi _{0}\) be a point on such a cycle C, say with \(\pi _{0}\in Q_{a}\). Let \(t_{1}\) be a fixed root of \(t^{2} + t + 1 = 0\). Consider a sequence of 4 vertices \(\pi _{0}\pi _{1}\pi _{2}\pi _{3}\) on C with \(\pi _{j}\pi _{j+1}\in E(C_{A}(q))\) for \(0\le j\le 2\). We may suppose that \(\pi _{j}\in Q_{at_{1}^{j}}\) by Lemma 5 and straightforward induction (otherwise replace \(t_{1}\) by \(\frac{1}{t_{1}}\)). Thus \(\pi _{0}, \pi _{1}, \pi _{2}\) are distinct since they belong to distinct cosets of Q. Since \(t_{1}^{3} = 1\), we see also that \(\pi _{0}\) and \(\pi _{3}\) belong to the same coset \(Q_{a}\) of Q. We now show that \(\pi _{0}\ne \pi _{3}\). Writing \(\pi _{1}(x) = bx + c\) (so \(b = at_{1}\)), we apply the first and third equations of (2) with \(\pi _{0}\) and \(\pi _{1}\) playing the roles of \(\sigma \) and \(\pi \) respectively, to get \(t_{1} = \frac{b}{a} = -\big (\frac{i-F}{j-F}\big ) = -\big (\frac{\pi _{0}(F)-F}{\pi _{1}(F)-F}\big )\). Applying this equation two more times we get \(1 = t_{1}^{3} = -\big (\frac{\pi _{0}(F)-F}{\pi _{3}(F)-F}\big )\), so that \(\big (\frac{\pi _{0}(F)-F}{\pi _{3}(F)-F}\big ) = -1\). Thus \(\pi _{0}(F)\ne \pi _{3}(F)\) , so \(\pi _{0}\ne \pi _{3}\). Thus each cycle component has length at least 4.

Consider now a sequence of 7 vertices \(\pi _{0}\pi _{1}\cdots \pi _{6}\) on C with \(\pi _{j}\pi _{j+1}\in E(C_{A}(q))\) for \(0\le j\le 5\). We claim the first 6 of these \(\pi _{0}, \pi _{1}, \cdots , \pi _{5}\) must be distinct as follows. Clearly any two vertices \(\pi _{j}, \pi _{j+3}\) are distinct, \(0\le j\le 2\), by the same argument that showed \(\pi _{0}\ne \pi _{3}\). But any two vertices \(\pi _{i}, \pi _{j}\) with \(i\not \equiv j (\)mod 3) are distinct, since \(t_{1}\ne 1\) and \(t_{1}^{2}\ne 1\) imply that they belong to different cosets of Q, proving the claim. Finally note that \(1 = t_{1}^{6} = \big (\frac{\pi _{0}(F)-F}{\pi _{6}(F)-F}\big )\), so that \(\pi _{0}(F) = \pi _{6}(F)\). Since also \(\pi _{0}\) and \(\pi _{6}\) also belong to the same coset \(Q_{a}\) of Q, it follows that \(\pi _{0} = \pi _{6}\). Thus the component C containing \(\pi _{0}\) is a cycle of length 6, as required.

Now consider part (c). Consider as above the sequence of 4 vertices \(\pi _{0}\pi _{1}\pi _{2}\pi _{3}\) in a nontrivial component, with \(\pi _{j}\pi _{j+1}\in E(C_{A}(q))\) for \(0\le j\le 2\). We get \(\big (\frac{\pi _{0}(F)-F}{\pi _{3}(F)-F}\big ) = -1 = 1\) since q is even. So since also \(\pi _{0}\) and \(\pi _{3}\) belong to the same coset \(Q_{a}\) of Q, it follows that \(\pi _{0} = \pi _{3}\). Thus the cycle containing \(\pi _{0}\) has length 3. \(\square \)

Corollary 7

Let q be a prime power with \(q\equiv 1 \pmod 3\) and \(q\ge 7\). Then

  • (a) if q is odd, then \(M(q-1, q-3)\ge (q^2-1)/2\), and

  • (b) if q is even, then \(M(q-1, q-3)\ge (q-1)(q+2)/3\).

Proof

For part (a) we form an independent set I in \(C_{A}(q)\) by taking 3 independent points in each cycle component of length 6, together with the set Y of isolated points. Then \(M(q-1, q-3)\ge |Y| + \frac{1}{2}( |C_{A}(q) - Y| ) = q-1 + \frac{1}{2}\big (q(q-1) - (q-1)\big ) = (q^2-1)/2\) , as required.

For part (b), we form an independent set I in \(C_{A}(q)\) by taking one point from each length 3 cycle component, together with the set Y of isolated points. We then have \(M(q-1, q-3)\ge |Y| + \frac{1}{3}( |C_{A}(q) - Y| ) = (q-1)(q+2)/3\). \(\square \)

The best previously known lower bound for \(M(q-1, q-3)\), for q a prime power with \(q\equiv 1 \pmod 3\), of which we are aware is derived from a general lower bound for \(M(n,n-2)\) for arbitrary n based on MOLS. Recall the lower bound [8] from the Introduction \(M(n,n-1)\ge nN(n)\) for any integer n, from which we obtain \(M(n,n-2)\ge M(n,n-1)\ge nN(n).\) It is known [7] that for n sufficiently large we have \(N(n)\ge n^{\frac{1}{14.8}}\), implying that for arbitrary n we have \(M(n,n-2)\ge n^{1 + \frac{1}{14.8} }\). Therefore our result \(M(q-1,q-3)\ge Kq^{2}\) (for some constant K) from Corollary 7, for q as above, significantly improves on the previously known \(M(q-1,q-3)\ge (q-1)^{1 + \frac{1}{14.8} }\).

We also mention other lower bounds for \(M(n,n-2)\), though not for the \(n = q-1\) (with q a prime power) of our result, and so not directly comparable to ours.

  1. (1)

    \(M(n,n-2)\ge n^{2}\) for prime powers \(n \not \equiv 2\)(mod 3) using permutation polynomials [9] as mentioned in the Introduction.

  2. (2)

    \(M(n,n-2) = |PGL(2,q)| = q^{3} - q\) for \(n = 1+q\), where q is a prime power, as an immediate consequence of the sharp 3-transitivity of PGL(2, q) and Theorem 1.

  3. (3)

    \(M(n,n-2)\ge n^{3} - n\) when n is a power of a prime and \(n \not \equiv 1\)(mod 3), using contraction applied to PGL(2, n) [2].

3 The contraction graph for PGL(2, q)

Let q be a power of a prime. The permutation group PGL(2, q) is defined as the set of one to one functions \(\sigma : GF(q)\cup \{\infty \} \rightarrow GF(q)\cup \{\infty \}\), under the binary operation of composition, given by

$$\begin{aligned} \left\{ ~ \sigma (x) = \frac{ax+b}{cx+d} ~:~a,b,c,d \in GF(q), ad \ne bc, x \in GF(q) \cup \{\infty \}~ \right\} . \end{aligned}$$
(6)

Here \(\sigma (x)\) is computed by the rules:

  1. 1.

    if \(x \in GF(q)\) and \(x\ne -(d/c)\), then \(\sigma (x) = \frac{ax+b}{cx+d} \),

  2. 2.

    if \(x \in GF(q)\) and \(x = -(d/c)\), then \(\sigma (x) = \infty \),

  3. 3.

    if \(x = \infty \), and \(c \ne 0\), then \(\sigma (x) = a/c\), and

  4. 4.

    if \(x = \infty \), and \(c = 0\), then \(\sigma (x) = \infty \).

We regard PGL(2, q) as a permutation group acting on the set \(GF(q)\cup \{\infty \}\) of size \(q+1\) via the one to one map \(x\mapsto \sigma (x)\). One can show that \(|PGL(2,q)| = (q+1)q(q-1)\), and it is well known that PGL(2, q) is sharply 3-transitive in its action on \(GF(q)\cup \{\infty \}\) (see [30] for a proof). It is straightforward to verify that \(hd(PGL(2,q)) = q-1\), and by Theorem 1 we have \(M(q+1, q-1) = |PGL(2,q)| = (q+1)q(q-1)\).

Take a fixed ordering of \(GF(q)\cup \{\infty \}\) with \(\infty \) as final symbol, say \(x_{1}, x_{2}, \cdots , x_{q}, \infty \) where the \(x_{i}\) are the distinct elements of GF(q). Then any element \(\pi \in PGL(2,q)\) is identified with the length \(q+1\) string \(\pi (x_{1})\pi (x_{2})\pi (x_{3})\cdots \pi (x_{q})\pi (\infty )\), which again we call the image string of \(\pi \). For any such \(\pi \in PGL(2,q)\) the permutation \(\pi ^{\triangle }\) on \(GF(q)\cup \{\infty \}\) is defined as in the section introducing contraction, where \(F = \infty \) is the distinguished element of \(GF(q)\cup \{\infty \}\) in that definition. As an example, if \(\pi = a\infty bcde\), then \(\pi ^{\triangle } = aebcd\infty \), and \(\pi ^{\triangle }_{-} = aebcd\). In the same way, for any subset \(R\subset PGL(2,q)\), the sets \(R^{\triangle }\), and \(R^{\triangle }_{-}\) are defined as in that section, with \(F = \infty \). Since \(hd(PGL(2,q)) = q-1 = q+1-2\), the image strings of any two elements of PGL(2, q) agree in at most two positions. It follows from Lemma 2a that for any \(\pi , \sigma \in PGL(2,q)\) we have \(hd(\pi ^{\triangle }, \sigma ^{\triangle })\ge hd(\pi , \sigma ) - 3 \ge q - 4\). That is, \(\pi ^{\triangle }\) and \( \sigma ^{\triangle }\) can agree in at most 5 positions; up to 2 occurring from the original \(\pi \) and \(\sigma \), and up to 3 more occurring from the \(\pi ^{\triangle }\) and \(\sigma ^{\triangle }\) operation.

As noted earlier, lower bounds for \(M(q,q-3)\) and \(M(q,q-4)\) when \(q\not \equiv 1(\)mod 3) based on permutation polynomials are known [9]. Thus in this section, we restrict ourselves to the case \(q\equiv 1(\)mod 3), q an odd prime power, where such bounds are not known. For technical reasons we take \(q\ge 13\).

The plan will be similar in some respects to the one we used in the previous section. That is, for a certain set \(I\subset PGL(2,q)\) we will find a permutation array \(I^{\triangle }_{-} \subset PGL(2,q)^{\triangle }_{-} \) on q symbols with \(hd(I^{\triangle }_{-})\ge q-3 \), thus obtaining the lower bound on \(M(q,q-3)\ge |I^{\triangle }_{-} |\). This set I will be an independent set in the contraction graph \(C_{PGL(2,q)}\) for PGL(2, q), which we abbreviate by \(C_{P}(q)\).

Since \(hd(PGL(2,q)) = q-1\), \(C_{P}(q)\) is given by \(V(C_{P}(q)) = PGL(2,q)\), and \(E(C_{P}(q)) = \{ \pi \sigma : hd(\pi ^{\triangle }, \sigma ^{\triangle }) = q - 4 \}\). So edges \(\pi \sigma \) of \(C_{P}(q)\) correspond to pairs \(\pi , \sigma \in PGL(2,q)\) for which \(hd(\pi ^{\triangle }, \sigma ^{\triangle })\) achieves its least possible value of \(q-4\), occurring when \(\pi ^{\triangle }\) and \(\sigma ^{\triangle }\) agree in 5 positions, so consequently \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi , \sigma ) - 3\). Thus a set \(I\subseteq V(C_{P}(q))\) is independent in \(C_{P}(q)\) if and only if it satisfies \(hd(I^{\triangle })\ge q-3\). By Lemma 2c, we get \(hd(I^{\triangle }_{-}) = hd(I^{\triangle })\ge q-3\), while \(|I^{\triangle }_{-}| = |I^{\triangle }| = |I|\), with the last equality following from \(hd(I^{\triangle }) = q-3 > 3\) since \(q\ge 13\).

We are thus reduced to finding an independent set I in \(C_{P}(q)\), from which \(M(q, q-3)\ge |I|\) follows.

To do this, it will be useful to represent functions in PGL(2, q) in a form different than the standard \(\frac{ax+b}{cx+d}\) form.

Definition 8

Fix a prime power q.

  • 1. Let \(K, r, i\in GF(q)\) with \(r\ne 0\). Define the function \(f:GF(q)\cup \{\infty \}\rightarrow GF(q)\cup \{\infty \}\) by \(f(x) = K + \frac{r}{x-i}\) for \(x\notin \{ i, \infty \}\), while \(f(\infty ) = K\) and \(f(i) = \infty \).

  • 2. Let \(P = \{ K + \frac{r}{x-i}: K,r,i\in GF(q), r\ne 0 \}\) be the set of all functions defined in 1.

  • 3. Let \(N\subset PGL(2,q)\) be given by \(N = \{ \pi \in PGL(2,q): \pi (x) = \frac{ax+b}{cx+d}, c\ne 0 \}\).

We will now see that P is the same set of functions as N.

Lemma 9

Let the map \(\alpha : N\rightarrow P\) be defined as follows. For any \(\pi \in N\) with \(\pi (x) = \frac{ax+b}{cx+d}\), let \(\alpha (\pi )\in P\) be given by \(\alpha (\pi )(x) = \frac{a}{c} + \frac{\frac{bc-ad}{c^{2}}}{x + \frac{d}{c}}.\) Then

  • (a) \(\pi \) and \(\alpha (\pi )\) are the same function on \(GF(q)\cup \{\infty \}\).

  • (b) \(|P| = |N| = q^{2}(q-1)\).

  • (c) The map \(\alpha \) is one to one and onto.

Proof

For (a), straightforward manipulation shows that for \(x\ne -\frac{d}{c}\) we have \(\pi (x) = \frac{a}{c} + \frac{\frac{bc-ad}{c^{2}}}{x + \frac{d}{c}} = \alpha (\pi )(x)\). Also by definition \(\alpha (\pi )(\infty ) = \frac{a}{c} = \pi (\infty )\) and \(\alpha (\pi )(-\frac{d}{c}) = \infty = \pi (-\frac{d}{c})\). so \(\pi \) and \(\alpha (\pi )\) are the same function.

Consider (b). Clearly \(|P| = q^{2}(q-1)\) since there are \(q-1\) choices for r, and q choices for each of K and i, independent of each other. To show \(|N| = q^{2}(q-1)\), observe first that for any \(\pi \in PGL(2,q)\) with \(\pi (x) = \frac{ax+b}{cx+d}\) we have \(c = 0 \Leftrightarrow \pi (\infty ) = \infty \). The \(\Rightarrow \) direction is immediate by definition. To see \(\Leftarrow \), assume \(c\ne 0\). Then \(\pi (\infty ) = \frac{a}{c} \ne \infty \), completing the proof of the observation. Next we have \(\pi (\infty ) = \infty \Leftrightarrow \pi (x) = Ax + B\in AGL(1,q)\) for all x for suitable \(A\ne 0,B\in GF(q)\), by definition of computing in PGL(2, q). Thus we have \(|N| = |PGL(2,q)| - |AGL(2,q)| = (q+1)q(q-1) - q(q-1) = q^{2}(q-1)\).

Part (c) is immediate from parts (a) and (b), since any two distinct elements of N are distinct as functions. As an alternative (constructive) proof, let \(f(x) = K + \frac{r}{x-i}\in P\) be given. Then for \(\pi (x) = \frac{Kx + r-iK}{x - i}\in N\) we have \(\alpha (\pi ) = f\). Thus \(\alpha \) is onto, and since \(|P| = |N|\), \(\alpha \) is also one to one. \(\square \)

We now see how the above observations, together with results which come later, reduce the study of \(C_{P}(q)\) to the set P of permutations. It was shown above that for \(\pi \in PGL(2,q)\), we have \(\pi (\infty ) = \infty \Leftrightarrow c = 0\). By condition (1) (with \(\infty = F\)) we see that \(\pi (\infty ) = \infty \) implies that \(\pi \) is an isolated point in \(C_{P}(q)\), and we will see later that for \(C_{P}(q)\) the converse is also true. So to study the structure of \(C_{P}(q)\) apart from its isolated points, we are reduced to studying its subgraph induced by the permutations in N. By the bijection \(\alpha : N\leftrightarrow P\), under which \(\pi \in N\) and \(\alpha (\pi )\in P\) are the same permutation on \(GF(q)\cup \{\infty \}\), we are then reduced to studying P.

Lemma 10

Let \(\pi , \sigma \in P\) with \(\pi (x) = a + \frac{r}{x-i}\), \(\sigma (x) = b + \frac{s}{x-j}\) with \(r,s\ne 0\). Then \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi ,\sigma ) - 3 \Longleftrightarrow (b-a)(j-i) = r\) and \(r = s\) .

Proof

\(\Longrightarrow \) : By assumption we have \(\pi (\infty ) = a\) and \(\pi (i) = \infty \), together with \(\sigma (j) = \infty \) and \(\sigma (\infty ) = b\). By Lemma 2a the only coordinates of either \(\pi \) or \(\sigma \) whose values are affected by the \(\triangle \) operation are the 3 coordinates \(\pi ^{-1}(\infty ) = i, \sigma ^{-1}(\infty ) = j\), and \(\infty \). So the assumption \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi ,\sigma ) - 3\) implies that \(\sigma (i) = a\) and \(\pi (j) = b\). Thus we get \(\pi (j) = a + \frac{r}{j-i} = b\), yielding \((b-a)(j-i) = r\) as required. Now interchanging the roles of \(\pi \) and \(\sigma \) in this argument, specifically, using \(\sigma (i) = b + \frac{s}{i-j} = a\), we get \((a-b)(i-j) = s\), so also \(r = s\).

\(\Longleftarrow \) : Again by assumption we have \(\pi (\infty ) = a\), \(\sigma (\infty ) = b\), \(\pi (i) = \infty \), \(\sigma (j) = \infty \), and \((b-a)(j-i) = r\) . To prove \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi ,\sigma ) - 3\), it remains only to show that \(\pi (j) = b\) and \(\sigma (i) = a\). For simplicity we let \(r = s = 1\), since the argument does not depend on \(r = s\). Solving for b in \((b-a)(j-i) = 1\) we get \(b = \frac{1}{j-i} + a = \pi (j)\). Solving for a we get \(a = b - \frac{1}{j-i} = b + \frac{1}{i - j} = \sigma (i)\), as required. \(\square \)

Lemma 11

Let \(q = p^{m}\), where p is an odd prime, with \(q\equiv 1(\)mod 3), \(q\ge 13\). Let \(\pi , \sigma \in P\), say with \(\pi (x) = a + \frac{r}{x-i}\), \(\sigma (x) = b + \frac{s}{x-j}\), with \(r,s\ne 0\). Then \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi ,\sigma ) - 3 \Longleftrightarrow \pi \sigma \in E(C_{P}(q))\).

Proof

\(\Longleftarrow \): By definition of edges in \(C_{P}(q)\) and Lemma 2a we have \(q - 4 = hd(\pi ^{\triangle }, \sigma ^{\triangle })\ge hd(\pi , \sigma ) -3\). Now since \(q-1\le hd(\pi , \sigma )\le q+1\), equality is forced together with \(hd(\pi , \sigma ) = q-1\). This yields \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi ,\sigma ) - 3\).

\(\Longrightarrow \) : By the assumption \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi ,\sigma ) - 3\) and \(hd(\pi , \sigma )\ge q-1\) we are reduced to showing that \(hd(\pi , \sigma ) = q-1\); that is, that \(\pi \) and \(\sigma \) already agree in two coordinates.

By assumption and Lemma 10 we have \(r=s\), so write \(\pi (x) = a + \frac{r}{x-i}\) and \(\sigma (x) = b + \frac{r}{x-j}\), for \(a, b, i, j, k \in GF(q)\) with \(r\ne 0\). Note also \(i\ne j\), since otherwise by Lemma 10 we get \(r=0\), a contradiction.

We now derive a quadratic equation over GF(q) whose distinct roots are the coordinates of agreement between \(\pi \) and \(\sigma \). Since \(hd(\pi ^{\triangle }, \sigma ^{\triangle }) = hd(\pi ,\sigma ) - 3\), by Lemma 10 we have \((b - a)(j - i) = r\). Thus \(b = \frac{r}{j - i} + a\). Now we set \(\pi (x) = \sigma (x)\) to find the possible coordinates x at which \(\pi \) and \(\sigma \) agree, understanding that x can be neither i nor j since \(\pi \) and \(\sigma \) can have no agreements in any of the coordinates \(i = \pi ^{-1}(\infty ),j = \sigma ^{-1}(\infty )\), or \(\infty \) by Lemma 2a. Substituting \(\frac{r}{j - i} + a\) for b and simplifying we obtain \(\frac{1}{x-i} - \frac{1}{x-j} = \frac{1}{j-i}.\) Hence \(\frac{i - j}{(x - i)(x - j)} = \frac{1}{j-i}\), and we get the quadratic \(x^{2} - (i + j)x + ij + (i - j)^{2} = 0.\) By Corollary 23b there are two distinct roots to this equation, giving the two coordinates of agreement for \(\pi \) and \(\sigma \) as follows; \(x_{1} = \frac{1}{2}[i(1+\sqrt{-3}) + j(1-\sqrt{-3})]\), and \(x_{2} = \frac{1}{2}[i(1-\sqrt{-3}) + j(1+\sqrt{-3})]\).

Hence by our reduction at the beginning of the proof it follows that \(\pi \sigma \in E(C_{P}(q))\), as required. \(\square \)

The preceding two Lemmas yield the following.

Corollary 12

Let \(q = p^{m}\), where p is an odd prime, with \(q\equiv 1(\)mod 3), \(q\ge 13\).

  • (a) Let \(\pi , \sigma \in P\), say with \(\pi (x) = a + \frac{r}{x-i}\), \(\sigma (x) = b + \frac{s}{x-j}\), \(r,s\ne 0\). Then \(\pi \sigma \in E(C_{P}(q)) \Longleftrightarrow r = s\) and \((b-a)(j-i) = r\).

  • (b) \(\pi \in PGL(2,q)\) is an isolated point in \(C_{P}(q) \Longleftrightarrow \pi (\infty ) = \infty \).

Proof

Part (a) follows immediately from Lemmas 10 and 11.

For part (b), suppose first that \(\pi (\infty ) = \infty \). Then immediately \(\pi \) is isolated in \(C_{P}(q)\) by the equivalence (1) (with \(\infty = F\)) applicable to any contraction graph.

Conversely, suppose to the contrary that \(\pi \) is isolated in \(C_{P}(q)\) and \(\pi (\infty ) = x\ne \infty \). Let \(i = \pi ^{-1}(\infty )\), and let j be any coordinate with \(j\notin \{ i, \infty \}\), and let \(\pi (j) = y\). Then by sharp 3-transitivity of PGL(2, q) we can find an element \(\sigma \in PGL(2,q)\) satisfying \(\sigma (j) = \infty \), \(\sigma (i) = x\), and \(\sigma (\infty ) = y\). Then we get \(hd(\sigma ^{\triangle }, \pi ^{\triangle }) = hd(\sigma , \pi ) - 3\). So by Lemma 11 we have \(\pi \sigma \in E(C_{P}(q))\), contradicting \(\pi \) being isolated. \(\square \)

The next two theorems, which use the preceding Corollary, tell us more about \(C_{P}(q)\). For \(S\subset C_{P}(q)\), recall that [S] is the subgraph of \(C_{P}(q)\) induced by S. When r is fixed by context, we denote a vertex \(\pi \in C_{P}(q)\), \(\pi \in P\), with \(\pi (x) = a + \frac{r}{x-i}\), by the abbreviation (ia).

Consider the partition of P given by \(P = \cup _{r\ne 0}P_{r}\), where for \(r\in GF(q)\) with \(r\ne 0\), \(P_{r} = \{ a + \frac{r}{x-i} : a, i \in GF(q) \}\), so \(|P_{r}| = q^{2}\). Further consider the partition of \(P_{r}\) given by \(P_{r} = \cup _{i\in GF(q)}B_{i}(r)\), where \(B_{i}(r) = \{ a + \frac{r}{x-i} : a \in GF(q) \}\).

Theorem 13

Let \(q = p^{m}\), where p is an odd prime, with \(q\equiv 1(\)mod 3), \(q\ge 13\). Then the following hold in the graph \(C_{P}(q)\).

  • (a) For any \(r\ne s\), \(r,s\ne 0\), we have \([P_{r}] \cong [P_{s}]\).

  • (b) For any \(r\ne 0\) and \(i\ne j\), \([B_{i}(r)\cup B_{j}(r)]\) is a perfect matching, which matches \(B_{i}(r)\) to \(B_{j}(r)\).

  • (c) For any \(r\ne 0\), the subgraph \([P_{r}]\) is regular of degree \(q-1\).

  • (d) Let \(v\in C_{P}(q)\) be a non isolated point, and N(v) the set of neighbors of v in \(C_{P}(q)\). Then [N(v)] is a disjoint union of cycles.

Proof

For (a), consider for any \(r\in GF(q)\), \(r\ne 0\), the map \(\varphi : P_{1}\rightarrow P_{r}\) given by \(\varphi (a + \frac{1}{x-i}) = a + \frac{r}{x-ri}\). Let \(v, w\in P_{1}\), say with \(v(x) = a + \frac{1}{x-i}\) and \(w(x) = b + \frac{1}{x-j}\). Then \(vw\in E([P_{1}])\Leftrightarrow (b-a)(j-i) = 1\Leftrightarrow (b-a)(rj-ri) = r \Leftrightarrow \varphi (v)\varphi (w)\in E([P_{r}])\) . Thus \(\varphi \) is a graph isomorphism, and since r was arbitrary, it follows that for any \(s\ne 0\) we have \([P_{r}]\cong [P_{1}]\cong [P_{s}]\).

Consider (b). Fix r, and consider any two points (ia) and (jb) of \(P_{r}\). By Corollary 12 we have \((i,a)(j,b)\in E(C_{P}(q))\) if and only if \(i\ne j\) and \((b-a)(j-i) = r\) in GF(q). Let \(H_{ij} = [B_{i}(r)\cup B_{j}(r)]\) for \(i\ne j\). Note there can be no edge in \(H_{ij}\) of the form (ia)(ib) since \((b-a)(i-i) = 0\ne r\), and similarly no edge of the form (ja)(jb). Now given \((i,a)\in B_{i}(r)\), a point \((j,b)\in B_{j}(r)\) is a neighbor of (ia) if and only if \((b-a)(j-i) = r\) by Corollary 12.

Thus for this fixed i and j we can uniquely determine b by the equation \(b = r(j-i)^{-1} + a\), showing that (jb) is the only neighbor of (ia) in \(B_{j}(r)\). A symmetric argument shows that each point in \(B_{j}(r)\) has a unique neighbor in \(B_{i}(r)\). Thus \(E(H_{ij})\) is a perfect matching, which matches \(B_{i}(r)\) to \(B_{j}(r)\).

For (c), let \(v\in C_{P}(q)\), say with \(v\in B_{i}(r)\subset P_{r}\) for some \(r\ne 0\). By Corollary 12, any neighbor of v in \(C_{P}(q)\) must also lie in \(P_{r}\). By part (b), the neighbors of v are in one to one correspondence with the sets \(B_{j}(r)\), \(j\ne i\), \(j\in GF(q)\). Thus v has exactly \(|GF(q)|-1 = q-1\) neighbors in \(C_{P}(q)\).

For (d), take \(v\in C_{P}(q)\), and by the isomorphism of subgraphs \([P_{r}]\) from part (a), we can take \(v = (i,a)\in P_{1}\). By Corollary 12 we have \(N(v)\subset P_{1}\). It suffices to show that [N(v)] is regular of degree 2. Let \((j,b)\in N(v)\), so \(j\ne i\) by part (b). Now any neighbor (kc) of (jb) in N(v) must lie in \(N((i,a))\cap N((j,b))\). So to show that (jb) has degree 2 in [N(v)], it suffices to show that \((k,c)\in P_{1}\) satisfies \((k,c)\in N((i,a))\cap N((j,b))\) if and only if k is a root in GF(q) of a quadratic equation over GF(q) having two distinct roots in GF(q).

Suppose first that \((k,c)\in N((i,a))\cap N((j,b))\). By Corollary 12 we must have the equations

$$\begin{aligned} (c-a)(k-i) = 1, (b-c)(j-k) = 1, (b-a)(j-i) = 1. \end{aligned}$$

Using the second and third equations we get \(c = (j-i)^{-1} - (j-k)^{-1} + a\), and from the first equation \(c = (k-i)^{-1} + a\). Setting these two expressions for c equal we obtain \((k-i)^{-1} + (j-k)^{-1} = (j-i)^{-1}.\) Some simplification leads to the quadratic \(k^{2} - k(i+j) + ij +(j-i)^{2} = 0\) with coefficients over GF(q) and unknown k. By Corollary 23b from the Appendix, we see that that there are two distinct solutions for k; namely \(k_{1} = \frac{1}{2}[i(1+\sqrt{-3}) + j(1-\sqrt{-3})]\), and \(k_{2} = \frac{1}{2}[i(1-\sqrt{-3}) + j(1+\sqrt{-3})].\)

Conversely suppose that k is one of the two distinct solutions of \(k^{2} - k(i+j) + ij +(j-i)^{2} = 0\). Then \((k-i)(j-k) = -k^{2} + k(i+j) - ij = (j-i)^{2}\), and using \(\frac{1}{(k-i)(j-k)} = \big ( \frac{1}{j-i} \big )\big ( \frac{1}{k-i} + \frac{1}{j-k} \big )\), one can derive \(\frac{1}{k-i} + \frac{1}{j-k} = \frac{1}{j-i}\). Now set \(c = \frac{1}{k-i} + a\), so immediately we get \((c-a)(k-i) = 1\). Since (ia) and (jb) are neighbors we have \((b-a)(j-i) = 1\), so \(b = \frac{1}{j-i} + a\). It follows that \(c = \frac{1}{k-i} + a = \frac{1}{j-i} - \frac{1}{j-k} + a = b - \frac{1}{j-k}\). Hence we get \((b-c)(j-k) = 1\). Thus the three equations \((c-a)(k-i) = 1\), \((b-c)(j-k) = 1\), and \((b-a)(j-i) = 1\) hold, showing that \((k,c)\in N((i,a))\cap N((j,b))\) by Corollary 12.

Note that once k is determined (as one of the two distinct roots), then the point (kc) is uniquely determined by the perfect matching between \(B_{k}(1)\) and \(B_{i}(1)\) (or \(B_{j}(1)\)). Thus we obtain that an arbitrary point \((j,b)\in N(v)\) has exactly two neighbors in N(v), completing (d). \(\square \)

To round out the structure of \(C_{P}(q)\) we consider the connected components of \(C_{P}(q)\).

Theorem 14

Let \(q = p^{m}\), where p is an odd prime, with \(q\equiv 1(\)mod 3), \(q\ge 13\). Then the connected components of \(C_{P}(q)\) are as follows.

  1. (1)

    the isolated points - these are of the form \(\pi (x) = ax + b\), \(a\ne 0 \), and there are \(q(q-1)\) of them,

  2. (2)

    the \(q-1\) many connected components \([P_{r}]\) induced by the sets \(P_{r}\).

Proof

By Corollary 12b we have that \(\pi \in PGL(2,q)\) is an isolated point in \(C_{P}(q)\) if and only if \(\pi (\infty ) = \infty \). This is equivalent to \(\pi (x) = ax + b\), \(a\ne 0\) and there are \(q(q-1)\) such points, completing part 1).

The remaining permutations are all of the form \(\pi (x) = a + \frac{r}{x-i}\) for suitable \(a,r,i\in GF(q)\) with \(r\ne 0\) as shown earlier. Hence it suffices to analyze the connected component structure of \([\cup _{r\ne 0} P_{r}]\). By Corollary 12 and Theorem 13a, to prove part 2) it suffices to prove that any one of the \([P_{r}]\), say \([P_{1}]\), is connected.

Recall the partition \(P_{1} = \cup _{i\in GF(q)}B_{i}(1)\) defined above, and from now on we abbreviate \(B_{i}(1)\) by \(B_{i}\). Let g by a generator of the multiplicative cyclic subgroup of nonzero elements in GF(q). Then we can write this partition as \(P_{1} = B_{0}\cup (\cup _{1\le k\le q-1}B_{g^{k}})\). We regard the sets in this partition as “levels” of \(C_{P}(q)\); where \(B_{0}\) is level 0 and \(B_{g^{k}}\) is level k, \(1\le k\le q-1\). See Fig. 2 for an illustration of \(P_{1}\) from this viewpoint, where in that figure we continue with the notation (ia) for \(a + \frac{1}{x-i}\). In particular, \((g^{t}, a)\) refers to \(a + \frac{1}{x-g^{t}}\). By Theorem 13b the subgraph of \([P_{1}]\) induced by any two levels has edge set which is a perfect matching, as illustrated in Fig. 3.

First we observe that to show that \([P_{1}]\) is connected it suffices to show that any two vertices in \(B_{0}\) are joined by a path in \([P_{1}]\). For if that was true, then we can find a path in \([P_{1}]\) from (0, 0) to any vertex \(w\in P_{1}\) (thus showing connectedness of \([P_{1}]\)) as follows. If \(w\in B_{0}\) we are done by assumption. So suppose \(w\notin B_{0}\), say with \(w\in B(g^{k})\). Let v be the unique neighbor in \(B_{0}\) of w under the perfect matching \(E([B_{0}\cup B(g^{k})])\). Let P be the path from (0, 0) to v in \([P_{1}]\) which exists by assumption. Then P followed by the edge vw is a walk joining (0, 0) to w, so P contains a path from (0, 0) to w.

By Theorem 13b there is a (unique) path in \([P_{1}]\) starting at (0, 0) and passing through levels \(1,2,\cdots , q-1\) in succession. Let \((0,0) - (g, \alpha _{1}) - (g^{2}, \alpha _{2}) - ... - (g^{q-1}, \alpha _{q-1})\) be this path, illustrated in bold lines in Fig. 4, for suitable \(\alpha _{k}\in GF(q)\). For \(k\ge 1\) let \((0, \beta _{k})\in B_{0}\) be the unique neighbor in level 0 of the vertex \((g^{k}, \alpha _{k})\) in level k. The edges \((g^{k}, \alpha _{k})(0, \beta _{k})\) are illustrated by the dotted lines in Fig. 4.

This path and the points \((0, \beta _{k})\) are illustrated in Fig. 4. Our first step is to obtain the values of \(\alpha _{k}\) and \(\beta _{k}\).

Claim 1

We have

  1. (a)

    \(\alpha _{1} = \frac{1}{g}, \alpha _{2} = \frac{1}{g-1}\), and \(\alpha _{k} = \frac{g^{k-1}+g^{k-3}+g^{k-4}+\cdots + g+1}{(g-1)g^{k-1}}\) for \(k\ge 3.\)

  2. (b)

    \(\beta _{1} = 0\), and \(\beta _{k} = \frac{(g^{2}-g+1)(1+g+g^{2}+g^{3}+\cdots + g^{k-2})}{g^{k}(g-1)}\) for \(k\ge 2\).

Proof of Claim 1

We repeatedly use the fact, proved earlier, that if (ra) and (sb) are adjacent vertices in the contraction graph \(C_{P}(q)\), then \((s-r)(b-a) = 1\).

For part (a), since \((0,0) - (g, \alpha _{1})\) is an edge in \(C_{P}(q)\) we have \((\alpha _{1}-0)(g-0) = 1\), so \(\alpha _{1} = \frac{1}{g}.\) Since \((g, \alpha _{1}) - (g^{2}, \alpha _{2})\) is an edge we have \((\alpha _{2}-\frac{1}{g})(g^{2}-g) = 1\), yielding \(\alpha _{2} = \frac{1}{g-1}\), and similarly \((\alpha _{3}-\frac{1}{g-1})(g^{3}-g^{2}) = 1\), yielding \(\alpha _{3} = \frac{g^{2}+1}{(g-1)g^{2}}\). Now for \(k\ge 3\) we proceed by induction, having proved the base case \(k = 3\). Since \((g^{k}, \alpha _{k}) - (g^{k-1}, \alpha _{k-1})\) is an edge, we have \((\alpha _{k} - \alpha _{k-1})(g^{k} - g^{k-1}) = 1\). Solving for \(\alpha _{k}\) and applying the inductive hypothesis to \(\alpha _{k-1}\), we obtain \(\alpha _{k} = \frac{1}{g^{k}-g^{k-1}} + \frac{g^{k-2}+g^{k-4}+g^{k-5}+\cdots + g+1}{(g-1)g^{k-2}}\), which after simplification yields the claim.

For part (b), we have \(\beta _{1} = 0\) since \((0,0) - (g, \alpha _{1})\) is an edge by definition. Since \((g^{2}, \alpha _{2}) - (0, \beta _{2})\) is an edge, we have \((\frac{1}{g-1} - \beta _{2})(g^{2}-0) = 1\), and solving for \(\beta _{2}\) and simplifying we get the claim for \(k=2\). Consider now \(k\ge 2.\) The existence of edge \((g^{k}, \alpha _{k}) - (0, \beta _{k})\) gives \(({\alpha _{k} - \beta _{k}})g^{k} = 1\), so \(\beta _{k} = \alpha _{k} - \frac{1}{g^{k}}\). Using the formula for \(\alpha _{k}\) from part (a), we have \(\beta _{k} = \frac{g^{k-1}+g^{k-3}+g^{k-4}+\cdots + g+1}{(g-1)g^{k-1}} - \frac{1}{g^{k}} = \frac{g^{k}+g^{k-2}+g^{k-3}+ \cdots +g^{2}+1}{(g-1)g^{k}} = \frac{(g^{2}-g+1)(1+g+g^{2}+g^{3}+\cdots + g^{k-2})}{g^{k}(g-1)}.\)   QED

Fig. 2
figure 2

The graph \(P_1\), partitioned into levels \(B_0\) and \(B_{g^i},1\le i\le q-1\)

Fig. 3
figure 3

Perfect matching between any two levels of \(P_1\)

Fig. 4
figure 4

The path \((0,0)-(g,\alpha _1)-(g^2,\alpha _2)-\dots -(g^{q-1},\alpha _{q-1})\) in \(P_{1}\), where \((0,\beta _i)\) is the level 0 neighbor of \((g^i,\alpha _i)\)

Claim 2

We have \(|\{ \beta _{k}: 1\le k\le q-1\}| = q-1\); that is, the \(\beta _{k}\), \(1\le k\le q-1\), are pairwise distinct.

Proof of Claim 2

In applying Claim 1, we note first that g could have been chosen so as not to be a root of \(x^{2} - x + 1 = 0\) as follows. The number of roots in GF(q) to this quadratic is at most 2. Now the number of generators in the multiplicative cyclic group \(GF(q) - \{0\}\) of order \(q-1\) is the euler totient function \(\phi (q-1)\), defined as the number of integers \(1\le s\le q-1\) which are relatively prime to \(q-1\). Since q is an odd prime power with \(q\ge 13\), we know that \(\phi (q-1) > 2\), so such a g exists.

We show that for any pair jk with \(1\le j < k \le q-1\) we have \(\beta _{k}\ne \beta _{j}\).

Consider first the case \(j = 1\). Since \(\beta _{1} = 0\), we need to show that \(\beta _{k}\ne 0\) for \(2\le k\le q-1\). Supposing the contrary and applying Claim 1b we get \(\frac{(g^{2}-g+1)(1+g+g^{2}+g^{3}+\cdots + g^{k-2})}{g^{k}(g-1)} = 0.\) Canceling the nonzero factor \(\frac{g^{2}-g+1}{g^{k}(g-1)}\) (by the preceding paragraph) on the left side, we get \(0 = (1+g+g^{2}+g^{3}+\cdots + g^{k-2}) = \frac{g^{k-1} - 1}{g - 1}.\) This implies that \(g^{k-1} - 1 = 0\), so g has order \(k-1\). This is impossible since \(k-1\le q-2\) while g, being a generator of the cyclic group \(GF(q) - \{0\}\), must have order \(q-1\).

So now suppose that \(j\ge 2\). Assuming the contrary that \(\beta _{k} = \beta _{j}\) and applying Claim 1b, we get after simplification that \(1+g+g^{2}+g^{3}+\cdots + g^{k-2} = g^{k-j}(1+g+g^{2}+g^{3}+\cdots + g^{j-2}) = g^{k-j} + g^{k-j+1} + \cdots + g^{k-2}\). Thus we have \(0 = 1 + g + g^{2} + \cdots + g^{k-j-1} = \frac{g^{k-j} - 1}{g - 1}\). So \(g^{k-j} = 1\), which is impossible since \(k - j\le q - 3\), while g has order \(q - 1.\)   QED

We introduce some notation in preparation for the rest of the argument. Let \(Z = \{ (0,\beta _{k}): 1\le k\le q-1\}\subset B_{0}\). Since \(|B_{0}| = q\), by Claim 2 we have \(|B_{0} - Z| = 1\), and we let u be the unique vertex of \(B_{0} - Z\). Further for any subset T of vertices in \(C_{P}(q)\), we let \(N(T) = \{ v\in C_{P}(q): v\notin T, vt\in E(C_{P}(q))\) for some \(t\in T\} \) be the neighbor set of T in \(C_{P}(q)\). Recall also that [T] denotes the subgraph of \(C_{P}(q)\) induced by T.

Claim 3

Let \(H = [Z\cup N(Z)]_{C_{P}(q)}\), and \(H' = [\{u\}\cup N(u)]_{C_{P}(q)}\).

  1. (a)

    \(H'\) is connected.

  2. (b)

    H is connected.

  3. (c)

    \(V(H)\cap V(H') = \emptyset \)

  4. (d)

    We have the partition \(V(P_{1}) = V(H)\cup V(H')\).

Proof of Claim 3

For part (a), we apply Theorem 13b to deduce that \(H'\) has the spanning star subgraph \(K_{1, q-1}\), where the center is u and the leaves, one in each level \(B_{i}\), \(i\ne 0\), form N(u). Thus \(H'\) is connected.

Consider part (b). Since \(\beta _{1} = 0\) we have \((0,0)\in Z\subset V(H)\). Thus it suffices to show that for any \(w\in V(H)\) there is a path in H joining (0, 0) to w.

Suppose first that \(w\in Z\), so \(w = (0, \beta _{k})\) for some k. Observe that \((g^{i},\alpha _{i}) \in N(Z)\) for all i by definition. So the path \((0,0) - (g, \alpha _{1}) - (g^{2}, \alpha _{2}) - ... - (g^{k}, \alpha _{k})\) followed by the edge \((g^{k}, \alpha _{k}) - (0,\beta _{k})\) is path in H joining (0, 0) to w.

Next suppose \(w\in N(Z)\), say with w adjacent to \((0, \beta _{k})\in Z\). Then the path \((0,0) - (g, \alpha _{1}) - (g^{2}, \alpha _{2}) - ... - (g^{k}, \alpha _{k})\) followed by the length 2 path \((g^{k}, \alpha _{k}) - (0, \beta _{k}) - w\) is a walk in H joining (0, 0) to w, and this walk contains the required path.

Next consider (c). Suppose not, and let \(z\in V(H)\cap V(H')\), say with \(z\in B(g^{k})\), noting that \(k\ge 1\) since each level, in particular \(B_{0}\), is an independent set in \([P_{1}]\). Then z has two distinct neighbors in \(B_{0}\); namely u and \((0,\beta _{j})\), for some \(1\le j\le q-1\). This contradicts the fact that the edge set of \([B(g^{k})\cup B_{0}]\) is a perfect matching between the levels \(B(g^{k})\) and \(B_{0}\) by Theorem 13b. Thus \(V(H)\cap V(H') = \emptyset \).

Consider now (d). By part (c), it suffices to show that \(|V(P_{1})| = |V(H)| + |V(H)'|\). By Theorem 13b, it follows that \(|V(H)| = |Z|q = (q-1)q.\) For the same reason \(|V(H')| = q.\) Therefore \(|V(P_{1})| = q^{2} = |V(H)| + |V(H)'|\) as required.   QED

We can now complete the proof of the theorem by showing that \(P_{1}\) is connected. In view of Claim 3, to do this we are reduced to showing that there is an edge \(vw\in E([P_{1}])\) with \(v\in H'\) and \(w\in H\). Suppose no such edge exists. Since \([P_{1}]\) is \((q-1)\)-regular by Theorem 13c, it follows that \(H'\) is a simple \(q-1\) regular graph on q vertices. Thus \(H' = K_{q}\). Hence \([N(u)] = K_{q-1}\). But this is a contradiction for \(q\ge 5\) since by Theorem 13d the neighborhood of any nonisolated point in \(C_{P}(q)\) is regular of degree 2, while [N(u)] is regular of degree \(q-2 > 2\) since we have assumed \(q\ge 13\). \(\square \)

We can now obtain our independent set in \(C_{P}(q)\) as a consequence of our previous results and the following theorem of Alon [1].

Theorem 15

[1] Let G = (V,E) be a graph on N vertices with average degree \(t\ge 1\) in which for every vertex \(v\in V\) the induced subgraph on the set of all neighbors of v is r-colorable. Then the maximum size \(\alpha (G)\) of an independent set in G satisfies \(\alpha (G)\ge \frac{c}{log(r+1)}\frac{N}{t}\,log(t)\), for some absolute constant c.

Corollary 16

Let q be a power of an odd prime p, with \(q\equiv 1(\)mod 3).

  • (a) \(\alpha (C_{P}(q))\ge Kq^{2}log(q)\) for some constant \(K>0\).

  • (b) \(M(q, q-3) \ge Kq^{2}log(q)\) for some constant \(K>0\).

Proof

We will be applying Alon’s result (Theorem  15) together with our results proved earlier in this section. The latter results require \(q\ge 13\). We can drop this requirement in these applications since the case \(q = 7\) can be included in the conclusion of this corollary by decreasing, if necessary, the absolute constant K appearing in its statement.

Consider (a). By Corollary 12a there is no edge between any two subgraphs \([P_{r}]\) and \([P_{s}]\) for \(r\ne s\). Since there are q such subgraphs, and by Theorem 13a) they are pairwise isomorphic, it suffices to show that \(\alpha (P_{1}) \ge Kqlog(q)\) for some constant K.

We now apply Alon’s theorem to the subgraph \([P_{1}]\) of \(C_{P}(q)\). Now \([P_{1}]\) is \((q-1)\)-regular by Theorem 13c, and has \(q^{2}\) points. Since the neighborhood of every point is a disjoint union of cycles by Theorem 13d, this neighborhood must be 3-colorable. It follows by Alon’s theorem that \([P_{1}]\) contains an independent set of size \(\frac{c}{log(4)}\frac{q^{2}}{q-1}\,log(q-1) \sim Kqlog(q)\), for some constant K.

For (b), let I be an independent set in \(C_{P}(q)\) of size \(Kq^{2}log(q)\) for suitable constant K, guaranteed to exist by by part (a). Then by the reduction made in the discussion preceding Lemma 10 we have \(M(q,q-3) \ge |I| \ge Kq^{2}log(q)\). \(\square \)

We mentioned at the end of Sect. 2 that \(M(n,n-2)\ge n^{2}\), where n is a prime power satisfying \(n \not \equiv 2\)(mod 3) [9]. Thus \(M(n,n-3)\ge M(n,n-2)\ge n^2\) is also true for such n, including in particular prime powers n satisfying \(n \equiv 1\)(mod 3) that are of interest in this paper. This is the best previously known lower bound for \(M(n,n-3)\) applicable to the latter such n. So our new lower bound \(M(n,n-3)\ge K n^2 \log (n)\) for such n is an improvement on previous results.

4 Special case lower bounds for M(nd) via the Mathieu groups

In this section we consider the Mathieu groups \(M_{11}, M_{12}, M_{22}, M_{23}, M_{24}\), discovered by E. Mathieu in 1861 and 1873. These permutation groups are the earliest known examples of sporadic simple groups. See [6, 12, 20], or [32] for a discussion of their construction. These groups act on 11, 12, 22, 23, 24 letters respectively, with \(M_{11}\) being a 1 point stabilizer of \(M_{12}\), while \(M_{23}\) and \(M_{22}\) are 1 and 2 point stabilizers of \(M_{24}\) respectively. We mention the recent book [20] as a good reference on these groups.

In this section we apply the contraction operation to these permutation groups to obtain new permutation arrays, with resulting lower bounds for M(nd) for suitable n and d.

Since \(M_{12}\) is sharply 5-transitive we have by Theorem 1 that \(hd(M_{12}) = 8\) and \(M(12,8) = |M_{12}| = 95040\). Similarly since \(M_{11}\) is sharply 4-transitive we have \(M(11,8) = |M_{11}| = 7920\). For \(M_{24}\) we do not have sharp transitivity. But observe that for any permutation group G acting on some set, and three elements \(\pi , \sigma , \tau \in G\), we have \(hd(\pi , \sigma ) = hd(\pi \tau , \sigma \tau ) = hd(\tau \pi , \tau \sigma )\). Thus \(hd(G) =\)min\(\{hd(1,\sigma ): \sigma \in G \}\). From the set of disjoint cycle structures of elements of \(M_{24}\) (available at [18, 31]) we find that the largest number of 1-cycles in the disjoint cycle structure of any nonidentity element of \(M_{24}\) is 8. Thus \(hd(M_{24}) = 24-8 = 16\), and from the stabilizer relation also \(hd(M_{23}) = hd(M_{22}) = 16\). We thus obtain \(M(24,16)\ge |M_{24}| = 24,423,040\), \(M(23,16)\ge |M_{23}| = 10,200,960\), and \(M(22,16)\ge |M_{22}| = 443,520\).

We now apply the contraction operation to these groups. Considering the action of \(M_{12}\) on the 12-letter set \(\Omega = \{x_{1}, x_{2}, \cdots , x_{12}\}\), we designate some element, say \(x_{12}\), of \(\Omega \) as the distinguished element F in the definition of \(\pi ^{\triangle }\). Then define for each \(\pi \in M_{12}\) the permutation \(\pi ^{\triangle }\) on the set \(\Omega \) exactly as in the introduction. Thus, using the natural ordering of elements of \(\Omega \) by subscript, the image string of any \(\sigma \in M_{12}\) can be written \(\sigma (x_{1})\sigma (x_{2})\cdots \sigma (x_{11})\sigma (F)\).

As before, we let \(\pi ^{\triangle }_{-}\) be the permutation on 11 symbols obtained from \(\pi ^{\triangle }\) by dropping the final symbol F, and for any subset \(S\subset M_{12}\), we let \(S^{\triangle }_{-} = \{ \pi ^{\triangle }_{-}: \pi \in S\}\), sometimes writing this as \((S)^{\triangle }_{-}\).\(\square \)

Proposition 17

  • (a) \(hd((M_{12})^{\triangle }_{-}) \ge 6\).

  • (b) \(M(11,6)\ge |M_{12}| = 95040\).

  • (c) \(M(10,6)\ge 8640\).

Proof

We start with (a). Suppose not. Since \(hd(M_{12}) = 8\), and for any \(\alpha , \beta \in M_{12}\) we have \(hd(\alpha ^{\triangle }, \beta ^{\triangle })\ge hd(\alpha , \beta ) - 3\) by Lemma 2a, the contrary assumption implies \(hd((M_{12}^{\triangle })_{-}) = 5\). Thus there is a pair \(\sigma , \tau \in M_{12}\) such that \(hd(\sigma ,\tau ) = 8\) and \(hd(\sigma ^{\triangle },\tau ^{\triangle }) = 5\); so \(hd(\sigma ^{\triangle },\tau ^{\triangle }) = hd(\sigma ,\tau ) - 3\). Thus by Lemma 2b we know that \(\pi \sigma ^{-1}\) has a 3-cycle in its disjoint cycle factorization so the order of \(\pi \sigma ^{-1}\) is divisible by 3.

Since \(hd(\sigma ,\tau ) = 8\) and \(\pi \) and \(\sigma \) are permutations on 12 letters, it follows that there are four positions, call them \(x_{i}\), \(1\le i\le 4\), at which \(\pi \) and \(\sigma \) agree. Then \(\pi \sigma ^{-1}\) belongs to the subgroup H of \(M_{12}\) fixing these four positions; that is \(H = \{ \alpha \in M_{12}: \alpha (x_{i}) = x_{i}, 1\le i\le 4\}\). This H, denoted \(M_{8}\), is known to be isomorphic to \(Q_{8}\), the quaternion group of order 8 ([5], Sect. 3.2). We can also verify this directly by making use of GAP (Groups, Algorithms, Programming), a system for computational discrete algebra. The following output employing GAP shows that \(H\cong Q_{8}\), the quaternion group of order 8 [17]

figure a

Now the order of \(\pi \sigma ^{-1}\) is divisible by 3 as noted above. But 3 does not divide \(|Q_{8}|\), a contradiction to Lagrange’s theorem.

Consider next (b). Using Lemma 2c and \(hd(M_{12}) = 8 > 3\), we have \(|M_{12}| = |(M_{12})^{\triangle }_{-}|\). Thus \((M_{12})^{\triangle }_{-}\) is a permutation array on 11 letters of size \(|M_{12}| \) with \(hd((M_{12})^{\triangle }_{-}) \ge 6\). Part (b) follows.

For part (c), we recall from the introduction the elementary bound \(M(n-1,d)\ge \frac{M(n,d)}{n}\). Using part (a), we then obtain \(M(10,6)\ge \frac{M(11,6)}{11}\ge 8640.\)

We remark that using the same method as in part (b) of the above proposition one can show \(M(10,6)\ge |M_{11}| = 7920\). But this is obviously weaker than the bound we give in part (c).

We now consider the contraction of \(M_{24}\) and resulting special case bounds for M(nd). Using similar notation as for \(M_{12}\) above, we let \(M_{24}\) act on the set of 24 letters \(\Theta = \{x_{1}, x_{2}, \cdots , x_{24}\}\), and we designate \(x_{24}\) as the distinguished symbol F in the definition of \(\pi ^{\triangle }\) from the introduction. Now define \(\pi ^{\triangle }\) for any \(\pi \in M_{24}\) as in the introduction, along with accompanying definitions \(S^{\triangle }\) and \(S^{\triangle }_{-}\) for \(S\subseteq M_{24}\).\(\square \)

Proposition 18

  • (a) \(hd((M_{24})^{\triangle }_{-}) \ge 14\).

  • (b) \(M(23,14)\ge |M_{24}| = 244,823,040\).

  • (c) \(M(22,14)\ge \frac{|M_{24}|}{23} = 10,644,480\).

  • d)\(M(21,14)\ge \frac{|M_{24}|}{23^{.} 22} = 483,840\).

Proof

For (a), suppose not. Since \(hd(M_{24}) = 16\), and for any \(\alpha , \beta \in M_{24}\) we have \(hd(\alpha ^{\triangle }, \beta ^{\triangle })\ge hd(\alpha , \beta ) - 3\), it follows that \(hd((M_{24})^{\triangle }_{-}) = 13\). Thus there is pair \(\sigma , \tau \in M_{24}\) such that \(hd(\sigma ,\tau ) = 16\) and \(hd(\sigma ^{\triangle },\tau ^{\triangle }) = 13\); so \(hd(\sigma ^{\triangle },\tau ^{\triangle }) = hd(\sigma ,\tau ) - 3\). Hence by Lemma 2b, \(\tau \sigma ^{-1}\) has a 3-cycle in its disjoint cycle structure factorization.

Since \(hd(\sigma ,\tau ) = 16\), and \(\sigma \) and \(\tau \) are permutations on 24 letters, it follows that \(\sigma \) and \(\tau \) must agree on 8 positions. Thus \(\tau \sigma ^{-1}\) belongs to the subgroup H of \(M_{24}\) fixing these 8 positions. From the structure theory of \(M_{24}\), we know that if these 8 positions form an “octad” (among the 24 positions), then \(H = M_{16} \cong Z_{2}\times Z_{2}\times Z_{2}\times Z_{2}\), the elementary Abelian group of order 16 ([32] Theorem 3.21, and [33], pp. 197–208). Again, this can also be verified directly using GAP from the following output [17].

figure b

If these 8 positions do not form an octad, then H is the identity ([32], Lemma 3.1). Now the order of \(\tau \sigma ^{-1}\) is divisible by 3, so 3 must divide |H|. By Lagrange’s theorem, this contradicts that |H| has order either 16 or 1.

Consider next (b). Using Lemma 2c and \(hd(M_{24}) = 16 > 3\), we have \(|M_{24}| = |(M_{24})^{\triangle }_{-}|\). Thus \((M_{24})^{\triangle }_{-}\) is a permutation array on 23 letters of size \(|M_{24}| \), and by part (a) we have \(hd((M_{24})^{\triangle }_{-}) \ge 14\). Part (b) follows.

For part (c), we again use the bound \(M(n-1,d)\ge \frac{M(n,d)}{n}\). Using part (b), we then obtain \(M(22,14)\ge \frac{M(23,14)}{23}\ge \frac{|M_{24}|}{23} = 10,644,480\).

For (d), using \(M(n-1,d)\ge \frac{M(n,d)}{n}\) again we get \(M(21,14)\ge \frac{M(22,14)}{22}\ge \frac{|M_{24}|}{23^{.} 22} = 483,840\). \(\square \)

5 Concluding remarks

We mention some problems left open from our work.

1. Recall that if I is an independent set in \(C_{P}(q)\), then \(M(q, q-3)\ge |I|\). To find a large such I, one can focus on any nontrivial connected component, say \(P_{1}\), of \(C_{P}(q)\). If \(P_{1}\) contains an independent set of size k, then by the isomorphism of the connected components \(P_{i}\), \(1\le i\le q-1\), we get an independent set of size \(k(q-1) + q(q-1) = (q-1)(k+q)\) in \(C_{P}(q)\), where \(q(q-1)\) counts the number of isolated points in \(C_{P}(q)\). Our lower bound \(M(q, q-3)\ge Kq^{2}log(q)\) implies, again by the isomorphism of components, that \(\alpha (P_{1})\ge Cqlog(q)\) (where \(\alpha (G)\) is the maximum size of an independent set in a graph G), for some constant C. We therefore ask whether an improvement on this lower bound for \(\alpha (P_{1})\) can be found.

Now \(V(P_{1})\) can be viewed as a rectangular array \(\{(i,a): i, a\in GF(q)\}\) as in Fig. 2, where we let i be the row index, and a the column index. By Corollary 12a an independent set in \(P_{1}\) is just a subset S of this array with the property that for any two points \((i,a), (j,b)\in S\) we have \((b-a)(j-i)\ne 1\) in GF(q). Using the integer programming package GUROBI [16] and a Greedy program that we constructed, we computed independent sets in \(P_{1}\) of size k for various q. This k, together with the resulting lower bound \((q-1)(k+q)\) for \(M(q,q-3)\) are presented in Table 1. The primes \(q = 41, 47, 53, 59, 71, 83, 89\), for example, are not included in this table since \(q\not \equiv 1(\)mod 3), and hence \(M(q,q-3) \ge (q+1)q(q-1)\), an improvement over the lower bound obtained using GUROBI. Other primes q, or powers of a prime, are not included, as previously known lower bounds, for example, when \(q-2\) is also a prime, or prime power, are not included. That is, in those cases, \(PGL(2,q-1)\) has \((q-1)^3 - q + 1\) elements, so \(M(q-1,q-3)\ge (q-1)^3 - q + 1\). That means that \(M(q,q-3) \ge (q-1)^3 - q + 1\) [10].

2. We also ask for good upper bounds on \(\alpha (P_{1})\).

Table 1 Independent set of size k in \(P_{1}\) obtained either by integer linear programming or by a greedy program