Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Let {u t } and {u t } be sequences of period \(\varepsilon\) with symbols from the finite field GF(p) with p elements. Let ω be a primitive complex pth root of unity. The cross-correlation between the two sequences at shift τ is defined to be

$$\displaystyle\begin{array}{rcl} C_{u,v}(\tau ) =\sum _{ t=0}^{\varepsilon -1}\omega ^{u_{t+\tau }-v_{t} }.& & {}\\ \end{array}$$

If the two sequences are cyclically equivalent (i.e., only differ by a cyclic shift), the correlation is denoted autocorrelation instead of cross-correlation.

In a code-division multiple-access (CDMA) system, each user is assigned a sequence from a family of sequences. The quality of the communication depends on the selection of a family of sequences with good parameters.

Let \(\mathcal{F}\) be a family of M cyclically distinct sequences of the same period \(\varepsilon\):

$$\displaystyle{\mathcal{F} =\{\{ s_{t}^{(i)}\}\;\vert \;1 \leq i \leq M\}.}$$

The most important parameters for evaluating the quality of the family are \((M,\varepsilon,\theta _{\mathrm{max}})\) where θ max is the maximum value of the absolute magnitude of the (nontrivial) auto- and cross-correlation between any two sequences in the family, i.e.,

$$\displaystyle{\theta _{\mathrm{max}} = \mathrm{max}\{\vert C_{s^{(i)},s^{(j)}}(\tau )\vert \;\vert \;i\neq j\text{ or }\tau \neq 0\}.}$$

Many of the best sequence families can be constructed from linear recursions. To generate a sequence {s t } with symbols from GF(p), one can use a linear recursion of degree n and generate each symbol from the previous n symbols such that

$$\displaystyle{s_{t+n} + c_{n-1}s_{t+n-1} + \cdots + c_{0}s_{t} = 0,\;\;c_{i} \in \mathrm{ GF(p)},c_{0}\neq 0.}$$

The initial state \((s_{0},s_{1},\ldots,s_{n-1})\) and the linear recursion uniquely determine the sequence {s t }. Thus, the linear recursion generates p n distinct sequences corresponding to the p n initial states \((s_{0},s_{1},\ldots,s_{n-1})\). Clearly, some of these generated sequences may be cyclically equivalent.

The characteristic polynomial of the linear recursion is defined to be

$$\displaystyle{f(x) =\sum _{ i=0}^{n}c_{ i}x^{i}.}$$

The period of the sequences generated by the recursion with characteristic polynomial f(x) is completely determined by the polynomial. It is a well-known fact that all these sequences will have period e where e is the smallest positive integer such that f(x)  |  x e − 1. Furthermore, at least one of these sequences will have e as its smallest period.

Let f(x) be a primitive polynomial, i.e., an irreducible polynomial with a zero α being a generator for the multiplicative group of GF(pn). Then the factorization of the primitive polynomial f(x) is given by

$$\displaystyle{f(x) =\prod _{ i=0}^{n-1}(x -\alpha ^{p^{i} }).}$$

Since the generator α has order p n − 1, then \(f(x)\;\vert \;x^{p^{n}-1 } - 1\) and any nonzero sequence generated by the recursion with characteristic polynomial f(x) has period p n − 1. This is maximum possible for a linear recursion of degree n, and any such sequence is therefore called a maximum-length sequence (or m-sequence).

During a period of the m-sequence, each nonzero consecutive n-tuple occurs exactly once during its period. In particular this implies that the m-sequence is as balanced as it can be for a sequence of period p n − 1 since all nonzero symbols occur p n−1 times, while the 0 element occurs \(p^{n-1} - 1\) time.

The trace function Tr n from GF(pn) to GF(p) is defined by

$$\displaystyle\begin{array}{rcl} \mathit{Tr}_{n}(x) =\sum _{ i=0}^{n-1}x^{p^{i} }.& & {}\\ \end{array}$$

The m-sequence can be written as

$$\displaystyle\begin{array}{rcl} s_{t} = \mathit{Tr}_{n}(c\alpha ^{t}),& & {}\\ \end{array}$$

where the p n − 1 different nonzero values of \(c \in \mathrm{ GF(p^{n})^{{\ast}}} =\mathrm{ GF(p^{n})}\setminus \{0\}\) correspond to all possible shifts of the m-sequence.

Starting with one m-sequence of period p n − 1, all other m-sequences of the same period can be obtained by decimating the sequence. The decimated sequence of {s t } is the sequence {s dt } which is an m-sequence if and only if \(\mathrm{gcd}(d,p^{n} - 1) = 1\). The sequence and its decimated sequence are cyclically distinct if and only if \(d\not\equiv p^{i}\pmod p^{n} - 1\) for \(i = 0,1,\ldots,n - 1\). The number of cyclically distinct m-sequences is \(\phi (p^{n} - 1)/n\), where ϕ is Euler’s ϕ function, and equals the number of primitive polynomials of degree n. For further results on linear recursions, the reader is referred to the classical book by Golomb [10].

Example 1

Let p = 3 and consider the linear recursion

$$\displaystyle{s_{t+3} + 2s_{t+2} + s_{t} = 0.}$$

The characteristic polynomial of the recursion is \(f(x) = x^{3} + 2x^{2} + 1\). This is a primitive polynomial, and using the initial state (011), the recursion generates the m-sequence (01110211210100222012212020) of period \(\varepsilon = 26\). The recursion clearly generates all cyclic shifts of this sequence since all nonzero initial states are present in the m-sequence. In addition the recursion generates the all-zero sequence using the initial state (000). It is easily verified that decimating the sequence above by \(d \equiv 3^{i}\pmod 26\) gives the same sequence, while decimation by any \(d\not\equiv 3^{i}\pmod 26\) with gcd(d, 26) = 1 gives a cyclically distinct m-sequence of period 26.

2 Correlation of m-Sequences

The cross-correlation at shift τ between two m-sequences that differ by a decimation d will be denoted by C d (τ). The problem to determine the values and the number of occurrences of each value of the cross-correlation C d (τ) between two m-sequences when τ runs through all p n − 1 shifts has been studied for almost 50 years.

The simplest case to consider is the autocorrelation function of an m-sequence. One reason for the popularity of m-sequences is due to their two-valued autocorrelation and their importance in synchronization applications.

Theorem 1

The autocorrelation function C 1 (τ) of an m-sequence having period \(\varepsilon = p^{n} - 1\) takes the value − 1 for any shift \(\tau \not\equiv 0\pmod p^{n} - 1\) and the value p n − 1 for any shift \(\tau \equiv 0\pmod p^{n} - 1\) .

Proof

Let \(\tau \not\equiv 0\pmod p^{n} - 1\), then since the characteristic polynomial of the m-sequence {s t } also generates the sequence \(\{s_{t+\tau } - s_{t}\}\), it follows that this is some shift of the m-sequence. Hence,

$$\displaystyle\begin{array}{rcl} C_{1}(\tau )& =& \sum _{t=0}^{p^{n}-2 }\omega ^{s_{t+\tau }-s_{t} } =\sum _{ t=0}^{p^{n}-2 }\omega ^{s_{t+\delta } } = -1 {}\\ \end{array}$$

since \(s_{t} - s_{t+\tau } = s_{t+\delta }\) for some δ depending on τ and the m-sequence is balanced (except for a “missing” 0) having p n−1 of each nonzero element and \(p^{n-1} - 1\) zeros during a period of the sequence.

Some basic results useful for the analysis of C d (τ) can be found in Helleseth [12].

Lemma 1

The following properties hold for the cross-correlation C d (τ):

  1. 1.

    If \(\mathit{dd}^{{\prime}}\equiv 1\pmod p^{n} - 1\) or \(d^{{\prime}}\equiv \mathit{dp}^{i}\) for some integer i, then C d (τ) and \(C_{d^{{\prime}}}(\tau )\) have the same correlation values with the same number of occurrences.

  2. 2.

    The value of C d (τ) is a real number.

  3. 3.

    The sum of the cross-correlation values is determined by

    $$\displaystyle{\sum _{\tau =0}^{p^{n}-2 }(C_{d}(\tau ) + 1) = p^{n}.}$$
  4. 4.

    The square sum of the cross-correlation values is determined by

    $$\displaystyle{\sum _{\tau =0}^{p^{n}-2 }(C_{d}(\tau ) + 1)^{2} = p^{2n}.}$$
  5. 5.

    The higher-order power sums of the cross-correlation are given by

    $$\displaystyle{\sum _{\tau =0}^{p^{n}-2 }(C_{d}(\tau ))^{r} = -(p^{n} - 1)^{r-1} + 2(-1)^{r-1} + a_{ r}p^{2n}}$$

    where a r is the number of solutions of the equations

    $$\displaystyle\begin{array}{rcl} x_{1} + x_{2} + \cdots + x_{r-1} + 1& = 0& {}\\ x_{1}^{d} + x_{ 2}^{d} + \cdots + x_{ r-1}^{d} + 1& = 0& {}\\ \end{array}$$

    and \(x_{i} \in \mathrm{GF}(p^{n})^{{\ast}}\) for \(i = 1,2,\ldots,r - 1\) .

The lemma above is useful to determine the number of occurrences of each value in C d (τ) when there are rather few, say r, values that have already been determined. Then one can determine the complete distribution of the cross-correlation if one can find a i for 2 < i < r.

In Helleseth [12], the previous lemma was applied to prove a result first mentioned without proof in Golomb [11].

Theorem 2

If \(d\not\in \{1,p,\ldots,p^{n-1}\}\) then C d (τ) takes on at least three different values when \(\tau = 0,1,\ldots,p^{n} - 2\) .

Proof

Suppose that C d (τ) takes on only the two values x and y that occur r and \(p^{n} - 1 - r\) times, respectively, in C d (τ) when τ runs through all shifts \(\tau = 0,1,\ldots,p^{n} - 2\). Then using (3) and (4) in Lemma 1 leads to two equations in three unknowns x, y, and r. Eliminating r leads to the equation

$$\displaystyle{(p^{n}x - (x + 1))(p^{n}y - (y + 1)) = p^{2n}(2 - p^{n}).}$$

For p = 2 this is a Diophantine equation that can be shown to have no valid integer solutions (i.e., except \(x = -1\) and \(y = p^{n} - 1\) or \(x = p^{n} - 1\) and \(y = -1\) corresponding to the autocorrelation). For the nonbinary case when p > 2, similar divisibility properties in the ring Q[ω], where Q denotes the rational number field, imply that two-valued cross-correlation is impossible except in the autocorrelation case, i.e., when \(d \equiv p^{i}\pmod p^{n} - 1\).

The cross-correlation between any two m-sequences {s t } and {s dt } with symbols from GF(p) of the same period \(\varepsilon = p^{n} - 1\) can be written as an exponential sum. After a suitable shift, we can assume without loss of generality that \(s_{t} = \mathit{Tr}_{n}(\alpha ^{t})\), and we therefore obtain

$$\displaystyle\begin{array}{rcl} C_{d}(\tau )& =& \sum _{t=0}^{\varepsilon -1}\omega ^{s_{t+\tau }-s_{\mathit{dt}} } =\sum _{ t=0}^{\varepsilon -1}\omega ^{\mathit{Tr}_{n} (\alpha ^{t+\tau }-\alpha ^{\mathit{dt}}) } =\sum _{x\in \mathrm{GF(p^{n})}^{{\ast}}}\omega ^{\mathit{Tr}_{n} (\mathit{cx}-x^{d}) } {}\\ \end{array}$$

where c = α τ. Finding the values and the number of occurrences of each value in the cross-correlation function C d (τ) for τ in \(\{0,1,\ldots,p^{n} - 2\}\) is equivalent to determine the distribution of this exponential sum for any c ≠ 0.

Since a two-valued cross-correlation is only possible when \(d \equiv p^{i}\pmod p^{n} - 1\), it was natural that the early research on the cross-correlation had a strong focus on finding decimations leading to three-valued cross-correlation. The following sections will survey known cases where the cross-correlation takes on three or four values.

The mathematical techniques used to prove these results are rather different for different decimations and give interesting connections between the cross-correlation, exponential sums, and the solutions of special equations over finite fields.

Note that when we in the following find a decimation d with a correlation distribution, then, due to (1) in Lemma 1, the correlation distribution is the same for the decimations \(\mathit{dp}^{i}\pmod p^{n} - 1\) for any i and for the inverse decimation by \(d^{-1}\pmod p^{n} - 1\).

3 Three-Valued Cross-Correlation

3.1 Binary Sequences

There are more decimations leading to three-valued cross-correlation when p = 2 than in the case p > 2. First we consider three-valued cross-correlation in the case of binary sequences.

The pioneering result on three-valued cross-correlation was due to Gold [9] in 1968. Gold considered binary sequences and showed that \(d = 2^{k} + 1\) for n odd and gcd(n, k) = 1 gave a three-valued cross-correlation. Note that the condition n odd was later relaxed to n∕gcd(n, k) odd which still implies that \(\gcd (d,2^{n} - 1) = 1\).

In 1968 Golomb [11] was the first to conjecture that \(d = 2^{2k} - 2^{k} + 1\) leads to a three-valued cross-correlation when n∕gcd(n, k) is odd, and he mentioned in this paper that this result was first proved by Welch, who never published his proof. Later in 1971 Kasami [19] published a proof in his famous paper on the weight distribution of several subcodes of the second-order Reed–Muller code.

Theorem 3

Let e = gcd (n,k) and let n∕e be odd. Let \(d = 2^{k} + 1\) or \(d = 2^{2k} - 2^{k} + 1\) . Then C d (τ) has the following distribution:

$$\displaystyle{\begin{array}{llll} - 1 + 2^{\frac{n+e} {2} } & \text{occurs} &2^{n-e-1} + 2^{\frac{n-e-2} {2} } & \text{times. } \\ - 1 &\text{occurs }&2^{n} - 2^{n-e} - 1 &\text{times. } \\ - 1 - 2^{\frac{n+e} {2} } & \text{occurs} &2^{n-e-1} - 2^{\frac{n-e-2} {2} } & \text{times.} \end{array} }$$

The proof of these decimations use, a simple squaring technique combined with arguments to determine the number of solutions of some linearized polynomial. In the case \(d = 2^{k} + 1\), one can compute rather directly, using simple properties of the trace function, that

$$\displaystyle\begin{array}{rcl} (C_{d}(\tau ) + 1)^{2} = 2^{n}(1 + (-1)^{\mathit{Tr}_{n} (c+1)}).& & {}\\ \end{array}$$

It follows that C d (τ) can only take the values \(-1,-1 \pm 2^{\frac{n+1} {2} }\), and the distribution can be determined from (3) and (4) in Lemma 1.

In the case \(d = 2^{2k} - 2^{k} + 1\) when n is odd and gcd(n, k) = 1, a similar squaring argument gives

$$\displaystyle{(C_{d}(\tau ) + 1)^{2} = 2^{n}N}$$

where N is either 0 or equal to the number of zeros in GF(2n) of the linearized polynomial

$$\displaystyle{L(z) = z^{2^{6k} } + c^{2^{3k} }z^{2^{4k} } + c^{2^{2k} }z^{2^{2k} } + z.}$$

In this case a more detailed argument shows that there is only 1 or 2 solutions in GF(2n) of L(z) = 0 and that C d (τ) can only take on the values \(-1,-1 \pm 2^{\frac{n+1} {2} }\).

The generalization to the general case when gcd(n, k) = e > 1 and ne is odd is rather straightforward but more cumbersome. An elegant method for counting the solutions of the equation above is given by Bracken [1].

The following theorem provides a list of all decimations known to give three-valued cross-correlation in the binary case.

Theorem 4

The cross-correlation C d (τ) is three-valued and the correlation distribution is known for the following values of d:

  1. 1.

    \(d = 2^{k} + 1\) , where n∕gcd (n,k) is odd.

  2. 2

    \(d = 2^{2k} - 2^{k} + 1\) , where n∕gcd (n,k) is odd.

  3. 3.

    \(d = 2^{\frac{n} {2} } + 2^{\frac{n+2} {4} } + 1\) , where \(n \equiv 2\pmod 4\) .

  4. 4.

    \(d = 2^{\frac{n+2} {2} } + 3\) , where \(n \equiv 2\pmod 4\) .

  5. 5.

    \(d = 2^{\frac{n-1} {2} } + 3\) , where n is odd.

  6. 6.
    $$\displaystyle{d = \left \{\begin{array}{ll} 2^{\frac{n-1} {2} } + 2^{\frac{n-1} {4} } - 1,\;\text{ when }n \equiv 1\pmod 4 \\ 2^{\frac{n-1} {2} } + 2^{\frac{3n-1} {4} } - 1,\text{ when }n \equiv 3\pmod 4.\end{array} \right.}$$

Comments Case (1) is the celebrated result proved by Gold [9]. Case (2) is the result first proved by Welch (see Golomb [11] and Kasami [19]). Cases (3) and (4) were proved by Cusick and Dobbertin [4] in 1996. Case (5) was a long-standing conjecture by Welch (see Golomb [11]) that was proved 30 years later by Canteaut et al. [2]. Case (6) is a consequence of the results by Dobbertin [6] and Hollmann and Xiang [16]. Cases (3), (4), and (6) were all conjectured in 1972 by Niho [23].

Since the conjectures (3), (4), and (6) by Niho [23] in 1972, which all have been proved, no new decimations of m-sequences have been found to give a three-valued cross-correlation, and it is widely believed that the list of decimations of binary m-sequences in the theorem above is complete.

Open Problem 1

Show that Theorem 4 contains all decimations with three-valued correlation between binary m-sequences.

This appears to be a very hard open problem. All decimations known to have only three values have their three values of the form \(-1,-1 \pm 2^{r}\) for some r. Even to show that any three-valued decimation must have three such values is not known.

3.2 Nonbinary Sequences

For nonbinary sequences there are three-valued decimations that are analogous to the Gold as well as to the Kasami and Welch decimations. These are given in the following result due to Trachtenberg [25], for n odd, in his Ph.D. thesis from 1970. The result is generalized by Helleseth [12] (or actually in his master thesis from 1971) to the case when n∕gcd(n, k) is odd. The generalization is rather straightforward using the properties of the subfield GF(pk) of GF(pn).

Theorem 5

Let p be an odd prime. Then the following decimations have three-valued cross-correlation.

  1. 1.

    \(d = \frac{p^{2k}+1} {2}\) where n∕gcd (n,k) is odd.

  2. 2.

    \(d = p^{2k} - p^{k} + 1\) where n∕gcd (n,k) is odd.

These are the only decimations for p > 3 that are known to give three-valued cross-correlation.

Open Problem 2

Show that Theorem 5 contains all decimations with three-valued cross-correlation between p-ary m-sequences when p > 3 is an odd prime.

For the ternary case there is an additional decimation with three-valued cross-correlation given in the following result by Dobbertin et al. [8].

Theorem 6

Let p = 3 and \(d = 2 \cdot 3^{\frac{n-1} {2} } + 1\) where n is an odd positive integer. Then the cross-correlation C d (τ) is three-valued and has the following distribution:

$$\displaystyle{\begin{array}{llll} - 1 + 3^{\frac{n+1} {2} } & \text{occurs}&\frac{1}{2}(3^{n-1} + 3^{\frac{n-1} {2} })&\text{times. } \\ - 1 &\text{occurs}&3^{n} - 3^{n-1} - 1 &\text{times. } \\ - 1 - 3^{\frac{n+1} {2} } & \text{occurs}&\frac{1}{2}(3^{n-1} - 3^{\frac{n-1} {2} })&\text{times.}\end{array} }$$

There are numerical observations of the cross-correlation between ternary sequences that give decimations with three-valued cross-correlation and that have not yet been proved. If the following open problem, conjectured by Dobbertin et al. [8], is settled, this would explain all the known decimations of ternary m-sequences with three-valued cross-correlation. Actually, a solution of the problem would complete the explanation of all currently known three-valued cross-correlation decimations for any p.

Open Problem 3

Let p = 3 and \(d = 2 \cdot 3^{r} + 1\) where n is odd and

$$\displaystyle{r = \left \{\begin{array}{ll} \frac{n-1} {4} & \quad \text{ if}\ \text{n}\ \equiv 1\pmod 4, \\ \frac{n-1} {4} & \quad \text{ if}\ \text{n} \equiv 3\pmod 4. \end{array} \right.}$$

Show that C d (τ) has three-valued cross-correlation.

If n∕gcd(n, k) is odd, Theorems 4 and 5 imply the existence of decimations with three-valued cross-correlation.

In the remaining cases when n = 2i for some positive integer i ≥ 2, there are no known decimations having three-valued cross-correlation. It was conjectured by Helleseth [12] that in these cases, any decimation gives at least four cross-correlation values. This conjecture has recently been proved in the binary case by Katz [20]. The general case to settle the conjecture for all other values of p is still open.

Open Problem 4

Show that C d (τ) is at least four-valued when n = 2i for all values of the prime p.

4 Four-Valued Cross-Correlation

One of the main contributions leading to new decimations with four-valued cross-correlation is due to Niho [23]. For his method to be applicable, then n = 2k has to be even and d must be of the special form \(d = s(2^{k} - 1) + 1\).

The main idea is to reduce the problem to compute the number of solutions of some special equations that depend on s.

The next theorem provides a list, in historical order, of all the decimations that have been proved to give four-valued cross-correlation. An important observation is that all the results in (1)–(4) are covered by the last case (5).

Theorem 7

Let v 2 (i) be the highest power of 2 dividing the integer i. The cross-correlation C d (τ) is four-valued and the correlation distribution is known for the following values of d:

  1. 1.

    \(d = 2^{\frac{n} {2} +1} - 1\) , where \(n \equiv 0\pmod 4\) .

  2. 2.

    \(d = (2^{\frac{n} {2} } + 1)(2^{\frac{n} {4} } - 1) + 2\) , where \(n \equiv 0\pmod 4\) .

  3. 3.

    \(d = \frac{2^{(n/2+1)r}-1} {2^{r}-1}\) \((0 <r <n/2,\gcd (n,r) = 1)\) for \(n \equiv 0\pmod 4\) .

  4. 4.

    \(d = \frac{2^{2k}+2^{s+1}-2^{k+1}-1} {2^{s}-1}\) , where n = 2k and 2s|k.

  5. 5.

    \(d = (2^{k} - 1)s + 1\), \(s \equiv 2^{r}(2^{r} \pm 1)^{-1}\pmod 2^{k} + 1\) , where \(v_{2}(r) <v_{2}(k)\) .

Comments The first two cases in Theorem 7 are due to Niho [23] in his Ph.D. thesis. Case (3) is due to Dobbertin [5]. Case (4) was proved by Helleseth and Rosendahl [15]. The final case (5) is proved by Dobbertin et al. [7] and contains all the four previous cases.

Sketch of proof

Since the Niho decimations have played a significant role in the cross-correlation of m-sequences, we will provide a short outline of the proof.

The main idea behind the proof of Theorem 7 is very simple and uses that any nonzero element x ∈ GF(2n) can be written uniquely as x = yz where y ∈ GF(2k) and z ∈ U where

$$\displaystyle{U =\{ z \in \mathrm{ GF(2^{n})}\;\vert \;z^{2^{k}+1 } = 1\}.}$$

In particular, since \(d = s(2^{k} - 1) + 1\), it follows that \(d \equiv 1\pmod 2^{k} - 1\) and therefore \(d \equiv -2s + 1\pmod 2^{k} + 1\). Hence, y d = y and \(z^{d} = z^{-2s+1}\) and the cross-correlation can be written as

$$\displaystyle\begin{array}{rcl} C_{d}(\tau )& =& \sum _{x\in \mathrm{GF(2^{n})^{{\ast}}}}(-1)^{\mathit{Tr}_{m} (\mathit{cx}+x^{d}) } {}\\ & =& \sum _{y\in \mathrm{GF(2^{n})^{{\ast}}},z\in U}(-1)^{\mathit{Tr}_{n} (\mathit{cyz}+\mathit{yz}^{-2s+1}) } {}\\ & =& \sum _{y\in \mathrm{GF(2^{n})^{{\ast}}},z\in U}(-1)^{\mathit{Tr}_{k} (\mathit{yh}(z))} {}\\ & =& (2^{k} - 1)N + (2^{k} + 1 - N)(-1) {}\\ & =& -1 + (N - 1)2^{k}. {}\\ \end{array}$$

Here N is the number of solutions z ∈ U of the equation h(z) = 0 where

$$\displaystyle{h(z) = \mathit{cz} + z^{-2s+1} + c^{2^{k} }z^{-1} + z^{2s-1}}$$

which is equivalent to N being the number of solutions z ∈ U to

$$\displaystyle{p(z) = z^{2s-1} + c^{1/2}z^{s} + c^{2^{k-1} }z^{s-1} + 1 = 0.}$$

Case (1) is one of the two decimations in Niho’s thesis shown to be four-valued. In this case s = 2, i.e., \(d = 2(2^{k} - 1) + 1\) and

$$\displaystyle{p(z) = z^{3} + c^{1/2}z^{2} + c^{2^{k-1} }z + 1 = 0}$$

which has at most three solutions for z. Hence, N = 0, 1, 2, 3 are the only possibilities leading to at most a four-valued cross-correlation with values in the set

$$\displaystyle{\{-1 - 2^{k},-1,-1 + 2^{k},-1 + 2^{k+1}\}.}$$

The correlation distribution follows from Lemma 1 using (3)–(5) and finding a 3.

In the other cases (2)–(5) in Theorem 7, we have

$$\displaystyle{p(z) = z^{2^{r}+1 } + \mathit{az}^{2^{r} } + \mathit{bz} + 1 = 0}$$

which is known to have 0, 1, 2 or 2gcd(r, n) + 1 solutions in GF(2n). A more detailed analysis shows that the number of solutions in U has these four possibilities and the four-valued cross-correlation distribution can be found as above.

There are numerical results that give decimations with four values that are not explained by this list. However, one believes that case (5) in Theorem 7 contains all four-valued cases when d is of the Niho form \(d = s(2^{k} - 1) + 1\). The following conjecture was stated in Dobbertin et al. [7].

Open Problem 5

Any binary decimation of Niho type \(d = s(2^{k} - 1) + 1\), n = 2k with four-valued cross-correlation is of the form \(d = (2^{k} - 1)s + 1\), \(s \equiv 2^{r}(2^{r} \pm 1)^{-1}\pmod 2^{k} + 1\), where \(v_{2}(r) <v_{2}(k)\).

In the nonbinary case there are a few families known with four-valued cross-correlation. The following decimation in Helleseth [12] is the only known four-valued decimation that works for any prime p.

Theorem 8

Let p be an odd prime and \(d = 2 \cdot p^{\frac{n} {2} +1} - 1\) , where \(n \equiv 0\pmod 4\) . Then the cross-correlation C d (τ) is four-valued and the distribution is known.

Recently new ternary decimations with four-valued cross-correlation have been found by Zhang et al. [26].

Theorem 9

Let p = 3, n = 3k, and gcd (k,3) = 1. If \(d = 3^{k} + 1\) or \(d = 3^{2k} + 2\) . Then if r is odd, the cross-correlation C d (τ) is four-valued (and six-valued for r even) and the distribution is known. (The distribution is conjectured to be the same if gcd (k,3) = 3).

5 The − 1 Conjecture

For binary sequences the cross-correlation values are obviously always integers. For p > 2, this may not always be the case even though the values of C d (τ) are always real numbers. This follows from the definition of the cross-correlation function and the fact that the second half of an m-sequence is the negative of the first half. It was shown in Helleseth [12] that C d (τ) is an integer for all τ if and only if \(d \equiv 1\pmod p - 1\).

Numerical results reveal that for p = 2, the cross-correlation always has − 1 as one of its values. For p > 2 this happens for all decimations d where \(d \equiv 1\pmod p - 1\). This was conjectured by Helleseth [12] and this is still an open problem.

Open Problem 6

Show that if \(d \equiv 1\pmod p - 1\), then − 1 always occurs as a value in C d (τ).

It is trivial to reformulate the conjecture as a result of the number of common solutions of a special equation system.

Lemma 2

Let q = p n and α be a primitive element in GF(q) . Let N be the number of solutions x i GF(q) of the equation system:

$$\displaystyle{\begin{array}{llrrrrrrrrr} x_{0} & +& \alpha x_{1} & +& \alpha ^{2}x_{2} & +&\cdots &+&\alpha ^{q-2}x_{q-2} & =& 0 \\ x_{0}^{d}&+&x_{1}^{d}&+&x_{2}^{d}&+&\cdots &+& x_{q-2}^{d}& =&0. \end{array} }$$

The − 1 conjecture holds if and only if \(N = q^{q-3}\) for all d where \(\gcd (d,p^{n} - 1) = 1\) and \(d \equiv 1\pmod p - 1\) .

Proof

Let N denote the number of common solutions of the two equations above. Then N can be expressed by the following exponential sum.

$$\displaystyle\begin{array}{rcl} q^{2}N& =& \sum _{ x_{0},x_{1},\ldots,x_{q-2}\in \mathrm{GF(q)}}\sum _{z_{1},z_{2}\in \mathrm{GF(q)}}\omega ^{\mathit{Tr}_{n} (z_{1}(x_{0}+\alpha x_{1}+\cdots +\alpha ^{q-2}x_{ q-2})+z_{2}(x_{0}^{d}+x_{ 1}^{d}+\cdots +x_{ q-2}^{d})) } {}\\ & =& \sum _{z_{1},z_{2}\in \mathrm{GF(q)}}\prod _{i=0}^{q-2}\sum _{ x\in \mathrm{GF(q)}}\omega ^{\mathit{Tr}_{n} (z_{1}\alpha ^{i}x+z_{ 2}x^{d}) } {}\\ & =& q^{q-1} + (q - 1)\prod _{ c\in \mathrm{GF(q)}}\sum _{x\in \mathrm{GF(q)}}\omega ^{\mathit{Tr}_{n} (\mathit{cx}+x^{d}) } {}\\ & =& q^{q-1} +\prod _{ \tau =0}^{q-2}(C_{ d}(\tau ) + 1) {}\\ & & {}\\ \end{array}$$

since the contribution from \(z_{1} = z_{2} = 0\) is q q−2, and z 1 = 0 or z 2 = 0 contributes 0 if not both are zero. Furthermore, \(z_{1}/z_{2}^{d^{-1} }\) runs through all nonzero elements in GF(q) q − 1 times when z 1 and z 2 run through all nonzero elements in the field. Hence, \(C_{d}(\tau ) = -1\) for some τ if and only of \(N = q^{q-3}\).

Another old problem on the cross-correlation between m-sequences that is more than 30-year-old is the following conjecture due to Sarwate and Pursley [24].

Open Problem 7

Let n be even. Show that \(\vert C_{d}(\tau ) + 1\vert \geq 2^{\frac{n} {2} +1}\).

For related surveys on m-sequences the reader is referred to [13, 14]. Other interesting results and open problems on this topic can be found in [18, 22].

6 Relations to APN and AB Functions

The cross-correlation of m-sequences has some interesting relations to almost perfect nonlinear mappings (APN) and almost bent functions (AB).

An almost perfect nonlinear function f is a mapping \(f:\mathrm{ GF(2^{n})}\mapsto \mathrm{GF(2^{n})}\) such that

$$\displaystyle{f(x + a) + f(x) = b}$$

has at most two solutions for any a ≠ 0, b ∈ GF(2n). The function is said to be Δ-uniform if the maximum number of solutions is Δ, such that an APN function is the same as being 2-uniform.

The Walsh transform of f is defined by

$$\displaystyle{\lambda _{f}(a,b) =\sum _{x\in \mathrm{GF(2^{n})}}(-1)^{\mathit{Tr}(\mathit{af }(x)+\mathit{bx})},}$$

where a, b ∈ GF(2n).

A function f is almost bent (AB) if

$$\displaystyle{\{\lambda _{f}(a,b): a,b \in \mathrm{ GF(2^{n})}\} =\{ 0,\pm 2^{(n+1)/2}\}.}$$

It has been shown by Chaubaud and Vaudenay [3] that AB implies APN. APN functions and AB functions are of significant importance in the design of S-boxes in block ciphers.

Monomial AB functions where f(x) = x d can be obtained from Gold sequences and several of the decimations with three-valued cross-correlation.

Theorem 10

The known monomial AB functions f(x) = x d are

  1. 1.

    Gold: \(d = 2^{k} + 1\) , where gcd (n,k) = 1.

  2. 2.

    Kasami: \(d = 2^{2k} - 2^{k} + 1\) , where gcd (n,k) = 1.

  3. 3.

    Welch: \(d = 2^{\frac{n-1} {2} } + 3\) , where n is odd.

  4. 4.

    Niho:

    $$\displaystyle{d = \left \{\begin{array}{ll} 2^{\frac{n-1} {2} } + 2^{\frac{n-1} {4} } - 1,\;if\;n \equiv 1\pmod 4 \\ 2^{\frac{n-1} {2} } + 2^{\frac{3n-1} {4} } - 1,if\;n \equiv 3\pmod 4.\end{array} \right.}$$

Note that each of these cases corresponds to decimations with three-valued cross-correlation where the values are restricted to the set \(\{0,\pm 2^{(n+1)/2}\}\). Thus, each corresponding monomial function f(x) = x d is AB. Dobbertin [6] conjectured that these are the only monomial AB functions.

Open Problem 8 Show that Theorem 10 contains all monomial f(x) = x d AB functions.

Since a monomial AB function is an APN function, the monomial functions f(x) = x d with d in Theorem 10 are also APN functions. In addition there are two more decimations leading to APN functions and which are not AB. The known monomial APN functions are given in the following theorem.

Theorem 11

The known monomial APN functions f(x) = x d are

  1. 1.

    Gold: \(d = 2^{k} + 1\) , where gcd (n,k) = 1.

  2. 2.

    Kasami: \(d = 2^{2k} - 2^{k} + 1\) , where gcd (n,k) = 1.

  3. 3.

    Welch: \(d = 2^{\frac{n-1} {2} } + 3\) , where n is odd.

  4. 4.

    Niho:

    $$\displaystyle{d = \left \{\begin{array}{ll} 2^{\frac{n-1} {2} } + 2^{\frac{n-1} {4} } - 1,\quad if\;n \equiv 1\pmod 4 \\ 2^{\frac{n-1} {2} } + 2^{\frac{3n-1} {4} } - 1,\quad if\;n \equiv 3\pmod 4.\end{array} \right.}$$
  5. 5.

    Inverse: \(d = 2^{n} - 2 \equiv -1\pmod 2^{n} - 1\) , where n is odd.

  6. 6.

    Dobbertin: \(d = 2^{4k} + 2^{3k} + 2^{2k} + 2^{k} - 1\) , where n = 5k.

Dobbertin [6] conjectured that these are the only monomial APN functions.

Open Problem 9 Show that Theorem 11 contains all monomial f(x) = x d APN functions.

It is easy to show that the cross-correlation values between two m-sequence obey \(C_{d}(\tau ) \equiv -1\pmod 4\). The cross-correlation between {s(t)} and its reverse sequence {s(−t)} corresponds to the famous Kloosterman sum defined by

$$\displaystyle{C_{-1}(\tau ) =\sum _{x\in \mathrm{GF(p^{n})^{{\ast}}}}\omega ^{\mathit{Tr}(\mathit{ax}+x^{-1}) }.}$$

A well-known bound for the Kloosterman sum is

$$\displaystyle{\vert C_{-1}(\tau )) + 1\vert \leq 2p^{n/2}\,.}$$

For p = 2 it was shown by Lachuad and Wolfmann [21] that C −1(τ) takes on all possible values \(\equiv -1\pmod 4\) that obey this bound.

The S-box used in the Advanced Encryption Standard (AES) is a permutation based on \(f(x) = x^{-1}\) for n = 8. The correlation between x −1 and all affine functions take on the same values as | C −1(τ) | when \(\tau = 0,1,\ldots,p^{n} - 2\). The S-box is 4-uniform (not APN) which is the best known uniformity for n = 8. The S-box is not AB but the correlation (and nonlinearity) is the best known for n = 8.

Conclusion

The cross-correlation of m-sequences is a challenging mathematical problem that has many important applications in communication systems. This chapter presents an updated overview of this problem and presented some of the remaining open problems that still exist in this area. Finally, a few connections have been given to AB and APN functions that are important in the design and analysis of S-boxes in block ciphers.