In Chapter 6, we studied Reed–Solomon codes, codes whose codewords are the evaluation of polynomials in one variable of degree at most k − 1 at the elements of \({\mathbb F}_q \cup \{\infty \}\). Reed–Solomon codes are short length codes, where the length n is bounded by q + 1, and only useful when we take the field to be large. The alternant codes which we constructed from generalised Reed–Solomon codes in Chapter 7 allowed us to construct codes over small fields and we put this to good use. In this chapter we will consider another generalisation of Reed–Solomon codes, codes whose codewords are the evaluation of polynomials in many variables. This again allows us to construct linear codes over small fields and we will restrict our attention, for the most part, to binary linear codes. It will turn out that these codes are not asymptotically good. Nevertheless, they are an important class of codes which are widely implemented due to the availability of fast decoding algorithms. One example of such a decoding algorithm is the majority-logic decoding algorithm that we will study here. We will then go on and construct Kerdock codes which are certain subcodes of the second-order Reed–Muller codes. These codes can give examples of non-linear codes with parameters for which no linear code exists.

9.1 Binary Reed–Muller Codes

A Boolean function from \({\mathbb F}_2^m\) to \({\mathbb F}_2\) is the evaluation map of a polynomial with coefficients from \({\mathbb F}_2\) in m variables generated by monomials in which the degree of any particular indeterminate is at most 1.

Note that for both elements x of \({\mathbb F}_2\), x 2 = x, so the function defined by the evaluation of the polynomial \(x_1^2x_2^3x_3\) at the elements of \({\mathbb F}_2^3\) and the polynomial x 1x 2x 3 will be the same. Therefore, it makes sense that when considering evaluations of polynomials in many variables over \({\mathbb F}_2\), we restrict our attention to Boolean functions.

The r-th order Reed–Muller code is a binary code R(r, m) of length 2m defined by

$$\displaystyle \begin{aligned} R(r,m)=\{ (f(a_1),\ldots,f(a_{2^m})) \ | \ \deg f \leqslant r \}, \end{aligned}$$

where \(\{a_1,\ldots ,a_{2^m}\}\) is the set of vectors of \({\mathbb F}_2^m\) and f runs through all Boolean functions that are defined by polynomials in m indeterminates of degree at most r.

The code R(r, m) is a linear code over \({\mathbb F}_2\), since

$$\displaystyle \begin{aligned} (f(a_1),\ldots,f(a_{2^m}))+ (g(a_1),\ldots,g(a_{2^m}))= ((f+g)(a_1),\ldots,(f+g)(a_{2^m})). \end{aligned}$$

The vector space of Boolean functions of degree at most r in m variables has a canonical basis, which is the set of monomials of degree at most r in m variables and degree at most one in any particular variable. Therefore, the code R(r, m) has a generator matrix whose rows are indexed by these monomials. For example, the set of monomials

$$\displaystyle \begin{aligned} \{1,x_1,\ldots,x_m,x_1x_2,\ldots,x_{m-1}x_m\} \end{aligned}$$

is a basis for the vector space of Boolean functions in m variables of degree at most 2.

Example 9.1

The 11 × 16 matrix

$$\displaystyle \begin{aligned} \mathrm{G}=\left( \begin{array}{cccccccccccccccc} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1\\ 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ 0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 & 0 & 1 & 1\\ 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1& 1 & 1\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 0 & 0 & 0 &0 & 0 & 1 & 1\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 &0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 1 &0 & 0 & 0 & 1 &0 & 0 & 0 & 1 &0 & 0 & 0 & 1 \\ \end{array} \right)\begin{array}{l}1 \\ x_1 \\ x_2 \\ x_3 \\ x_4 \\x_1x_2 \\ x_1x_3 \\ x_1x_4 \\ x_2x_3 \\ x_2x_4 \\ x_3x_4 \end{array} \end{aligned}$$

is a generator matrix of the code R(2, 4), with the rows being indexed by the monomials in four variables of degree at most two. \(\blacksquare \)

We have already proved the following lemma.

Lemma 9.2

R(r, m) is a linear code of length 2m and of dimension

$$\displaystyle \begin{aligned} 1+{m \choose 1}+{m \choose 2} + \ldots + {m \choose r}. \end{aligned}$$

Since R(r, m) is linear, Lemma 4.1 implies that its minimum distance is equal to the minimum weight of a non-zero codeword. In the above example, evidently R(2, 4) has codewords of weight 4 and this is indeed its minimum distance. In Theorem 9.3 we will calculate the minimum distance for binary Reed–Muller codes. Later, in Theorem 9.15, we will calculate the minimum distance for non-binary Reed–Muller codes.

Theorem 9.3

The minimum distance of R(r, m) is 2mr.

Proof

By induction on m. If m = r, then the evaluation of the polynomial X 1X r is a codeword of weight one.

Suppose that the minimum distance of R(r, m) is 2mr.

Order the vectors of \({\mathbb F}_2^{m+1}\) so that the first 2m vectors have x m+1 = 0.

A codeword (u, u + v) of R(r, m + 1) is the evaluation of a polynomial

$$\displaystyle \begin{aligned} f(X)+X_{m+1}g(X),\end{aligned}$$

where f(X) is a polynomial of degree at most r in m variables and g(X) is a polynomial of degree at most r − 1 in m variables. Then u ∈ R(r, m), since it is the evaluation of f(X) and v ∈ R(r − 1, m), since it is the evaluation of g(X).

If u = 0, then the codeword is (0, v) and by induction has non-zero weight at least 2m−(r−1).

If u + v = 0, then u = v ∈ R(r − 1, m) and so the codeword (u, 0) has non-zero weight at least 2m−(r−1).

If neither u nor u + v is zero, then (u, u + v) has weight at least 2 ⋅ 2mr = 2mr+1, since both u and u + v are in R(r, m).

Thus, the minimum weight of a non-zero codeword of R(r, m + 1) is 2mr+1. By Lemma 4.1, the minimum weight of a non-zero codeword of a linear code is equal to its minimum distance. □

9.2 Decoding Reed–Muller Codes

The popularity of Reed–Muller codes in real-world applications is due in part to the fact that there are fast decoding algorithms, the most common of which is the focus of this section. Before we consider this decoding algorithm, we first prove a couple of lemmas which prove some properties of Boolean functions.

For each non-empty subset J of {1, …, m}, let

$$\displaystyle \begin{aligned} f_J(X)=\prod_{j \in J} X_j \end{aligned}$$

and define

$$\displaystyle \begin{aligned} f_{\emptyset}(X)=1. \end{aligned}$$

Then

$$\displaystyle \begin{aligned} \{ f_J(X) \ | \ J \subseteq \{1,\ldots,m \}, \ |J | \leqslant r \} \end{aligned}$$

is a basis for the space of polynomials in m variables of degree at most r whose evaluations define Boolean functions.

We will exploit the following lemma repeatedly.

Lemma 9.4

Let J be a subset of {1, …, m}. Suppose

$$\displaystyle \begin{aligned} g(X)=\sum_{L \subseteq \{1,\ldots,m\}} a_L f_L(X), \end{aligned}$$

for some \(a_L \in {\mathbb F}_2\), where the sum is over all subsets L of size at most m −|J|.

Then

$$\displaystyle \begin{aligned} \sum_{ x \in {\mathbb F}_2^m } f_J(x)g(x)=a_{\{1,\ldots,m\} \setminus J}. \end{aligned}$$

Proof

Let K ⊆{1, …, m}.

If there is an i ∈{1, …, m}∖ K, then

$$\displaystyle \begin{aligned} \sum_{ \{x \in {\mathbb F}_2^m | x_i=0 \}} f_K(x)=\sum_{ \{x \in {\mathbb F}_2^m | x_i \neq 0 \}} f_K(x). \end{aligned}$$

This implies

$$\displaystyle \begin{aligned} \sum_{ x \in {\mathbb F}_2^m } f_K(x)=0, \end{aligned} $$
(9.1)

unless K = {1, …, m}.

Then

$$\displaystyle \begin{aligned} \sum_{ x \in {\mathbb F}_2^m } f_J(x)g(x)=\sum_{L \subseteq \{1,\ldots,m\}} \sum_{ x \in {\mathbb F}_2^m } a_L f_J(x)f_L(x), \end{aligned}$$

where the first sum on the right-hand side is over all subsets L of size at most m −|J|.

This expression is equal to

$$\displaystyle \begin{aligned} \sum_{L \subseteq \{1,\ldots,m\}} a_L \sum_{ x \in {\mathbb F}_2^m } f_{J\cup L}(x)=a_{\{1,\ldots,m\} \setminus J}, \end{aligned}$$

by (9.1). □

Theorem 9.5

The dual of the code R(r, m) is the Reed–Muller code R(m  r − 1, m).

Proof

A codeword u of R(r, m) is the evaluation of a polynomial

$$\displaystyle \begin{aligned} g(X)=\sum_{K \subseteq \{1,\ldots,m\}} a_K f_K(X), \end{aligned}$$

for some \(a_K \in {\mathbb F}_2\), where the sum is over all subsets of size at most r.

A codeword v of R(m − r − 1, m) is the evaluation of

$$\displaystyle \begin{aligned} h(X)=\sum_{L \subseteq \{1,\ldots,m\}} b_L f_L(X), \end{aligned}$$

for some \(b_L \in {\mathbb F}_2\), where the sum is over all subsets of size at most m − r − 1.

The inner product of u and v is

$$\displaystyle \begin{aligned} \sum_{ x \in {\mathbb F}_2^m } h(x)g(x)=\sum_{ x \in {\mathbb F}_2^m } \sum_K \sum_L a_K b_L f_K(x) f_L(x) \end{aligned}$$
$$\displaystyle \begin{aligned} \quad \quad = \sum_K \sum_L a_K b_L\sum_{ x \in {\mathbb F}_2^m } f_{K\cup L}(x)=0, \end{aligned}$$

by Lemma 9.4.

Therefore,

$$\displaystyle \begin{aligned} R(m-r-1,m) \subseteq R(r,m)^{\perp}. \end{aligned}$$

By Theorem 9.3, the sum of the dimensions of R(r, m) and R(m − r − 1, m) is 2m, which is the length of the codes.

Hence,

$$\displaystyle \begin{aligned} \dim R(m-r-1,m) = \dim R(r,m)^{\perp}. \end{aligned}$$

The following lemma is fundamental to the decoding algorithm.

Lemma 9.6

Let

$$\displaystyle \begin{aligned} g(X)=\sum_{K \subseteq \{1,\ldots,m\}, \ |K| \leqslant r} b_K f_K(X), \end{aligned}$$

where \(b_K \in {\mathbb F}_2\) and let J be a subset of {1, …, m} of size r.

For all 2mr choices of \(a_i \in {\mathbb F}_2\), i ∈{1, …, m}∖ J,

$$\displaystyle \begin{aligned} \sum_{x \in {\mathbb F}_2^n} g(x)\prod_{i \in \{1,\ldots,m\} \setminus J} (x_i+a_i)=b_J. \end{aligned}$$

Proof

When we expand the product in the sum, all terms have degree less than m except those coming from

$$\displaystyle \begin{aligned} g(x)\prod_{i \in \{1,\ldots,m\} \setminus J} x_i=g(x)f_{\{1,\ldots,m\} \setminus J}(x). \end{aligned}$$

The lemma follows from Lemma 9.4. □

We are now in a position to describe a decoding algorithm for Reed–Muller codes, which is an example of a majority-logic decoding algorithm. Let v be the received vector, whose coordinates v x are indexed by the vectors \(x \in {\mathbb F}_2^m\). For each subset J of {1, …, m} of size r, we perform a test. We wish to determine whether u J is zero or one, where the sent codeword u is the evaluation of

$$\displaystyle \begin{aligned} \sum_{J \subseteq \{1,\ldots,m\}, \ |J| \leqslant r} u_J f_J(X). \end{aligned}$$

For all 2mr choices of \(a_i \in {\mathbb F}_2\), i ∈{1, …, m}∖ J, we calculate

$$\displaystyle \begin{aligned} \sum_{x \in {\mathbb F}_2^m} v_x \prod_{i \in \{1,\ldots,m\} \setminus J} (x_i+a_i). \end{aligned}$$

If the result of this test is 1 in the majority of cases, then we conclude that u J = 1 and vice versa, if it is 0 in the majority of cases, then we conclude that u J = 0. Once we have completed this for all subsets J of {1, …, m} of size r, we subtract the evaluation of

$$\displaystyle \begin{aligned} \sum_{K \subseteq \{1,\ldots,m\}, \ |K|= r} u_K f_K(X), \end{aligned}$$

from the received vector and continue with the subsets of size r − 1 supposing that, if we are correctly decoding, we now have a corrupted codeword of R(r − 1, m).

All that remains to be shown, to prove that this decoding algorithm will correct up to 2mr−1 − 1 error bits, is to show that an error bit will only affect one of the tests. Before we prove this in Lemma 9.8, we consider an example.

Example 9.7

Suppose that we have encoded using R(2, 4) and have received

$$\displaystyle \begin{aligned} v=(1,1,1,0,0,1,1,0,1,0,0,0,0,1,0), \end{aligned}$$

where the vectors of \({\mathbb F}_2^4\) are ordered as in the matrix G in Example 9.1.

We calculate

$$\displaystyle \begin{aligned} w=v\mathrm{G}^t=(1,0,1,1,0,1,1,0,0,1,0).\end{aligned} $$

The coordinates are indexed by subsets of {1, 2, 3, 4} of size at most 2, as in Example 9.1. Indexing the coordinates explicitly

where

$$\displaystyle \begin{aligned} w_J=\sum_{ x \in {\mathbb F}_2^m} v_x f_J(x) \end{aligned}$$

are the coordinates of w.

We start by determining u J for the subsets J of size r = 2.

To determine u {12}, we make 2mr = 4 tests by calculating

$$\displaystyle \begin{aligned} \sum_{ x \in {\mathbb F}_2^m} v_x x_3x_4, \sum_{ x \in {\mathbb F}_2^m} v_x (x_3x_4+x_3), \sum_{ x \in {\mathbb F}_2^m} v_x (x_3x_4+x_4) \end{aligned}$$

and

$$\displaystyle \begin{aligned} \sum_{ x \in {\mathbb F}_2^m} v_x (x_3x_4+x_3+x_4+1), \end{aligned}$$

which is

$$\displaystyle \begin{aligned} w_{\{34\}},w_{\{34\}}+w_{\{3\}},w_{\{34\}}+w_{\{4\}} \ \mathrm{and} \ w_{\{34\}}+w_{\{3\}}+w_{\{4\}}+w_{\emptyset}, \end{aligned}$$

respectively.

The results of these tests are 0, 1, 0, 0, respectively, so we decode u {12} as 0, since there are a majority of zeros.

The following table lists the results of these tests for all subsets of size 2 and indicates the majority decision.

Based on the results of those tests, we subtract

$$\displaystyle \begin{aligned} (0,0,0,0,0,0,1,1,0,1,1)\mathrm{G} \end{aligned}$$

from v and get

$$\displaystyle \begin{aligned} v^1=v+(0,0,0,0,0,0,1,1,0,1,1)\mathrm{G}=(1,1,1,1,0,0,1,0,1,1,1,1,0,0,0,0). \end{aligned}$$

If we are decoding correctly, v 1 should be a (possibly) corrupted codeword of R(1, 4). To determine u J, where J is a subset of size 1, we repeat the above.

We calculate

$$\displaystyle \begin{aligned} w^1=v^1\mathrm{G}_1^t, \end{aligned}$$

where G1 is the generator matrix of R(3, m). This vector will have coordinates

$$\displaystyle \begin{aligned} w^1_K=\sum_{x \in {\mathbb F}_2^m} f_K(x)v^1_x, \end{aligned}$$

where K is a subset of {1, 2, 3, 4} of size at most 3.

Indexing the coordinates explicitly as before

allows us to perform 2m−(r−1) = 8 tests for each u J.

To determine u {1}, we make 8 tests by calculating

$$\displaystyle \begin{aligned} w^1_{\{234\}},w^1_{\{234\}}+w^1_{\{23\}},w^1_{\{234\}}+w^1_{\{24\}}, w^1_{\{234\}}+w^1_{\{34\}}, \ w^1_{\{234\}}+w^1_{\{23\}}+w^1_{\{24\}}+w^1_{\{2\}}, \end{aligned}$$
$$\displaystyle \begin{aligned} \ w^1_{\{234\}}+w^1_{\{23\}}+w^1_{\{34\}}+w^1_{\{3\}}, \ w^1_{\{234\}}+w^1_{\{24\}}+w^1_{\{34\}}+w^1_{\{4\}} \end{aligned}$$

and

$$\displaystyle \begin{aligned} w^1_{\{234\}}+w^1_{\{23\}}+w^1_{\{24\}}+w^1_{\{34\}}+w^1_{\{2\}}+w^1_{\{3\}}+w^1_{\{4\}}+w^1_{\emptyset}. \end{aligned}$$

The results of these tests are

Based on the results of the tests, we subtract

$$\displaystyle \begin{aligned} (0,0,1,0,0,0,0,0,0,0,0)\mathrm{G} \end{aligned}$$

from v 1 and get

$$\displaystyle \begin{aligned} v_2=v+(0,0,1,0,0,0,1,1,0,1,1)\mathrm{G}=(1,1,1,1,1,1,0,1,1,1,1,1,1,1,1,1). \end{aligned}$$

Summing

$$\displaystyle \begin{aligned} (1,0,0,0,0,0,0,0,0,0,0)\mathrm{G} \end{aligned}$$

to v 2 we have that

$$\displaystyle \begin{aligned} v+(1,0,1,0,0,0,1,1,0,1,1)\mathrm{G}=(0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0). \end{aligned}$$

Therefore, we have determined that the error is in the 7-th bit, that the uncoded string

$$\displaystyle \begin{aligned} u=(1,0,1,0,0,0,1,1,0,1,1) \end{aligned}$$

and that the sent codeword was uG. \(\blacksquare \)

To finish this section, we prove that an error bit affects exactly one of the tests when testing a corrupted codeword of R(r, m). Since we perform 2mr tests, this implies that we can correct up to 2mr−1 − 1 error bits, or in other terms \(\frac {1}{2}d-1\) error bits, since the minimum distance d of R(r, m) is 2mr.

Lemma 9.8

Suppose that e is a vector of \({\mathbb F}_2^{2^m}\) of weight one, whose coordinates are indexed by the vectors of \({\mathbb F}_2^m\). Let J be a subset of {1, …, m}. For all but one of the choices of a, whose coordinates \(a_i \in {\mathbb F}_2\) for i ∈{1, …, m}∖ J,

$$\displaystyle \begin{aligned} \sum_{x \in {\mathbb F}_2^m} e_x \prod_{i \in \{1,\ldots,m\} \setminus J} (x_i+a_i)=0, \end{aligned}$$

where e x is the coordinate of e indexed by x.

Proof

Let y be the vector of \({\mathbb F}_2^m\) indexing the coordinate where the vector e has a 1.

The vector e is the evaluation of

$$\displaystyle \begin{aligned} \prod_{i=1}^m (X_i+y_i+1), \end{aligned}$$

since it is zero unless X i = y i for all i = 1, …, m.

Hence, for all \(x \in {\mathbb F}_2^m\),

$$\displaystyle \begin{aligned} e_x \prod_{i \in \{1,\ldots,m\} \setminus J} (x_i+a_i)= \prod_{i=1}^m (x_i+y_i+1) \prod_{i \in \{1,\ldots,m\} \setminus J} (x_i+a_i), \end{aligned}$$

which will contain a factor \(x_i^2+x_i\) (and is therefore zero) unless a i = y i + 1 for all i ∈{1, …, m}∖ J. □

To decode using this majority-logic decoding algorithm we perform at most 2m tests k times, where k is the dimension of the code. This is less than n 2 tests, where n is the length of the code. Each test involves summing less than n terms, so the decoding algorithm is completed in a number of steps which is polynomial in the length of the code. This should be compared to syndrome decoding from Chapter 4, which involved searching through a look-up table with a number of entries which is exponential in n. For this reason Reed–Muller codes and the majority-logic decoding algorithm are widely implemented. However, they do not give a sequence of asymptotically good codes. Although the relative minimum distance is 2r, which we can bound away from zero by fixing r, the transmission rate of R(r, m) is less than

$$\displaystyle \begin{aligned} \frac{r}{n}{\log n \choose r} \end{aligned}$$

which tends to zero as n tends to infinity.

9.3 Kerdock Codes

A codeword of R(2, m) ∖ R(1, m) is the evaluation of polynomials of the form

$$\displaystyle \begin{aligned} q(X)+\ell(X) \ \ \mathrm{or} \ \ q(X)+\ell(X)+1, \end{aligned}$$

where (X) is a linear form in m variables and

$$\displaystyle \begin{aligned} q(X)=\sum_{1 \leqslant i<j\leqslant m} a_{ij} X_iX_j \end{aligned}$$

is a non-zero quadratic form.

If the quadratic form q(X) has maximum rank, then we will prove that, for all the linear forms (X), these codewords will have large weight. Therefore, if we can find a set of quadratic forms whose differences are quadratic forms of maximum rank, then the distance between any two codewords will be large. In this section we will develop and formalise this idea.

Let A = (a ij) be the symmetric matrix defined by the symmetric bilinear form

$$\displaystyle \begin{aligned} b(X,Y)=q(X+Y)-q(X)-q(Y)=\sum_{1 \leqslant i<j\leqslant m} a_{ij} (X_iY_j+X_jY_i)=X^t \mathrm{A} Y. \end{aligned}$$

The rank of the bilinear form b(X, Y ) is defined to be the rank of A.

Lemma 9.9

Suppose m is even. The evaluation of

$$\displaystyle \begin{aligned} \sum_{i=1}^{m/2} X_{2i-1}X_{2i} \end{aligned}$$

at the vectors of \({\mathbb F}_2^m\) has 2m−1 + 2m∕2−1 zeros.

Proof

There are 2m∕2 zeros of the form (0, x 2, 0, x 4, …, 0, x m).

If x 2i−1 ≠ 0 for some i = 1, …, m∕2, then one of the x 2i is determined by

$$\displaystyle \begin{aligned} \sum_{i=1}^{m/2} x_{2i-1}X_{2i}=0, \end{aligned}$$

which gives 2m∕2−1(2m∕2 − 1) zeros of this form, 2m∕2−1 zeros for each non-zero vector (x 1, x 3, …, x m−1).

Hence, there are precisely 2m−1 + 2m∕2−1 zeros when evaluated at the vectors of \({\mathbb F}_2^m\). □

We are going to construct codes whose codewords are the evaluation of the sum of a quadratic form and a linear form. For this reason, we want to know the weights of the vectors which are the evaluations of these Boolean functions.

Lemma 9.10

Suppose m is even, q(X) is a quadratic form and ℓ(X) is a linear form. If the bilinear form associated to q(X) has rank m, then the evaluation of q(X) + ℓ(X) at the vectors of \({\mathbb F}_2^m\) has either 2m−1 + 2m∕2−1 or 2m−1 − 2m∕2−1 zeros.

Proof

Dickson’s theorem, Exercise 9.2, implies that there is a basis of \({\mathbb F}_2^m\) with respect to which q(x) + (X) is

$$\displaystyle \begin{aligned} \sum_{i=1}^{m/2} (X_{2i-1}X_{2i}+a_{2i-1}X_{2i-1}+a_{2i}X_{2i}). \end{aligned}$$

This is equal to

$$\displaystyle \begin{aligned} \sum_{i=1}^{m/2} (X_{2i-1}+a_{2i})(X_{2i}+a_{2i-1})+b \end{aligned}$$

for some \(b \in {\mathbb F}_2\). By Lemma 9.9, the evaluation of q(X) + (X) has either 2m−1 + 2m∕2−1 zeros or 2m − (2m−1 + 2m∕2−1) zeros, depending on whether b = 0 or 1. □

Let K be a set of symmetric m × m matrices over \({\mathbb F}_2\), which have zeros on the diagonal, and which have the property that the matrix A −A has rank m for all distinct A, A∈ K.

No two matrices in K can have the same first row, since their difference is of rank m. The entries on the diagonal of the matrices in K are zero, so the top-left entry of a matrix in K is zero. Hence, we have that

$$\displaystyle \begin{aligned} |K| \leqslant 2^{m-1}. \end{aligned}$$

For each A = (a ij) ∈ K, let

$$\displaystyle \begin{aligned} q_{\mathrm{A}}(X)=\sum_{1 \leqslant i<j\leqslant m} a_{ij} X_iX_j. \end{aligned}$$

Let C(K) be the code whose codewords are the evaluation at the vectors of \({\mathbb F}_2^m\) of

$$\displaystyle \begin{aligned} q_{\mathrm{A}}(X)+\ell(X) \ \ \mathrm{or} \ \ q_{\mathrm{A}}(X)+\ell(X)+1, \end{aligned}$$

for all A ∈ K and for all linear forms (X).

Theorem 9.11

Suppose that m is even. The code C(K) is a binary block code of length 2m, size |K||R(1, m)| and minimum distance 2m−1 − 2m∕2−1.

Proof

The distance between the evaluation of

$$\displaystyle \begin{aligned} q_{\mathrm{A}}(X)+\ell(X)+b \end{aligned}$$

and

$$\displaystyle \begin{aligned} q_{\mathrm{A}^{\prime}}(X)+\ell^{\prime}(X)+b^{\prime} \end{aligned}$$

is the weight of the evaluation of

$$\displaystyle \begin{aligned} q_{\mathrm{A}-\mathrm{A}^{\prime}}(X)+\ell(X)-\ell^{\prime}(X)+b-b^{\prime}. \end{aligned}$$

Since, A −A has rank m, Lemma 9.10 implies that this distance is at least 2m−1 − 2m∕2−1. □

A Kerdock code is a code C(K) where |K| = 2m−1. Thus, for a Kerdock code, K is of maximum size and the set K is called a Kerdock set. A Kerdock code is a binary block code of length 2m, it has minimum distance 2m−1 − 2m∕2−1 and size 22m, i.e. it is a (2m, 22m, 2m−1 − 2m∕2−1)2 code.

There are many non-equivalent Kerdock codes. Indeed, if m − 1 is not prime, then there are at least \(2^{\sqrt {m}/2}\) inequivalent Kerdock codes of length 2m. However, a sequence of Kerdock codes, whose lengths tend to infinity, is asymptotically bad. Although the relative minimum distance tends to \(\frac {1}{2}\), the transmission rate is 2m∕2m, which tends to zero.

Kerdock codes are of interest because they can be non-linear. The algebraic and geometric nature of their construction allows for non-trivial decoding algorithms to be implemented. The fact that Kerdock codes can be non-linear opens up the possibility of constructing codes with parameter sets for which linear codes do not exist.

Example 9.12

Consider the set of 4 × 4 matrices over \({\mathbb F}_2\)

$$\displaystyle \begin{aligned} \{ \left(\begin{array}{cccc} 0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 \\ \end{array}\right), \left(\begin{array}{cccc} 0 & 1 & 0 & 0 \\1 & 0 & 0 & 0 \\0 & 0 & 0 & 1 \\0 & 0 & 1 & 0 \\ \end{array}\right), \left(\begin{array}{cccc} 0 & 0 & 1 & 0 \\0 & 0 & 0 & 1 \\1 & 0 & 0 & 1 \\0 & 1 & 1 & 0 \\ \end{array}\right), \left(\begin{array}{cccc} 0 & 0 & 0 & 1 \\0 & 0 & 1 & 1 \\0 & 1 & 0 & 1 \\1 & 1 & 1 & 0 \\ \end{array}\right) \}. \end{aligned}$$

This set of matrices can be extended to a set K of 8 matrices with the property that the difference of any two matrices has rank 4, see Exercise 9.5. Thus, K is a Kerdock set and, by Theorem 9.11, the binary Kerdock code C(K) is a (16, 256, 6)2 code. We proved in Example 4.20 that there is no binary linear code with these parameters. This code is the Nordstrom–Robinson code. \(\blacksquare \)

9.4 Non-binary Reed–Muller Codes

Until now we have only considered Reed–Muller codes over \({\mathbb F}_2\), but one can naturally generalise the definition of a Reed–Muller code over a general finite field \({\mathbb F}_q\). The codewords of R q(r, m) are the evaluations at the vectors of \({\mathbb F}_q^m\) of polynomials of degree at most r in m variables, where the degree in any particular variable is at most q − 1. The number of vectors in an m-dimensional vector space over \({\mathbb F}_q\) is q m, so the length of the linear code R q(r, m) is q m. Its dimension is more difficult to calculate, see Exercise 9.7. In the following examples, we calculate the dimension for some specific cases and some low weight codewords, which we will then go on and prove are of minimum non-zero weight.

Example 9.13

Suppose \(r \leqslant q-1\). The evaluation of any polynomial in m variables of degree at most r will be a codeword of R q(r, m). The set

$$\displaystyle \begin{aligned} \{X_1^{c_1}\cdots X_m^{c_m} \ | \ c_1+\cdots+c_m \leqslant r \} \end{aligned}$$

is a basis for the space of polynomials in m variables of degree at most r. Hence, the dimension of R q(r, m) is

$$\displaystyle \begin{aligned} {m+r \choose r} \end{aligned}$$

since this is the number of non-negative integer solutions to

$$\displaystyle \begin{aligned} c_1+\cdots+c_m \leqslant r. \end{aligned}$$

Let g(X 1) be a polynomial of degree r with r distinct roots in \({\mathbb F}_q\). The evaluation of g is a codeword with precisely rq m−1 zero coordinates; a zero coordinate being indexed by a vector of \({\mathbb F}_q^m\) whose first coordinate is a root of g. Therefore, R q(r, m) has codewords of weight (q − r)q m−1. \(\blacksquare \)

Example 9.14

The space of polynomials of degree at most 3 in three variables in which no variable has an exponent larger than 2 has a basis

$$\displaystyle \begin{aligned} \{1,X_1,X_2,X_3,X_1^2,X_2^2,X_3^2,X_1X_2,X_1X_3,X_2X_3,\end{aligned}$$
$$\displaystyle \begin{aligned} X_1^2X_2, X_1^2X_3,X_2^2X_1,X_2^2X_3,X_3^2X_1,X_3^2X_2,X_1X_2X_3 \}. \end{aligned}$$

Therefore, the code R 3(3, 3) is a 17-dimensional ternary linear code of length 27. We could also have arrived at this conclusion by considering a monomial basis for all polynomials of degree at most three in three variables and deleting \(X_1^3\), \(X_2^3\) and \(X_3^3\), see Exercise 9.7. \(\blacksquare \)

Suppose that r = a(q − 1) + b, where \(0 \leqslant b \leqslant q-2\). If g(X 1) is a polynomial of degree b with b distinct roots in \({\mathbb F}_q\), then the evaluation of

$$\displaystyle \begin{aligned} g(X_1)(X_2^{q-1}-1) \cdots (X_{a+1}^{q-1}-1), \end{aligned}$$

a polynomial of degree r, is non-zero only when evaluated at

$$\displaystyle \begin{aligned} x=(x_1,0,\ldots,0,x_{a+2},\ldots,x_m) \end{aligned}$$

for some x 1 which is not a root of g. Therefore, R q(r, m) has a codeword of weight

$$\displaystyle \begin{aligned} (q-b)q^{m-a-1}. \end{aligned}$$

We shall prove that this is the minimum weight of a non-zero codeword in the following theorem, the proof of which is an example of a proof using the polynomial method. This type of proof, which one sees often in combinatorics, attempts to obtain bounds from the fact that the number of zeros of a non-zero polynomial is bounded. The application of the method is often something like the following. Given a combinatorial object, a polynomial is constructed in such a way that the properties of the combinatorial object are translated into algebraic properties of the polynomial. Usually we are interested in the zeros of the polynomial, often restricted to subsets of a vector space. Here, the polynomial is directly given as the polynomial whose evaluation is the codeword. By bounding from above the number of zeros of the polynomial, we will bound from below the weight of the codeword.

Theorem 9.15

The minimum distance of R q(r, m) is (q  b)q ma−1, where r = a(q − 1) + b and \(0 \leqslant b \leqslant q-2\).

Proof

By induction on m.

If m = 1, then the codewords are the evaluation of a polynomial of degree \(r \leqslant q-1\) in one variable. The polynomial has at most r zeros, so the codeword has weight at least q − r. Observe that if r = q − 1, then a = 1 and b = 0 and

$$\displaystyle \begin{aligned} (q-b)q^{m-a-1}=q/q=1=q-r. \end{aligned}$$

Suppose that the codeword u ∈ R q(r, m) is the evaluation of the polynomial

$$\displaystyle \begin{aligned} f(X)=f(X_1,\ldots,X_m). \end{aligned}$$

We write f(X) as a polynomial in X m, whose coefficients are polynomials in X 1, …, X m−1. Thus,

$$\displaystyle \begin{aligned} f(X)=\sum_{i=0}^{c} f_i(X_1,\ldots,X_{m-1})X_m^i, \end{aligned}$$

where c is the degree of f(X) in the indeterminate X m. Note that f c(X 1, …, X m−1)≢0 and

$$\displaystyle \begin{aligned} \deg f_c \leqslant \deg f -c \leqslant r-c. \end{aligned}$$

The codeword of R q(r − c, m − 1) which is the evaluation of f c has, by induction, at least

$$\displaystyle \begin{aligned} (q-b^{\prime})q^{m-a^{\prime}-2} \end{aligned}$$

non-zero coordinates, where r − c = a (q − 1) + b and \(0 \leqslant b^{\prime } \leqslant q-2\).

For any (x 1, …, x m−1) such that f c(x 1, …, x m−1) ≠ 0, there are at least q − c elements of \({\mathbb F}_q\) for which f(x 1, …, x m−1, X m) is not zero. Hence, the codeword u has weight at least

$$\displaystyle \begin{aligned} (q-c)(q-b^{\prime})q^{m-a^{\prime}-2}. \end{aligned}$$

It remains to prove that

$$\displaystyle \begin{aligned} (q-c)(q-b^{\prime})q^{m-a^{\prime}-2} \geqslant (q-b)q^{m-a-1}. \end{aligned}$$

The theorem then follows since, by Lemma 4.1, the minimum distance of a linear code is equal to the minimum weight of a non-zero codeword.

If \(a^{\prime } \leqslant a-2\), then this is clear, so we can assume a  = a − 1 or a.

Suppose a  = a − 1 and (q − c)(q − b ) < q − b. This inequality implies b  > b. We have r = a(q − 1) + b and r − c = a (q − 1) + b , so

$$\displaystyle \begin{aligned} c=(a-a^{\prime})(q-1)+b-b^{\prime}=q-1+b-b^{\prime}.\end{aligned}$$

Then (q − c)(q − b ) < (q − b) implies (b − b + 1)(q − b ) < q − b, a contradiction.

Suppose a  = a and (q − c)(q − b ) < q(q − b). We have r = a(q − 1) + b and r − c = a (q − 1) + b , so c = b − b . Then (q − c)(q − b ) < q(q − b) implies (q − b + b )(q − b ) < q(q − b) which implies b < b and c < 0, a contradiction. □

9.5 Comments

Reed–Muller codes were introduced by Reed [59] and Muller [53] in the 1950s.

We have taken an algebraic rather than a geometric approach to the majority-logic decoding algorithm. For a geometric description of the algorithm, see Van Lint [74] or MacWilliams and Sloane [50].

Dickson’s classification of quadratic form over fields of even characteristic is from [22].

If m is odd, then there are examples of sets K for which Exercise 9.6 is a \((\frac {1}{2}(m^2+m)+1-r m)\)-dimensional binary linear code. The 11-dimensional codes (m = 5 and r = 2) are the codes which caused a dispute between Apple and Samsung, referred to in James Davis’ lecture [20]. They can be found in Corollary 17 (m = 5, d = t = 2) on page 455 of MacWilliams and Sloane [50].

Kerdock codes were first considered by Kerdock in [44] in 1972. That there are an exponential number of inequivalent Kerdock codes is proven by Kantor in [41]. Kantor takes a geometric approach to Kerdock codes in the articles [42], a treatment of which can be found in Chapter 12 of Cameron and van Lint’s book [17]. The Nordstrom–Robinson code is from [54]. Kerdock codes have applications to quantum mechanics, see [15] and [18].

The non-binary Reed–Muller codes were defined by various authors. Theorem 9.15 is attributed to Kasami, Lin and Peterson [43] in Bishnoi [11], where the proof given here is adapted from.

9.6 Exercises

9.1

Suppose that we have sent a codeword of the code R(2, 4), the coordinates ordered as in Example 9.1, and have received the vector

$$\displaystyle \begin{aligned} (0,1,1,0,0,1,1,0,0,1,1,0,1,0,1,1). \end{aligned}$$
  1. i.

    Decode the received vector using syndrome decoding.

  2. ii.

    Decode the received vector using majority-logic decoding.

9.2

Suppose that q(X) is a quadratic form of rank m of the type

$$\displaystyle \begin{aligned} q(X)=\sum_{1 \leqslant i<j\leqslant m} q_{ij} X_iX_j. \end{aligned}$$

Prove that there is a basis of \({\mathbb F}_2^m\), with respect to which, q(X) is

$$\displaystyle \begin{aligned} \sum_{i=1}^{m/2} X_{2i-1}X_{2i}. \end{aligned}$$

9.3

  1. i.

    Prove that we can select half the codewords of R(1, m) so that the 2m × 2m matrix H, whose rows are the selected codewords with zeros changed to minus one, has the property that HHt = 2mI, where I is the 2m × 2m identity matrix.

  2. ii.

    Prove that for each vector \(v \in {\mathbb F}_2^{2^m}\) there is a codeword u of R(1, m) such that \(d(u,v) \leqslant 2^{m-1}-2^{m/2-1}\).

9.4

Prove that the Kerdock code C(K) of length 2m is linear if and only if the Kerdock set K is a subspace of the vector space of m × m matrices.

9.5

Complete the set of matrices to a Kerdock set K of eight matrices and prove that C(K) is a non-linear (16, 256, 16) code.

$$\displaystyle \begin{aligned} \{ \left(\begin{array}{cccc} 0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 \\0 & 0 & 0 & 0 \\ \end{array}\right), \left(\begin{array}{cccc} 0 & 1 & 0 & 0 \\1 & 0 & 0 & 0 \\0 & 0 & 0 & 1 \\0 & 0 & 1 & 0 \\ \end{array}\right), \left(\begin{array}{cccc} 0 & 0 & 1 & 0 \\0 & 0 & 0 & 1 \\1 & 0 & 0 & 1 \\0 & 1 & 1 & 0 \\ \end{array}\right), \left(\begin{array}{cccc} 0 & 0 & 0 & 1 \\0 & 0 & 1 & 1 \\0 & 1 & 0 & 1 \\1 & 1 & 1 & 0 \\ \end{array}\right) \} \end{aligned}$$

9.6

  1. i.

    Prove that the evaluation of

    $$\displaystyle \begin{aligned} q(X)=\sum_{i=1}^r X_{2i-1}X_{2i} \end{aligned}$$

    at the vectors of \({\mathbb F}_2^m\), has 2m−1 + 2mr−1 zeros.

  2. ii.

    Prove that the evaluation of q(X) + ℓ(X), where ℓ(X) is a linear form and q(X) is a quadratic form whose associated bilinear form is of rank 2r, at the vectors of \({\mathbb F}_2^m\), has either 2m−1 + 2mr−1, 2m−1 or 2m−1 − 2mr−1 zeros.

  3. iii.

    Suppose that K is a set of m × m symmetric matrices over \({\mathbb F}_2\) with the property that A + A has rank at least 2r for all A, A K. Construct a (2m, |K|2m+1, 2m−1 − 2mr−1)2 code.

  4. iv.

    Use iii. to construct a [32, 11, 12]2 code.

  5. v.

    Construct a linear code with the same parameters from the code of length 31 constructed in Exercise 5.6 .

9.7

  1. i.

    By finding a monomial basis for the space of polynomials in 3 variables of degree at most 4, in which the degree of each variable is at most 2, calculate the dimension of R 3(4, 3).

  2. ii.

    Prove that if \(r \leqslant q-1\), then the dimension of R q(r, m) is

    $$\displaystyle \begin{aligned} \sum_{i=0}^r {m+i-1 \choose m-1}. \end{aligned}$$
  3. iii.

    Prove that the dimension of R q(r, m) is

    $$\displaystyle \begin{aligned} \sum_{i=0}^r \sum_{j=0}^{m} (-1)^j {m+i-1-jq \choose m-1}{m \choose j}. \end{aligned}$$