1 Introduction

A q-ary (n,M,d)-code is a code of length n over an alphabet with q symbols, with M codewords and minimum Hamming distance d. If we increase M while fixing n and d, the code can be used to encode more messages with the same number of symbols transmitted and the same number of correctable errors. For given q, n and d, the maximum possible cardinality of a q-ary (n,M,d)-code is denoted by Aq(n,d), and for binary codes by A(n,d). This is one version of what is called ‘the main problem of coding theory’.

For binary codes of length n up to 15 and any minimum distance d, the exact values of A(n,d) are known. For 6 ≤ n ≤ 28 Brouwer [1] maintains an updated table of lower and upper bounds.

We see in [1] that 256 ≤ A(16,5) ≤ 340. The lower bound is obtained by adding a zero in front of every codeword of the Nordstrom–Robinson code [7]. Some experiments attempting to find large (16,M,5)-codes are discussed by Haas and Houghten [5]. In this note we improve this lower bound to 258 by describing a (16,258,5)-binary code.

In Section 2 we explain the construction and Tabu Search algorithm we use to find the code. In Section 3 we present the code.

2 Code construction and Tabu Search

In [2, p. 173], Dueck and Scheuer construct a binary constant weight code with weight w and length 2w from a binary constant weight code with half the number of codewords. They find the smaller code using their Threshold Accepting method, which is a stochastic local search [6] method.

We use the above construction here, but apply it to unrestricted binary codes. We find the smaller code with Tabu Search [4], which is also a stochastic local search method.

First we give the construction.

Proposition 1 (from 2, applied to general codes)

LetC be a binary code of lengthn with M codewords. Assume that for every two distinct codewordsx,yinC we havedd(x,y) ≤ nd.Then the code obtained by adding the complements of the codewords inC toC has 2Mcodewords and minimum distance at leastd.

Proof

We denote the complement of a vector v by \(\bar {\mathbf {v}}\). The code C and the code consisting of the complements of the codewords in C both have only distances at least d between distinct codewords. Now let x be a codeword in C and \(\bar {\mathbf {y}}\) a codeword in the complements code (yC). Clearly \(d(\mathbf {x}, \mathbf {y}) + d(\mathbf {x}, \bar {\mathbf {y}}) = n\). Therefore, \(d(\mathbf {x}, \bar {\mathbf {y}}) = n - d(\mathbf {x}, \mathbf {y}) \geq d\). □

Next we give the description of the Tabu Search algorithm we use to find the smaller code. We do this by specifying the components of a stochastic local search algorithm as suggested by Hoos and Stützle [6, p. 38].

The problem of finding a binary code of length n, with M codewords, and distances at least d and at most nd is a combinatorial decision problem. A stochastic local search algorithm for a decision problem starts with an initial candidate solution. At each iteration the next candidate solution is chosen according to some probability distribution over the neighborhood of the current candidate solution. When the termination criterion is met the algorithm stops and returns the current candidate solution if it is indeed a solution, or reports failure otherwise [6, p. 41].

We define the search spaceS of candidate solutions as the set of binary codes of length n with M codewords. A solution is a code in S such that the distance between any two distinct codewords is in the range [d,nd]. The algorithm initially chooses a candidate solution from S randomly with equal probability. It terminates when a solution is found or when we stop it manually, whichever comes first.

The neighborhood of a candidate solution C is the set of codes in S that can be obtained from C by replacing one codeword with a vector that is not already in C. Formally, N(C) = {CS | |CC| = 1}.

Next we associate a cost with every two distinct evctors.

Definition 1

The function c is defined for every two vectors xy by the rule

$$ c(\mathbf{x}, \mathbf{y}) = \left\{\begin{array}{lll} (d - d(\mathbf{x} , \mathbf{y}))^{2} & \text{if } d(\mathbf{x} , \mathbf{y}) < d,\\ (n - d - d(\mathbf{x} , \mathbf{y}))^{2} & \text{if } d(\mathbf{x} , \mathbf{y}) > n - d,\\ 0 & \text{otherwise}. \end{array}\right. $$

For pairs of distinct vectors that are at a distance smaller than d, the smaller the distance the higher the cost. Similarly, for pairs of distinct vectors that are at a distance larger than nd, the larger the distance the higher the cost.

Using the cost c we define an assisting evaluation function that ranks the candidate solutions and guides the search.

Definition 2 (Evaluation function)

\(g:S \mapsto \mathbb {R}\), where for all CS

$$ g(C) = \sum\limits_{\underset{\mathbf{x} < \mathbf{y}}{\mathbf{x}, \mathbf{y} \in C}} c(\mathbf{x} , \mathbf{y}) $$

and x < y refers to lexicographical order.

The function g has the property that its global minima correspond to the solutions. Notice that the evaluation function defined here is different from the one used in [2], because the cost we give here to pairs of vectors at a distance larger than nd is different than the one given in [2].

A neighbor of a candidate solution meets the tabu criterion if it is obtained by reintroducing a codeword that has been removed in the previous tt iterations. The value tt is called the tabu tenure. Based on some experiments we chose the tabu tenure value to be 2M. A neighbor meets the aspiration criterion if it has a value of g lower than the value of any candidate solution that was encountered in previous iterations. The set A(C) of admissible neighbors of C consists of the neighbors that do not meet the tabu criterion or meet the aspiration criterion.

For every candidate solution C and a neighbor C in N(C) the step function gives the probability that C would be chosen to be the next candidate solution. Now we can define this function: if C is a candidate solution then the next candidate solution is chosen randomly with equal probability from the ‘best’ admissible neighbors, that is from

$$ \arg\min_{C^{\prime} \in A(C)} g(C^{\prime}). $$

Finally, we give a few details about some of the data structures we use. They are almost identical to the ones described by Fiduccia and Mattheyses in [3, Fig. 2].

By definition 2 if C is a candidate solution then the total cost of distinct codewords pairs in C is g(C). If we remove a codeword v from C this total cost is reduced by

$$ \sum\limits_{\underset{\mathbf{u} \neq \mathbf{v}}{\mathbf{u} \in C}} c(\mathbf{v} , \mathbf{u}). $$
(1)

Likewise, if we add a vector vC to C then this total cost is increased by expression 1. Therefore we define the gain of v with respect to a candidate solution C as the reduction in the total cost of distinct vectors pairs when v is moved into or out of C.

Definition 3

Given a candidate solution C and a vector v

$$ gain_{C}(\mathbf{v}) = \left\{ \begin{array}{rccl} {\sum}_{\underset{\mathbf{u} \neq \mathbf{v}}{\mathbf{u} \in C}} c(\mathbf{v} , \mathbf{u}) &&& \text{if } \mathbf{v} \in C,\\ -{\sum}_{\underset{\mathbf{u} \neq \mathbf{v}}{\mathbf{u} \in C}} c(\mathbf{v} , \mathbf{u}) &&& \text{if } \mathbf{v} \notin C. \end{array} \right. $$

We maintain an array, GAIN, of the gains of all the vectors with respect to the current candidate solution. Given the evaluation function’s value of the current candidate solution C, the gain of a codeword xC and the gain of a vector yC, we can calculate the evaluation function’s value of the neighbor C ∖{x}∪{y} in constant time. Its evaluation value is g(C) − gain(x) − gain(y) − c(x,y).

We maintain an array, CODEWORDS, indexed by gain values. The element, CODEWORDS[gain], is a container holding all the codewords of the current candidate solution that have the gain value gain. This way we can iterate efficiently through all the codewords with a fixed gain. A second array of containers, OUTSIDE_VECTORS, holds all the the vectors that do not belong to the current candidate solution. Finally, for every vector we keep a pointer to its location in its respective container (in CODEWORDS or in OUTSIDE_VECTORS). This way we can move a vector between containers in constant time when its gain value changes or when it is moving into or out of a code.

3 Search result

We wrote a C++ program and ran it on a PC with a 3.4GHz processor and 40GB of RAM. We report here the results for the problem instance n = 16,d = 5,M = 258/2 = 129. We ran the program 20 times. The mean time for finding a solution (not including some lookup table precalculations) was 9.9 seconds and the mean number of iterations executed was about 40731. In the worst case it took 32.2 seconds and 170973 iterations to find a solution.

The first solution that was found is shown in hexadecimal representation in Table 1. Adding the complements of the codewords of this code we get a (16,258,5)-code.

Table 1 A code of length 16, with 129 codewords, and distances in the range [5,11]