Partial Covering Arrays: Algorithms and Asymptotics

Sarkar, Kaushik; Colbourn, Charles J.; de Bonis, Annalisa; Vaccaro, Ugo

doi:10.1007/978-3-319-44543-4_34

Kaushik Sarkar¹⁶,
Charles J. Colbourn¹⁶,
Annalisa de Bonis¹⁷ &
…
Ugo Vaccaro¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9843))

Included in the following conference series:

International Workshop on Combinatorial Algorithms

886 Accesses
4 Citations

Abstract

A covering array $\mathsf {CA}(N;t,k,v)$ is an $N\times k$ array with entries in $\{1, 2, \ldots , v\}$, for which every $N\times t$ subarray contains each t-tuple of $\{1, 2, \ldots , v\}^t$ among its rows. Covering arrays find application in interaction testing, including software and hardware testing, advanced materials development, and biological systems. A central question is to determine or bound $\mathsf {CAN}(t,k,v)$, the minimum number N of rows of a $\mathsf {CA}(N;t,k,v)$. The well known bound $\mathsf {CAN}(t,k,v)=O((t-1)v^t\log k)$ is not too far from being asymptotically optimal. Sensible relaxations of the covering requirement arise when (1) the set $\{1, 2, \ldots , v\}^t$ need only be contained among the rows of at least $(1-\epsilon )\left( {\begin{array}{c}k\\ t\end{array}}\right) $ of the $N\times t$ subarrays and (2) the rows of every $N\times t$ subarray need only contain a (large) subset of $\{1, 2, \ldots , v\}^t$. In this paper, using probabilistic methods, significant improvements on the covering array upper bound are established for both relaxations, and for the conjunction of the two. In each case, a randomized algorithm constructs such arrays in expected polynomial time.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Tight Approximation Bounds for Maximum Multi-coverage

Flavors of Translative Coverings

An Improved Approximation Algorithm for the Minimum Common Integer Partition Problem

1 Introduction

Let [n] denote the set $\{1,2,\ldots ,n\}$. Let N,t,k, and v be integers such that $k \ge t \ge 2$ and $v \ge 2$. Let A be an $N \times k$ array where each entry is from the set [v]. For $I = \{j_1, \ldots , j_\rho \} \subseteq [k]$ where $j_1<\ldots <j_\rho $, let $A_I$ denote the $N \times \rho $ array in which $A_I(i,\ell ) = A(i,j_\ell )$ for $1 \le i \le N$ and $1 \le \ell \le \rho $; $A_I$ is the projection of A onto the columns in I.

A covering array $\mathsf {CA}(N;t,k,v)$ is an $N \times k$ array A with each entry from [v] so that for each t-set of columns $C \in {[k] \atopwithdelims ()t}$, each t-tuple $x \in [v]^t$ appears as a row in $A_C$. The smallest N for which a $\mathsf {CA}(N;t,k,v)$ exists is denoted by $\mathsf {CAN}(t,k,v)$.

Covering arrays find important application in software and hardware testing (see [22] and references therein). Applications of covering arrays also arise in experimental testing for advanced materials [4], inference of interactions that regulate gene expression [29], fault-tolerance of parallel architectures [15], synchronization of robot behavior [17], drug screening [30], and learning of boolean functions [11]. Covering arrays have been studied using different nomenclature, as qualitatively independent partitions [13], t-surjective arrays [5], and (k, t)-universal sets [19], among others. Covering arrays are closely related to hash families [10] and orthogonal arrays [8].

2 Background and Motivation

The exact or approximate determination of $\mathsf {CAN}(t,k,v)$ is central in applications of covering arrays, but remains an open problem. For fixed t and v, only when $t=v=2$ is $\mathsf {CAN}(t,k,v)$ known precisely for infinitely many values of k. Kleitman and Spencer [21] and Katona [20] independently proved that the largest k for which a $\mathsf {CA}(N;2,k,2)$ exists satisfies $k=\left( {\begin{array}{c}N-1\\ \lceil N/2\rceil \end{array}}\right) .$ When $t=2$, Gargano, Kőrner, and Vaccaro [13] establish that

$$\begin{aligned} \mathsf {CAN}(2,k,v) =\frac{v}{2}\log k(1+\text{ o }(1)). \end{aligned}$$

(1)

(We write $\log $ for logarithms base 2, and $\ln $ for natural logarithms.) Several researchers [2, 5, 14, 16] establish a general asymptotic upper bound on $\mathsf {CAN}(t,k,v)$:

$$\begin{aligned} \mathsf {CAN}(t,k,v) \le \frac{t-1}{\log \frac{v^t}{v^t-1}}\log k(1+\text{ o }(1)). \end{aligned}$$

(2)

A slight improvement on (2) has recently been proved [12, 28]. An (essentially) equivalent but more convenient form of (2) is:

$$\begin{aligned} \mathsf {CAN}(t,k,v) \le (t-1)v^t \log k(1+o(1)). \end{aligned}$$

(3)

A lower bound on $\mathsf {CAN}(t,k,v)$ results from the inequality $\mathsf {CAN}(t,k,v) \ge v \cdot \mathsf {CAN}(t-1,k-1,v)$ obtained by derivation, together with (1), to establish that $\mathsf {CAN}(t,k,v) \ge v^{t-2} \cdot \mathsf {CAN}(2,k-t+2,v) = v^{t-2}\cdot \frac{v}{2}\log (k-t+2)(1+\text{ o }(1))$. When $\frac{t}{k} < 1$, we obtain:

$$\begin{aligned} \mathsf {CAN}(t,k,v) = \varOmega (v^{t-1}\log k). \end{aligned}$$

(4)

Because (4) ensures that the number of rows in covering arrays can be considerable, researchers have suggested the need for relaxations in which not all interactions must be covered [7, 18, 23, 24] in order to reduce the number of rows. The practical relevance is that each row corresponds to a test to be performed, adding to the cost of testing.

For example, an array covers a t-set of columns when it covers each of the $v^t$ interactions on this t-set. Hartman and Raskin [18] consider arrays with a fixed number of rows that cover the maximum number of t-sets of columns. A similar question was also considered in [24]. In [23, 24] a more refined measure of the (partial) coverage of an $N\times k$ array A is introduced. For a given $q\in [0,1]$, let $\alpha (A,q)$ be the number of $N\times t$ submatrices of A with the property that at least $qv^t$ elements of $[v]^t$ appear in their set of rows; the (q, t)-completeness of A is $\alpha (A,q)/\left( {\begin{array}{c}k\\ t\end{array}}\right) $. Then for practical purposes one wants “high"(q, t)-completeness with few rows.

In these works, no theoretical results on partial coverage appear to have been stated; earlier contributions focus on experimental investigations of heuristic construction methods. Our purpose is to initiate a mathematical investigation of arrays offering “partial” coverage. More precisely, we address:

Can one obtain a significant improvement on the upper bound (3) if the set $[v]^t$ is only required to be contained among the rows of at least $(1-\epsilon )\left( {\begin{array}{c}k\\ t\end{array}}\right) $ subarrays of A of dimension $N\times t$?
Can one obtain a significant improvement if, among the rows of every $N\times t$ subarray of A, only a (large) subset of $[v]^t$ is required to be contained?
Can one obtain a significant improvement if the set $[v]^t$ is only required to be contained among the rows of at least $(1-\epsilon )\left( {\begin{array}{c}k\\ t\end{array}}\right) $ subarrays of A of dimension $N\times t$, and among the rows of each of the $\epsilon \left( {\begin{array}{c}k\\ t\end{array}}\right) $ subarrays that remain, a (large) subset of $[v]^t$ is required to be contained?

We answer these questions both theoretically and algorithmically in the following sections.

3 Partial Covering Arrays

When $1 \le m \le v^t$, a partial m-covering array, $\mathsf {PCA}(N;t,k,v,m)$, is an $N \times k$ array A with each entry from [v] so that for each t-set of columns $C \in {[k] \atopwithdelims ()t}$, at least m distinct tuples $x \in [v]^t$ appear as rows in $A_C$. Hence a covering array $\mathsf {CA}(N;t,k,v)$ is precisely a partial $v^t$-covering array $\mathsf {PCA}(N;t,k,v,v^t)$.

Theorem 1

For integers t, k, v, and m where $k \ge t \ge 2$, $v \ge 2$ and $1 \le m \le v^t$ there exists a $\mathsf {PCA}(N;t,k,v,m)$ with

$$\begin{aligned} N \le \frac{\ln \left\{ {k \atopwithdelims ()t}{v^t \atopwithdelims ()m - 1}\right\} }{\ln \left( \frac{v^t}{m-1}\right) } . \end{aligned}$$

(5)

Proof

Let $r = v^t - m + 1$, and A be a random $N \times k$ array where each entry is chosen independently from [v] with uniform probability. For $C \in {[k] \atopwithdelims ()t}$, let $B_C$ denote the event that at least r tuples from $[v]^t$ are missing in $A_C$. The probability that a particular r-set of tuples from $[v]^t$ is missing in $A_C$ is $\left( 1 - \frac{r}{v^t}\right) ^N$. Applying the union bound to all r-sets of tuples from $[v]^t$, we obtain $\Pr [B_C] \le {v^t \atopwithdelims ()r}\left( 1 - \frac{r}{v^t}\right) ^N$. By linearity of expectation, the expected number of t-sets C for which $A_C$ misses at least r tuples from $[v]^t$ is at most ${k \atopwithdelims ()t} {v^t \atopwithdelims ()r}\left( 1 - \frac{r}{v^t}\right) ^N$. When A has at least $\frac{\ln \left\{ {k \atopwithdelims ()t}{v^t \atopwithdelims ()m - 1}\right\} }{\ln \left( \frac{v^t}{m-1}\right) }$ rows this expected number is less than 1. Therefore, an array A exists with the required number of rows such that for all $C \in {[k] \atopwithdelims ()t}$, $A_C$ misses at most $r-1$ tuples from $[v]^t$, i.e. $A_C$ covers at least m tuples from $[v]^t$. $\square $

Theorem 1 can be improved upon using the Lovász local lemma.

Lemma 1

(Lovász local lemma; symmetric case) (see [1]) Let $A_{1},A_{2},\ldots ,A_{n}$ be events in an arbitrary probability space. Suppose that each event $A_{i}$ is mutually independent of a set of all other events $A_{j}$ except for at most d, and that $\Pr [A_{i}]\le p$ for all $1\le i\le n$. If $ep(d+1)\le 1$, then $\Pr [\cap _{i=1}^{n}\bar{A_{i}}]>0$.

Lemma 1 provides an upper bound on the probability of a “bad” event in terms of the dependence structure among such bad events, so that there is a guaranteed outcome in which all “bad” events are avoided. This lemma is most useful when there is limited dependence among the “bad” events, as in the following:

Theorem 2

For integers t, k, v and m where $v,t \ge 2$, $k \ge 2t$ and $1 \le m \le v^t$ there exists a $\mathsf {PCA}(N;t,k,v,m)$ with

$$\begin{aligned} N \le \frac{1 + \ln \left\{ t{k \atopwithdelims ()t - 1}{v^t \atopwithdelims ()m - 1}\right\} }{\ln \left( \frac{v^t}{m-1}\right) } . \end{aligned}$$

(6)

Proof

When $k \ge 2t$, each event $B_C$ with $C \in {[k] \atopwithdelims ()t}$ (that is, at least $v^t - m + 1$ tuples are missing in $A_C$) is independent of all but at most ${t \atopwithdelims ()1}{k-1 \atopwithdelims ()t-1}<t{k \atopwithdelims ()t-1}$ events in $\{ B_{C'} : C' \in {[k] \atopwithdelims ()t}\setminus \{C\}\}$. Applying Lemma 1, $\Pr [\wedge _{C \in {[k] \atopwithdelims ()t}} \overline{B_C}]>0$ when

$$\begin{aligned} \mathrm {e}{v^t \atopwithdelims ()r}\left( 1 - \frac{r}{v^t}\right) ^N t{k \atopwithdelims ()t-1} \le 1. \end{aligned}$$

(7)

Solve (7) to obtain the required upper bound on N. $\square $

When $m=v^t$, apply the Taylor series expansion to obtain $\ln \left( \frac{v^t}{m-1}\right) \ge \frac{1}{v^t}$, and thereby recover the upper bound (3). Theorem 2 implies:

Corollary 1

Given $q\in [0,1]$ and integers $2 \le t \le k$, $v \ge 2$, there exists an $N\times k$ array on [v] with (q, t)-completeness equal to 1 (i.e., maximal), whose number of rows, N satisfies

$$N\le \frac{1 + \ln \left\{ t{k \atopwithdelims ()t - 1}{v^t \atopwithdelims ()qv^t - 1}\right\} }{\ln \left( \frac{v^t}{qv^t-1}\right) }.$$

Rewriting (6), setting $r = v^t - m + 1$, and using the Taylor series expansion of $\ln \left( 1 - \frac{r}{v^t}\right) $, we get

$$\begin{aligned} N \le \frac{1 + \ln \left\{ t{k \atopwithdelims ()t - 1}{v^t \atopwithdelims ()r}\right\} }{\ln \left( \frac{v^t}{v^t - r}\right) } \le \frac{v^t(t-1)\ln k}{r}\left\{ 1 - \frac{\ln r}{\ln k} + o(1)\right\} . \end{aligned}$$

(8)

Hence when $r = v(t-1)$ (or equivalently, $m = v^t - v(t-1) + 1$), there is a partial m-covering array with $\varTheta (v^{t-1} \ln k)$ rows. This matches the lower bound (4) asymptotically for covering arrays by missing, in each t-set of columns, no more than $v(t-1)-1$ of the $v^t$ possible rows.

The dependence of the bound (6) on the number of v-ary t-vectors that must appear in the t-tuples of columns is particularly of interest when test suites are run sequentially until a fault is revealed, as in [3]. Indeed the arguments here may have useful consequences for the rate of fault detection.

Lemma 1 and hence Theorem 2 have proofs that are non-constructive in nature. Nevertheless, Moser and Tardos [26] provide a randomized algorithm with the same guarantee. Patterned on their method, Algorithm 1 constructs a partial m-covering array with exactly the same number of rows as (6) in expected polynomial time. Indeed, for fixed t, the expected number of times the resampling step (line 13) is repeated is linear in k (see [26] for more details).

4 Almost Partial Covering Arrays

For $0< \epsilon < 1$, an $\epsilon $-almost partial m-covering array, $\mathsf {APCA}(N;t,k,v,m,\epsilon )$, is an $N \times k$ array A with each entry from [v] so that for at least $(1-\epsilon ){k \atopwithdelims ()t}$ column t-sets $C \in {[k] \atopwithdelims ()t}$, $A_C$ covers at least m distinct tuples $x \in [v]^t$. Again, a covering array $\mathsf {CA}(N;t,k,v)$ is precisely an $\mathsf {APCA}(N;t,k,v,v^t, \epsilon )$ when $\epsilon < 1/ \left( {\begin{array}{c}k\\ t\end{array}}\right) $. Our first result on $\epsilon $-almost partial m-covering arrays is the following.

Theorem 3

For integers t, k, v, m and real $\epsilon $ where $k \ge t \ge 2$, $v \ge 2$, $1 \le m \le v^t$ and $0 \le \epsilon \le 1$, there exists an $\mathsf {APCA}(N;t,k,v,m,\epsilon )$ with

$$\begin{aligned} N \le \frac{\ln \left\{ {v^t \atopwithdelims ()m - 1}/\epsilon \right\} }{\ln \left( \frac{v^t}{m-1}\right) }. \end{aligned}$$

(9)

Proof

Parallelling the proof of Theorem 1 we compute an upper bound on the expected number of t-sets $C\in {[k] \atopwithdelims ()t}$ for which $A_C$ misses at least r tuples $x \in [v]^t$. When this expected number is at most $\epsilon {k \atopwithdelims ()t}$, an array A is guaranteed to exist with at least $(1-\epsilon ){k \atopwithdelims ()t}$ t-sets of columns $C \in {[k] \atopwithdelims ()t}$ such that $A_C$ misses at most $r-1$ distinct tuples $x \in [v]^t$. Thus A is an $\mathsf {APCA}(N;t,k,v,m,\epsilon )$. To establish the theorem, solve the following for N:

$$\begin{aligned} {k \atopwithdelims ()t} {v^t \atopwithdelims ()r}\left( 1 - \frac{r}{v^t}\right) ^N \le \epsilon {k \atopwithdelims ()t}. \end{aligned}$$

$\square $

When $\epsilon < 1 / {k \atopwithdelims ()t}$ we recover the bound from Theorem 1 for partial m-covering arrays. In terms of (q, t)-completeness, Theorem 3 yields the following.

Corollary 2

For $q\in [0,1]$ and integers $2 \le t \le k$, $v \ge 2$, there exists an $N\times k$ array on [v] with (q, t)-completeness equal to $1-\epsilon $, with

$$N \le \frac{\ln \left\{ {v^t \atopwithdelims ()m - 1}/\epsilon \right\} }{\ln \left( \frac{v^t}{m-1}\right) }.$$

When $m = v^t$, an $\epsilon $-almost covering array exists with $N \le v^t \ln \left( \frac{v^t}{\epsilon }\right) $ rows. Improvements result by focussing on covering arrays in which the symbols are acted on by a finite group. In this setting, one chooses orbit representatives of rows that collectively cover orbit representatives of t-way interactions under the group action; see [9], for example. Such group actions have been used in direct and computational methods for covering arrays [6, 25], and in randomized and derandomized methods [9, 27, 28].

We employ the sharply transitive action of the cyclic group of order v, adapting the earlier arguments using methods from [28]:

Theorem 4

For integers t, k, v and real $\epsilon $ where $k \ge t \ge 2$, $v \ge 2$ and $0 \le \epsilon \le 1$ there exists an $\mathsf {APCA}(N;t,k,v,v^t,\epsilon )$ with

$$\begin{aligned} N \le v^t \ln \left( \frac{v^{t-1}}{\epsilon }\right) . \end{aligned}$$

(10)

Proof

The action of the cyclic group of order v partitions $[v]^t$ into $v^{t-1}$ orbits, each of length v. Let $n = \lfloor \frac{N}{v} \rfloor $ and let A be an $n \times k$ random array where each entry is chosen independently from the set [v] with uniform probability. For $C \in {[k] \atopwithdelims ()t}$, $A_C$ covers the orbit X if at least one tuple $x\in X$ is present in $A_C$. The probability that the orbit X is not covered in A is $\left( 1 - \frac{v}{v^t}\right) ^n = \left( 1 - \frac{1}{v^{t-1}}\right) ^n$. Let $D_C$ denote the event that $A_C$ does not cover at least one orbit. Applying the union bound, $\Pr [D_C] \le v^{t-1}\left( 1 - \frac{1}{v^{t-1}}\right) ^n$. By linearity of expectation, the expected number of column t-sets C for which $D_C$ occurs is at most ${k \atopwithdelims ()t}v^{t-1}\left( 1 - \frac{1}{v^{t-1}}\right) ^n$. As earlier, set this expected value to be at most $\epsilon {k \atopwithdelims ()t}$ and solve for n. An array exists that covers all orbits in at least $(1-\epsilon ){k \atopwithdelims ()t}$ column t-sets. Develop this array over the cyclic group to obtain the desired array. $\square $

As in [28], further improvements result by considering a group, like the Frobenius group, that acts sharply 2-transitively on [v]. When v is a prime power, the Frobenius group is the group of permutations of $\mathbb {F}_v$ of the form $\{x \mapsto ax+b\,:\,a,b\in \mathbb {F}_v,\,a\ne 0\}$.

Theorem 5

For integers t, k, v and real $\epsilon $ where $k \ge t \ge 2$, $v \ge 2$, v is a prime power and $0 \le \epsilon \le 1$ there exists an $\mathsf {APCA}(N;t,k,v,v^t,\epsilon )$ with

$$\begin{aligned} N \le v^t \ln \left( \frac{2v^{t-2}}{\epsilon }\right) + v. \end{aligned}$$

(11)

Proof

The action of the Frobenius group partitions $[v]^t$ into $\frac{v^{t-1}-1}{v-1}$ orbits of length $v(v-1)$ (full orbits) each and 1 orbit of length v (a short orbit). The short orbit consists of tuples of the form $(x_1,\ldots ,x_t)\in [v]^t$ where $x_1=\ldots =x_t$. Let $n = \lfloor \frac{N-v}{v(v-1)}\rfloor $ and let A be an $n \times k$ random array where each entry is chosen independently from the set [v] with uniform probability. Our strategy is to construct A so that it covers all full orbits for the required number of arrays $\{A_C :C \in {[k] \atopwithdelims ()t}\}$. Develop A over the Frobenius group and add v rows of the form $(x_1, \ldots , x_k)\in [v]^t$ with $x_1= \ldots =x_k$ to obtain an $\mathsf {APCA}(N;t,k,v,v^t,\epsilon )$ with the desired value of N. Following the lines of the proof of Theorem 4, A covers all full orbits in at least $(1-\epsilon ){k \atopwithdelims ()t}$ column t-sets C when

$$ {k \atopwithdelims ()t}\frac{v^{t-1}-1}{v-1}\left( 1 - \frac{v-1}{v^{t-1}}\right) ^n \le \epsilon {k \atopwithdelims ()t}. $$

Because $\frac{v^{t-1}-1}{v-1} \le 2v^{t-2}$ for $v \ge 2$, we obtain the desired bound. $\square $

Using group action when $m=v^t$ affords useful improvements. Does this improvement extend to cases when $m < v^t$? Unfortunately, the answer appears to be no. Consider the case for $\mathsf {PCA}(N;t,k,v,m)$ when $m \le v^t$ using the action of the cyclic group of order v on $[v]^t$. Let A be a random $n \times k$ array over [v]. When $v^t-vs+1 \le m \le v^t-v(s-1)$ for $1 \le s \le v^{t-1}$, this implies that for all $C \in \left( {\begin{array}{c}[k]\\ t\end{array}}\right) $, $A_C$ misses at most $s-1$ orbits of $[v]^t$. Then we obtain that $n \le \left( 1+\ln \left( t\left( {\begin{array}{c}k\\ t-1\end{array}}\right) \left( {\begin{array}{c}v^{t-1}\\ s\end{array}}\right) \right) \right) /\ln \left( \frac{v^{t-1}}{v^{t-1}-s}\right) $. Developing A over the cyclic group we obtain a $\mathsf {PCA}(N;t,k,v,m)$ with

$$\begin{aligned} N \le v \frac{1+\ln \left\{ \left( {\begin{array}{c}k\\ t-1\end{array}}\right) \left( {\begin{array}{c}v^{t-1}\\ s\end{array}}\right) \right\} }{\ln \left( \frac{v^{t-1}}{v^{t-1}-s}\right) } \end{aligned}$$

(12)

Figure 1 compares (12) and (6). In Fig. 1a we plot the size of the partial m-covering array as obtained by (12) and (6) for $v^t-6v+1 \le m \le v^t$ and $t=6,\,k=20,\,v=4$. Except when $m=v^t=4096$, the covering array case, (6) outperforms (12). Similarly, Fig. 1b shows that for $m=v^t-v=4092$, (6) consistently outperforms (12) for all values of k when $t=6,\,v=4$. We observe similar behavior for different values of t and v.

Next we consider even stricter coverage restrictions, combining Theorems 2 and 4.

Theorem 6

For integers t, k, v, m and real $\epsilon $ where $k \ge t \ge 2$, $v \ge 2$, $0 \le \epsilon \le 1$ and $m \le v^t + 1 - \frac{\ln k}{\ln (v/\epsilon ^{1/(t-1)})}$ there exists an $N\times k$ array A with entries from [v] such that

1.
for each $C \in {[k] \atopwithdelims ()t}$, $A_C$ covers at least m tuples $x\in [v]^t$,
2.
for at least $(1 - \epsilon ){k \atopwithdelims ()t}$ column t-sets C, $A_C$ covers all tuples $x \in [v]^t$,
3.
$N = O(v^t \ln \left( \frac{v^{t-1}}{\epsilon }\right) )$.

Proof

We vertically juxtapose a partial m-covering array and an $\epsilon $-almost $v^t$-covering array. For $r = \frac{\ln k}{\ln (v/\epsilon ^{1/(t-1)})}$ and $m = v^t - r + 1$, (8) guarantees the existence of a partial m-covering array with $v^t \ln \left( \frac{v^{t-1}}{\epsilon }\right) \{1+\text{ o }(1)\}$ rows. Theorem 4 guarantees the existence of an $\epsilon $-almost $v^t$-covering array with at most $v^t \ln \left( \frac{v^{t-1}}{\epsilon }\right) $ rows. $\square $

Corollary 3

There exists an $N \times k$ array A such that:

1.
for any t-set of columns $C \in {[k] \atopwithdelims ()t}$, $A_C$ covers at least $m \le v^t + 1 - v(t-1)$ distinct t-tuples $x\in [v]^t$,
2.
for at least $\left( 1-\frac{v^{t-1}}{k^{1/v}}\right) {k \atopwithdelims ()t}$ column t-sets C, $A_C$ covers all the distinct t-tuples $x\in [v]^t$.
3.
$N = O(v^{t-1}\ln k)$.

Proof

Apply Theorem 6 with $m = v^t + 1 - \frac{\ln k}{\ln (v/\epsilon ^{1/(t-1)})}$. There are at most $\frac{\ln k}{\ln (v/\epsilon ^{1/(t-1)})} -1$ missing t-tuples $x \in [v]^t$ in the $A_C$ for each of the at most $\epsilon {k\atopwithdelims ()t}$ column t-sets C that do not satisfy the second condition of Theorem 6. To bound from above the number of missing tuples to a certain small function f(t) of t, it is sufficient that $\epsilon \le v^{t-1}\left( \frac{1}{k}\right) ^\frac{t-1}{f(t)+1}$. Then the number of missing t-tuples $x \in [v]^t$ in $A_C$ is bounded from above by f(t) whenever $\epsilon $ is not larger than

$$\begin{aligned} v^{t-1}\left( \frac{1}{k}\right) ^\frac{t-1}{f(t)+1} \end{aligned}$$

(13)

On the other hand, in order for the number $N=O\left( v^{t-1}\ln \left( \frac{v^{t-1}}{\epsilon }\right) \right) $ of rows of A to be asymptotically equal to the lower bound (4), it suffices that $\epsilon $ is not smaller than

$$\begin{aligned} {v^{t-1}\over k^{\frac{1}{v}}}. \end{aligned}$$

(14)

When $f(t)=v(t-1)-1$, (13) and (14) agree asymptotically, completing the proof. $\square $

Once again we obtain a size that is $O(v^{t-1}\!\log k)$, a goal that has not been reached for covering arrays. This is evidence that even a small relaxation of covering arrays provides arrays of the best sizes one can hope for.

Next we consider the efficient construction of the arrays whose existence is ensured by Theorem 6. Algorithm 2 is a randomized method to construct an $\mathsf {APCA}(N;t,k,v,m,\epsilon )$ of a size N that is very close to the bound of Theorem 3. By Markov’s inequality the condition in line 9 of Algorithm 2 is met with probability at most 1 / 2. Therefore, the expected number of times the loop in line 2 repeats is at most 2.

To prove Theorem 3, t-wise independence among the variables is sufficient. Hence, Algorithm 2 can be derandomized using t-wise independent random variables. We can also derandomize the algorithm using the method of conditional expectation. In this method we construct A by considering the k columns one by one and fixing all N entries of a column. Given a set of already fixed columns, to fix the entries of the next column we consider all possible $v^N$ choices, and choose one that provides the maximum conditional expectation of the number of column t-sets $C \in \left( {\begin{array}{c}[k]\\ t\end{array}}\right) $ such that $A_C$ covers at least m tuples $x\in [v]^t$. Because $v^N=O(\mathsf {poly}(1/\epsilon ))$, this derandomized algorithm constructs the desired array in polynomial time. Similar randomized and derandomized strategies can be applied to construct the array guaranteed by Theorem 4. Together with Algorithm 1 this implies that the array in Theorem 6 is also efficiently constructible.

5 Final Remarks

We have shown that by relaxing the coverage requirement of a covering array somewhat, powerful upper bounds on the sizes of the arrays can be established. Indeed the upper bounds are substantially smaller than the best known bounds for a covering array; they are of the same order as the lower bound for $\mathsf {CAN}(t,k,v)$. As importantly, the techniques not only provide asymptotic bounds but also randomized polynomial time construction algorithms for such arrays.

Our approach seems flexible enough to handle variations of these problems. For instance, some applications require arrays that satisfy, for different subsets of columns, different coverage or separation requirements [8]. In [16] several interesting examples of combinatorial problems are presented that can be unified and expressed in the framework of S-constrained matrices. Given a set of vectors S each of length t, an $N\times k$ matrix M is S-constrained if for every t-set $C\in \left( {\begin{array}{c}[k]\\ t\end{array}}\right) $, $M_C$ contains as a row each of the vectors in S. The parameter to optimize is, as usual, the number of rows of M. One potential direction is to ask for arrays that, in every t-tuple of columns, cover at least m of the vectors in S, or that all vectors in S are covered by all but a small number of t-tuples of columns. Exploiting the structure of the members of S appears to require an extension of the results developed here.

References

Alon, N., Spencer, J.H.: The Probabilistic Method. Wiley-Interscience Series in Discrete Mathematics and Optimization, 3rd edn. John Wiley & Sons Inc, Hoboken, NJ (2008)
Book MATH Google Scholar
Becker, B., Simon, H.-U.: How robust is the $n$-cube? Inform. and Comput. 77, 162–178 (1988)
Article MathSciNet MATH Google Scholar
Bryce, R.C., Chen, Y., Colbourn, C.J.: Biased covering arrays for progressive ranking and composition of web services. Int. J. Simul. Process Model. 3(1/2), 80–87 (2007)
Article Google Scholar
Cawse, J.N.: Experimental design for combinatorial and high throughput materials development. GE Global Res. Techn. Report 29, 769–781 (2002)
Google Scholar
Chandra, A.K., Kou, L.T., Markowsky, G., Zaks, S.: On sets of boolean n-vectors with all k-projections surjective. Acta Inf. 20(1), 103–111 (1983)
Article MathSciNet MATH Google Scholar
Chateauneuf, M.A., Colbourn, C.J., Kreher, D.L.: Covering arrays of strength 3. Des. Codes Crypt. 16, 235–242 (1999)
Article MathSciNet MATH Google Scholar
Chen, B., Zhang, J.: Tuple density: a new metric for combinatorial test suites. In: Proceedings of the 33rd International Conference on Software Engineering, ICSE, Waikiki, Honolulu, HI, USA, May 21–28, pp. 876–879 (2011)
Google Scholar
Colbourn, C.J.: Combinatorial aspects of covering arrays. Le Mat. (Catania) 58, 121–167 (2004)
MathSciNet MATH Google Scholar
Colbourn, C.J.: Conditional expectation algorithms for covering arrays. J. Comb. Math. Comb. Comput. 90, 97–115 (2014)
MathSciNet MATH Google Scholar
Colbourn, C.J.: Covering arrays and hash families. In: Crnkovič, D., Tonchev, V. (eds.), Information Security, Coding Theory, and Related Combinatorics, NATO Science for Peace and Security Series, pp. 99–135. IOS Press (2011)
Google Scholar
Damaschke, P.: Adaptive versus nonadaptive attribute-efficient learning. Mach. Learn. 41(2), 197–215 (2000)
Article MATH Google Scholar
Francetić, N., Stevens, B.: Asymptotic size of covering arrays: an application of entropy compression. ArXiv e-prints (March 2015)
Google Scholar
Gargano, L., Körner, J., Vaccaro, U.: Sperner capacities. Graphs Comb. 9, 31–46 (1993)
Article MathSciNet MATH Google Scholar
Godbole, A.P., Skipper, D.E., Sunley, R.A.: $t$-covering arrays: upper bounds and Poisson approximations. Comb. Probab. Comput. 5, 105–118 (1996)
Article MathSciNet MATH Google Scholar
Graham, N., Harary, F., Livingston, M., Stout, Q.F.: Subcube fault-tolerance in hypercubes. Inf. Comput. 102(2), 280–314 (1993)
Article MathSciNet MATH Google Scholar
Gravier, S., Ycart, B.: S-constrained random matrices. DMTCS In: Proceedings (1) (2006)
Google Scholar
Hartman, A.: Software and hardware testing using combinatorial covering suites. In: Golumbic, M.C., Hartman, I.B.-A. (eds.) Graph Theory, Combinatorics and Algorithms. OR/CSIS, pp. 237–266. Springer, Heidelberg (2005)
Chapter Google Scholar
Hartman, A., Raskin, L.: Problems and algorithms for covering arrays. Discrete Math. 284(13), 149–156 (2004)
Article MathSciNet MATH Google Scholar
Jukna, S.: Extremal Combinatorics: With Applications in Computer Science, 1st edn. Springer Publishing Company, Incorporated (2010)
MATH Google Scholar
Katona, G.O.H.: Two applications (for search theory and truth functions) of Sperner type theorems. Periodica Math. 3, 19–26 (1973)
Article MathSciNet MATH Google Scholar
Kleitman, D., Spencer, J.: Families of k-independent sets. Discrete Math. 6, 255–262 (1973)
Article MathSciNet MATH Google Scholar
Kuhn, D.R., Kacker, R., Lei, Y.: Introduction to Combinatorial Testing. CRC Press, Boca Raton (2013)
MATH Google Scholar
Kuhn, D.R., Mendoza, I.D., Kacker, R.N., Lei, Y.: Combinatorial coverage measurement concepts and applications. In: 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation Workshops (ICSTW), pp. 352–361, March 2013
Google Scholar
Maximoff, J.R., Trela, M.D., Kuhn, D.R., Kacker, R.: A method for analyzing system state-space coverage within a $t$-wise testing framework. In: 4th Annual IEEE Systems Conference, pp. 598–603 (2010)
Google Scholar
Meagher, K., Stevens, B.: Group construction of covering arrays. J. Combin. Des. 13, 70–77 (2005)
Article MathSciNet MATH Google Scholar
Moser, R.A., Tardos, G.: A constructive proof of the general Lovász local lemma. J. ACM 57(2), 524–529 (2010). Art. 11, 15
Article MathSciNet MATH Google Scholar
Sarkar, K., Colbourn, C.J.: Two-stage algorithms for covering array construction. submitted for publication
Google Scholar
Sarkar, K., Colbourn, C.J.: Upper bounds on the size of covering arrays. ArXiv e-prints (March 2016)
Google Scholar
Shasha, D.E., Kouranov, A.Y., Lejay, L.V., Chou, M.F., Coruzzi, G.M.: Using combinatorial design to study regulation by multiple input signals: A tool for parsimony in the post-genomics era. Plant Physiol. 127, 1590–2594 (2001)
Article Google Scholar
Tong, A.J., Wu, Y.G., Li, L.D.: Room-temperature phosphorimetry studies of some addictive drugs following dansyl chloride labelling. Talanta 43(9), 14291436 (1996)
Article Google Scholar

Download references

Acknowledgements

Research of KS and CJC was supported in part by the National Science Foundation under Grant No. 1421058.

Author information

Authors and Affiliations

CIDSE, Arizona State University, Tempe, USA
Kaushik Sarkar & Charles J. Colbourn
Dipartimento di Informatica, University of Salerno, Fisciano, Italy
Annalisa de Bonis & Ugo Vaccaro

Authors

Kaushik Sarkar
View author publications
You can also search for this author in PubMed Google Scholar
Charles J. Colbourn
View author publications
You can also search for this author in PubMed Google Scholar
Annalisa de Bonis
View author publications
You can also search for this author in PubMed Google Scholar
Ugo Vaccaro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kaushik Sarkar .

Editor information

Editors and Affiliations

University of Helsinki, Helsinki, Finland
Veli Mäkinen
University of Helsinki, Helsinki, Finland
Simon J. Puglisi
University of Helsinki, Helsinki, Finland
Leena Salmela

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sarkar, K., Colbourn, C.J., de Bonis, A., Vaccaro, U. (2016). Partial Covering Arrays: Algorithms and Asymptotics. In: Mäkinen, V., Puglisi, S., Salmela, L. (eds) Combinatorial Algorithms. IWOCA 2016. Lecture Notes in Computer Science(), vol 9843. Springer, Cham. https://doi.org/10.1007/978-3-319-44543-4_34

Download citation

DOI: https://doi.org/10.1007/978-3-319-44543-4_34
Published: 09 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44542-7
Online ISBN: 978-3-319-44543-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Partial Covering Arrays: Algorithms and Asymptotics

Abstract

Similar content being viewed by others

Tight Approximation Bounds for Maximum Multi-coverage

Flavors of Translative Coverings

An Improved Approximation Algorithm for the Minimum Common Integer Partition Problem

1 Introduction

2 Background and Motivation

3 Partial Covering Arrays

Theorem 1

Proof

Lemma 1

Theorem 2

Proof

Corollary 1

4 Almost Partial Covering Arrays

Theorem 3

Proof

Corollary 2

Theorem 4

Proof

Theorem 5

Proof

Theorem 6

Proof

Corollary 3

Proof

5 Final Remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation