1 Introduction

Numerous recent publications have unified the core of various decoding algorithms for Reed–Solomon (RS) and Hermitian codes using row reduction of certain \(\mathbb {F}[x]\)-module bases. First for the Guruswami–Sudan list decoder [2, 9, 21], then for Power decoding [32, 33] and also either type of decoder for Hermitian codes [34]. By factoring out coding theory from the core problem, we enable the immediate use of sophisticated algorithms developed by the computer algebra community such as [16, 50].

The goal of this paper is to explore the row reduction description over skew polynomial rings, with a main application for decoding rank-metric and subspace codes. Concretely, we prove that Interleaved Gabidulin and Mahdavifar–Vardy (MV) codes can be decoded by transforming a module basis into weak Popov form, which can be obtained by a skew-analogue of the elegantly simple Mulders–Storjohann algorithm [31]. By exploiting the structure of the module bases arising from the decoding problems, we refine the algorithm to obtain improved complexities. These match the best known algorithms for these applications but solve more general problems, and it demonstrates that the row reduction methodology is both flexible and fast for skew polynomial rings. Building on this paper, Ref. [40] proposes an algorithm which improves upon the best known complexity for decoding Interleaved Gabidulin codes.

Section 1.1 summarizes related work. We set basic notation in Sect. 2. Section 3 shows how to solve the mentioned decoding problems using row reduction and states the final complexity results which are proven in the subsequent sections. We describe row reduction of skew polynomial matrices in Sect. 4. Section 5 presents faster row reduction algorithms for certain input matrices, with applications to the decoding problems.

This work was partly presented at the International Workshop on Coding and Cryptography 2015 [24]. Compared to this previous work, we added the decoding of MV codes using the row reduction approach.Footnote 1 It spurred a new refinement of the Mulders–Storjohann, in Sect. 5.2, which could be of wider interest.

1.1 Related work

In this paper we consider skew polynomial rings over finite fields without derivations [38] (see Sect. 2.1 for this restricted definition of skew polynomials). This is the most relevant case for coding theory, partly because they are easier to compute with, though non-zero derivations have been used in some constructions [8]. All of the row reduction algorithms in this paper work for skew polynomial rings with non-zero derivation, but the complexity would be worse. The algorithms also apply to skew polynomial rings over any base ring, e.g. \(\mathbb {F}(z)\) or a number field, but due to coefficient growth in such settings, their bit-complexity would have to be analysed.

A skew polynomial ring over a finite field without derivation is isomorphic to a ring of linearised polynomials under a trivial isomorphism, and the rings’ evaluation maps agree. Our algorithms could be phrased straightforwardly to work on modules over linearised polynomials. Much literature on Gabidulin codes uses the language of linearised polynomials.

Skew polynomial rings are instances of Ore rings, and some previous work on computing matrix normal forms over Ore rings can be found in [1, 6]. The focus there is when the base ring is \(\mathbb {Q}\) or \(\mathbb {F}(z)\), where coefficient growth is a main concern, e.g. for modelling systems of partial differential equations. These algorithms are slower than ours when the base ring is a finite field. Reference [29] considers a setting more similar to ours, but obtains a different algorithm and slightly worse complexity.

Gabidulin codes [11, 15, 41] are maximum rank distance codes over finite fields; they are the rank-metric analogue of RS codes. An Interleaved Gabidulin code is a direct sum of several Gabidulin codes, similar to Interleaved RS codes. In a synchronised error model they allow an average-case error-correction capability far beyond half the minimum rank distance [25]. Decoding of Interleaved Gabidulin codes is often formulated as solving a simultaneous “Key Equation” [43]. Over \(\mathbb {F}[x]\) this computational problem is also known as a multi-sequence shift-register synthesis [14, 45], simultaneous Padé approximation [4], or vector rational function reconstruction [36]. This problem also has further generalisations, some of which have found applications in decoding of algebraic codes, e.g. [34, 42, 49]. In the computer algebra community, Padé approximations have been studied in a wide generality, e.g. [5]; to the best of our knowledge, analogous generalisations over skew polynomial rings have yet to see any applications.

Lately, there has been an interest in Gabidulin codes over number fields, with applications to space-time codes and low-rank matrix recovery [3]. Their decoding can also be reduced to a shift-register type problem [30], which could be solved using the algorithms in this paper (though again, one should analyse the bit-complexity).

MV codes [26, 28] are subspace codes whose main interest lies in their property of being list-decodable beyond half the minimum distance. Their rate unfortunately tends to zero for increasing code lengths. In [27], Mahdavifar and Vardy presented a refined construction which can be decoded “with multiplicities” allowing a better decoding radius and rate; it is future work to adapt our algorithm to this case. The decoding of MV codes is is heavily inspired by the Guruswami–Sudan algorithm for RS codes [17], and our row reduction approach in Sect. 3.2 is similarly inspired by fast module-based algorithms for realising the Guruswami–Sudan [7, 21].

Another family of rank-metric codes which can be decoded beyond half the minimum distance are Guruswami–Xing codes [19]. These can be seen simply as heavily punctured Gabidulin codes, and their decoding as a virtual interleaving of multiple Gabidulin codes. This leads to a decoder based on solving a simultaneous shift-register equation, where our algorithms apply. Guruswami–Wang codes [18] are Guruswami–Xing codes with a restricted message set, so the same decoding algorithm applies.

Over \(\mathbb {F}[x]\), row reduction, and the related concept of order bases, have been widely studied and sophisticated algorithms have emerged, e.g. [2, 16, 50]. As a follow-up to this work, a skew-analogue of the algorithm in [2] was proposed in [40].

2 Preliminaries

2.1 Skew polynomials

Let \(\mathbb {F}\) be a finite field and \(\theta \) an \(\mathbb {F}\)-automorphism. Denote by \(\mathcal{{R}}=\mathbb {F}[x;\theta ]\) the non-commutative ring of skew polynomials over \(\mathbb {F}\) (with zero derivation): elements of \(\mathcal{{R}}\) are of the form \(\sum _i a_i x^i\) with \(a_i \in \mathbb {F}\), addition is as usual, while multiplication is defined by \(xa = \theta (a) x\) for all \(a \in \mathbb {F}\). When we say “polynomial”, we will mean elements of \(\mathcal{{R}}\). The definition of the degree of a polynomial is the same as for ordinary polynomials. See [38] for more details.

The evaluation map of \(a \in \mathcal{{R}}\) is given as:

$$\begin{aligned}&a(\cdot ) := ev _a(\cdot ) \, : \, \mathbb {F}\rightarrow \mathbb {F}\ \\&\alpha \mapsto {\textstyle \sum \nolimits }_{i} a_i \theta ^i(\alpha ). \end{aligned}$$

This is a group homomorphism on \((\mathbb {F},+)\), and it is a linear map over the fixed field of \(\theta \). Furthermore, for two \(a,b \in \mathcal{{R}}\) we have \( ev _{ab} = ev _a \circ ev _b\). This is sometimes known as operator evaluation, e.g. [8].

If \(\mathbb {F}_{q}\) is the field fixed by \(\theta \) for some prime power q, then \(\mathbb {F}= \mathbb {F}_{q^s}, s \in \mathbb {Z}_{> 0}\), and \(\theta (a) = a^{q^i}\) for some \(0 \le i < s\), i.e. a power of the Frobenius automorphism of \(\mathbb {F}_{q^s}/\;\mathbb {F}_q\).

Definition 1

For \(a,b,c \in \mathcal{{R}}\), we write \(a \equiv b \mod c\) (right modulo operation) if there exists \(d \in \mathcal{{R}}\) such that \(a = b + d c\)

In complexity estimates we count the total number of the following operations: \(+, -, \cdot , \slash \) and \(\theta ^i\) for any \(i \in \mathbb {Z}_{> 0}\). For computing \(\theta ^i\) the assumption is that the Frobenius automorphism can be done efficiently in \(\mathbb {F}_{q^s}\); this is reasonable since we can represent \(\mathbb {F}_{q^s}\)-elements using a normal basis over \(\mathbb {F}_q\) (cf. [47, Sect. 2.1.2]): in this case, \(a^q\) for \(a \in \mathbb {F}_{q^s}\) is simply the cyclic shift of a represented as an \(\mathbb {F}_q\)-vector over the normal basis.

2.2 Skew polynomial matrices

Free modules and matrices over \(\mathcal{{R}}\) behave quite similarly to the \(\mathbb {F}[x]\) case, keeping non-commutativity in mind:

  • Any left sub-module \(\mathcal{{V}}\) of \(\mathcal{{R}}^m\) is free and admits a basis of at most m elements. Any two bases of \(\mathcal{{V}}\) have the same number of elements.

  • The rank of a matrix M over \(\mathcal{{R}}\) is defined as the number of elements in any basis of the left \(\mathcal{{R}}\)-row space of M. The rows of two such matrices \(M, M' \in \mathcal{{R}}^{n \times m}\) generate the same left module if and only if there exists a \(U \in \mathrm {GL}_n(\mathcal{{R}})\) such that \(M = UM'\), where \(\mathrm {GL}_n(\mathcal{{R}})\) denotes the set of invertible \(n \times n\) matrices over \(\mathcal{{R}}\).

These properties follow principally from \(\mathcal{{R}}\) being an Ore ring and therefore left Euclidean, hence left Principal Ideal Domain (PID), hence left Noetherian.Footnote 2 Moreover, \(\mathcal{{R}}\) has a unique left skew fieldFootnote 3 of fractions \(\mathcal {Q}\) from which it inherits its linear algebra properties. See e.g. [10, 13] for more details. In this paper we exclusively use the left module structure of \(\mathcal{{R}}\), and we will often omit the “left” denotation.

We introduce the following notation for vectors and matrices over \(\mathcal{{R}}\): Matrices are denoted by capital letters (e.g. V). The ith row of V is denoted by \(\varvec{v}_i\), the jth element of a vector \(\varvec{v}\) is \(v_j\) and \(v_{i,j}\) is the (ij)th entry of a matrix V. Indices start at 0.

  • The degree of a vector \(\varvec{v}\) is \(\deg \varvec{v}:= \max _{i} \{ \deg v_i \}\) (and \(\deg \varvec{0} = -\infty \)) and the degree of a matrix V is \(\deg V := \sum _{i} \{ \deg \varvec{v}_i \}\).

  • The max-degree of V is \({{\mathrm{maxdeg}}}V := \max _{i} \{ \deg \varvec{v}_i \} = \max _{i,j} \{ \deg v_{i,j} \}\).

  • The leading position of a non-zero vector \(\varvec{v}\) is \(\mathrm {LP}_{}(\varvec{v}) := \max \{i \, : \, \deg v_i = \deg \varvec{v}\}\), i.e. the rightmost position having maximal degree in the vector. Furthermore, we define the leading term \(\mathrm {LT}(\varvec{v}) := v_{\mathrm {LP}_{}(\varvec{v})}\) and \(\mathrm {LC}(\varvec{v})\) is the leading coefficient of \(\mathrm {LT}(\varvec{v})\).

2.3 The weak Popov form

Definition 2

A matrix V over \(\mathcal{{R}}\) is in weak Popov form if the leading positions of all its non-zero rows are different.

The following lemma describes that the rows of a matrix in weak Popov form are minimal in a certain way. Its proof is exactly the same as for \(\mathbb {F}[x]\) modules and is therefore omitted, see e.g. [32].

Lemma 1

Let V be a matrix in weak Popov form, and let \(\mathcal{{V}}\) be the \(\mathcal{{R}}\)-module generated by its rows. Then the non-zero rows of V are a basis of \(\mathcal{{V}}\) and every \(\varvec{u}\in \mathcal{{V}}\) satisfies \(\deg \varvec{u}\ge \deg \varvec{v}\), where \(\varvec{v}\) is the row of V with \(\mathrm {LP}_{}(\varvec{v})=\mathrm {LP}_{}(\varvec{u})\).

We will need to “shift” the relative importance of some columns compared to others. Given a “shift vector” \(\varvec{w}= (w_0,\dots ,w_\ell ) \in \mathbb {Z}_{\ge 0}^{\ell +1}\), define the mapping

$$\begin{aligned} \varPhi _{\varvec{w}} : \mathcal{{R}}^{\ell +1} \rightarrow \mathcal{{R}}^{\ell +1}, \; \varvec{u}= (u_0,\dots ,u_\ell ) \mapsto \Big (u_0 x^{w_0}, \dots , u_\ell x^{w_\ell }\Big ). \end{aligned}$$

It is easy to compute the inverse of \(\varPhi _{\varvec{w}}\) for any vector in \(\varPhi _{\varvec{w}}(\mathcal{{R}}^{\ell +1})\). Note that since the monomials \(x^{w_i}\) are multiplied from the right, applying \(\varPhi _{\varvec{w}}\) will only shift the entry polynomials, and not modify the coefficients. We can extend \(\varPhi _{\varvec{w}}\) to \(\mathcal{{R}}\)-matrices by applying it row-wise.

Definition 3

For any \(\varvec{w}= (w_0,\dots ,w_\ell ) \in \mathbb {Z}_{\ge 0}^{\ell +1}\), a matrix \(V \in \mathcal{{R}}^{\cdot \times (\ell +1)}\) is in \(\varvec{w}\) -shifted weak Popov form if \(\varPhi _{\varvec{w}}(V)\) is in weak Popov form.

Given some matrix V over \(\mathcal{{R}}\), “transforming V into (\(\varvec{w}\)-shifted) weak Popov form” means to find some W generating the same row space as V and such that W is in (\(\varvec{w}\)-shifted) weak Popov form. We will see in Sect. 4.1 that such W always exist.

Throughout this paper, by “row reduced” we mean “in weak Popov form”.Footnote 4 Similarly, “row reduction” means “transforming into weak Popov form”.

3 Decoding problems in rank-metric and subspace codes

3.1 Interlaved Gabidulin codes: multi-sequence shift registers

It is classical to decode errors in a Gabidulin code by solving a syndrome-based “Key Equation”: that is, a shift-register synthesis problem over \(\mathcal{{R}}\), see e.g. [15]. An Interleaved Gabidulin code is a direct sum of several Gabidulin codes [25], and error-decoding can be formulated as a shift-register synthesis of several sequences simultaneously. A slightly more general notion of shift-register synthesis allows formulating the decoder using the “Gao Key Equation” [47]. Another generalisation accommodates error-and-erasure decoding of some Gabidulin resp. Interleaved Gabidulin codes [23, 47].

All these approaches are instances of the following “Multi-Sequence generalised Linear Skew-Feedback Shift Register” (MgLSSR) synthesis problem:

Problem 1

(MgLSSR) Given skew polynomials \(s_i, g_i \in \mathcal{{R}}\) and non-negative integers \(\gamma _i \in \mathbb {Z}_{\ge 0}\) for \(i=1,\ldots ,\ell \), find skew polynomials \(\lambda , \omega _1,\dots ,\omega _\ell \in \mathcal{{R}}\), with \(\lambda \) of minimal degree such that the following holds:

$$\begin{aligned} \lambda s_i\equiv & {} \omega _i \mod g_i \\ \deg \lambda + \gamma _0> & {} \deg \omega _i + \gamma _i \end{aligned}$$

We show how to solve this problem by row reduction of a particular module basis. The approach is analogous to how the \(\mathbb {F}[x]\)-version of the problem is handled by Rosenkilde in [32], with only a few technical differences due to the non-commutativity of \(\mathcal{{R}}\).

In the sequel we consider a particular instance of Problem 1, so \(\mathcal{{R}}\), \(\ell \in \mathbb {Z}_{> 0}\), and \(s_i, g_i \in \mathcal{{R}}\), \(\gamma _i \in \mathbb {Z}_{\ge 0}\) for \(i=1,\ldots ,\ell \) are arbitrary but fixed. We assume \(\deg s_i \le \deg g_i\) for all i since taking \(s_i := (s_i \ mod \ g_i)\) yields the same solutions to Problem 1.

Denote by \(\mathcal{{M}}\) the set of all vectors \(\varvec{v}\in \mathcal{{R}}^{\ell +1}\) satisfying the congruence relation, i.e.,

$$\begin{aligned} \mathcal{{M}}:= \big \{(\lambda ,\omega _1,\dots ,\omega _\ell )\in \mathcal{{R}}^{\ell +1} \; \big | \; \lambda s_i \equiv \omega _i \mod g_i \; \forall i=1,\dots ,\ell \big \}. \end{aligned}$$
(1)

Lemma 2

Consider an instance of Problem 1 and \(\mathcal{{M}}\) as in (1). \(\mathcal{{M}}\) with component-wise addition and left multiplication by elements of \(\mathcal{{R}}\) forms a free left module over \(\mathcal{{R}}\). The rows of M form a basis of \(\mathcal{{M}}\), where

$$\begin{aligned} M=\left( \begin{array}{ccccc} 1 &{} s_1 &{} s_2 &{} \dots &{} s_\ell \\ 0 &{} g_1 &{} 0 &{} \dots &{} 0 \\ 0 &{} 0 &{} g_2 &{} \dots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \ddots &{}\vdots \\ 0 &{} 0 &{} 0 &{} \dots &{} g_\ell \\ \end{array}\right) \ . \end{aligned}$$
(2)

Proof

Since \(\mathcal{{M}}\subseteq \mathcal{{R}}^{\ell +1}\), the first half of the statement follows easily since \(\mathcal{{M}}\) is clearly closed under addition and left \(\mathcal{{R}}\) multiplication. M is a basis of \(\mathcal{{M}}\) by arguments analogous to the \(\mathbb {F}[x]\) case, cf. [32, Lemma 1]. \(\square \)

Lemma 2 gives a simple description of all solutions of the congruence requirement of Problem 1 in the form of the row space of an explicit matrix M. The following theorem implies that computing the weak Popov form of M is enough to solve Problem 1. The proof is similar to the \(\mathbb {F}[x]\)-case but since there is no convenient reference for it, we give it here for the \(\mathcal{{R}}\) case. The entire strategy is formalised in Algorithm 1.

Theorem 1

Consider an instance of Problem 1, and \(\mathcal{{M}}\) as in (1). Let \(\varvec{w}= (\gamma _0,\dots ,\gamma _\ell ) \in \mathbb {Z}_{\ge 0}^{\ell +1}\). If V is a basis of \(\mathcal{{M}}\) in \(\varvec{w}\)-shifted weak Popov form, the row \(\varvec{v}\) of V with \(\mathrm {LP}_{}(\varPhi _{\varvec{w}}(\varvec{v}))=0\) is a solution to Problem 1.

Proof

By Lemma 2 the row \(\varvec{v}\) satisfies the congruence requirement of Problem 1. For the degree restriction of Problem 1, note that any \(\varvec{u}\in \mathcal{{M}}\) satisfies this restriction if and only if \(\mathrm {LP}_{}(\varPhi _{\varvec{w}}(\varvec{u})) = 0\), since \(\deg u_i + \gamma _i = \deg (\varPhi _{\varvec{w}}(\varvec{u})_i)\). Furthermore, if this is the case, then \(\deg (\varPhi _{\varvec{w}}(\varvec{u})) = \deg u_0 + \gamma _0\). Thus, not only must \(\varvec{v}\) satisfy the degree restriction, but by Lemma 1, \(v_0\) also has minimal possible degree. \(\square \)

figure a

The complexity of Algorithm 1 is determined by Line 2. Therefore, in Sects. 4 and 5.1 we analyse how and in which complexity we can row-reduce \(\mathcal{{R}}\)-matrices. In particular, we prove the following statement, where \(\mu := \max _i \{\gamma _i +\deg g_i\}\).

Theorem 2

Algorithm 1 has complexity \({\left\{ \begin{array}{ll} O(\ell \mu ^2), &{}\text {if } g_i = x^{t_i}+c_i, \, t_i \in \mathbb {Z}_{> 0}, \, c_i \in \mathbb {F}\; \forall \, i, \\ O(\ell ^2 \mu ^2), &{}\text {otherwise.} \end{array}\right. }\)

Proof

The first case follows from Theorem 8 in Sect. 5.1, using Algorithm 4 for the row reduction step. For general \(g_i\)’s, the result of Example 2 in Sect. 4 holds, which estimates the complexity of Algorithm 3 for a shift-register input. \(\square \)

The above theorem applies well to decoding Gabidulin and Interleaved Gabidulin codes since the \(g_i\) are often in the restricted form: specifically, \(g_i\) is a power of x in syndrome Key Equations, while \(g_i = x^n - 1\) in Gao Key Equation whenever \(n \mid s\). We therefore achieve the same complexity as [44] but in a wider setting.

3.2 Decoding Mahdavifar–Vardy codes

MV codes [26, 28] are subspace codes constructed by evaluating powers of skew polynomials at certain points. We will describe how one can use row reduction to carry out the most computationally intensive step of the MV decoding algorithm given in [28], the Interpolation step. In this section, \(\mathcal{{R}}= \mathbb {F}_{q^s}[x;\theta ]\) where \(\theta \) is some power of the Frobenius automorphism of \(\mathbb {F}_{q^s}/\,\mathbb {F}_q\).

Problem 2

(Interpolation Step of MV decoding) Let \(\ell , k, s, n \in \mathbb {Z}_{> 0}\) be such that \(\left( {\begin{array}{c}\ell +1\\ 2\end{array}}\right) (k-1) < n \le s\). Given \((x_i, y_{i,1},\ldots , y_{i,\ell }) \in \mathbb {F}_{q^s}^{\ell +1}\) for \(i=1,\ldots ,n\), where the \(x_i\) are linearly independent over \(\mathbb {F}_q\), find a non-zero \(\varvec{Q} \in \mathcal{{R}}^{\ell +1}\) satisfying:

$$\begin{aligned}&Q_0(x_i) + \sum _{t=1}^\ell Q_t(y_{i,t}) = 0 \quad \qquad \qquad \qquad i=1,\ldots ,n, \end{aligned}$$
(3)
$$\begin{aligned}&\deg Q_t < \chi - t(k-1) \quad t=0,\dots ,\ell , \end{aligned}$$
(4)

where \(\chi \) is given by

$$\begin{aligned} \chi = \left\lceil \frac{n+1}{\ell +1} + \frac{1}{2}\ell (k-1) \right\rceil {.} \\ \end{aligned}$$

The problem can be solved by a large linear system of equations whose dimensions reveals that a solution always exists [28, Lemma 8]. Note that the requirement \(n > \left( {\begin{array}{c}\ell +1\\ 2\end{array}}\right) (k-1)\) ensures that all the degree bounds (4) are non-negative.

Let \(\mathcal{{M}}\) be the set of all \(\varvec{Q}\) that satisfy (3) though not necessarily (4):

$$\begin{aligned} \textstyle \mathcal{{M}}= \big \{ \varvec{Q} \in \mathcal{{R}}^{\ell +1} \ \big |\ Q_0(x_i) + \sum _{t=1}^\ell Q_t(y_{i,t}) = 0 \quad i=1,\ldots ,n \big \} \end{aligned}$$
(5)

Lemma 3

Consider an instance of Problem 2. Then \(\mathcal{{M}}\) of (5) is a left \(\mathcal{{R}}\)-module.

Proof

\(\mathcal{{M}}\) is closed under addition since \(a(\alpha ) + b(\alpha ) = (a+b)(\alpha )\) for all \(a,b \in \mathcal{{R}}\) and \(\alpha \in \mathbb {F}_{q^s}\). Let \(f \in \mathcal{{R}}\), \(\varvec{Q} = (Q_0,Q_1,\dots ,Q_\ell ) \in \mathcal{{M}}\). Then \(f \cdot \varvec{Q}\) satisfies (3) since

$$\begin{aligned} (f \cdot Q_0)(x) + \sum _{i=1}^{\ell } (f \cdot Q_i)(y_i) = f\left( Q_0(x) + \sum _{i=1}^{\ell } Q_i(y_i)\right) = f(0) = 0 \ . \end{aligned}$$

\(\square \)

For explicitly describing a basis of \(\mathcal{{M}}\), we need a few well-known technical elements:

Definition 4

Given \(a_1,\ldots ,a_m \in \mathbb {F}_{q^s}\) which are linearly independent over \(\mathbb {F}_q\), the annihilator polynomial of the \(a_i\) is the monic non-zero \(\mathcal {A}\in \mathcal{{R}}\) of minimal degree such that \(\mathcal {A}(a_i) = 0\) for all i.

It is easy to show that the annihilator polynomial is well-defined and that \(\deg \mathcal {A}= m\), see e.g. [37]. The existence of annihilator polynomials easily leads to the following analogue of Lagrange interpolation:

Lemma 4

(Interpolation polynomial) Given any \(a_1,\dots ,a_m \in \mathbb {F}_{q^s}\) which are linearly independent over \(\mathbb {F}_q\), and arbitrary \(b_1,\dots ,b_m \in \mathbb {F}_{q^s}\), there exists a unique \(R \in \mathcal{{R}}\) of degree at most \(m-1\) such that \(R(a_i) = b_i\) for all \(i=1,\ldots ,m\).

Lemma 5

Consider an instance of Problem 2 and let \(\mathcal{{M}}\) be as in (5). Denote by G the annihilator polynomial of the \(x_i, i=1,\ldots ,n\), and let \(R_t \in \mathcal{{R}}, t=1,\ldots ,\ell \) be the interpolation polynomial with \(R_t(x_i) = y_{i,t}\) for \(i=1,\dots ,n\). The rows of M form a basis of \(\mathcal{{M}}\):

$$\begin{aligned} M = \begin{pmatrix} \varvec{m}_0 \\ \varvec{m}_1 \\ \varvec{m}_2 \\ \vdots \\ \varvec{m}_\ell \end{pmatrix} = \begin{pmatrix} G &{} 0 &{} 0 &{} \dots &{} 0 \\ -R_1 &{} 1 &{} 0 &{} \dots &{} 0 \\ -R_2 &{} 0 &{} 1 &{} \dots &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ -R_\ell &{} 0 &{} 0 &{} \dots &{} 1 \end{pmatrix} \end{aligned}$$
(6)

Proof

\(\subseteq \)”: We should show that each \(\varvec{m}_j\) all “vanish” at the points \((x_i,y_{i,1},\ldots ,y_{i,\ell })\). Consider such a point; we have two cases:

$$\begin{aligned} \varvec{m}_0: G(x_i)= & {} 0 \\ \varvec{m}_t: 1(y_{i,t})-R_t(x_i) = y_{i,t}-R_t(x_i)= & {} 0 \ , t=1,\dots ,\ell \end{aligned}$$

\(\supseteq \)”: Consider some \(\varvec{Q} = (Q_0,\dots ,Q_\ell ) \in \mathcal{{M}}\). Then we can write

$$\begin{aligned} \varvec{v}_\ell&:= \varvec{Q} \\ \varvec{v}_{\ell -1}&:= \varvec{v}_\ell - v_{\ell ,\ell } \cdot \varvec{m}_\ell = (v_{\ell -1, 0},\ldots ,v_{\ell -1,\ell -1}, 0) \\ \varvec{v}_{\ell -2}&:= \varvec{v}_{\ell -1} - v_{\ell -1,\ell -1} \cdot \varvec{m}_{\ell -1} = (v_{\ell -2,0}, \ldots , v_{\ell -2,\ell -2},0, 0) \\&\; \; \, \vdots \\ \varvec{v}_{0}&:= \varvec{v}_1 - v_{1,1} \cdot \varvec{m}_1 = (v_{0,0}, 0, \dots , 0). \end{aligned}$$

Since \(\varvec{v}_\ell \in \mathcal{{M}}\), and each \(\varvec{m}_t \in \mathcal{{M}}\), we conclude that all the \(\varvec{v}_t \in \mathcal{{M}}\) and in particular \(\varvec{v}_0 \in \mathcal{{M}}\). Thus for any i we must have \(v_{0,0}(x_i) = 0\). This means G must right-divide \(v_{0,0}\): for otherwise, the division would yield a non-zero remainder \(B \in \mathcal{{R}}\) with \(\deg B < \deg G\) but still having \(B(x_i) = 0\), contradicting the minimality of G.

Summarily, \(\varvec{v}_0 = f\cdot \varvec{m}_0\) for some \(f \in \mathcal{{R}}\), and hence \(\varvec{Q} = \varvec{v}_\ell \) is an \(\mathcal{{R}}\)-linear combination of the rows of M. \(\square \)

To complete the interpolation step, we need to find an element of \(\mathcal{{M}}\) whose components satisfy the degree constraints (4).

Theorem 3

Consider an instance of Problem 2, and let \(\mathcal{{M}}\) be as in (5). Let \(\varvec{w}= (0,(k-1), \dots , \ell (k-1))\), and V be a basis of \(\mathcal{{M}}\) in \(\varvec{w}\)-shifted weak Popov form. If \(\varvec{v}\) is a row of V with minimal \(\varvec{w}\)-shifted degree, \(\deg \varPhi _{\varvec{w}}(\varvec{v})\), then \(\varvec{v}\) is a solution to Problem 2.

Proof

Any row of V satisfies (3) because it is in \(\mathcal{{M}}\). As previously remarked, there exists some solution \(\varvec{Q} = (Q_0, Q_1,\dots ,Q_\ell ) \in \mathcal{{M}}\) satisfying the degree conditions (4). By the choice of \(\varvec{v}\) and by Lemma 1 on page 5, then \(\deg \varPhi _{\varvec{w}}(\varvec{v}) \le \deg \varPhi _{\varvec{w}}(\varvec{Q})\). But then if \(t = \mathrm {LP}_{}(\varPhi _{\varvec{w}}(\varvec{Q}))\) we have that for any i:

$$\begin{aligned}&\deg \left( v_i x^{i(k-1)}\right) \le \deg \varPhi _{\varvec{w}}(\varvec{Q}) = \deg \left( Q_t x^{t(k-1)}\right) < \chi \end{aligned}$$

Hence, \(\varvec{v}\) satisfies (4). \(\square \)

This results immediately in the decoding procedure outlined as Algorithm 2.

figure b

Theorem 4

Algorithm 2 has complexity \(O(\ell n^2)\) over \(\mathbb {F}_{q^s}\).

Proof

Computing G can be done straightforwardly in \(O(n^2)\) operations over \(\mathbb {F}_{q^s}\). Each \(R_t\) can be computed in the same speed using a decomposition into smaller interpolations and two annihilator polynomials, see e.g. [39]. For Line 2, we use Algorithm 7 whose complexity is \(O(\ell n^2)\), proved as Theorem 10. \(\square \)

In [48], Xie, Lin, Yan and Suter present an algorithm for solving the Interpolation Step using a skew-variant of the Kötter–Nielsen–Høholdt algorithm [35] with complexity \(O(\ell ^2 s n)\) over \(\mathbb {F}_{q^s}\). Since \(n < s\), our algorithm is at least as fast as theirs. Note that these costs probably dominate the complexity of MV decoding: the other step, Root-finding, likelyFootnote 5 has complexity \(O(\ell ^2 k n)\).

4 Row reduction of \(\mathcal{{R}}\)-matrices

4.1 The Mulders–Storjohann algorithm

In this section, we introduce our algorithmic work horse: obtaining row reduced bases of left \(\mathcal{{R}}\)-modules \(\mathcal{{V}}\subseteq \mathcal{{R}}^{m}\). The core is an \(\mathcal{{R}}\)-variant of the Mulders–Storjohann algorithm [31] that was originally described for \(\mathbb {F}[x]\) matrices. The algorithm and its proof of correctness carries over almost unchanged, while a fine-grained complexity analysis is considerably more involved; we return to this in Sect. 4.3.

Definition 5

Applying a simple transformation i on j at position h on a matrix V with \(\deg v_{i,h} \le \deg v_{j,h}\) means to replace \(\varvec{v}_j\) by \(\varvec{v}_j - \alpha x^\beta \varvec{v}_i\), where \(\beta = \deg v_{j,h} - \deg v_{i,h}\) and \(\alpha = \mathrm {LC}(v_{j,h})/\theta ^{\beta }(\mathrm {LC}(v_{i,h}))\).

By a simple LP-transformation i on j, where \(\mathrm {LP}_{}(\varvec{v}_i) = \mathrm {LP}_{}(\varvec{v}_j)\), we will mean a simple transformation i on j at position \(\mathrm {LP}_{}(\varvec{v}_i)\).

Remark 1

Note that a simple transformation i on j at position h cancels the leading term of the polynomial \(v_{j,h}\). Elementary row operations keep the row space and rank of the matrix unchanged, and in particular so does any sequence of simple transformations.

We use the following value function for \(\mathcal{{R}}\) vectors as a “size” of \(\mathcal{{R}}^{m}\) vectors:

$$\begin{aligned} \psi : \mathcal{{R}}^{m}&\rightarrow \mathbb {Z}_{\ge 0}\\&\varvec{v}\mapsto \left\{ \begin{array}{ll} 0 &{} \text { if } \varvec{v}= \varvec{0} \\ m\deg \varvec{v}+ \mathrm {LP}_{}(\varvec{v}) + 1 &{} \text {otherwise}\\ \end{array} \right. \end{aligned}$$

Lemma 6

For some \(V \in \mathcal{{R}}^{\cdot \times m}\), consider a simple LP-transformation i on j, where \(\varvec{v}_j\) is replaced by \(\varvec{v}'_j\). Then \(\psi (\varvec{v}_j')<\psi (\varvec{v}_j)\).

Proof

The proof works exactly as in the \(\mathbb {F}[x]\) case, cf. [32, Lemma 8]. \(\square \)

figure c

Theorem 5

Algorithm 3 is correct.

Proof

By Lemma 6, the \(\psi \)-value of one row of V decreases for each simple LP-transformation. The sum of the values of the rows must at all times be non-negative so the algorithm must terminate. When the algorithm terminates there are no \(i \ne j\) such that \(\mathrm {LP}_{}(\varvec{v}_i)= \mathrm {LP}_{}(\varvec{v}_j)\). That is to say, V is in weak Popov form. \(\square \)

The above proof easily leads to the rough complexity estimate of Algorithm 3 of \(O(m^2 \deg V {{\mathrm{maxdeg}}}V)\), where m is the number of columns in V.

Note that in Algorithm 3 each iteration might present several possibilities for the simple LP-transformation; the above theorem shows that any choice of LP-transformations leads to the correct result.

To transform V into \(\varvec{w}\)-shifted weak Popov form, for some shift \(\varvec{w}\in \mathbb {Z}_{\ge 0}^m\), we let \(V' = \varPhi _{\varvec{w}}(V)\) and apply Algorithm 3 on \(V'\) to obtain \(W'\) in weak Popov form. Since Algorithm 3 only performs row operations, it is clear that \(\varPhi _{\varvec{w}}\) can be inverted on \(W'\) to obtain \(W = \varPhi _{\varvec{w}}^{-1}(W')\). Then W is in \(\varvec{w}\)-shifted weak Popov form by definition.

4.2 The determinant degree and orthogonality defect

The purpose of this section is to introduce the orthogonality defect as a tool for measuring “how far” a square, full-rank matrix over \(\mathcal{{R}}\) is from being in weak Popov form. It relies on the nice properties of the degree of the Dieudonné determinant for matrices over \(\mathcal{{R}}\). The orthogonality defect for \(\mathbb {F}[x]\) matrices was introduced by Lenstra [22] and used in [32] to similar effect as we do here.

Dieudonné introduced a function for matrices over skew fields which shares some of the essential properties of the usual commutative determinant, in particular that it is multiplicative, see [12] or [13, Sect. 20]. This Dieudonné determinant can be applied to matrices over \(\mathcal{{R}}\) by considering \(\mathcal{{R}}\) inside its left field of fractions. The definition of this determinant is quite technical, and we will not actually need to invoke it. Rather, we will use an observation by Taelman [46] that the Dieudonné determinant implies a simple-behaving determinant degree function for matrices with very nice properties:

Proposition 1

There is a unique function \(\deg \det :\,\mathcal{{R}}^{m\times m}\rightarrow \mathbb {Z}_{\ge 0} \cup \{-\infty \}\) s.t.:

  • \(\deg \det (A A') = \deg \det (A) + \deg \det (A')\) for all \(A, A' \in \mathcal{{R}}^{m\times m}\).

  • \(\deg \det U = 0\) for all \(U \in GL_m(\mathcal{{R}})\).

  • If A is diagonal with diagonal elements \(d_0,\ldots ,d_{m-1}\), then \(\deg \det A = \sum _{i}\deg d_i\)

Corollary 1

For any \(A,A' \in \mathcal{{R}}^{m\times m}\) then:

  • If \(A'\) is obtained from A by elementary row operations, then \(\deg \det A' = \deg \det A\).

  • If B equals \(A \in \mathcal{{R}}^{m\times m}\) with one row or column scaled by some \(f \in \mathcal{{R}}^*\), then \(\deg \det B = \deg f + \deg \det A\).

  • If A is triangular with diagonal elements \(d_0,\ldots ,d_{m-1}\), then \(\deg \det A = \sum _i\deg d_i\).

  • \(\deg \det (\varPhi _{\varvec{w}}(A)) = \deg \det (A) + \sum _i w_i\) for any shift \(\varvec{w}\).

Example 1

Consider input matrix \(\varPhi _{\varvec{w}}(M)\) to Algorithm 1 for the caseFootnote 6 \(\ell =2\), \(\varvec{w}= (\gamma _0, \gamma _1, \gamma _2) = (100, 42,69)\), \(\deg s_1 = 99\), \(\deg s_2 = 95\) and \(\deg g_1 = \deg g_2 = 100\). Then

And so by Proposition 1, \(\deg \det \varPhi _{\varvec{w}}(M) = \deg g_1 + \deg g_2 + \sum _i \gamma _i = 411\).

This description of \(\deg \det (\cdot )\) is not operational in the sense that it is not clear how to compute \(\deg \det V\) for general \(V \in \mathcal{{R}}^{m\times m}\). The following definition and Proposition 2 implies that Algorithm 3 can be used to compute \(\deg \det V\); conversely, we show in Sect. 4.3 how to bound the complexity of Algorithm 3 based on \(\deg \det V\).

Definition 6

The orthogonality defect of \(V \in \mathcal{{R}}^{m\times m}\) is \({\Delta (V)} := \deg V - \deg \det V\).

The following observations are easy for \(\mathbb {F}[x]\) matrices, but require more work over \(\mathcal{{R}}\):

Proposition 2

Let \(V \in \mathcal{{R}}^{m \times m}\) of full rank and in weak Popov form. Then \({\Delta (V)} = 0\).

Proof

Due to Corollary 1, we can assume the columns and rows of V are ordered such that \(\mathrm {LP}_{}(\varvec{v}_i) = i\) and \(\deg v_{i,i} \le \deg v_{j,j}\) for \(i < j\). We will call this property “ordered weak Popov form” in this proof. Note that it implies \(\psi (\varvec{v}_i) < \psi (\varvec{v}_j)\) for \(i < j\). We will inductively obtain a series of matrices \(V^{(0)} = V, V^{(1)}, V^{(2)}, \ldots , V^{(m)}\) each in ordered weak Popov form, and such that the first i columns of \(V^{(i)}\) are zero below the diagonal. Then \(V^{(m)}\) is upper triangular and we can obtain two expressions for its \(\deg \det \).

So assume that \(V^{(i)}\) is in ordered weak Popov form and its first i columns are zero below the diagonal. Recall that the (left) union of two skew polynomials \(f, g \in \mathcal{{R}}\) is the unique lowest-degree \(p \in \mathcal{{R}}\) such that \(p = af = bg\) for some \(a,b \in \mathcal{{R}}\); it is a consequence of the Euclidean algorithm that the union always exists, see e.g. [38]. For each \(j > i\) consider now the coefficients in the union of \(v^{(i)}_{i,i}\) and \(v^{(i)}_{j,i}\), i.e. \(a^{(i)}_j,b^{(i)}_j \in \mathcal{{R}}\) such that \(a^{(i)}_j v^{(i)}_{i,i} = b^{(i)}_j v^{(i)}_{j,i}\). Let

$$\begin{aligned} V^{(i+1)} = \left( \begin{array}{@{}c|cccc@{}} \ I_{i-1}\ &{} \\ \hline &{} 1 \\ &{} -a^{(i)}_{i+1} &{} b^{(i)}_{i+1} \\ &{} \vdots &{} &{} \ddots \\ &{} -a^{(i)}_{m-1} &{} &{} &{} b^{(i)}_{m-1} \end{array}\right) V^{(i)} \ , \end{aligned}$$

where \(I_{i-1}\) is the \((i-1) \times (i-1)\) identity matrix. The \(i+1\) first columns of \(V^{(i+1)}\) are then zero below the diagonal. Also \(\mathrm {LP}_{}(a^{(i)}_j\varvec{v}^{(i)}_i) < \mathrm {LP}_{}(b^{(i)}_j\varvec{v}^{(i)}_{j}) = \mathrm {LP}_{}(\varvec{v}^{(i)}_j)\) and \(\deg (a^{(i)}_j\varvec{v}^{(i)}_i) \le \deg (b^{(i)}_j\varvec{v}^{(i)}_{j})\) for \(j > i\), which means \(\psi (a^{(i)}_j\varvec{v}^{(i)}_i) < \psi (b^{(i)}_j\varvec{v}^{(i)}_{j})\) and therefore \(\psi (\varvec{v}^{(i+1)}_j) = \psi (b^{(i)}_j\varvec{v}^{(i)}_{j})\). This implies that \(V^{(i+1)}\) is in ordered weak Popov form and that \(\deg v^{(i+1)}_{j,j} = \deg b^{(i)}_j + \deg v^{(i)}_{j,j}\) for \(j > i\), which inductively expands to

$$\begin{aligned} \deg v^{(i+1)}_{j,j} = \deg v_{j,j} + \sum _{h=0}^i \deg b^{(h)}_j \ . \end{aligned}$$

Inductively, we therefore arrive at an upper triangular matrix \(V^{(m)}\) in ordered weak Popov form, and whose diagonal elements satisfy \(\deg v^{(m)}_{j,j} = \deg v_{j,j} + \sum _{i=0}^{j-1} \deg b^{(i)}_j\). Thus \(\deg \det V^{(m)}\) is the sum of all these degrees by Corollary 1. On the other hand \(V^{(m)}\) is obtained by multiplying triangular matrices on \(V^{(0)} = V\), so by Proposition 1 we get another expression for \(\deg \det V^{(m)}\) as:

$$\begin{aligned} \deg \det V^{(m)} = \deg \det V + \sum _{i=0}^{m-1} \sum _{j=i+1}^{m-1} \deg b^{(i)}_j \end{aligned}$$

Combining the expressions, we get \(\deg \det V = \sum _{i=0}^{m-1} \deg v_{i,i} = \deg V\). \(\square \)

Corollary 2

Let \(V \in \mathcal{{R}}^{m\times m}\) and full-rank, then \(0 \le \deg \det V \le \deg V\).

Proof

Applying Algorithm 3 on V would use row operations to obtain a matrix \(V' \in \mathcal{{R}}^{m\times m}\) in weak Popov form. Then \(\deg \det V = \deg \det V'\) by Proposition 1. By Proposition 2 then \(\deg \det V' = \deg V' \ge 0\), and by the nature of Algorithm 3, then \(\deg V' \le \deg V\). \(\square \)

4.3 Complexity of Mulders–Storjohann

We can now bound the complexity of Algorithm 3 using arguments completely analogous to the \(\mathbb {F}[x]\) case in [32]. These are in turn, the original arguments of [31] but finer grained by using the orthogonality defect. We bring the full proof here since the main steps are referred to in Sect. 5.1.

Theorem 6

Algorithm 3 with a full-rank input matrix \(V \in \mathcal{{R}}^{m \times m}\) performs at most \(m \big ({\Delta (V)} + m \big )\) simple LP-transformations, and it has complexity \(O(m^2 {\Delta (V)} {{\mathrm{maxdeg}}}(V))\) over \(\mathbb {F}\).

Proof

By Lemma 6, every simple LP-transformation reduces the \(\psi \)-value of one row with at least 1. So the number of possible simple LP-transformations is upper bounded by the difference of values of the input matrix V and the output matrix U, the matrices values being the sum of their rows’. More precisely, the number of iterations is upper bounded by:

$$\begin{aligned}&{\textstyle \sum _{i=0}^{m-1}} \left[ m\deg \varvec{v}_i+\mathrm {LP}_{}(\varvec{v}_i)-\big ( m\deg \varvec{u}_i+\mathrm {LP}_{}(\varvec{u}_i) \big )\right] \\\le & {} m^2 + m {\textstyle \sum _{i=0}^{m-1}} \left[ \deg \varvec{v}_i-\deg \varvec{u}_i\right] \\= & {} m [\deg V-\deg U+m] = m ({\Delta (V)}+m) , \end{aligned}$$

where the last equality follows from \(\deg U = \deg \det U\) due to Proposition 2 and \(\deg \det U = \deg \det V\).

One simple transformation consists of calculating \(\varvec{v}_j - \alpha x^\beta \varvec{v}_i\), so for every coefficient in \(\varvec{v}_i\), we must apply \(\theta ^\beta \), multiply by \(\alpha \) and then add it to a coefficient in \(\varvec{v}_j\), each being in O(1). Since \(\deg \varvec{v}_j \le {{\mathrm{maxdeg}}}(V)\) this costs \(O(m{{\mathrm{maxdeg}}}(V))\) operations in \(\mathbb {F}\). \(\square \)

Since \({\Delta (V)} \le \deg V\), the above complexity bound is always at least as good as the straightforward bound we mentioned at the end of Sect. 4.1.

Example 2

Consider an instance of Problem 1. The complexity of Algorithm 1 is determined by a row reduction of

$$\begin{aligned} \varPhi _{\varvec{w}}(M) = \left( \begin{array}{ccccc} x^{\gamma _0} &{} s_1 x^{\gamma _1} &{} s_2 x^{\gamma _2} &{} \dots &{} s_\ell x^{\gamma _\ell } \\ &{} g_1 x^{\gamma _1} &{} \\ &{} &{} g_2 x^{\gamma _2} &{} \\ &{} &{} &{} \ddots &{} \\ &{} &{} &{} &{} g_\ell x^{\gamma _\ell } \\ \end{array}\right) . \end{aligned}$$
(7)

Let \(\mu := \max _i \{\gamma _i +\deg g_i\}\). We can assume that \(\gamma _0 < \max _{i \ge 1} \{ \gamma _i + \deg s_i \} \le \mu \) since otherwise M is already in \(\varvec{w}\)-shifted weak Popov form. To apply Theorem 6, we calculate the orthogonality defect of \(\varPhi _{\varvec{w}}(M)\). Since it is upper triangular, the degree of its determinant is

$$\begin{aligned} \deg \det \varPhi _{\varvec{w}}(M) = \sum _{i=1}^\ell \deg g_i + \sum _{i=0}^\ell \gamma _i \ . \end{aligned}$$

The degrees of the rows of \(\varPhi (M)\) satisfy

$$\begin{aligned} \deg \varPhi _{\varvec{w}}(\varvec{m}_0)&= \max _i \{\gamma _i + \deg s_i\} \le \mu \ , \\ \deg \varPhi _{\varvec{w}}(\varvec{m}_i)&= \gamma _i + \deg g_i&\text {for } i \ge 1. \end{aligned}$$

Thus, \({\Delta (\varPhi _{\varvec{w}}(M))} \le \mu - \gamma _0\). With \({{\mathrm{maxdeg}}}(\varPhi _{\varvec{w}}(M)) \le \mu \), Theorem 6 implies a complexity of \(O(\ell ^2 \mu ^2)\), assuming \(\ell \in O(\mu )\). Note that the straightforward bound on Algorithm 3 yields \(O(\ell ^3 \mu ^2)\). \(\square \)

Example 3

(Mulders–Storjohann for the Interpolation Step in decoding MV codes) Line 2 of Algorithm 2 is a row reduction of \(\varPhi _{\varvec{w}}(M)\), as defined in (6) on page 8, whose degrees of the nonzero entries are component-wise upper bounded by:

$$\begin{aligned} \begin{pmatrix} n &{} &{} &{} &{} &{}\\ n &{} (k-1) &{} &{} &{}\\ n &{} &{} 2(k-1) &{} &{}\\ \vdots &{} &{} &{} \ddots &{} \\ n &{} &{} &{} &{} \ell (k-1) \end{pmatrix} \end{aligned}$$

Thus \({\Delta (\varPhi _{\varvec{w}}(M))} \le \deg \varPhi _{\varvec{w}}(M) - n - \left( {\begin{array}{c}\ell +1\\ 2\end{array}}\right) (k-1) \le \ell n\). Using Theorem 6, the complexity in operations over \(\mathbb {F}_{q^s}\) becomes \(O(\ell ^3 n^2)\).

5 Faster row reduction on matrices having special forms

In this section, we will investigate improved row reduction algorithms for matrices of special forms. The main goals are to improve the running time of row reducing the matrices appearing in the decoding settings of Sects. 3.1 and 3.2, but the results here apply more broadly.

5.1 Shift register problems: the demand-driven algorithm

Our first focus is to improve the MgLSSR case of Algorithm 1 on page 7, where we are to row reduce \(\varPhi _{\varvec{w}}(M)\), given by (7): Algorithm 4 is a refinement of Algorithm 3 which is asymptotically faster when all \(g_i\) are of the form \(x^{d_i} + a_i\) for \(a_i \in \mathbb {F}\). Though the refinement is completely analogous to that of [32] for the \(\mathbb {F}[x]\) case, no complete proof has appeared in unabridged, peer-reviewed form before, so we give full proofs of the \(\mathcal{{R}}\) case here. We begin with a technical lemma:

Lemma 7

Consider an instance of Problem 1 and Algorithm 3 with input \(\varPhi _{\varvec{w}}(M)\) of (7). Let \(\tilde{g}_j = g_j x^{\gamma _j}\). Consider a variant of Algorithm 3 where, after a simple LP-transformation i on j, which replaces \(\varvec{v}_j\) with \(\varvec{v}'_j\), we instead replace it with \(\varvec{v}''_j = (v'_{j,0}, v'_{j,1} \ mod \ \tilde{g}_1, \ldots , v'_{j,\ell } \ mod \ \tilde{g}_\ell )\). This does not change the correctness of the algorithm or the upper bound on the number of simple LP-transformations performed.

Proof

Correctness follows if we can show that each of the \(\ell \) modulo reductions could have been achieved by a series of row operations on the current matrix V after the simple LP-transformation producing \(\varvec{v}_j'\). For each \(h \ge 1\), let \(\varvec{g}_h = (0,\ldots ,0,\tilde{g}_h,0,\ldots ,0)\), with position h non-zero.

During the algorithm, we will let \(J_h\) be a subset of the current rows in V having two properties: that \(\varvec{g}_h\) can be constructed as an \(\mathcal{{R}}\)-linear combination of the rows in \(J_h\); and that each \(\varvec{v} \in J_h\) has \( \psi (\varvec{v}) \le \psi (\varvec{g}_h)\). Initially, \(J_h = \{ \varvec{g}_h \}\).

After simple LP-transformations on rows not in \(J_h\), the h’th modulo reduction is therefore allowed, since \(\varvec{g}_h\) can be constructed by the rows in \(J_h\). On the other hand, consider a simple LP-transformation i on j where \(\varvec{v}_j \in J_h\), resulting in the row \(\varvec{v}_j'\). Then the h’th modulo reduction has no effect since \(\psi (\varvec{v}_j') < \psi (\varvec{v}_j) \le \psi (\varvec{g}_h)\). Afterwards, \(J_h\) is updated as \(J_h = J_h \setminus \{ \varvec{v}_j \} \cup \{ \varvec{v}_j', \varvec{v}_i \}\). We see that \(J_h\) then still satisfies the two properties, since \(\psi (\varvec{v}_i) \le \psi (\varvec{v}_j) \le \psi (\varvec{g}_h)\).

Since \(\psi (\varvec{v}''_j) \le \psi (\varvec{v}'_j)\) the proof of Theorem 6 shows that the number of simple LP-transformations performed is still bounded by \((\ell +1)({\Delta (V)}+\ell +1)\). \(\square \)

figure d

Theorem 7

Algorithm 4 is correct.

Proof

We first prove that an intermediary algorithm, Algorithm 5, is correct using the correctness of Algorithm 3, and then prove the correctness of Algorithm 4 using Algorithm 5 and Lemma 7. Starting from Algorithm 3 with input \(\varPhi _{\varvec{w}}(M)\), then Algorithm 5 is obtained by two simple modifications: Firstly, note that initially, when \(V := \varPhi _{\varvec{w}}(M)\), then \(\mathrm {LP}_{}(\varvec{v}_h) = h\) for \(h \ge 1\), and therefore the only possible simple LP-transformation must involve \(\varvec{v}_0\). We can maintain this property as a loop invariant throughout the algorithm by swapping \(\varvec{v}_0\) and \(\varvec{v}_{\mathrm {LP}_{}(\varvec{v}_0)}\) when applying a simple LP-transformation \(\mathrm {LP}_{}(\varvec{v}_0)\) on 0.

The second modification is to maintain \((\eta , h)\) as an upper bound on the \((\deg , \mathrm {LP})\) of \(\varvec{v}_0\) throughout the algorithm: we initially simply compute these values. Whenever we have applied a simple LP-transformation on \(\varvec{v}_0\) resulting in \(\varvec{v}_0'\), we know by Lemma 6 that \(\psi (\varvec{v}_0') < \psi (\varvec{v}_0)\). Therefore, either \(\deg \varvec{v}_0' < \eta \) or \(\deg \varvec{v}_0' = \eta \wedge \mathrm {LP}_{}(\varvec{v}_0') < h\). This is reflected in a corresponding decrement of \((\eta , h)\).

As a loop invariant we therefore have \(\psi (\varvec{v}_0) \le \eta (\ell +1) + h\). After an iteration, if this inequality is sharp, it simply implies that the \(\alpha \) computed in the following iteration will be 0, and \((\eta , h)\) will be correspondingly decremented once more. Note that we never set \(h = 0\): when \(\mathrm {LP}_{}(\varvec{v}_0) = 0\) then V must be in weak Popov form (since we already maintain \(\mathrm {LP}_{}(\varvec{v}_h) = h\) for \(h>0\)). At this point, the while-loop will be exited since \(\deg v_0 > \eta \).

figure e

Algorithm 5 is then simply the implementation of these modifications, and writing out in full what the simple LP-transformation does to \(\varvec{v}_0\). This proves that Algorithm 5 is operationally equivalent to Algorithm 3 with input \(\varPhi _{\varvec{w}}(M)\).

For obtaining Algorithm 4 from Algorithm 5, the idea is to store only the necessary part of V and compute the rest on demand. Firstly, by Lemma 7 correctness would be maintained if the simple LP-transformation on Line 9 of Algorithm 5 was followed by the \(\ell \) modulo reductions. In that case, we would have \(v_{0,h} = (v_{0,0}\tilde{s}_h \mod \tilde{g}_h)\), so storing only \(v_{0,0}\) suffice for reconstructing \(\varvec{v}_0\). Consequently we store the first column of V in Algorithm 4 as \((\lambda _0,\ldots ,\lambda _\ell )\). Line 6 of Algorithm 4 is now the computation of the needed coefficient of \(v_{0,h}\) at the latest possible time.

As \(\deg \varvec{v}_h\) is used in Line 6 of Algorithm 5, we need to store and maintain this between iterations; this is the variables \(\eta _1,\ldots ,\eta _\ell \). To save some redundant computation of coefficients, the \(x^{\eta _h}\)-coefficient of \(v_{h,h}\) is also stored as \(\alpha _h\).

This proves that Algorithm 4 is operationally equivalent to Algorithm 5, which finishes the proof of correctness. \(\square \)

Proposition 3

Algorithm 4 has computational complexity \(O(\ell \mu ^2 + \sum _{h=1}^\ell \sum _{\eta =0}^{\mu -1} T_{h,\eta })\), where \(\mu = \max _i \{\gamma _i +\deg g_i\}\) and \(T_{h,\eta }\) bounds the complexity of running Line 6 for those values of h and \(\eta \).

Proof

Clearly, all steps of the algorithm are essentially free except Lines 6 and 9. Observe that every iteration of the while-loop decreases an upper bound on the value of row 0, whether we enter the if-branch in Line 7 or not. So by the arguments of the proof of Theorem 6, the loop will iterate at most \(O(\ell \mu )\) times in which each possible value of \((h, \eta ) \in \{1,\ldots ,\ell \} \times \{0,\ldots ,\mu -1\}\) will be taken at most once. Each execution of Line 9 costs \(O(\mu )\) since the \(\lambda _j\) all have degree at most \(\mu \). \(\square \)

It is possible to use Proposition 3 to show that Algorithm 4 is efficient if e.g. all the \(g_i\) have few non-zero monomials.Footnote 7 We will restrict ourselves to a simpler case which nonetheless has high relevance for coding theory:

Theorem 8

Algorithm 4 can be realised with complexity \(O(\ell \mu ^2)\) if \(g_i = x^{d_i} + a_i\) for \(a_i \in \mathbb {F}_q\) for all i, where \(\mu = \max _i \{\gamma _i +\deg g_i\}\).

Proof

We will bound \(\sum _{\eta =0}^{\mu -1} T_{h,\eta }\) of Proposition 3. Note first that for any \(\eta \), the coefficient \(\alpha \) to \(x^\eta \) in \((\lambda \tilde{s}_h \ mod \ \tilde{g}_h)\) equals the coefficient to \(x^{\eta -\gamma _h}\) of \((\lambda s_h \ mod \ g_h)\), so considering \(\gamma _h = 0\) suffice. Now if \(\eta \ge d_h\) then \(\alpha = 0\) and can be returned immediately. If \(\eta < d_h\), then due to the assumed shape of \(g_i\), \(\alpha \) is a linear combination of the coefficients to \(x^{\eta }, x^{\eta +d_h}, \ldots , x^{\eta + t d_h}\) in \(\lambda _0 s_h\), where \(t = \left\lfloor \frac{\mu - \eta }{d_h} \right\rfloor \). Each such coefficient can be computed by convolution of \(\lambda _0\) and \(s_h\) in \(O(\mu )\), so it costs \(O(\frac{\mu ^2}{d})\) to compute \(\alpha \). Summing over all choices of \(\eta \), we have \(\sum _{\eta =0}^{\mu -1} T_{h,\eta } \in O(\mu ^2)\) and the theorem follows from Proposition 3. \(\square \)

5.2 Weak Popov walking

The goal of this section is to arrive at a faster row reduction algorithm for the matrices used for decoding MV codes in Sect. 3.2. However, the algorithm we describe could be of much broader interest: it is essentially an improved way of computing a \(\varvec{w}\)-weak Popov form of a matrix which is already in \(\varvec{w}'\)-weak Popov form, for a shift \(\varvec{w}'\) which is not too far from \(\varvec{w}\). Inspired by “Gröbner walks”, we have dubbed this strategy “weak Popov walking”. Each “step” of the walk can be seen as just Algorithm 3 but where we carefully choose which LP-transformations to apply each iteration, in case there is choice.

This strategy would work completely equivalently for the \(\mathbb {F}[x]\) case. However, to the best of our knowledge, that has not been done before.

In this section we will extensively discuss vectors under different shifts. To ease the notation we therefore introduce shifted versions of the following operators: \(\mathrm {LP}_{\varvec{w}}(\varvec{v}) := \mathrm {LP}_{}(\varPhi _{\varvec{w}}(\varvec{v}))\) as well as \(\deg _{\varvec{w}}(\varvec{v}) := \deg \varPhi _{\varvec{w}}(\varvec{v})\).

We begin by Algorithm 6 that efficiently “walks” from a weak Popov form according to the shift \(\varvec{w}\) into one with the shift \(\varvec{w}+ (1,0,\ldots ,0)\). The approach can readily be generalised to support increment on any index, but we do not need it for the decoding problem so we omit the generalisation to simplify notation.

figure f

Theorem 9

Algorithm 6 is correct.

Proof

Denote in this proof V as the input and \(\hat{V}\) as the output of the algorithm. The algorithm performs a single sweep of simple transformations, modifying only rows indexed by I: in particular, if \(\varvec{v}_i,{\varvec{\hat{v}}}_i\) are the rows of V respectively \(\hat{V}\), then either \({\varvec{\hat{v}}}_i = \varvec{v}_i\), or \({\varvec{\hat{v}}}_i\) is the result of a simple transformation on \(\varvec{v}_i\) by another row \(\varvec{v}_j\) of V and \(i,j \in I\). All the \(h_i\) are different since V is in \(\varvec{w}\)-shifted weak Popov form. We will show that the \({\varvec{\hat{w}}}\)-shifted leading positions of \(\hat{V}\) is a permutation of the \(h_i\), implying that \(\hat{V}\) is in \({\varvec{\hat{w}}}\)-shifted weak Popov form.

Note first that for any vector \(\varvec{v}\in \mathcal{{R}}^m\) with \(\mathrm {LP}_{\varvec{w}}(\varvec{v}) \ne \mathrm {LP}_{{\varvec{\hat{w}}}}(\varvec{v})\), then \(\mathrm {LP}_{{\varvec{\hat{w}}}}(\varvec{v}) = 0\), since only the degree of the 0’th position of \(\varPhi _{\varvec{\hat{w}}}(\varvec{v})\) is different from the corresponding position of \(\varPhi _{\varvec{w}}(\varvec{v})\). For each \(i \in \{0,\ldots ,m-1\} \setminus I\) we have \({\varvec{\hat{v}}}_i = \varvec{v}_i\) and so \(\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_i) = h_i\). And of course for each \(i \in I\) we have \(\mathrm {LP}_{{\varvec{\hat{w}}}}(\varvec{v}_i) = 0\). This implies for each \(j \in I\) that:

$$\begin{aligned} \deg v_{j,0} + w_0 = \deg v_{j,h_j} + w_{h_j} \ . \end{aligned}$$
(8)

Consider first an index \(i \in I\) for which Line 9 was run, and let t be as at that point. This means \({\varvec{\hat{v}}}_i = \varvec{v}_i + \alpha x^\delta \varvec{v}_t\) for some \(\alpha \in \mathbb {F}\) and \(\delta = \deg v_{i,0} - \deg v_{t,0}\). Note that the if-condition ensures \(\delta \ge 0\) and the simple transformation makes sense. We will establish that \(\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_i) = h_i\). Since we are performing an LP-transformation, we know that \(\deg _{\varvec{\hat{w}}}{\varvec{\hat{v}}}_i \le \deg _{\varvec{\hat{w}}}\varvec{v}_i\), so we are done if we can show that \(\deg \hat{v}_{i,h_i} = \deg v_{i,h_i}\) and \(\deg \hat{v}_{i, k} + w_k < \deg v_{i,h_i} + w_{h_i}\) for \(k > h_i\). This in turn will follow if \(\alpha x^\delta \varvec{v}_t\) has \({\varvec{\hat{w}}}\)-weighted degree less than \(\deg v_{i,h_i} + w_{h_i}\) on all position \(k \ge h_i\).

Due to \(\mathrm {LP}_{\varvec{w}}(\varvec{v}_t) = h_t\) and (8) for index t then for any \(k > h_t\):

$$\begin{aligned} \deg v_{t, k} + w_k < \deg v_{t, h_t} + w_{h_t} = \deg v_{t,0} + w_0 \ . \end{aligned}$$
(9)

Using \(\deg v_{t,0} + \delta = \deg v_{i,0}\) and (8) for index i, we conclude that

$$\begin{aligned} \deg v_{t, k} + w_k + \delta < \deg v_{i,0} + w_{0} = \deg v_{i,h_i} + w_{h_i} \ . \end{aligned}$$

Since \(h_t < h_i\) by the ordering of the \(i_\star \), this shows that \(\deg v_{i, k} + w_k + \delta < \deg v_{i,h_i} + w_{h_i}\) for \(k \ge h_i\). These are the degree bounds we sought and so \(\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_i) = h_i\).

Consider now an \(i \in I\) for which Line 9 was run, and let again t be as at that point, before the reassignment. The situation is completely reversed according to before, so by analogous arguments \(\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_t) = h_i\).

For the value of t at the end of the algorithm, then clearly \(\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_t) = 0\) since the row was not modified. Since we necessarily have \(h_{i_1} = 0\), then \(\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_t) = h_{i_1}\). Thus every \(h_i\) becomes the \({\varvec{\hat{w}}}\)-leading position of one of the \(\varvec{v}_j\) exactly once. But the \(h_i\) were all different, and so \(\hat{V}\) is in \({\varvec{\hat{w}}}\)-shifted weak Popov form. \(\square \)

Proposition 4

Algorithm 6 performs at most

$$\begin{aligned} \textstyle O\big (m\deg \det (V) + \sum _{i < j}|w_i - w_j| + m^2\big ) \end{aligned}$$

operations over \(\mathcal{{R}}\).

Proof

We will bound the number of non-zero monomials which are involved in simple transformations. As remarked in the proof of Theorem 9, all simple transformations are done using distinct rows of the input matrix, so it suffices to bound the total number of monomials in the input matrix V.

Since we are then simply counting monomials in V, we can assume w.l.o.g. that \(w_0 \le w_1 \le \ldots \le w_{m-1}\), and since the input matrix V was in \(\varvec{w}\)-shifted weak Popov form, assume also w.l.o.g that we have sorted the rows such that \(\mathrm {LP}_{\varvec{w}}(\varvec{v}_i) = i\). Since \({\Delta (\varPhi _{\varvec{w}}(V))} = 0\) we have

$$\begin{aligned} \deg \det \varPhi _{\varvec{w}}(V) = \deg _{\varvec{w}} V \qquad \text { that is }\qquad \textstyle \deg _{\varvec{w}} V = \deg \det V + \sum _i w_i \ . \end{aligned}$$

We can therefore consider the assignment of \(\deg _{\varvec{w}}\) to the individual rows of V under these constraints that will maximise the possible number of monomials in V. We cannot have \(\deg _{\varvec{w}} \varvec{v}_i < w_i\) since \(\mathrm {LP}_{\varvec{w}}(\varvec{v}_i) = i\). It is easy to see that the worst-case assignment is then to have exactly \(\deg _{\varvec{w}} \varvec{v}_i = w_i\) for \(i=0,\ldots ,m-2\) and \(\deg _{\varvec{w}} \varvec{v}_{m-1} = \deg \det V + w_{m-1}\). In this case, for \(i < m-1\) then \(\deg v_{i,j} \le w_i - w_j\) if \(j \le i\) and \(v_{i,j} = 0\) if \(j > i\), so the number of monomials can then be bounded as

$$\begin{aligned}&\left( \sum _{i=0}^{m-2} \sum _{j=0}^{i}(w_i - w_j + 1) \right) + \left( \sum _{j=0}^{m-1}(\deg \det V + w_{m-1} - w_j + 1) \right) \\&\qquad \le m^2 + \sum _{i < j}(w_j - w_i) + m\deg \det V. \end{aligned}$$

\(\square \)

The idea is now to iterate Algorithm 6 to “walk” from a matrix that is in weak Popov form for one shift \(\varvec{w}\) into another one \({\varvec{\hat{w}}}\). Row reducing the matrix for the MV codes can be done as Algorithm 7.

figure g

Theorem 10

Algorithm 7 is correct. It has complexity \(O(\ell n^2)\) over \(\mathbb {F}_{q^s}\).

Proof

Note that M is in \(\varvec{w}'\)-shifted weak Popov form, where \(\varvec{w}'\) is as on Line 2. Thus by the correctness of Algorithm 6, then V at the end of the algorithm must be in \(\big (\varvec{w}+ (n, \ldots , n)\big )\)-shifted weak Popov form. Then it is clearly also in \(\varvec{w}\)-shifted weak Popov form. For the complexity, the algorithm simply performs n calls to Algorithm 6. We should estimate the quantity \(\sum _{i < j}|w_i - w_j|\), which is greatest in the first iteration. Since Problem 2 posits \(n > \left( {\begin{array}{c}\ell +1\\ 2\end{array}}\right) (k-1)\), we can bound the sum as:

$$\begin{aligned} \sum _{j=1}^{\ell }(n + j(k-1)) + \sum _{1 \le i< j}(j-i)(k-1) < \ell n + (\ell +1)\genfrac(){0.0pt}1{\ell +1}{2} (k-1) \in O(\ell n) \ . \end{aligned}$$

Since \(\deg \det (V) = \deg \det (M) = n\) then by Proposition 4 each of the calls to Algorithm 6 therefore costs at most \(O(\ell n)\). \(\square \)

6 Conclusion

We have explored row reduction of skew polynomial matrices. For ordinary polynomial rings, row reduction has proven a useful strategy for obtaining flexible, efficient while conceptually simple decoding algorithms for RS and other code families. Our results introduce the methodology and tools aimed at bringing similar benefits to Gabidulin, Interleaved Gabidulin, MV, and other skew polynomial-based codes. We used those tools in two settings. We solved a general form of multiple skew-shift register synthesis (cf. Problem 1), and applied this for decoding of Interleaving Gabidulin codes in complexity \(O(\ell \mu ^2)\), see Theorem 2. For MV codes (cf. Problem 2), we gave an interpolation algorithm with complexity \(O(\ell n^2)\), see Theorem 4.

We extended and analysed the simple and generally applicable Mulders–Storjohann algorithm to the skew polynomial setting. In both the studied settings, the complexity of that algorithm was initially not satisfactory, but it served as a crucial step in developing more efficient algorithms. For multiple skew-shift register synthesis, we were able to obtain a good complexity for a more general problem than previously. For the MV codes, the improved algorithm was in the shape of a versatile “Weak Popov Walk”, which could potentially apply to many other problems. In all previously studied cases, we matched the best known complexities [44, 48] that do not make use of fast multiplication of skew polynomials.

Based on a preprint of this paper, in [40] it is shown how to further reduce the complexity for decoding Interleaved Gabidulin codes using a divide-&-conquer version of Algorithm 3, matching the complexity of [43].

The weak Popov form has many properties that can be beneficial in a coding setting, and which we did not yet explore. For instance, it allows to easily enumerate all “small” elements of the row space: that could e.g. be used to enumerate all solutions to a shift register problem, allowing a chase-like decoding of Interleaved Gabidulin codes beyond half the minimum distance.