Row reduction applied to decoding of rank-metric and subspace codes

Puchinger, Sven; Rosenkilde né Nielsen, Johan; Li, Wenhui; Sidorenko, Vladimir

doi:10.1007/s10623-016-0257-9

Row reduction applied to decoding of rank-metric and subspace codes

Published: 02 August 2016

Volume 82, pages 389–409, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Designs, Codes and Cryptography Aims and scope Submit manuscript

Row reduction applied to decoding of rank-metric and subspace codes

Download PDF

Sven Puchinger¹,
Johan Rosenkilde né Nielsen²,
Wenhui Li¹ &
…
Vladimir Sidorenko³

408 Accesses
12 Citations
Explore all metrics

Abstract

We show that decoding of $\ell $-Interleaved Gabidulin codes, as well as list-$\ell $ decoding of Mahdavifar–Vardy (MV) codes can be performed by row reducing skew polynomial matrices. Inspired by row reduction of $\mathbb {F}[x]$ matrices, we develop a general and flexible approach of transforming matrices over skew polynomial rings into a certain reduced form. We apply this to solve generalised shift register problems over skew polynomial rings which occur in decoding $\ell $-Interleaved Gabidulin codes. We obtain an algorithm with complexity $O(\ell \mu ^2)$ where $\mu $ measures the size of the input problem and is proportional to the code length n in the case of decoding. Further, we show how to perform the interpolation step of list-$\ell $-decoding MV codes in complexity $O(\ell n^2)$, where n is the number of interpolation constraints.

Algebraic decoding of folded Gabidulin codes

Article 23 March 2016

New constructions of MSRD codes

Article Open access 06 September 2024

On the list decodability of rank-metric codes containing Gabidulin codes

Article 16 February 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Numerous recent publications have unified the core of various decoding algorithms for Reed–Solomon (RS) and Hermitian codes using row reduction of certain $\mathbb {F}[x]$-module bases. First for the Guruswami–Sudan list decoder [2, 9, 21], then for Power decoding [32, 33] and also either type of decoder for Hermitian codes [34]. By factoring out coding theory from the core problem, we enable the immediate use of sophisticated algorithms developed by the computer algebra community such as [16, 50].

The goal of this paper is to explore the row reduction description over skew polynomial rings, with a main application for decoding rank-metric and subspace codes. Concretely, we prove that Interleaved Gabidulin and Mahdavifar–Vardy (MV) codes can be decoded by transforming a module basis into weak Popov form, which can be obtained by a skew-analogue of the elegantly simple Mulders–Storjohann algorithm [31]. By exploiting the structure of the module bases arising from the decoding problems, we refine the algorithm to obtain improved complexities. These match the best known algorithms for these applications but solve more general problems, and it demonstrates that the row reduction methodology is both flexible and fast for skew polynomial rings. Building on this paper, Ref. [40] proposes an algorithm which improves upon the best known complexity for decoding Interleaved Gabidulin codes.

Section 1.1 summarizes related work. We set basic notation in Sect. 2. Section 3 shows how to solve the mentioned decoding problems using row reduction and states the final complexity results which are proven in the subsequent sections. We describe row reduction of skew polynomial matrices in Sect. 4. Section 5 presents faster row reduction algorithms for certain input matrices, with applications to the decoding problems.

This work was partly presented at the International Workshop on Coding and Cryptography 2015 [24]. Compared to this previous work, we added the decoding of MV codes using the row reduction approach.^{Footnote 1} It spurred a new refinement of the Mulders–Storjohann, in Sect. 5.2, which could be of wider interest.

1.1 Related work

In this paper we consider skew polynomial rings over finite fields without derivations [38] (see Sect. 2.1 for this restricted definition of skew polynomials). This is the most relevant case for coding theory, partly because they are easier to compute with, though non-zero derivations have been used in some constructions [8]. All of the row reduction algorithms in this paper work for skew polynomial rings with non-zero derivation, but the complexity would be worse. The algorithms also apply to skew polynomial rings over any base ring, e.g. $\mathbb {F}(z)$ or a number field, but due to coefficient growth in such settings, their bit-complexity would have to be analysed.

A skew polynomial ring over a finite field without derivation is isomorphic to a ring of linearised polynomials under a trivial isomorphism, and the rings’ evaluation maps agree. Our algorithms could be phrased straightforwardly to work on modules over linearised polynomials. Much literature on Gabidulin codes uses the language of linearised polynomials.

Skew polynomial rings are instances of Ore rings, and some previous work on computing matrix normal forms over Ore rings can be found in [1, 6]. The focus there is when the base ring is $\mathbb {Q}$ or $\mathbb {F}(z)$, where coefficient growth is a main concern, e.g. for modelling systems of partial differential equations. These algorithms are slower than ours when the base ring is a finite field. Reference [29] considers a setting more similar to ours, but obtains a different algorithm and slightly worse complexity.

Gabidulin codes [11, 15, 41] are maximum rank distance codes over finite fields; they are the rank-metric analogue of RS codes. An Interleaved Gabidulin code is a direct sum of several Gabidulin codes, similar to Interleaved RS codes. In a synchronised error model they allow an average-case error-correction capability far beyond half the minimum rank distance [25]. Decoding of Interleaved Gabidulin codes is often formulated as solving a simultaneous “Key Equation” [43]. Over $\mathbb {F}[x]$ this computational problem is also known as a multi-sequence shift-register synthesis [14, 45], simultaneous Padé approximation [4], or vector rational function reconstruction [36]. This problem also has further generalisations, some of which have found applications in decoding of algebraic codes, e.g. [34, 42, 49]. In the computer algebra community, Padé approximations have been studied in a wide generality, e.g. [5]; to the best of our knowledge, analogous generalisations over skew polynomial rings have yet to see any applications.

Lately, there has been an interest in Gabidulin codes over number fields, with applications to space-time codes and low-rank matrix recovery [3]. Their decoding can also be reduced to a shift-register type problem [30], which could be solved using the algorithms in this paper (though again, one should analyse the bit-complexity).

MV codes [26, 28] are subspace codes whose main interest lies in their property of being list-decodable beyond half the minimum distance. Their rate unfortunately tends to zero for increasing code lengths. In [27], Mahdavifar and Vardy presented a refined construction which can be decoded “with multiplicities” allowing a better decoding radius and rate; it is future work to adapt our algorithm to this case. The decoding of MV codes is is heavily inspired by the Guruswami–Sudan algorithm for RS codes [17], and our row reduction approach in Sect. 3.2 is similarly inspired by fast module-based algorithms for realising the Guruswami–Sudan [7, 21].

Another family of rank-metric codes which can be decoded beyond half the minimum distance are Guruswami–Xing codes [19]. These can be seen simply as heavily punctured Gabidulin codes, and their decoding as a virtual interleaving of multiple Gabidulin codes. This leads to a decoder based on solving a simultaneous shift-register equation, where our algorithms apply. Guruswami–Wang codes [18] are Guruswami–Xing codes with a restricted message set, so the same decoding algorithm applies.

Over $\mathbb {F}[x]$, row reduction, and the related concept of order bases, have been widely studied and sophisticated algorithms have emerged, e.g. [2, 16, 50]. As a follow-up to this work, a skew-analogue of the algorithm in [2] was proposed in [40].

2 Preliminaries

2.1 Skew polynomials

Let $\mathbb {F}$ be a finite field and $\theta $ an $\mathbb {F}$-automorphism. Denote by $\mathcal{{R}}=\mathbb {F}[x;\theta ]$ the non-commutative ring of skew polynomials over $\mathbb {F}$ (with zero derivation): elements of $\mathcal{{R}}$ are of the form $\sum _i a_i x^i$ with $a_i \in \mathbb {F}$, addition is as usual, while multiplication is defined by $xa = \theta (a) x$ for all $a \in \mathbb {F}$. When we say “polynomial”, we will mean elements of $\mathcal{{R}}$. The definition of the degree of a polynomial is the same as for ordinary polynomials. See [38] for more details.

The evaluation map of $a \in \mathcal{{R}}$ is given as:

$$\begin{aligned}&a(\cdot ) := ev _a(\cdot ) \, : \, \mathbb {F}\rightarrow \mathbb {F}\ \\&\alpha \mapsto {\textstyle \sum \nolimits }_{i} a_i \theta ^i(\alpha ). \end{aligned}$$

This is a group homomorphism on $(\mathbb {F},+)$, and it is a linear map over the fixed field of $\theta $. Furthermore, for two $a,b \in \mathcal{{R}}$ we have $ ev _{ab} = ev _a \circ ev _b$. This is sometimes known as operator evaluation, e.g. [8].

If $\mathbb {F}_{q}$ is the field fixed by $\theta $ for some prime power q, then $\mathbb {F}= \mathbb {F}_{q^s}, s \in \mathbb {Z}_{> 0}$, and $\theta (a) = a^{q^i}$ for some $0 \le i < s$, i.e. a power of the Frobenius automorphism of $\mathbb {F}_{q^s}/\;\mathbb {F}_q$.

Definition 1

For $a,b,c \in \mathcal{{R}}$, we write $a \equiv b \mod c$ (right modulo operation) if there exists $d \in \mathcal{{R}}$ such that $a = b + d c$

In complexity estimates we count the total number of the following operations: $+, -, \cdot , \slash $ and $\theta ^i$ for any $i \in \mathbb {Z}_{> 0}$. For computing $\theta ^i$ the assumption is that the Frobenius automorphism can be done efficiently in $\mathbb {F}_{q^s}$; this is reasonable since we can represent $\mathbb {F}_{q^s}$-elements using a normal basis over $\mathbb {F}_q$ (cf. [47, Sect. 2.1.2]): in this case, $a^q$ for $a \in \mathbb {F}_{q^s}$ is simply the cyclic shift of a represented as an $\mathbb {F}_q$-vector over the normal basis.

2.2 Skew polynomial matrices

Free modules and matrices over $\mathcal{{R}}$ behave quite similarly to the $\mathbb {F}[x]$ case, keeping non-commutativity in mind:

Any left sub-module $\mathcal{{V}}$ of $\mathcal{{R}}^m$ is free and admits a basis of at most m elements. Any two bases of $\mathcal{{V}}$ have the same number of elements.
The rank of a matrix M over $\mathcal{{R}}$ is defined as the number of elements in any basis of the left $\mathcal{{R}}$-row space of M. The rows of two such matrices $M, M' \in \mathcal{{R}}^{n \times m}$ generate the same left module if and only if there exists a $U \in \mathrm {GL}_n(\mathcal{{R}})$ such that $M = UM'$, where $\mathrm {GL}_n(\mathcal{{R}})$ denotes the set of invertible $n \times n$ matrices over $\mathcal{{R}}$.

These properties follow principally from $\mathcal{{R}}$ being an Ore ring and therefore left Euclidean, hence left Principal Ideal Domain (PID), hence left Noetherian.^{Footnote 2} Moreover, $\mathcal{{R}}$ has a unique left skew field^{Footnote 3} of fractions $\mathcal {Q}$ from which it inherits its linear algebra properties. See e.g. [10, 13] for more details. In this paper we exclusively use the left module structure of $\mathcal{{R}}$, and we will often omit the “left” denotation.

We introduce the following notation for vectors and matrices over $\mathcal{{R}}$: Matrices are denoted by capital letters (e.g. V). The ith row of V is denoted by $\varvec{v}_i$, the jth element of a vector $\varvec{v}$ is $v_j$ and $v_{i,j}$ is the (i, j)th entry of a matrix V. Indices start at 0.

The degree of a vector $\varvec{v}$ is $\deg \varvec{v}:= \max _{i} \{ \deg v_i \}$ (and $\deg \varvec{0} = -\infty $) and the degree of a matrix V is $\deg V := \sum _{i} \{ \deg \varvec{v}_i \}$.
The max-degree of V is ${{\mathrm{maxdeg}}}V := \max _{i} \{ \deg \varvec{v}_i \} = \max _{i,j} \{ \deg v_{i,j} \}$.
The leading position of a non-zero vector $\varvec{v}$ is $\mathrm {LP}_{}(\varvec{v}) := \max \{i \, : \, \deg v_i = \deg \varvec{v}\}$, i.e. the rightmost position having maximal degree in the vector. Furthermore, we define the leading term $\mathrm {LT}(\varvec{v}) := v_{\mathrm {LP}_{}(\varvec{v})}$ and $\mathrm {LC}(\varvec{v})$ is the leading coefficient of $\mathrm {LT}(\varvec{v})$.

2.3 The weak Popov form

Definition 2

A matrix V over $\mathcal{{R}}$ is in weak Popov form if the leading positions of all its non-zero rows are different.

The following lemma describes that the rows of a matrix in weak Popov form are minimal in a certain way. Its proof is exactly the same as for $\mathbb {F}[x]$ modules and is therefore omitted, see e.g. [32].

Lemma 1

Let V be a matrix in weak Popov form, and let $\mathcal{{V}}$ be the $\mathcal{{R}}$-module generated by its rows. Then the non-zero rows of V are a basis of $\mathcal{{V}}$ and every $\varvec{u}\in \mathcal{{V}}$ satisfies $\deg \varvec{u}\ge \deg \varvec{v}$, where $\varvec{v}$ is the row of V with $\mathrm {LP}_{}(\varvec{v})=\mathrm {LP}_{}(\varvec{u})$.

We will need to “shift” the relative importance of some columns compared to others. Given a “shift vector” $\varvec{w}= (w_0,\dots ,w_\ell ) \in \mathbb {Z}_{\ge 0}^{\ell +1}$, define the mapping

$$\begin{aligned} \varPhi _{\varvec{w}} : \mathcal{{R}}^{\ell +1} \rightarrow \mathcal{{R}}^{\ell +1}, \; \varvec{u}= (u_0,\dots ,u_\ell ) \mapsto \Big (u_0 x^{w_0}, \dots , u_\ell x^{w_\ell }\Big ). \end{aligned}$$

It is easy to compute the inverse of $\varPhi _{\varvec{w}}$ for any vector in $\varPhi _{\varvec{w}}(\mathcal{{R}}^{\ell +1})$. Note that since the monomials $x^{w_i}$ are multiplied from the right, applying $\varPhi _{\varvec{w}}$ will only shift the entry polynomials, and not modify the coefficients. We can extend $\varPhi _{\varvec{w}}$ to $\mathcal{{R}}$-matrices by applying it row-wise.

Definition 3

For any $\varvec{w}= (w_0,\dots ,w_\ell ) \in \mathbb {Z}_{\ge 0}^{\ell +1}$, a matrix $V \in \mathcal{{R}}^{\cdot \times (\ell +1)}$ is in $\varvec{w}$ -shifted weak Popov form if $\varPhi _{\varvec{w}}(V)$ is in weak Popov form.

Given some matrix V over $\mathcal{{R}}$, “transforming V into ($\varvec{w}$-shifted) weak Popov form” means to find some W generating the same row space as V and such that W is in ($\varvec{w}$-shifted) weak Popov form. We will see in Sect. 4.1 that such W always exist.

Throughout this paper, by “row reduced” we mean “in weak Popov form”.^{Footnote 4} Similarly, “row reduction” means “transforming into weak Popov form”.

3 Decoding problems in rank-metric and subspace codes

3.1 Interlaved Gabidulin codes: multi-sequence shift registers

It is classical to decode errors in a Gabidulin code by solving a syndrome-based “Key Equation”: that is, a shift-register synthesis problem over $\mathcal{{R}}$, see e.g. [15]. An Interleaved Gabidulin code is a direct sum of several Gabidulin codes [25], and error-decoding can be formulated as a shift-register synthesis of several sequences simultaneously. A slightly more general notion of shift-register synthesis allows formulating the decoder using the “Gao Key Equation” [47]. Another generalisation accommodates error-and-erasure decoding of some Gabidulin resp. Interleaved Gabidulin codes [23, 47].

All these approaches are instances of the following “Multi-Sequence generalised Linear Skew-Feedback Shift Register” (MgLSSR) synthesis problem:

Problem 1

(MgLSSR) Given skew polynomials $s_i, g_i \in \mathcal{{R}}$ and non-negative integers $\gamma _i \in \mathbb {Z}_{\ge 0}$ for $i=1,\ldots ,\ell $, find skew polynomials $\lambda , \omega _1,\dots ,\omega _\ell \in \mathcal{{R}}$, with $\lambda $ of minimal degree such that the following holds:

$$\begin{aligned} \lambda s_i\equiv & {} \omega _i \mod g_i \\ \deg \lambda + \gamma _0> & {} \deg \omega _i + \gamma _i \end{aligned}$$

We show how to solve this problem by row reduction of a particular module basis. The approach is analogous to how the $\mathbb {F}[x]$-version of the problem is handled by Rosenkilde in [32], with only a few technical differences due to the non-commutativity of $\mathcal{{R}}$.

In the sequel we consider a particular instance of Problem 1, so $\mathcal{{R}}$, $\ell \in \mathbb {Z}_{> 0}$, and $s_i, g_i \in \mathcal{{R}}$, $\gamma _i \in \mathbb {Z}_{\ge 0}$ for $i=1,\ldots ,\ell $ are arbitrary but fixed. We assume $\deg s_i \le \deg g_i$ for all i since taking $s_i := (s_i \ mod \ g_i)$ yields the same solutions to Problem 1.

Denote by $\mathcal{{M}}$ the set of all vectors $\varvec{v}\in \mathcal{{R}}^{\ell +1}$ satisfying the congruence relation, i.e.,

$$\begin{aligned} \mathcal{{M}}:= \big \{(\lambda ,\omega _1,\dots ,\omega _\ell )\in \mathcal{{R}}^{\ell +1} \; \big | \; \lambda s_i \equiv \omega _i \mod g_i \; \forall i=1,\dots ,\ell \big \}. \end{aligned}$$

(1)

Lemma 2

Consider an instance of Problem 1 and $\mathcal{{M}}$ as in (1). $\mathcal{{M}}$ with component-wise addition and left multiplication by elements of $\mathcal{{R}}$ forms a free left module over $\mathcal{{R}}$. The rows of M form a basis of $\mathcal{{M}}$, where

$$\begin{aligned} M=\left( \begin{array}{ccccc} 1 &{} s_1 &{} s_2 &{} \dots &{} s_\ell \\ 0 &{} g_1 &{} 0 &{} \dots &{} 0 \\ 0 &{} 0 &{} g_2 &{} \dots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \ddots &{}\vdots \\ 0 &{} 0 &{} 0 &{} \dots &{} g_\ell \\ \end{array}\right) \ . \end{aligned}$$

(2)

Proof

Since $\mathcal{{M}}\subseteq \mathcal{{R}}^{\ell +1}$, the first half of the statement follows easily since $\mathcal{{M}}$ is clearly closed under addition and left $\mathcal{{R}}$ multiplication. M is a basis of $\mathcal{{M}}$ by arguments analogous to the $\mathbb {F}[x]$ case, cf. [32, Lemma 1]. $\square $

Lemma 2 gives a simple description of all solutions of the congruence requirement of Problem 1 in the form of the row space of an explicit matrix M. The following theorem implies that computing the weak Popov form of M is enough to solve Problem 1. The proof is similar to the $\mathbb {F}[x]$-case but since there is no convenient reference for it, we give it here for the $\mathcal{{R}}$ case. The entire strategy is formalised in Algorithm 1.

Theorem 1

Consider an instance of Problem 1, and $\mathcal{{M}}$ as in (1). Let $\varvec{w}= (\gamma _0,\dots ,\gamma _\ell ) \in \mathbb {Z}_{\ge 0}^{\ell +1}$. If V is a basis of $\mathcal{{M}}$ in $\varvec{w}$-shifted weak Popov form, the row $\varvec{v}$ of V with $\mathrm {LP}_{}(\varPhi _{\varvec{w}}(\varvec{v}))=0$ is a solution to Problem 1.

Proof

By Lemma 2 the row $\varvec{v}$ satisfies the congruence requirement of Problem 1. For the degree restriction of Problem 1, note that any $\varvec{u}\in \mathcal{{M}}$ satisfies this restriction if and only if $\mathrm {LP}_{}(\varPhi _{\varvec{w}}(\varvec{u})) = 0$, since $\deg u_i + \gamma _i = \deg (\varPhi _{\varvec{w}}(\varvec{u})_i)$. Furthermore, if this is the case, then $\deg (\varPhi _{\varvec{w}}(\varvec{u})) = \deg u_0 + \gamma _0$. Thus, not only must $\varvec{v}$ satisfy the degree restriction, but by Lemma 1, $v_0$ also has minimal possible degree. $\square $

The complexity of Algorithm 1 is determined by Line 2. Therefore, in Sects. 4 and 5.1 we analyse how and in which complexity we can row-reduce $\mathcal{{R}}$-matrices. In particular, we prove the following statement, where $\mu := \max _i \{\gamma _i +\deg g_i\}$.

Theorem 2

Algorithm 1 has complexity ${\left\{ \begin{array}{ll} O(\ell \mu ^2), &{}\text {if } g_i = x^{t_i}+c_i, \, t_i \in \mathbb {Z}_{> 0}, \, c_i \in \mathbb {F}\; \forall \, i, \\ O(\ell ^2 \mu ^2), &{}\text {otherwise.} \end{array}\right. }$

Proof

The first case follows from Theorem 8 in Sect. 5.1, using Algorithm 4 for the row reduction step. For general $g_i$’s, the result of Example 2 in Sect. 4 holds, which estimates the complexity of Algorithm 3 for a shift-register input. $\square $

The above theorem applies well to decoding Gabidulin and Interleaved Gabidulin codes since the $g_i$ are often in the restricted form: specifically, $g_i$ is a power of x in syndrome Key Equations, while $g_i = x^n - 1$ in Gao Key Equation whenever $n \mid s$. We therefore achieve the same complexity as [44] but in a wider setting.

3.2 Decoding Mahdavifar–Vardy codes

MV codes [26, 28] are subspace codes constructed by evaluating powers of skew polynomials at certain points. We will describe how one can use row reduction to carry out the most computationally intensive step of the MV decoding algorithm given in [28], the Interpolation step. In this section, $\mathcal{{R}}= \mathbb {F}_{q^s}[x;\theta ]$ where $\theta $ is some power of the Frobenius automorphism of $\mathbb {F}_{q^s}/\,\mathbb {F}_q$.

Problem 2

(Interpolation Step of MV decoding) Let $\ell , k, s, n \in \mathbb {Z}_{> 0}$ be such that $\left( {\begin{array}{c}\ell +1\\ 2\end{array}}\right) (k-1) < n \le s$. Given $(x_i, y_{i,1},\ldots , y_{i,\ell }) \in \mathbb {F}_{q^s}^{\ell +1}$ for $i=1,\ldots ,n$, where the $x_i$ are linearly independent over $\mathbb {F}_q$, find a non-zero $\varvec{Q} \in \mathcal{{R}}^{\ell +1}$ satisfying:

$$\begin{aligned}&Q_0(x_i) + \sum _{t=1}^\ell Q_t(y_{i,t}) = 0 \quad \qquad \qquad \qquad i=1,\ldots ,n, \end{aligned}$$

(3)

$$\begin{aligned}&\deg Q_t < \chi - t(k-1) \quad t=0,\dots ,\ell , \end{aligned}$$

(4)

where $\chi $ is given by

$$\begin{aligned} \chi = \left\lceil \frac{n+1}{\ell +1} + \frac{1}{2}\ell (k-1) \right\rceil {.} \\ \end{aligned}$$

The problem can be solved by a large linear system of equations whose dimensions reveals that a solution always exists [28, Lemma 8]. Note that the requirement $n > \left( {\begin{array}{c}\ell +1\\ 2\end{array}}\right) (k-1)$ ensures that all the degree bounds (4) are non-negative.

Let $\mathcal{{M}}$ be the set of all $\varvec{Q}$ that satisfy (3) though not necessarily (4):

$$\begin{aligned} \textstyle \mathcal{{M}}= \big \{ \varvec{Q} \in \mathcal{{R}}^{\ell +1} \ \big |\ Q_0(x_i) + \sum _{t=1}^\ell Q_t(y_{i,t}) = 0 \quad i=1,\ldots ,n \big \} \end{aligned}$$

(5)

Lemma 3

Consider an instance of Problem 2. Then $\mathcal{{M}}$ of (5) is a left $\mathcal{{R}}$-module.

Proof

$\mathcal{{M}}$ is closed under addition since $a(\alpha ) + b(\alpha ) = (a+b)(\alpha )$ for all $a,b \in \mathcal{{R}}$ and $\alpha \in \mathbb {F}_{q^s}$. Let $f \in \mathcal{{R}}$, $\varvec{Q} = (Q_0,Q_1,\dots ,Q_\ell ) \in \mathcal{{M}}$. Then $f \cdot \varvec{Q}$ satisfies (3) since

$$\begin{aligned} (f \cdot Q_0)(x) + \sum _{i=1}^{\ell } (f \cdot Q_i)(y_i) = f\left( Q_0(x) + \sum _{i=1}^{\ell } Q_i(y_i)\right) = f(0) = 0 \ . \end{aligned}$$

$\square $

For explicitly describing a basis of $\mathcal{{M}}$, we need a few well-known technical elements:

Definition 4

Given $a_1,\ldots ,a_m \in \mathbb {F}_{q^s}$ which are linearly independent over $\mathbb {F}_q$, the annihilator polynomial of the $a_i$ is the monic non-zero $\mathcal {A}\in \mathcal{{R}}$ of minimal degree such that $\mathcal {A}(a_i) = 0$ for all i.

It is easy to show that the annihilator polynomial is well-defined and that $\deg \mathcal {A}= m$, see e.g. [37]. The existence of annihilator polynomials easily leads to the following analogue of Lagrange interpolation:

Lemma 4

(Interpolation polynomial) Given any $a_1,\dots ,a_m \in \mathbb {F}_{q^s}$ which are linearly independent over $\mathbb {F}_q$, and arbitrary $b_1,\dots ,b_m \in \mathbb {F}_{q^s}$, there exists a unique $R \in \mathcal{{R}}$ of degree at most $m-1$ such that $R(a_i) = b_i$ for all $i=1,\ldots ,m$.

Lemma 5

Consider an instance of Problem 2 and let $\mathcal{{M}}$ be as in (5). Denote by G the annihilator polynomial of the $x_i, i=1,\ldots ,n$, and let $R_t \in \mathcal{{R}}, t=1,\ldots ,\ell $ be the interpolation polynomial with $R_t(x_i) = y_{i,t}$ for $i=1,\dots ,n$. The rows of M form a basis of $\mathcal{{M}}$:

$$\begin{aligned} M = \begin{pmatrix} \varvec{m}_0 \\ \varvec{m}_1 \\ \varvec{m}_2 \\ \vdots \\ \varvec{m}_\ell \end{pmatrix} = \begin{pmatrix} G &{} 0 &{} 0 &{} \dots &{} 0 \\ -R_1 &{} 1 &{} 0 &{} \dots &{} 0 \\ -R_2 &{} 0 &{} 1 &{} \dots &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots \\ -R_\ell &{} 0 &{} 0 &{} \dots &{} 1 \end{pmatrix} \end{aligned}$$

(6)

Proof

“$\subseteq $”: We should show that each $\varvec{m}_j$ all “vanish” at the points $(x_i,y_{i,1},\ldots ,y_{i,\ell })$. Consider such a point; we have two cases:

$$\begin{aligned} \varvec{m}_0: G(x_i)= & {} 0 \\ \varvec{m}_t: 1(y_{i,t})-R_t(x_i) = y_{i,t}-R_t(x_i)= & {} 0 \ , t=1,\dots ,\ell \end{aligned}$$

“$\supseteq $”: Consider some $\varvec{Q} = (Q_0,\dots ,Q_\ell ) \in \mathcal{{M}}$. Then we can write

$$\begin{aligned} \varvec{v}_\ell&:= \varvec{Q} \\ \varvec{v}_{\ell -1}&:= \varvec{v}_\ell - v_{\ell ,\ell } \cdot \varvec{m}_\ell = (v_{\ell -1, 0},\ldots ,v_{\ell -1,\ell -1}, 0) \\ \varvec{v}_{\ell -2}&:= \varvec{v}_{\ell -1} - v_{\ell -1,\ell -1} \cdot \varvec{m}_{\ell -1} = (v_{\ell -2,0}, \ldots , v_{\ell -2,\ell -2},0, 0) \\&\; \; \, \vdots \\ \varvec{v}_{0}&:= \varvec{v}_1 - v_{1,1} \cdot \varvec{m}_1 = (v_{0,0}, 0, \dots , 0). \end{aligned}$$

Since $\varvec{v}_\ell \in \mathcal{{M}}$, and each $\varvec{m}_t \in \mathcal{{M}}$, we conclude that all the $\varvec{v}_t \in \mathcal{{M}}$ and in particular $\varvec{v}_0 \in \mathcal{{M}}$. Thus for any i we must have $v_{0,0}(x_i) = 0$. This means G must right-divide $v_{0,0}$: for otherwise, the division would yield a non-zero remainder $B \in \mathcal{{R}}$ with $\deg B < \deg G$ but still having $B(x_i) = 0$, contradicting the minimality of G.

Summarily, $\varvec{v}_0 = f\cdot \varvec{m}_0$ for some $f \in \mathcal{{R}}$, and hence $\varvec{Q} = \varvec{v}_\ell $ is an $\mathcal{{R}}$-linear combination of the rows of M. $\square $

To complete the interpolation step, we need to find an element of $\mathcal{{M}}$ whose components satisfy the degree constraints (4).

Theorem 3

Consider an instance of Problem 2, and let $\mathcal{{M}}$ be as in (5). Let $\varvec{w}= (0,(k-1), \dots , \ell (k-1))$, and V be a basis of $\mathcal{{M}}$ in $\varvec{w}$-shifted weak Popov form. If $\varvec{v}$ is a row of V with minimal $\varvec{w}$-shifted degree, $\deg \varPhi _{\varvec{w}}(\varvec{v})$, then $\varvec{v}$ is a solution to Problem 2.

Proof

Any row of V satisfies (3) because it is in $\mathcal{{M}}$. As previously remarked, there exists some solution $\varvec{Q} = (Q_0, Q_1,\dots ,Q_\ell ) \in \mathcal{{M}}$ satisfying the degree conditions (4). By the choice of $\varvec{v}$ and by Lemma 1 on page 5, then $\deg \varPhi _{\varvec{w}}(\varvec{v}) \le \deg \varPhi _{\varvec{w}}(\varvec{Q})$. But then if $t = \mathrm {LP}_{}(\varPhi _{\varvec{w}}(\varvec{Q}))$ we have that for any i:

$$\begin{aligned}&\deg \left( v_i x^{i(k-1)}\right) \le \deg \varPhi _{\varvec{w}}(\varvec{Q}) = \deg \left( Q_t x^{t(k-1)}\right) < \chi \end{aligned}$$

Hence, $\varvec{v}$ satisfies (4). $\square $

This results immediately in the decoding procedure outlined as Algorithm 2.

Theorem 4

Algorithm 2 has complexity $O(\ell n^2)$ over $\mathbb {F}_{q^s}$.

Proof

Computing G can be done straightforwardly in $O(n^2)$ operations over $\mathbb {F}_{q^s}$. Each $R_t$ can be computed in the same speed using a decomposition into smaller interpolations and two annihilator polynomials, see e.g. [39]. For Line 2, we use Algorithm 7 whose complexity is $O(\ell n^2)$, proved as Theorem 10. $\square $

In [48], Xie, Lin, Yan and Suter present an algorithm for solving the Interpolation Step using a skew-variant of the Kötter–Nielsen–Høholdt algorithm [35] with complexity $O(\ell ^2 s n)$ over $\mathbb {F}_{q^s}$. Since $n < s$, our algorithm is at least as fast as theirs. Note that these costs probably dominate the complexity of MV decoding: the other step, Root-finding, likely^{Footnote 5} has complexity $O(\ell ^2 k n)$.

4 Row reduction of $\mathcal{{R}}$-matrices

4.1 The Mulders–Storjohann algorithm

In this section, we introduce our algorithmic work horse: obtaining row reduced bases of left $\mathcal{{R}}$-modules $\mathcal{{V}}\subseteq \mathcal{{R}}^{m}$. The core is an $\mathcal{{R}}$-variant of the Mulders–Storjohann algorithm [31] that was originally described for $\mathbb {F}[x]$ matrices. The algorithm and its proof of correctness carries over almost unchanged, while a fine-grained complexity analysis is considerably more involved; we return to this in Sect. 4.3.

Definition 5

Applying a simple transformation i on j at position h on a matrix V with $\deg v_{i,h} \le \deg v_{j,h}$ means to replace $\varvec{v}_j$ by $\varvec{v}_j - \alpha x^\beta \varvec{v}_i$, where $\beta = \deg v_{j,h} - \deg v_{i,h}$ and $\alpha = \mathrm {LC}(v_{j,h})/\theta ^{\beta }(\mathrm {LC}(v_{i,h}))$.

By a simple LP-transformation i on j, where $\mathrm {LP}_{}(\varvec{v}_i) = \mathrm {LP}_{}(\varvec{v}_j)$, we will mean a simple transformation i on j at position $\mathrm {LP}_{}(\varvec{v}_i)$.

Remark 1

Note that a simple transformation i on j at position h cancels the leading term of the polynomial $v_{j,h}$. Elementary row operations keep the row space and rank of the matrix unchanged, and in particular so does any sequence of simple transformations.

We use the following value function for $\mathcal{{R}}$ vectors as a “size” of $\mathcal{{R}}^{m}$ vectors:

$$\begin{aligned} \psi : \mathcal{{R}}^{m}&\rightarrow \mathbb {Z}_{\ge 0}\\&\varvec{v}\mapsto \left\{ \begin{array}{ll} 0 &{} \text { if } \varvec{v}= \varvec{0} \\ m\deg \varvec{v}+ \mathrm {LP}_{}(\varvec{v}) + 1 &{} \text {otherwise}\\ \end{array} \right. \end{aligned}$$

Lemma 6

For some $V \in \mathcal{{R}}^{\cdot \times m}$, consider a simple LP-transformation i on j, where $\varvec{v}_j$ is replaced by $\varvec{v}'_j$. Then $\psi (\varvec{v}_j')<\psi (\varvec{v}_j)$.

Proof

The proof works exactly as in the $\mathbb {F}[x]$ case, cf. [32, Lemma 8]. $\square $

Theorem 5

Algorithm 3 is correct.

Proof

By Lemma 6, the $\psi $-value of one row of V decreases for each simple LP-transformation. The sum of the values of the rows must at all times be non-negative so the algorithm must terminate. When the algorithm terminates there are no $i \ne j$ such that $\mathrm {LP}_{}(\varvec{v}_i)= \mathrm {LP}_{}(\varvec{v}_j)$. That is to say, V is in weak Popov form. $\square $

The above proof easily leads to the rough complexity estimate of Algorithm 3 of $O(m^2 \deg V {{\mathrm{maxdeg}}}V)$, where m is the number of columns in V.

Note that in Algorithm 3 each iteration might present several possibilities for the simple LP-transformation; the above theorem shows that any choice of LP-transformations leads to the correct result.

To transform V into $\varvec{w}$-shifted weak Popov form, for some shift $\varvec{w}\in \mathbb {Z}_{\ge 0}^m$, we let $V' = \varPhi _{\varvec{w}}(V)$ and apply Algorithm 3 on $V'$ to obtain $W'$ in weak Popov form. Since Algorithm 3 only performs row operations, it is clear that $\varPhi _{\varvec{w}}$ can be inverted on $W'$ to obtain $W = \varPhi _{\varvec{w}}^{-1}(W')$. Then W is in $\varvec{w}$-shifted weak Popov form by definition.

4.2 The determinant degree and orthogonality defect

The purpose of this section is to introduce the orthogonality defect as a tool for measuring “how far” a square, full-rank matrix over $\mathcal{{R}}$ is from being in weak Popov form. It relies on the nice properties of the degree of the Dieudonné determinant for matrices over $\mathcal{{R}}$. The orthogonality defect for $\mathbb {F}[x]$ matrices was introduced by Lenstra [22] and used in [32] to similar effect as we do here.

Dieudonné introduced a function for matrices over skew fields which shares some of the essential properties of the usual commutative determinant, in particular that it is multiplicative, see [12] or [13, Sect. 20]. This Dieudonné determinant can be applied to matrices over $\mathcal{{R}}$ by considering $\mathcal{{R}}$ inside its left field of fractions. The definition of this determinant is quite technical, and we will not actually need to invoke it. Rather, we will use an observation by Taelman [46] that the Dieudonné determinant implies a simple-behaving determinant degree function for matrices with very nice properties:

Proposition 1

There is a unique function $\deg \det :\,\mathcal{{R}}^{m\times m}\rightarrow \mathbb {Z}_{\ge 0} \cup \{-\infty \}$ s.t.:

$\deg \det (A A') = \deg \det (A) + \deg \det (A')$ for all $A, A' \in \mathcal{{R}}^{m\times m}$.
$\deg \det U = 0$ for all $U \in GL_m(\mathcal{{R}})$.
If A is diagonal with diagonal elements $d_0,\ldots ,d_{m-1}$, then $\deg \det A = \sum _{i}\deg d_i$

Corollary 1

For any $A,A' \in \mathcal{{R}}^{m\times m}$ then:

If $A'$ is obtained from A by elementary row operations, then $\deg \det A' = \deg \det A$.
If B equals $A \in \mathcal{{R}}^{m\times m}$ with one row or column scaled by some $f \in \mathcal{{R}}^*$, then $\deg \det B = \deg f + \deg \det A$.
If A is triangular with diagonal elements $d_0,\ldots ,d_{m-1}$, then $\deg \det A = \sum _i\deg d_i$.
$\deg \det (\varPhi _{\varvec{w}}(A)) = \deg \det (A) + \sum _i w_i$ for any shift $\varvec{w}$.

Example 1

Consider input matrix $\varPhi _{\varvec{w}}(M)$ to Algorithm 1 for the case^{Footnote 6} $\ell =2$, $\varvec{w}= (\gamma _0, \gamma _1, \gamma _2) = (100, 42,69)$, $\deg s_1 = 99$, $\deg s_2 = 95$ and $\deg g_1 = \deg g_2 = 100$. Then

And so by Proposition 1, $\deg \det \varPhi _{\varvec{w}}(M) = \deg g_1 + \deg g_2 + \sum _i \gamma _i = 411$.

This description of $\deg \det (\cdot )$ is not operational in the sense that it is not clear how to compute $\deg \det V$ for general $V \in \mathcal{{R}}^{m\times m}$. The following definition and Proposition 2 implies that Algorithm 3 can be used to compute $\deg \det V$; conversely, we show in Sect. 4.3 how to bound the complexity of Algorithm 3 based on $\deg \det V$.

Definition 6

The orthogonality defect of $V \in \mathcal{{R}}^{m\times m}$ is ${\Delta (V)} := \deg V - \deg \det V$.

The following observations are easy for $\mathbb {F}[x]$ matrices, but require more work over $\mathcal{{R}}$:

Proposition 2

Let $V \in \mathcal{{R}}^{m \times m}$ of full rank and in weak Popov form. Then ${\Delta (V)} = 0$.

Proof

Due to Corollary 1, we can assume the columns and rows of V are ordered such that $\mathrm {LP}_{}(\varvec{v}_i) = i$ and $\deg v_{i,i} \le \deg v_{j,j}$ for $i < j$. We will call this property “ordered weak Popov form” in this proof. Note that it implies $\psi (\varvec{v}_i) < \psi (\varvec{v}_j)$ for $i < j$. We will inductively obtain a series of matrices $V^{(0)} = V, V^{(1)}, V^{(2)}, \ldots , V^{(m)}$ each in ordered weak Popov form, and such that the first i columns of $V^{(i)}$ are zero below the diagonal. Then $V^{(m)}$ is upper triangular and we can obtain two expressions for its $\deg \det $.

So assume that $V^{(i)}$ is in ordered weak Popov form and its first i columns are zero below the diagonal. Recall that the (left) union of two skew polynomials $f, g \in \mathcal{{R}}$ is the unique lowest-degree $p \in \mathcal{{R}}$ such that $p = af = bg$ for some $a,b \in \mathcal{{R}}$; it is a consequence of the Euclidean algorithm that the union always exists, see e.g. [38]. For each $j > i$ consider now the coefficients in the union of $v^{(i)}_{i,i}$ and $v^{(i)}_{j,i}$, i.e. $a^{(i)}_j,b^{(i)}_j \in \mathcal{{R}}$ such that $a^{(i)}_j v^{(i)}_{i,i} = b^{(i)}_j v^{(i)}_{j,i}$. Let

$$\begin{aligned} V^{(i+1)} = \left( \begin{array}{@{}c|cccc@{}} \ I_{i-1}\ &{} \\ \hline &{} 1 \\ &{} -a^{(i)}_{i+1} &{} b^{(i)}_{i+1} \\ &{} \vdots &{} &{} \ddots \\ &{} -a^{(i)}_{m-1} &{} &{} &{} b^{(i)}_{m-1} \end{array}\right) V^{(i)} \ , \end{aligned}$$

where $I_{i-1}$ is the $(i-1) \times (i-1)$ identity matrix. The $i+1$ first columns of $V^{(i+1)}$ are then zero below the diagonal. Also $\mathrm {LP}_{}(a^{(i)}_j\varvec{v}^{(i)}_i) < \mathrm {LP}_{}(b^{(i)}_j\varvec{v}^{(i)}_{j}) = \mathrm {LP}_{}(\varvec{v}^{(i)}_j)$ and $\deg (a^{(i)}_j\varvec{v}^{(i)}_i) \le \deg (b^{(i)}_j\varvec{v}^{(i)}_{j})$ for $j > i$, which means $\psi (a^{(i)}_j\varvec{v}^{(i)}_i) < \psi (b^{(i)}_j\varvec{v}^{(i)}_{j})$ and therefore $\psi (\varvec{v}^{(i+1)}_j) = \psi (b^{(i)}_j\varvec{v}^{(i)}_{j})$. This implies that $V^{(i+1)}$ is in ordered weak Popov form and that $\deg v^{(i+1)}_{j,j} = \deg b^{(i)}_j + \deg v^{(i)}_{j,j}$ for $j > i$, which inductively expands to

$$\begin{aligned} \deg v^{(i+1)}_{j,j} = \deg v_{j,j} + \sum _{h=0}^i \deg b^{(h)}_j \ . \end{aligned}$$

Inductively, we therefore arrive at an upper triangular matrix $V^{(m)}$ in ordered weak Popov form, and whose diagonal elements satisfy $\deg v^{(m)}_{j,j} = \deg v_{j,j} + \sum _{i=0}^{j-1} \deg b^{(i)}_j$. Thus $\deg \det V^{(m)}$ is the sum of all these degrees by Corollary 1. On the other hand $V^{(m)}$ is obtained by multiplying triangular matrices on $V^{(0)} = V$, so by Proposition 1 we get another expression for $\deg \det V^{(m)}$ as:

$$\begin{aligned} \deg \det V^{(m)} = \deg \det V + \sum _{i=0}^{m-1} \sum _{j=i+1}^{m-1} \deg b^{(i)}_j \end{aligned}$$

Combining the expressions, we get $\deg \det V = \sum _{i=0}^{m-1} \deg v_{i,i} = \deg V$. $\square $

Corollary 2

Let $V \in \mathcal{{R}}^{m\times m}$ and full-rank, then $0 \le \deg \det V \le \deg V$.

Proof

Applying Algorithm 3 on V would use row operations to obtain a matrix $V' \in \mathcal{{R}}^{m\times m}$ in weak Popov form. Then $\deg \det V = \deg \det V'$ by Proposition 1. By Proposition 2 then $\deg \det V' = \deg V' \ge 0$, and by the nature of Algorithm 3, then $\deg V' \le \deg V$. $\square $

4.3 Complexity of Mulders–Storjohann

We can now bound the complexity of Algorithm 3 using arguments completely analogous to the $\mathbb {F}[x]$ case in [32]. These are in turn, the original arguments of [31] but finer grained by using the orthogonality defect. We bring the full proof here since the main steps are referred to in Sect. 5.1.

Theorem 6

Algorithm 3 with a full-rank input matrix $V \in \mathcal{{R}}^{m \times m}$ performs at most $m \big ({\Delta (V)} + m \big )$ simple LP-transformations, and it has complexity $O(m^2 {\Delta (V)} {{\mathrm{maxdeg}}}(V))$ over $\mathbb {F}$.

Proof

By Lemma 6, every simple LP-transformation reduces the $\psi $-value of one row with at least 1. So the number of possible simple LP-transformations is upper bounded by the difference of values of the input matrix V and the output matrix U, the matrices values being the sum of their rows’. More precisely, the number of iterations is upper bounded by:

$$\begin{aligned}&{\textstyle \sum _{i=0}^{m-1}} \left[ m\deg \varvec{v}_i+\mathrm {LP}_{}(\varvec{v}_i)-\big ( m\deg \varvec{u}_i+\mathrm {LP}_{}(\varvec{u}_i) \big )\right] \\\le & {} m^2 + m {\textstyle \sum _{i=0}^{m-1}} \left[ \deg \varvec{v}_i-\deg \varvec{u}_i\right] \\= & {} m [\deg V-\deg U+m] = m ({\Delta (V)}+m) , \end{aligned}$$

where the last equality follows from $\deg U = \deg \det U$ due to Proposition 2 and $\deg \det U = \deg \det V$.

One simple transformation consists of calculating $\varvec{v}_j - \alpha x^\beta \varvec{v}_i$, so for every coefficient in $\varvec{v}_i$, we must apply $\theta ^\beta $, multiply by $\alpha $ and then add it to a coefficient in $\varvec{v}_j$, each being in O(1). Since $\deg \varvec{v}_j \le {{\mathrm{maxdeg}}}(V)$ this costs $O(m{{\mathrm{maxdeg}}}(V))$ operations in $\mathbb {F}$. $\square $

Since ${\Delta (V)} \le \deg V$, the above complexity bound is always at least as good as the straightforward bound we mentioned at the end of Sect. 4.1.

Example 2

Consider an instance of Problem 1. The complexity of Algorithm 1 is determined by a row reduction of

$$\begin{aligned} \varPhi _{\varvec{w}}(M) = \left( \begin{array}{ccccc} x^{\gamma _0} &{} s_1 x^{\gamma _1} &{} s_2 x^{\gamma _2} &{} \dots &{} s_\ell x^{\gamma _\ell } \\ &{} g_1 x^{\gamma _1} &{} \\ &{} &{} g_2 x^{\gamma _2} &{} \\ &{} &{} &{} \ddots &{} \\ &{} &{} &{} &{} g_\ell x^{\gamma _\ell } \\ \end{array}\right) . \end{aligned}$$

(7)

Let $\mu := \max _i \{\gamma _i +\deg g_i\}$. We can assume that $\gamma _0 < \max _{i \ge 1} \{ \gamma _i + \deg s_i \} \le \mu $ since otherwise M is already in $\varvec{w}$-shifted weak Popov form. To apply Theorem 6, we calculate the orthogonality defect of $\varPhi _{\varvec{w}}(M)$. Since it is upper triangular, the degree of its determinant is

$$\begin{aligned} \deg \det \varPhi _{\varvec{w}}(M) = \sum _{i=1}^\ell \deg g_i + \sum _{i=0}^\ell \gamma _i \ . \end{aligned}$$

The degrees of the rows of $\varPhi (M)$ satisfy

$$\begin{aligned} \deg \varPhi _{\varvec{w}}(\varvec{m}_0)&= \max _i \{\gamma _i + \deg s_i\} \le \mu \ , \\ \deg \varPhi _{\varvec{w}}(\varvec{m}_i)&= \gamma _i + \deg g_i&\text {for } i \ge 1. \end{aligned}$$

Thus, ${\Delta (\varPhi _{\varvec{w}}(M))} \le \mu - \gamma _0$. With ${{\mathrm{maxdeg}}}(\varPhi _{\varvec{w}}(M)) \le \mu $, Theorem 6 implies a complexity of $O(\ell ^2 \mu ^2)$, assuming $\ell \in O(\mu )$. Note that the straightforward bound on Algorithm 3 yields $O(\ell ^3 \mu ^2)$. $\square $

Example 3

(Mulders–Storjohann for the Interpolation Step in decoding MV codes) Line 2 of Algorithm 2 is a row reduction of $\varPhi _{\varvec{w}}(M)$, as defined in (6) on page 8, whose degrees of the nonzero entries are component-wise upper bounded by:

$$\begin{aligned} \begin{pmatrix} n &{} &{} &{} &{} &{}\\ n &{} (k-1) &{} &{} &{}\\ n &{} &{} 2(k-1) &{} &{}\\ \vdots &{} &{} &{} \ddots &{} \\ n &{} &{} &{} &{} \ell (k-1) \end{pmatrix} \end{aligned}$$

Thus ${\Delta (\varPhi _{\varvec{w}}(M))} \le \deg \varPhi _{\varvec{w}}(M) - n - \left( {\begin{array}{c}\ell +1\\ 2\end{array}}\right) (k-1) \le \ell n$. Using Theorem 6, the complexity in operations over $\mathbb {F}_{q^s}$ becomes $O(\ell ^3 n^2)$.

5 Faster row reduction on matrices having special forms

In this section, we will investigate improved row reduction algorithms for matrices of special forms. The main goals are to improve the running time of row reducing the matrices appearing in the decoding settings of Sects. 3.1 and 3.2, but the results here apply more broadly.

5.1 Shift register problems: the demand-driven algorithm

Our first focus is to improve the MgLSSR case of Algorithm 1 on page 7, where we are to row reduce $\varPhi _{\varvec{w}}(M)$, given by (7): Algorithm 4 is a refinement of Algorithm 3 which is asymptotically faster when all $g_i$ are of the form $x^{d_i} + a_i$ for $a_i \in \mathbb {F}$. Though the refinement is completely analogous to that of [32] for the $\mathbb {F}[x]$ case, no complete proof has appeared in unabridged, peer-reviewed form before, so we give full proofs of the $\mathcal{{R}}$ case here. We begin with a technical lemma:

Lemma 7

Consider an instance of Problem 1 and Algorithm 3 with input $\varPhi _{\varvec{w}}(M)$ of (7). Let $\tilde{g}_j = g_j x^{\gamma _j}$. Consider a variant of Algorithm 3 where, after a simple LP-transformation i on j, which replaces $\varvec{v}_j$ with $\varvec{v}'_j$, we instead replace it with $\varvec{v}''_j = (v'_{j,0}, v'_{j,1} \ mod \ \tilde{g}_1, \ldots , v'_{j,\ell } \ mod \ \tilde{g}_\ell )$. This does not change the correctness of the algorithm or the upper bound on the number of simple LP-transformations performed.

Proof

Correctness follows if we can show that each of the $\ell $ modulo reductions could have been achieved by a series of row operations on the current matrix V after the simple LP-transformation producing $\varvec{v}_j'$. For each $h \ge 1$, let $\varvec{g}_h = (0,\ldots ,0,\tilde{g}_h,0,\ldots ,0)$, with position h non-zero.

During the algorithm, we will let $J_h$ be a subset of the current rows in V having two properties: that $\varvec{g}_h$ can be constructed as an $\mathcal{{R}}$-linear combination of the rows in $J_h$; and that each $\varvec{v} \in J_h$ has $ \psi (\varvec{v}) \le \psi (\varvec{g}_h)$. Initially, $J_h = \{ \varvec{g}_h \}$.

After simple LP-transformations on rows not in $J_h$, the h’th modulo reduction is therefore allowed, since $\varvec{g}_h$ can be constructed by the rows in $J_h$. On the other hand, consider a simple LP-transformation i on j where $\varvec{v}_j \in J_h$, resulting in the row $\varvec{v}_j'$. Then the h’th modulo reduction has no effect since $\psi (\varvec{v}_j') < \psi (\varvec{v}_j) \le \psi (\varvec{g}_h)$. Afterwards, $J_h$ is updated as $J_h = J_h \setminus \{ \varvec{v}_j \} \cup \{ \varvec{v}_j', \varvec{v}_i \}$. We see that $J_h$ then still satisfies the two properties, since $\psi (\varvec{v}_i) \le \psi (\varvec{v}_j) \le \psi (\varvec{g}_h)$.

Since $\psi (\varvec{v}''_j) \le \psi (\varvec{v}'_j)$ the proof of Theorem 6 shows that the number of simple LP-transformations performed is still bounded by $(\ell +1)({\Delta (V)}+\ell +1)$. $\square $

Theorem 7

Algorithm 4 is correct.

Proof

We first prove that an intermediary algorithm, Algorithm 5, is correct using the correctness of Algorithm 3, and then prove the correctness of Algorithm 4 using Algorithm 5 and Lemma 7. Starting from Algorithm 3 with input $\varPhi _{\varvec{w}}(M)$, then Algorithm 5 is obtained by two simple modifications: Firstly, note that initially, when $V := \varPhi _{\varvec{w}}(M)$, then $\mathrm {LP}_{}(\varvec{v}_h) = h$ for $h \ge 1$, and therefore the only possible simple LP-transformation must involve $\varvec{v}_0$. We can maintain this property as a loop invariant throughout the algorithm by swapping $\varvec{v}_0$ and $\varvec{v}_{\mathrm {LP}_{}(\varvec{v}_0)}$ when applying a simple LP-transformation $\mathrm {LP}_{}(\varvec{v}_0)$ on 0.

The second modification is to maintain $(\eta , h)$ as an upper bound on the $(\deg , \mathrm {LP})$ of $\varvec{v}_0$ throughout the algorithm: we initially simply compute these values. Whenever we have applied a simple LP-transformation on $\varvec{v}_0$ resulting in $\varvec{v}_0'$, we know by Lemma 6 that $\psi (\varvec{v}_0') < \psi (\varvec{v}_0)$. Therefore, either $\deg \varvec{v}_0' < \eta $ or $\deg \varvec{v}_0' = \eta \wedge \mathrm {LP}_{}(\varvec{v}_0') < h$. This is reflected in a corresponding decrement of $(\eta , h)$.

As a loop invariant we therefore have $\psi (\varvec{v}_0) \le \eta (\ell +1) + h$. After an iteration, if this inequality is sharp, it simply implies that the $\alpha $ computed in the following iteration will be 0, and $(\eta , h)$ will be correspondingly decremented once more. Note that we never set $h = 0$: when $\mathrm {LP}_{}(\varvec{v}_0) = 0$ then V must be in weak Popov form (since we already maintain $\mathrm {LP}_{}(\varvec{v}_h) = h$ for $h>0$). At this point, the while-loop will be exited since $\deg v_0 > \eta $.

Algorithm 5 is then simply the implementation of these modifications, and writing out in full what the simple LP-transformation does to $\varvec{v}_0$. This proves that Algorithm 5 is operationally equivalent to Algorithm 3 with input $\varPhi _{\varvec{w}}(M)$.

For obtaining Algorithm 4 from Algorithm 5, the idea is to store only the necessary part of V and compute the rest on demand. Firstly, by Lemma 7 correctness would be maintained if the simple LP-transformation on Line 9 of Algorithm 5 was followed by the $\ell $ modulo reductions. In that case, we would have $v_{0,h} = (v_{0,0}\tilde{s}_h \mod \tilde{g}_h)$, so storing only $v_{0,0}$ suffice for reconstructing $\varvec{v}_0$. Consequently we store the first column of V in Algorithm 4 as $(\lambda _0,\ldots ,\lambda _\ell )$. Line 6 of Algorithm 4 is now the computation of the needed coefficient of $v_{0,h}$ at the latest possible time.

As $\deg \varvec{v}_h$ is used in Line 6 of Algorithm 5, we need to store and maintain this between iterations; this is the variables $\eta _1,\ldots ,\eta _\ell $. To save some redundant computation of coefficients, the $x^{\eta _h}$-coefficient of $v_{h,h}$ is also stored as $\alpha _h$.

This proves that Algorithm 4 is operationally equivalent to Algorithm 5, which finishes the proof of correctness. $\square $

Proposition 3

Algorithm 4 has computational complexity $O(\ell \mu ^2 + \sum _{h=1}^\ell \sum _{\eta =0}^{\mu -1} T_{h,\eta })$, where $\mu = \max _i \{\gamma _i +\deg g_i\}$ and $T_{h,\eta }$ bounds the complexity of running Line 6 for those values of h and $\eta $.

Proof

Clearly, all steps of the algorithm are essentially free except Lines 6 and 9. Observe that every iteration of the while-loop decreases an upper bound on the value of row 0, whether we enter the if-branch in Line 7 or not. So by the arguments of the proof of Theorem 6, the loop will iterate at most $O(\ell \mu )$ times in which each possible value of $(h, \eta ) \in \{1,\ldots ,\ell \} \times \{0,\ldots ,\mu -1\}$ will be taken at most once. Each execution of Line 9 costs $O(\mu )$ since the $\lambda _j$ all have degree at most $\mu $. $\square $

It is possible to use Proposition 3 to show that Algorithm 4 is efficient if e.g. all the $g_i$ have few non-zero monomials.^{Footnote 7} We will restrict ourselves to a simpler case which nonetheless has high relevance for coding theory:

Theorem 8

Algorithm 4 can be realised with complexity $O(\ell \mu ^2)$ if $g_i = x^{d_i} + a_i$ for $a_i \in \mathbb {F}_q$ for all i, where $\mu = \max _i \{\gamma _i +\deg g_i\}$.

Proof

We will bound $\sum _{\eta =0}^{\mu -1} T_{h,\eta }$ of Proposition 3. Note first that for any $\eta $, the coefficient $\alpha $ to $x^\eta $ in $(\lambda \tilde{s}_h \ mod \ \tilde{g}_h)$ equals the coefficient to $x^{\eta -\gamma _h}$ of $(\lambda s_h \ mod \ g_h)$, so considering $\gamma _h = 0$ suffice. Now if $\eta \ge d_h$ then $\alpha = 0$ and can be returned immediately. If $\eta < d_h$, then due to the assumed shape of $g_i$, $\alpha $ is a linear combination of the coefficients to $x^{\eta }, x^{\eta +d_h}, \ldots , x^{\eta + t d_h}$ in $\lambda _0 s_h$, where $t = \left\lfloor \frac{\mu - \eta }{d_h} \right\rfloor $. Each such coefficient can be computed by convolution of $\lambda _0$ and $s_h$ in $O(\mu )$, so it costs $O(\frac{\mu ^2}{d})$ to compute $\alpha $. Summing over all choices of $\eta $, we have $\sum _{\eta =0}^{\mu -1} T_{h,\eta } \in O(\mu ^2)$ and the theorem follows from Proposition 3. $\square $

5.2 Weak Popov walking

The goal of this section is to arrive at a faster row reduction algorithm for the matrices used for decoding MV codes in Sect. 3.2. However, the algorithm we describe could be of much broader interest: it is essentially an improved way of computing a $\varvec{w}$-weak Popov form of a matrix which is already in $\varvec{w}'$-weak Popov form, for a shift $\varvec{w}'$ which is not too far from $\varvec{w}$. Inspired by “Gröbner walks”, we have dubbed this strategy “weak Popov walking”. Each “step” of the walk can be seen as just Algorithm 3 but where we carefully choose which LP-transformations to apply each iteration, in case there is choice.

This strategy would work completely equivalently for the $\mathbb {F}[x]$ case. However, to the best of our knowledge, that has not been done before.

In this section we will extensively discuss vectors under different shifts. To ease the notation we therefore introduce shifted versions of the following operators: $\mathrm {LP}_{\varvec{w}}(\varvec{v}) := \mathrm {LP}_{}(\varPhi _{\varvec{w}}(\varvec{v}))$ as well as $\deg _{\varvec{w}}(\varvec{v}) := \deg \varPhi _{\varvec{w}}(\varvec{v})$.

We begin by Algorithm 6 that efficiently “walks” from a weak Popov form according to the shift $\varvec{w}$ into one with the shift $\varvec{w}+ (1,0,\ldots ,0)$. The approach can readily be generalised to support increment on any index, but we do not need it for the decoding problem so we omit the generalisation to simplify notation.

Theorem 9

Algorithm 6 is correct.

Proof

Denote in this proof V as the input and $\hat{V}$ as the output of the algorithm. The algorithm performs a single sweep of simple transformations, modifying only rows indexed by I: in particular, if $\varvec{v}_i,{\varvec{\hat{v}}}_i$ are the rows of V respectively $\hat{V}$, then either ${\varvec{\hat{v}}}_i = \varvec{v}_i$, or ${\varvec{\hat{v}}}_i$ is the result of a simple transformation on $\varvec{v}_i$ by another row $\varvec{v}_j$ of V and $i,j \in I$. All the $h_i$ are different since V is in $\varvec{w}$-shifted weak Popov form. We will show that the ${\varvec{\hat{w}}}$-shifted leading positions of $\hat{V}$ is a permutation of the $h_i$, implying that $\hat{V}$ is in ${\varvec{\hat{w}}}$-shifted weak Popov form.

Note first that for any vector $\varvec{v}\in \mathcal{{R}}^m$ with $\mathrm {LP}_{\varvec{w}}(\varvec{v}) \ne \mathrm {LP}_{{\varvec{\hat{w}}}}(\varvec{v})$, then $\mathrm {LP}_{{\varvec{\hat{w}}}}(\varvec{v}) = 0$, since only the degree of the 0’th position of $\varPhi _{\varvec{\hat{w}}}(\varvec{v})$ is different from the corresponding position of $\varPhi _{\varvec{w}}(\varvec{v})$. For each $i \in \{0,\ldots ,m-1\} \setminus I$ we have ${\varvec{\hat{v}}}_i = \varvec{v}_i$ and so $\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_i) = h_i$. And of course for each $i \in I$ we have $\mathrm {LP}_{{\varvec{\hat{w}}}}(\varvec{v}_i) = 0$. This implies for each $j \in I$ that:

$$\begin{aligned} \deg v_{j,0} + w_0 = \deg v_{j,h_j} + w_{h_j} \ . \end{aligned}$$

(8)

Consider first an index $i \in I$ for which Line 9 was run, and let t be as at that point. This means ${\varvec{\hat{v}}}_i = \varvec{v}_i + \alpha x^\delta \varvec{v}_t$ for some $\alpha \in \mathbb {F}$ and $\delta = \deg v_{i,0} - \deg v_{t,0}$. Note that the if-condition ensures $\delta \ge 0$ and the simple transformation makes sense. We will establish that $\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_i) = h_i$. Since we are performing an LP-transformation, we know that $\deg _{\varvec{\hat{w}}}{\varvec{\hat{v}}}_i \le \deg _{\varvec{\hat{w}}}\varvec{v}_i$, so we are done if we can show that $\deg \hat{v}_{i,h_i} = \deg v_{i,h_i}$ and $\deg \hat{v}_{i, k} + w_k < \deg v_{i,h_i} + w_{h_i}$ for $k > h_i$. This in turn will follow if $\alpha x^\delta \varvec{v}_t$ has ${\varvec{\hat{w}}}$-weighted degree less than $\deg v_{i,h_i} + w_{h_i}$ on all position $k \ge h_i$.

Due to $\mathrm {LP}_{\varvec{w}}(\varvec{v}_t) = h_t$ and (8) for index t then for any $k > h_t$:

$$\begin{aligned} \deg v_{t, k} + w_k < \deg v_{t, h_t} + w_{h_t} = \deg v_{t,0} + w_0 \ . \end{aligned}$$

(9)

Using $\deg v_{t,0} + \delta = \deg v_{i,0}$ and (8) for index i, we conclude that

$$\begin{aligned} \deg v_{t, k} + w_k + \delta < \deg v_{i,0} + w_{0} = \deg v_{i,h_i} + w_{h_i} \ . \end{aligned}$$

Since $h_t < h_i$ by the ordering of the $i_\star $, this shows that $\deg v_{i, k} + w_k + \delta < \deg v_{i,h_i} + w_{h_i}$ for $k \ge h_i$. These are the degree bounds we sought and so $\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_i) = h_i$.

Consider now an $i \in I$ for which Line 9 was run, and let again t be as at that point, before the reassignment. The situation is completely reversed according to before, so by analogous arguments $\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_t) = h_i$.

For the value of t at the end of the algorithm, then clearly $\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_t) = 0$ since the row was not modified. Since we necessarily have $h_{i_1} = 0$, then $\mathrm {LP}_{{\varvec{\hat{w}}}}({\varvec{\hat{v}}}_t) = h_{i_1}$. Thus every $h_i$ becomes the ${\varvec{\hat{w}}}$-leading position of one of the $\varvec{v}_j$ exactly once. But the $h_i$ were all different, and so $\hat{V}$ is in ${\varvec{\hat{w}}}$-shifted weak Popov form. $\square $

Proposition 4

Algorithm 6 performs at most

$$\begin{aligned} \textstyle O\big (m\deg \det (V) + \sum _{i < j}|w_i - w_j| + m^2\big ) \end{aligned}$$

operations over $\mathcal{{R}}$.

Proof

We will bound the number of non-zero monomials which are involved in simple transformations. As remarked in the proof of Theorem 9, all simple transformations are done using distinct rows of the input matrix, so it suffices to bound the total number of monomials in the input matrix V.

Since we are then simply counting monomials in V, we can assume w.l.o.g. that $w_0 \le w_1 \le \ldots \le w_{m-1}$, and since the input matrix V was in $\varvec{w}$-shifted weak Popov form, assume also w.l.o.g that we have sorted the rows such that $\mathrm {LP}_{\varvec{w}}(\varvec{v}_i) = i$. Since ${\Delta (\varPhi _{\varvec{w}}(V))} = 0$ we have

$$\begin{aligned} \deg \det \varPhi _{\varvec{w}}(V) = \deg _{\varvec{w}} V \qquad \text { that is }\qquad \textstyle \deg _{\varvec{w}} V = \deg \det V + \sum _i w_i \ . \end{aligned}$$

We can therefore consider the assignment of $\deg _{\varvec{w}}$ to the individual rows of V under these constraints that will maximise the possible number of monomials in V. We cannot have $\deg _{\varvec{w}} \varvec{v}_i < w_i$ since $\mathrm {LP}_{\varvec{w}}(\varvec{v}_i) = i$. It is easy to see that the worst-case assignment is then to have exactly $\deg _{\varvec{w}} \varvec{v}_i = w_i$ for $i=0,\ldots ,m-2$ and $\deg _{\varvec{w}} \varvec{v}_{m-1} = \deg \det V + w_{m-1}$. In this case, for $i < m-1$ then $\deg v_{i,j} \le w_i - w_j$ if $j \le i$ and $v_{i,j} = 0$ if $j > i$, so the number of monomials can then be bounded as

$$\begin{aligned}&\left( \sum _{i=0}^{m-2} \sum _{j=0}^{i}(w_i - w_j + 1) \right) + \left( \sum _{j=0}^{m-1}(\deg \det V + w_{m-1} - w_j + 1) \right) \\&\qquad \le m^2 + \sum _{i < j}(w_j - w_i) + m\deg \det V. \end{aligned}$$

$\square $

The idea is now to iterate Algorithm 6 to “walk” from a matrix that is in weak Popov form for one shift $\varvec{w}$ into another one ${\varvec{\hat{w}}}$. Row reducing the matrix for the MV codes can be done as Algorithm 7.

Theorem 10

Algorithm 7 is correct. It has complexity $O(\ell n^2)$ over $\mathbb {F}_{q^s}$.

Proof

Note that M is in $\varvec{w}'$-shifted weak Popov form, where $\varvec{w}'$ is as on Line 2. Thus by the correctness of Algorithm 6, then V at the end of the algorithm must be in $\big (\varvec{w}+ (n, \ldots , n)\big )$-shifted weak Popov form. Then it is clearly also in $\varvec{w}$-shifted weak Popov form. For the complexity, the algorithm simply performs n calls to Algorithm 6. We should estimate the quantity $\sum _{i < j}|w_i - w_j|$, which is greatest in the first iteration. Since Problem 2 posits $n > \left( {\begin{array}{c}\ell +1\\ 2\end{array}}\right) (k-1)$, we can bound the sum as:

$$\begin{aligned} \sum _{j=1}^{\ell }(n + j(k-1)) + \sum _{1 \le i< j}(j-i)(k-1) < \ell n + (\ell +1)\genfrac(){0.0pt}1{\ell +1}{2} (k-1) \in O(\ell n) \ . \end{aligned}$$

Since $\deg \det (V) = \deg \det (M) = n$ then by Proposition 4 each of the calls to Algorithm 6 therefore costs at most $O(\ell n)$. $\square $

6 Conclusion

We have explored row reduction of skew polynomial matrices. For ordinary polynomial rings, row reduction has proven a useful strategy for obtaining flexible, efficient while conceptually simple decoding algorithms for RS and other code families. Our results introduce the methodology and tools aimed at bringing similar benefits to Gabidulin, Interleaved Gabidulin, MV, and other skew polynomial-based codes. We used those tools in two settings. We solved a general form of multiple skew-shift register synthesis (cf. Problem 1), and applied this for decoding of Interleaving Gabidulin codes in complexity $O(\ell \mu ^2)$, see Theorem 2. For MV codes (cf. Problem 2), we gave an interpolation algorithm with complexity $O(\ell n^2)$, see Theorem 4.

We extended and analysed the simple and generally applicable Mulders–Storjohann algorithm to the skew polynomial setting. In both the studied settings, the complexity of that algorithm was initially not satisfactory, but it served as a crucial step in developing more efficient algorithms. For multiple skew-shift register synthesis, we were able to obtain a good complexity for a more general problem than previously. For the MV codes, the improved algorithm was in the shape of a versatile “Weak Popov Walk”, which could potentially apply to many other problems. In all previously studied cases, we matched the best known complexities [44, 48] that do not make use of fast multiplication of skew polynomials.

Based on a preprint of this paper, in [40] it is shown how to further reduce the complexity for decoding Interleaved Gabidulin codes using a divide-&-conquer version of Algorithm 3, matching the complexity of [43].

The weak Popov form has many properties that can be beneficial in a coding setting, and which we did not yet explore. For instance, it allows to easily enumerate all “small” elements of the row space: that could e.g. be used to enumerate all solutions to a shift register problem, allowing a chase-like decoding of Interleaved Gabidulin codes beyond half the minimum distance.

Notes

We opted for using the term “row reduction” rather than “module minimisation”, as we used in [24], since the former is more common in the literature.
$\mathcal{{R}}$ is also right Euclidean, a right PID and right Noetherian, but we will only need its left module structure.
Skew fields are sometimes known as “division rings”.
There is a precise notion of “row reduced” [20, p. 384] for $\mathbb {F}[x]$ matrices. Weak Popov form implies being row reduced, but we will not formally define row reduced in this paper.
In [28], the claimed complexity of their root-finding is $O(\ell ^{O(1)} k)$. However, we have to point out that the complexity analysis of that algorithm has severe issues which are outside the scope of this paper to amend. There are two problems: (1) It is not proven that the recursive calls will not produce many spurious “pseudo-roots” which are sifted away only at the leaf of the recursions; and (2) the cost analysis ignores the cost of computing the shifts $Q(X, Y^q + \gamma Y)$. Issue 1 is necessary to resolve for assuring polynomial complexity. For the original $\mathbb {F}[x]$-algorithm this is proved as [42, Proposition 6.4], and an analogous proof might carry over. Issue 2 is critical since these shifts dominate the complexity: assuming the algorithm makes a total of $O(\ell k)$ recursive calls to itself, then $O(\ell k)$ shifts need to be computed, each of which costs $O(\ell \deg _x Q) \subset O(\ell n)$. If Issue 1 is resolved the algorithm should then have complexity $O(\ell ^2 k n)$.
This is a realistic shift register problem arising in decoding of an Interleaved Gabidulin code with $n=s=100$, $k_1 = 58$, $k_2 = 31$.
In the conference version of this paper [24], we erroneously claimed a too strong statement concerning this. However, one can relate the complexity of Algorithm 4 to the number of non-zero monomials of $g_i$, as long as all but the leading monomial have low degree; however the precise statement becomes cumbersome and is not very relevant for this paper.

References

Abramov S.A., Bronstein M.: On solutions of linear functional systems. In: Proceedings of ISSAC, pp. 1–6 (2001).
Alekhnovich M.: Linear Diophantine equations over polynomials and soft decoding of Reed–Solomon codes. IEEE Trans. Inf. Theory 51(7), 2257–2265 (2005).
Augot D., Loidreau P., Robert G.: Rank metric and Gabidulin codes in characteristic zero. In: ISIT (2013).
Baker G., Graves-Morris P.: Padé Approximants, vol. 59. Cambridge University Press, Cambridge (1996).
Beckermann B., Labahn G.: A uniform approach for the fast computation of matrix-type Padé approximants. SIAM J. Matrix Anal. Appl. 15(3), 804–823 (1994).
Beckermann B., Cheng H., Labahn G.: Fraction-free row reduction of matrices of Ore polynomials. J. Symb. Comput. 41(5), 513–543 (2006).
Beelen P., Brander K.: Key equations for list decoding of Reed–Solomon codes and how to solve them. J. Symb. Comput. 45(7), 773–786 (2010).
Boucher D., Ulmer F.: Linear codes using skew polynomials with automorphisms and derivations. Des. Codes Cryptogr. 70(3), 405–431 (2014).
Cohn H., Heninger N.: Ideal forms of Coppersmith’s theorem and Guruswami–Sudan list decoding (2010). arXiv:1008.1284.
Clark P.L.: Non-commutative algebra. University of Georgia. http://math.uga.edu/~pete/noncommutativealgebra.pdf (2012).
Delsarte P.: Bilinear forms over a finite field, with applications to coding theory. J. Comb. Theory 25(3), 226–241 (1978).
Dieudonné J.: Les déterminants sur un corps non commutatif. Bull. Soc. Math. France 71, 27–45 (1943).
Draxl P.K.: Skew Fields, vol. 81. Cambridge University Press, Cambridge (1983).
Feng G.L., Tzeng K.K.: A generalization of the Berlekamp–Massey algorithm for multisequence shift-register synthesis with applications to decoding cyclic codes. IEEE Trans. Inf. Theory 37(5), 1274–1287 (1991).
Gabidulin E.M.: Theory of codes with maximum rank distance. Problemy Peredachi Informatsii 21(1), 3–16 (1985).
Giorgi P., Jeannerod C., Villard G.: On the complexity of polynomial matrix computations. In: Proceedings of ISSAC, pp. 135–142 (2003).
Guruswami V., Sudan M.: Improved decoding of Reed–Solomon codes and algebraic-geometric codes. IEEE Trans. Inf. Theory 45(6), 1757–1767 (1999).
Guruswami V., Wang C.: Explicit rank-metric codes list-decodable with optimal redundancy. In: Proceedings of RANDOM (2014). arXiv:1311.7084.
Guruswami V., Xing C.: List decoding Reed-solomon, algebraic-geometric, and Gabidulin subcodes up to the singleton bound. In: Proceedings of STOC, pp. 843–852. ACM, New York (2013)
Kailath T.: Linear Systems. Prentice-Hall, Upper Saddle River (1980).
Lee K., O’Sullivan M.E.: List decoding of Reed–Solomon codes from a Gröbner basis perspective. J. Symb. Comput. 43(9), 645–658 (2008).
Lenstra A.: Factoring multivariate polynomials over finite fields. J. Comput. Syst. Sci. 30(2), 235–246 (1985).
Li W., Sidorenko V., Silva D.: On transform-domain error and erasure correction by Gabidulin codes. Des. Codes Cryptogr. 73(2), 571–586 (2014).
Li W., Nielsen J.S.R., Puchinger S., Sidorenko V.: Solving shift register problems over skew polynomial rings using module minimisation. In: Proceedings of WCC (2015).
Loidreau P., Overbeck R.: Decoding rank errors beyond the error correcting capability. In: Proceedings of ACCT, pp. 186–190 (2006).
Mahdavifar H.: List decoding of subspace codes and rank-metric codes. Ph.D. thesis, University of California, San Diego (2012).
Mahdavifar H., Vardy A.: Algebraic list-decoding of subspace codes with multiplicities. In: Allerton Conference on Communication, Control, and Computing, pp. 1430–1437 (2011).
Mahdavifar H., Vardy A.: Algebraic list-decoding of subspace codes. IEEE Trans. Inf. Theory 59(12), 7814–7828 (2013).
Middeke J.: A computational view on normal forms of matrices of Ore polynomials. Ph.D. thesis, Research Institute for Symbolic Computation (RISC) (2011).
Müelich S., Puchinger S., Mödinger D., Bossert M.: An alternative decoding method for Gabidulin codes in characteristic zero. In: Proceedings of IEEE ISIT (2016). arXiv:1601.05205.
Mulders T., Storjohann A.: On lattice reduction for polynomial matrices. J. Symb. Comput. 35(4), 377–401 (2003).
Article MATH MathSciNet Google Scholar
Nielsen J.S.R.: Generalised multi-sequence shift-register synthesis using module minimisation. In: Proceedings of IEEE ISIT, pp. 882–886 (2013).
Nielsen J.S.R.: Power decoding Reed–Solomon codes up to the Johnson radius. In: Proceedings of ACCT (2014).
Nielsen J.S.R., Beelen P.: Sub-quadratic decoding of one-point Hermitian codes. IEEE Trans. Inf. Theory 61(6), 3225–3240 (2015).
Nielsen R.R., Høholdt T.: Decoding Reed–Solomon codes beyond half the minimum distance. In: Coding Theory, Cryptography and Related Areas, pp. 221–236. Springer, Berlin (1998).
Olesh Z., Storjohann A.: The vector rational function reconstruction problem. In: Proceedings of WWCA, pp. 137–149 (2006).
Ore O.: On a special class of polynomials. Trans. Am. Math. Soc. 35(3), 559–584 (1933).
Ore O.: Theory of non-commutative polynomials. Ann. Math. 34(3), 480–508 (1933).
Puchinger S., Wachter-Zeh A.: Fast operations on linearized polynomials and their applications in coding theory. J. Symb. Comput. (2015). arXiv:1512.06520.
Puchinger S., Müelich S., Mödinger D., Nielsen J.S.R., Bossert M.: Decoding interleaved Gabidulin codes using Alekhnovich’s algorithm. In: Proceedings of ACCT (2016). arXiv:1604.04397.
Roth R.M.: Maximum-rank array codes and their application to crisscross error correction. IEEE Trans. Inf. Theory 37(2), 328–336 (1991).
Roth R., Ruckenstein G.: Efficient decoding of Reed–Solomon codes beyond half the minimum distance. IEEE Trans. Inf. Theory 46(1), 246–257 (2000).
Sidorenko V., Bossert M.: Fast skew-feedback shift-register synthesis. Des. Codes Cryptogr. 70(1–2), 55–67 (2014).
Sidorenko V., Jiang L., Bossert M.: Skew-feedback shift-register synthesis and decoding interleaved Gabidulin codes. IEEE Trans. Inf. Theory 57(2), 621–632 (2011).
Sidorenko V., Schmidt G.: A linear algebraic approach to multisequence shift-register synthesis. Probl. Inf. Transm. 47(2), 149–165 (2011).
Taelman L.: Dieudonné determinants for skew polynomial rings. J. Algebra Appl. 5(1), 89–93 (2006).
Wachter-Zeh A.: Decoding of block and convolutional codes in rank metric. Ph.D. thesis, Universität Ulm (2013).
Xie H., Lin J., Yan Z., Suter B.W.: Linearized polynomial interpolation and its applications. IEEE Trans. Signal Process. 61(1), 206–217 (2013).
Zeh A., Gentner C., Augot D.: An interpolation procedure for list decoding Reed–Solomon codes based on generalized key equations. IEEE Trans. Inf. Theory 57(9), 5946–5959 (2011).
Zhou W., Labahn G.: Efficient algorithms for order basis computation. J. Symb. Comput. 47(7), 793–819 (2012).

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for suggestions that have substantially improved the clarity of the paper. Sven Puchinger (Grant BO 867/29-3), Wenhui Li and Vladimir Sidorenko (Grant BO 867/34-1) were supported by the German Research Foundation “Deutsche Forschungsgemeinschaft” (DFG).

Author information

Authors and Affiliations

Institute of Communications Engineering, Ulm University, Ulm, Germany
Sven Puchinger & Wenhui Li
Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
Johan Rosenkilde né Nielsen
Institute for Communications Engineering, TU München, Munich, Germany
Vladimir Sidorenko

Authors

Sven Puchinger
View author publications
You can also search for this author in PubMed Google Scholar
Johan Rosenkilde né Nielsen
View author publications
You can also search for this author in PubMed Google Scholar
Wenhui Li
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Sidorenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johan Rosenkilde né Nielsen.

Additional information

This is one of several papers published in Designs, Codes and Cryptography comprising the “Special Issue on Coding and Cryptography”.

Vladimir Sidorenko is on leave from Institute of Information Transmission Problems (IITP), Russian Academy of Sciences.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Puchinger, S., Rosenkilde né Nielsen, J., Li, W. et al. Row reduction applied to decoding of rank-metric and subspace codes. Des. Codes Cryptogr. 82, 389–409 (2017). https://doi.org/10.1007/s10623-016-0257-9

Download citation

Received: 12 October 2015
Revised: 23 June 2016
Accepted: 13 July 2016
Published: 02 August 2016
Issue Date: January 2017
DOI: https://doi.org/10.1007/s10623-016-0257-9

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Row reduction applied to decoding of rank-metric and subspace codes

Abstract

Similar content being viewed by others

Algebraic decoding of folded Gabidulin codes

New constructions of MSRD codes

On the list decodability of rank-metric codes containing Gabidulin codes

1 Introduction

1.1 Related work

2 Preliminaries

2.1 Skew polynomials

Definition 1

2.2 Skew polynomial matrices

2.3 The weak Popov form

Definition 2

Lemma 1

Definition 3

3 Decoding problems in rank-metric and subspace codes

3.1 Interlaved Gabidulin codes: multi-sequence shift registers

Problem 1

Lemma 2

Proof

Theorem 1

Proof

Theorem 2

Proof

3.2 Decoding Mahdavifar–Vardy codes

Problem 2

Lemma 3

Proof

Definition 4

Lemma 4

Lemma 5

Proof

Theorem 3

Proof

Theorem 4

Proof

4 Row reduction of \(\mathcal{{R}}\)-matrices

4.1 The Mulders–Storjohann algorithm

Definition 5

Remark 1

Lemma 6

Proof

Theorem 5

Proof

4.2 The determinant degree and orthogonality defect

Proposition 1

Corollary 1

Example 1

Definition 6

Proposition 2

Proof

Corollary 2

Proof

4.3 Complexity of Mulders–Storjohann

Theorem 6

Proof

Example 2

Example 3

5 Faster row reduction on matrices having special forms

5.1 Shift register problems: the demand-driven algorithm

Lemma 7

Proof

Theorem 7

Proof

Proposition 3

Proof

Theorem 8

Proof

5.2 Weak Popov walking

Theorem 9

Proof

Proposition 4

Proof

Theorem 10

Proof

6 Conclusion

Notes

References

Acknowledgments