Keywords

1 Introduction

In symbolic mathematical computation it is important to have efficient algorithms for the fundamental arithmetic operations of addition, multiplication and division. While linear time algorithms for additive operations are usually straightforward, considerable attention has been devoted to find efficient methods to compute products and quotients of integers, polynomials with integer or finite field coefficients and matrices with elements from a ring. For these, both practically efficient algorithms and theoretically important bounds are well known.

For integer and polynomial division, efficient algorithms based on Newton iteration allow the computation of quotients in time proportional to multiplication. Until recently, these algorithms left the original domain to perform arithmetic in related domains. For integers, this involved computing an approximation to the inverse of the divisor in extended precision approximate arithmetic or in a residue ring, and for polynomials it involved computing the inverse of the reverse of the divisor polynomial in ideal-adic arithmetic.

We have recently shown how these quotients may be computed without leaving the original domain, and we have extended this to a generic domain-preserving algorithm for rings with a suitable whole shift operation [10]. For integers the whole shift multiplies by a power of the representation base and for polynomials it multiplies by a power of the variable, in both cases discarding terms with negative powers. The previous paper developed the concept of the whole shifted inverse and used it to compute quotients efficiently. Non-commutative domains were mentioned only briefly.

The present article expands on how these methods may be used to compute quotients of non-commutative polynomials. In particular, it is shown that

  • the whole shifted inverse is well-defined on non-commutative polynomial rings R[x],

  • its computation is efficient,

  • they may be used to compute left or right quotients in R[x], each with one multiplication,

  • left and right whole shifted inverses may be defined on skew polynomials \(R[x; \sigma , \delta ]\), and

  • they may be used to compute the right and left quotients in \(R[x; \sigma , \delta ]\), each with one multiplication.

The remainder of this article is organized as follows. Section 2 presents some basic background, including notation, the definition of division in a non-commutative context, and the Newton-Schulz iteration. Section 3 considers division of non-commutative polynomials in R[x], showing \(O(n^2)\) algorithms for classical division and for pseudodivision. It recalls the notion of the whole shifted inverse, proves it is well-defined on non-commutative R[x] and shows that it can be used to compute left and right quotients in this setting. Section 4 recapitulates the generic algorithms from [10] that use a modified Newton iteration to compute the whole shifted inverse. It also explains why it applies when polynomial coefficients are non-commutative. Section 5 gives an example of these algorithms applied to polynomial matrices. Section 6 extends the discussion to skew polynomials \(R[x;\sigma , \delta ]\), defining left and right whole shifted inverse, and showing how they may be used. Section 7 gives linear ordinary differential and difference operators as examples, before concluding remarks in Sect. 8.

2 Background

2.1 Notation

We adopt the following notation:

figure a

The “\(\textrm{prec}\)” notation, standing for “precision”, means the number of base-B digits or polynomial coefficients. It is similar to that of [4], where it is used to present certain algorithms generically for integers and polynomials. In particular, if we take integers to be represented in base-B, i.e. for any integer \(u \ne 0\) there is \(h = \textrm{prec}_B(u)-1\), such that

$$\begin{aligned} u = \sum _{i=0}^h u_i B^i, \quad u_i \in \mathbb Z,\, 0 \le u_i < B, \; u_h \ne 0, \end{aligned}$$
(1)

then integers base-B behave similarly to univariate polynomials with coefficients \(u_i\), but with carries complicating matters.

2.2 Division

The notion of integer quotients and remainders can be extended to more general rings. For a Euclidean domain D with valuation \(N: D \rightarrow \mathbb Z_{\ge 0}\), such that for any \(u, v \in D, v \ne 0\), there exist \(q, r \in D\) such that

$$\begin{aligned} u&= q v + r,&r&= 0 \text { or } N(r) < N(v). \end{aligned}$$

The value q is a quotient of u and v and r is a remainder of dividing u by v and we write

$$\begin{aligned} q&= u \mathbin {\textrm{quo}}v&r&= u \mathbin {\textrm{rem}}v \end{aligned}$$

when these are unique. When both the quotient and remainder are required, we write \(u~\textrm{div}~v = (u~ \mathbin {\textrm{quo}}~ v, u~ \mathbin {\textrm{rem}}~ v)\). When D is a non-commutative ring with a valuation N, there may exist left and right quotients such that

$$\begin{aligned} \begin{aligned} u&= v \,q_\text {l}+ r_\text {l},& r_\text {l}&= 0 \text { or } N(r_\text {l})< N(v) \\ u&= q_\text {r}\,v + r_\text {r},& r_\text {r}&= 0 \text { or } N(r_\text {r}) < N(v). \end{aligned} \end{aligned}$$
(2)

When these exist and are unique, we write

$$\begin{aligned} q_\text {l}&= u \,\textrm{lquo}\, v&r_\text {l}&= u \,\textrm{lrem}\, v&q_\text {r}&= u \,\textrm{rquo}\, v&r_\text {r}&= u \,\textrm{rrem}\, v . \end{aligned}$$

For certain non-commutative rings with a distance measure \(\Vert \cdot \Vert \), a sequence of approximations to the inverse of A may be computed via the Newton-Schulz iteration [7]

$$\begin{aligned} X_{(i+1)} = X_{(i)} + X_{(i)} (1 - A X_{(i)}) \end{aligned}$$
(3)

where 1 denotes the multiplicative identity of the ring. There are several ways to arrange this expression, but the form above emphasizes that as \(X_{(i)}\) approaches \(A^{-1}\), the product \(X_{(i)}(1 - AX_{(i)})\) approaches 0. For \(\mathbb C^{n\times n}\) matrices, a suitable initial value is \(X_{(0)} = A^\dagger /(n\, \textrm{Tr}(A A^\dagger ))\), where \(A^\dagger \) is the Hermitian transpose.

2.3 Whole Shift and Whole Shifted Inverse

In previous work [10] we studied the problem of efficient domain-preserving computation of quotients and remainders for integers and polynomials, then generalized these results to a generic setting. To this end, we defined the notions of the whole shift and whole shifted inverse with attention to commutative domains. We recapitulate these definitions and two results relevant to the present article.

Definition 1

(Whole n-shift in \(R[x]\) ). Given a polynomial \(u = \sum _{i=0}^h u_ix^i \in R[x]\), with R a ring and \(n \in \mathbb Z\), the whole n-shift of u with respect to x is

$$\begin{aligned} \textrm{shift}_{n,x} u = \sum _{i+n\ge 0} u_i x^{i+n}. \end{aligned}$$
(4)

When x is clear by context, we write \(\textrm{shift}_n u\).

Definition 2

(Whole n-shifted inverse in \(F[x]\)). Given \(n \in \mathbb Z_{\ge 0}\) and \(v\in F[x]\), F a field, the whole n-shifted inverse of v with respect to x is

$$\begin{aligned} \textrm{shinv}_{n,x} v = x^n \mathbin {\textrm{quo}}v. \end{aligned}$$
(5)

When x is clear by context, we write \(\textrm{shinv}_n v\),

Theorem 1

Given two polynomials \(u, v \in F[x]\), F a field, and \(0 \le \textrm{degree}\,u \le h\),

$$\begin{aligned} u \mathbin {\textrm{quo}}v = \textrm{shift}_{-h}(u \cdot \textrm{shinv}_h\, v). \end{aligned}$$
(6)
figure b

For classical and Karatsuba multiplication it is more efficient to compute just the top part of the product in (6), omitting the lower h terms, instead of shifting:

$$ \textrm{shift}_{-h}(u \cdot \textrm{shinv}_h\, v) = \text {MultQuo}(u, \textrm{shinv}_h\,v, h), $$

with \(\text {MultQuo}(a,b,n) = ab \mathbin {\textrm{quo}}x^n\) computing only \(\textrm{degree}\,a + \textrm{degree}\,b - n + 1\) terms. For multiplication methods where computing only the top part of the product gives no saving, some improvement is obtained using

$$\begin{aligned} \textrm{shift}_{-h}(u \cdot \textrm{shinv}_h\,v) = \textrm{shift}_{-(h-k)}(\textrm{shift}_{-k}\,u \cdot \textrm{shinv}_h\,v). \end{aligned}$$

Theorem 2

Given \(v \in F[x]\), with F a field and \(h > \textrm{degree}\,v = k\) and suitable starting value \(w_{(0)}\), the sequence of iterates

$$ w_{(i+1)} = w_{(i)} + \textrm{shift}_{-h} \big ( w_{(i)} (\textrm{shift}_h 1 - vw_{(i)} ) \big ) $$

converges to \(\textrm{shinv}_h\,v\) in \(\lceil \log _2(h-k) \rceil \) steps.

A suitable starting value for \(w_{(0)}\) is given by \(\text {Shinv0}\) in Sect. 4.

3 Division in Non-Commutative R[x]

We now lay out how to use \(\textrm{shift}\) and \(\textrm{shinv}\) to compute quotients for polynomials with non-commutative coefficients. First we show classical algorithms to compute left and right quotients in R[x]. We then prove two theorems, one showing that \(x^n \textrm{lquo}v = x^n \textrm{rquo}v\) in this setting, making the whole shifted inverse well defined, and another showing that it may be used to compute left and right quotients.

figure c

3.1 Definitions and Classical Algorithms

Let u and v be two polynomials in R[x] with Euclidean norm being the polynomial degree. The left and right quotients and remainders are defined as in (2). Left and right quotients will exist provided that \(v_k\) is invertible in R and they may be computed by Algorithm 1. In the presentation of the algorithm, \(\pi \) denotes a permutation on two elements so is either the identity or a transposition. The notation \(\times _\pi \) is a shorthand for \(\times \circ \pi \) so \(a \times _\pi b = a \times b\) when \(\pi \) is the identity and \(a \times _\pi b = b \times a\) when \(\pi \) is a transposition.

There are some circumstances where quotients or related quantities may be computed even if \(v_k\) is not invertible. When R is an integral domain, quotients may be computed as usual in K[x] with K being the quotient field of R. Alternatively, when R is non-commutative but \(v_k\) commutes with v, it is possible to compute pseudoquotients and pseudoremainders satisfying

$$\begin{aligned} m \, u&= v \, q_\text {l}+ r_\text {l},&\textrm{degree}\,r_\text {l}&< \textrm{degree}\,v \\ u \, m&= q_\text {r}\, v + r_\text {r},&\textrm{degree}\,r_\text {r}&< \textrm{degree}\,v \\ m&= v_k^{h-k+1}, \end{aligned}$$

as shown in Algorithm 2. In this case, we write

$$\begin{aligned} q_\text {l}&= u\,\textrm{lpquo}\,v&r_\text {l}&= \textrm{lprem}\,v \\ q_\text {r}&= u\,\textrm{rpquo}\,v&r_\text {l}&= \textrm{rprem}\,v. \end{aligned}$$

Requiring \(v_k\) to commute with v is quite restrictive, however, so we focus our attention to situations where the inverse of \(v_k\) exists.

3.2 Whole Shift and Whole Shifted Inverse in \(R[x]\)

We now examine the notions of the whole shift and whole shifted inverse for R[x] with non-commutative R. First consider the whole shift. Since x commutes with all values in R[x], we may without ambiguity take, for \(u = \sum _{i = 0}^h u_i x^i\) and \(n \in \mathbb Z\),

$$\begin{aligned} \textrm{shift}_n\,u \;= \sum _{i +n \ge 0} x^n (u_i x^i) \;= \sum _{i +n \ge 0} (u_i x^i) x^n. \end{aligned}$$
(7)

That is, the fact that R[x] is non-commutative does not lead to left and right variants of the whole shift.

We state two simple theorems with obvious proofs:

Theorem 3

Let \(w \in R[x]\). Then, for all \(n \in \mathbb Z_{\ge 0}\), \( \textrm{shift}_{-n} \textrm{shift}_n w = w. \)

Theorem 4

Let \(u, v \in R[x]\) with \(\textrm{degree}\,u = h\) and \(\textrm{degree}\,v = k\). Then, for \(m \in \mathbb Z\),

$$\begin{aligned} \textrm{shift}_{-k-m} (u \times v)&= \textrm{shift}_{-k} (\textrm{shift}_{-m}(u) \times v) \\ \textrm{shift}_{-h-m} (u \times v)&= \textrm{shift}_{-h} (u \times \textrm{shift}_{-m}(v)). \end{aligned}$$

We now come to the main point of this section and show \(\textrm{shinv}\) is well-defined when R is non-commutative.

Theorem 5

(Whole shifted inverse for non-commutative \(R[x]\) ).

Let \(v = \sum _{i=0}^k v_i x^i \in R[x]\), with R a non-commutative ring and \(v_k\) invertible in R. Then, for \(h \in \mathbb Z_{\ge 0}\),

$$ x^h\,\textrm{lquo}\,v = x^h\,\textrm{rquo}\,v. $$

Proof

Let \(q_\text {l}= x^h \textrm{lquo}v\) and \(q_\text {r}= x^h \textrm{rquo}v\). If \(h < k\), then \(q_\text {l}= q_\text {r}= 0\). Otherwise, both \(q_\text {l}\) and \(q_\text {r}\) have degree \(h-k \ge 0\) so

$$\begin{aligned} v_k \, {q_\text {l}}_{h-k}&= 1&{q_\text {r}}_{h-k} \, v_k&= 1 \end{aligned}$$
(8)
$$\begin{aligned} \sum _{j=M}^k v_j \,{q_\text {l}}_{i+k-j}&= 0&\sum _{j=M}^k {q_\text {r}}_{i+k-j} \,v_j&= 0, \quad 0 \le i < h-k, \end{aligned}$$
(9)

where \(M=\max (0, i -h +2k)\). We show by induction on i that \({q_\text {l}}_i = {q_\text {r}}_i\) for \(0 \le i \le h-k\). Since \(v_k\) is invertible, (8) and (9) give

$$\begin{aligned} {q_\text {l}}_{h-k} = {q_\text {r}}_{h-k} = v_k^{-1} \end{aligned}$$
(10)

and

$$\begin{aligned} {q_\text {l}}_i&= -\sum _{j=M}^{k-1} v_k^{-1}\, v_j \, {q_\text {l}}_{i+k-j}&{q_\text {r}}_i&= -\sum _{j=M}^{k-1} {q_\text {r}}_{i+k-j} \,v_j \, v_k^{-1} , \quad 0 \le i < h-k. \end{aligned}$$
(11)

Equation (10) gives the base of the induction. Now suppose \({q_\text {l}}_i = {q_\text {r}}_i\) for \(N < i \le h-k\). Then for \(i = N \ge 0\) equation (11) gives

$$\begin{aligned} {q_\text {l}}_N&= -\sum _{j=M}^{k-1} v_k^{-1}\,v_j \, {q_\text {l}}_{N+k-j} = -\sum _{j=M}^{k-1} v_k^{-1}\,v_j \, {q_\text {r}}_{N+k-j} \\&= -\sum _{j=M}^{k-1} v_k^{-1}\,v_j \left( -\sum _{\ell =M}^{k-1} \, {q_\text {r}}_{N+k-j+k-\ell } v_\ell \, v_k^{-1} \right) \\&= -\sum _{\ell =M}^{k-1} \left( - \sum _{j=M}^{k-1} v_k^{-1}\, v_j \, {q_\text {r}}_{N+k-j+k-\ell } \right) \,v_\ell \, v_k^{-1} \\ {}&= -\sum _{\ell =M}^{k-1} {q_\text {r}}_{N+k-j} \, v_\ell \, v_k^{-1} = {q_\text {r}}_N. \end{aligned}$$

   \(\Box \)

Thus we may write \(\textrm{shinv}_h\,v\) without ambiguity in the non-commutative case, i.e

$$\begin{aligned} \textrm{shinv}_h\,v = x^h\,\textrm{lquo}\,v = x^h\,\textrm{rquo}\,v. \end{aligned}$$
(12)

3.3 Quotients from the Whole Shifted Inverse in \(R[x]\)

We consider computing the left and right quotients in R[x] from the whole shifted inverse. We have the following theorem.

Theorem 6

(Left and right quotients from the whole shifted inverse in \(R[x]\)). Let \(u, v \in R[x]\), R a ring, with \(\textrm{degree}\,v = k\) and \(v_k\) invertible in R. Then for \(h \ge \textrm{degree}\,u\),

$$\begin{aligned} u\,\textrm{lquo}\,v&= \textrm{shift}_{-h} (\textrm{shinv}_h(v) \times u)\quad \text {and} \\ u\,\textrm{rquo}\,v&= \textrm{shift}_{-h} (u \times \textrm{shinv}_h(v)) . \end{aligned}$$

Proof

Consider first the right quotient. It is sufficient to show

$$ u = \textrm{shift}_{-h} (u \times \textrm{shinv}_h\,v) \times v + r_\text {r}$$

for some \(r_\text {r}\) with \(\textrm{degree}\,r_\text {r}< k\). It suffices to show

$$\begin{aligned} \textrm{shift}_{-k}\,u = \textrm{shift}_{-k} \big (\textrm{shift}_{-h} (u \times \textrm{shinv}_h\,v) \times v\big ). \end{aligned}$$
(13)

We have

$$\begin{aligned} (u \times \textrm{shinv}_h\,v) \times v&= u \times ((x^h\,\textrm{rquo}\,v) \times v) \end{aligned}$$
(14)
$$\begin{aligned}&= u \times (x^h - \rho ), \quad \rho = 0 \text { or } \textrm{degree}\,\rho < k \nonumber \\&= \textrm{shift}_h\,u - u \times \rho . \nonumber \\ \textrm{shift}_h u&= (u\times \textrm{shinv}_h\,v) \times v + u \times \rho . \end{aligned}$$
(15)

Since \(h \ge 0\), Theorem 3 applies and equation (15) gives

$$\begin{aligned} u&= \textrm{shift}_{-h} \big ((u \times \textrm{shinv}_h\,v) \times v \big ) + \textrm{shift}_{-h}(u \times \rho ) \end{aligned}$$

with the degree of \(\textrm{shift}_{-h}(u \times \rho )\) less than k. Therefore

$$\begin{aligned} \textrm{shift}_{-k}\,u&= \textrm{shift}_{-k -h} \big ((u \times \textrm{shinv}_h\,v) \times v \big ) \\&= \textrm{shift}_{-k} \big ( \textrm{shift}_{-h}(u \times \textrm{shinv}_h\,v) \times v) \big ), \end{aligned}$$

by Theorem 4, and we have shown equation (13) as required. The proof for \(\textrm{lquo}\) replaces equation (14) with

$$ v \times (\textrm{shinv}_h\,v \times u) = (v \times (x^h\,\textrm{lquo}\,v)) \times u $$

and follows the same lines, mutatis mutandis.    \(\Box \)

As in the commutative case, it may be more efficient to compute only the top part of the product instead of computing the whole thing then shifting away part. Now that we have shown that \(\textrm{shift}\) and \(\textrm{shinv}\) are well-defined for non-commutative R[x], we next see that \(\textrm{shinv}\) may be computed by our generic algorithm.

4 Generic Algorithm for the Whole Shifted Inverse

Earlier work has shown how to compute \(\textrm{shinv}\) efficiently for \(\mathbb Z\), both for Euclidean domains F[x], and generically [10]. The generic version shown here in Algorithm 3. We justify below that it applies equally well to polynomials with non-commutative coefficients. The algorithm operates on a ring D that is required to have a suitable \(\textrm{shift}\) and certain other operations and properties must be defined. For example, on F[x], F a field, these are

$$\begin{aligned} \textrm{shift}_n\,u&= {\left\{ \begin{array}{ll} u \cdot x^n &{} \text {if~} n \ge 0 \\ u \mathbin {\textrm{quo}}x^{-n} &{}\text {if~} n < 0 \end{array}\right. } \\ \text {coeff}(u, i)&= u_i \\ \text {Shinv0}(v)&= (1/v_k \,x - 1/v_k \cdot v_{k-1}\cdot 1/v_k,\; 2) \\ \text {hasCarries}&= \text {false} \\ \text {Mult}(a, b)&= ab \\ \text {MultMod}(a,b,n)&= ab \mathbin {\textrm{rem}}x^n. \end{aligned}$$

The iterative step of Algorithm 3 is given on line 32. Since \(\text {D.PowDiff}\) computes \(\textrm{shift}_h 1 - v \cdot w\), this line computes

$$\begin{aligned} \textrm{shift}_m w + \textrm{shift}_{2m-h}\big (w \cdot (\textrm{shift}_h 1 - v\cdot w)\big ). \end{aligned}$$
(16)
figure d

The shift operations are multiplications by powers of x, with \(\textrm{shift}_h p = p x^h\). The expressions involving \(k, h, \ell \) and m for shift amounts arise from multiplication by various powers of x at different points in order to compute shorter polynomials when possible. Since x commutes with all values, it is possible to accumulate these into single pre- and post- shifts. With this in mind, the R[x] operations \(+\) and \(\cdot \) ultimately compute the polynomial coefficients using the operations of R and the order of the multiplicands in (16) is exactly that of the Newton-Schulz iteration (3). The form of \(\text {Shinv0}\) above is chosen so that it gives a suitable initial value for non-commutative polynomials.

The computational complexity of the \(\text {Refine}\) methods of Algorithm 4 may be summarized as follows: The function \(\text {D.Refine1}\) computes full-length values at each iteration so has time complexity \(O(\log (h-k) M(h))\) where M(N) is the time complexity of multiplication. The function \(\text {D.Refine2}\) reduces the size of the values, computing only the necessary prefixes. The function \(\text {D.Refine3}\) reduces the size of some values further and achieves time complexity \(O\big (\sum _{i=1}^{\log (h-k)} M(2^i)\big )\), which gives time complexity \(O(M(N)), N = h-k\) for the purely theoretical \(M(N) \in O(N \log N)\), for Schönhage-Strassen \(M(N) \in O(N \log N \log \log N)\) and for \(M(N) \in O(N^p), p > 0.\)

5 Non-commutative Polynomial Example

We give an example of computing left and right quotients via the whole shifted inverse with \(R[x] = {F_{7}}^{2\times 2}[x]\) using the algorithms of Sects. 3 and 4. Note that R[x] is not a domain—there may be zero divisors, but it is easy enough to check for them. This example, and the one in Sect. 7, were produced using the Domains package in Maple [5]. The setup to use the Domains package for this example is

figure e

We start with

$$\begin{aligned} u&= \left[ \begin{array}{cc} 4 &{} 6 \\ 6 &{} 1 \end{array} \right] x^5 + \left[ \begin{array}{cc} 2 &{} 2 \\ 0 &{} 1 \end{array} \right] x^4 + \left[ \begin{array}{cc} 2 &{} 1 \\ 1 &{} 3 \end{array} \right] x^3 + \left[ \begin{array}{cc} 2 &{} 0 \\ 4 &{} 1 \end{array} \right] x^2 + \left[ \begin{array}{cc} 3 &{} 3 \\ 5 &{} 4 \end{array} \right] x + \left[ \begin{array}{cc} 4 &{} 5 \\ 1 &{} 2 \end{array} \right] , \\ v&= \left[ \begin{array}{cc} 4 &{} 3 \\ 4 &{} 5 \end{array} \right] x^2 + \left[ \begin{array}{cc} 5 &{} 3 \\ 0 &{} 4 \end{array} \right] x + \left[ \begin{array}{cc} 1 &{} 2 \\ 6 &{} 1 \end{array} \right] . \end{aligned}$$

The whole 5-shifted inverse of v is then

$$\begin{aligned} \textrm{shinv}_5\,v = \left[ \begin{array}{cc} 5 &{} 4 \\ 3 &{} 4 \end{array} \right] x^3 + \left[ \begin{array}{cc} 6 &{} 0 \\ 4 &{} 1 \end{array} \right] x^2 + \left[ \begin{array}{cc} 1 &{} 0 \\ 2 &{} 2 \end{array} \right] x + \left[ \begin{array}{cc} 5 &{} 1 \\ 6 &{} 3 \end{array} \right] . \end{aligned}$$

From this, the left and right quotients and remainders are computed to be

$$\begin{aligned} q_\text {l}&= \left[ \begin{array}{cc} 2 &{} 6 \\ 1 &{} 1 \end{array} \right] x^3 + \left[ \begin{array}{cc} 6 &{} 1 \\ 0 &{} 0 \end{array} \right] x^2 + \left[ \begin{array}{cc} 2 &{} 0 \\ 3 &{} 3 \end{array} \right] x + \left[ \begin{array}{cc} 3 &{} 1 \\ 0 &{} 0 \end{array} \right] ,&r_\text {l}&= \left[ \begin{array}{cc} 1 &{} 6 \\ 4 &{} 1 \end{array} \right] x + \left[ \begin{array}{cc} 1 &{} 4 \\ 4 &{} 3 \end{array} \right] , \\ q_\text {r}&= \left[ \begin{array}{cc} 3 &{} 5 \\ 5 &{} 0 \end{array} \right] x^3 + \left[ \begin{array}{cc} 1 &{} 1 \\ 1 &{} 5 \end{array} \right] x^2 + \left[ \begin{array}{cc} 0 &{} 5 \\ 5 &{} 5 \end{array} \right] x + \left[ \begin{array}{cc} 4 &{} 0 \\ 2 &{} 6 \end{array} \right] ,&r_\text {r}&= \left[ \begin{array}{cc} 2 &{} 0 \\ 2 &{} 1 \end{array} \right] x + \left[ \begin{array}{cc} 0 &{} 4 \\ 5 &{} 6 \end{array} \right] . \end{aligned}$$

Taking a larger example where u has degree 100 and v degree 10, \(\text {D.Refine1}\) computes \(\textrm{shinv}_{100} v\) with one guard digit in 6 steps with intermediate values of w all of \(\textrm{prec}\) 92. Methods \(\text {D.Refine2}\) and \(\text {D.Refine3}\) compute the same result also in 6 steps but with values of w having \(\textrm{prec}\) 4, 8, 16, 32, 64, 92 successively. Method \(\text {D.Refine3}\) uses a shorter prefix of v on the first iteration (\(s = 3\)). The Maple code used for this example is given in Fig. 1.

6 Division in \(R[x; \sigma , \delta ]\)

We now examine the more general case where the polynomial variable does not commute with coefficients. For quotients and remainders to be defined, a notion of degree is required and we note that this leads immediately to Ore extensions, or skew polynomials. After touching upon classical algorithms, we introduce the notions of left and right whole shifted inverse. We note that the modified Newton-Schulz iteration may be used to compute whole shifted inverses, though in this case there is no benefit over classical division. Finally, we show how left and right whole shifted inverses may be used to compute right and left quotients, each with only one multiplication.

6.1 Definitions and Classical Algorithms

Consider a ring of objects with elements from a ring R extended by x, with x not necessarily commuting with elements of R. By distributivity, any finite expression in this extended ring is equal to a sum of monomials, the monomials composed of products of elements of R and x. To have a well-defined degree compatible with that of usual polynomials, it is required that

$$\begin{aligned} \forall \,r \in R \; \exists \,a, b, c, d \in R \;\text { s.t. }\; x r - r x = a x + b = x c + d. \end{aligned}$$
(17)

We call the elements of such a ring skew polynomials. Condition (17) implies that for all \(r \in R\) there exist \(\sigma (r), \delta (r) \in R\) such that

$$\begin{aligned} x \, r = \sigma (r)\, x + \delta (r). \end{aligned}$$
(18)

Therefore, to have well-defined notion of degree, the ring must be an Ore extension, \(R[x; \sigma , \delta ]\). Ore studied these non-commutative polynomials almost a century ago [6] and overviews of Ore extensions in computer algebra are given in [1, 2]. The subject is viewed from a linear algebra perspective in [3] and the complexity of skew arithmetic is studied in [9]. The ring axioms of \(R[x; \sigma , \delta ]\) imply that \(\sigma \) be an endomorphism on R and \(\delta \) be a \(\sigma \)-derivation, i.e. for all \(r, s \in R\)

$$\begin{aligned} \delta (r+s)&=\delta (r)+\delta (s)&\delta (r\cdot s)&= \sigma (r) \cdot \delta (s) + \delta (r)\cdot s. \end{aligned}$$

Different choices of \(\sigma \) and \(\delta \) allow skew polynomials to represent linear differential operators, linear difference operators, q-generalizations of these and other algebraic systems.

Condition (18) implies that it is possible to write any skew polynomial as a sum of monomials with all the powers of x on the right or all on the left. We will use the notation \(u_{i}\) for coefficients of skew polynomials with all powers of the variable on the right and \(\,{}_{i}u\) for coefficients with all powers of the variable on the left, e.g.

$$\begin{aligned} u = \sum _{i = 0}^h u_{i} x^i = \sum _{i=0}^h x^i \,{}_{i}u. \end{aligned}$$

Algorithm 4 gives left and right classical division in \(R[x; \sigma , \delta ]\). As in Sect. 3, \(\times _\pi \) is multiplication with arguments permuted by \(\pi \). When \(\sigma (r) = r\), \(R[x;\sigma , \delta ]\) is a differential ring, usually denoted \(R[x, \delta ]\), and Algorithm 4 specializes to Algorithm 1. The left division algorithm applies only when \(\sigma \) is bijective. If left division is of primary interest, start from \(r x = x \sigma ^*(r) + \delta ^*(r) \) instead of (18) and work in the adjoint ring \(R[x; \sigma ^*, \delta ^*]\).

Some care is needed in Algorithm 4 to avoid duplicating computation. Notice that for \(\text {rskewdiv}\) the application of \(\text {qcoeff}\) on line 6 requires n-fold application of \(\sigma \) to \(\textrm{inv}v_k\) and that the computation of \(t\times _\pi v\) on line 7 is \(\text {coeff}(t)\, x^{i+k} \times v\). The latter requires commuting \(h-k\) powers of x across v over the course of the division. Depending on the cost to compute \(\sigma \), it may be useful to create an array of the values \(\sigma ^i(\textrm{inv}v_k)\) for i from 0 to \(h-k\). It is also possible to pre-compute and store the products \(x^i \times v\), with \(x^{i+1}\times v\) obtained from \(x^i\times v\) by one application of (18). Then the \(x^i\times v\) may be used in descending order in the for loop without re-computation. Both of these pre-computations are performed in the Maple program for P[RDiv] shown in Fig. 2.

6.2 Whole Shift and Inverse in \(R[x; \sigma , \delta ]\)

It is possible to define left and right analogs of the whole shift and whole shifted inverse for skew polynomials. In general, the left and right operations give different values.

Definition 3

(Left and right whole n-shift in \(R[x; \sigma , \delta ]\)). Given \(u \in R[x; \sigma , \delta ]\) and \(n \in \mathbb Z\), the left whole n-shift of u is

$$\begin{aligned} \textrm{lshift}_{n,x}\,u = \sum _{i+n \ge 0} x^{i+n} \,{}_{i}u, \end{aligned}$$

the right whole n-shift of u is

$$\begin{aligned} \textrm{rshift}_{n,x}\,u = \sum _{i+n \ge 0} u_{i} x^{i+n} \end{aligned}$$

When x is clear by context, we write \(\textrm{lshift}_n u\) and \(\textrm{rshift}_n u\).

figure f

Definition 4

(Left and right whole n-shifted inverse in \(R[x; \sigma , \delta ]\)). Given \(n \in \mathbb Z_{\ge 0}\) and \(v \in R[x; \sigma , \delta ]\), the left whole n-shifted inverse of v with respect to x is

$$\begin{aligned} \textrm{lshinv}_{n,x}\,v = x^n \,\textrm{lquo}\, v \end{aligned}$$

the right whole n-shifted inverse of v with respect to x is

$$\begin{aligned} \textrm{rshinv}_{n,x}\,v = x^n\,\textrm{rquo}\,v \end{aligned}$$

When x is clear by context, we write \(\textrm{lshinv}_n\,v\) and \(\textrm{rshinv}_n\,v\).

Modified Newton-Schulz Iteration. For monic \(v \in R[x; \sigma , \delta ]\), the whole shifted inverses may be computed using modified Newton-Schulz iterations with \(g=1\) guard places as follows:

$$\begin{aligned} \begin{aligned} {w_\text {l}}_{(0)}&= {w_\text {r}}_{(0)}= x^{h-k+g} - v_{k-1} x^{h-k-1+g}\\ {w_\text {l}}_{(i+1)}&= {w_\text {l}}_{(i)} + \textrm{rshift}_{-h} \big ( {w_\text {l}}_{(i)} \times (\textrm{rshift}_h\,1 - v \times {w_\text {l}}_{(i)}) \big ), \\ {w_\text {r}}_{(i+1)}&= {w_\text {r}}_{(i)} + \,\textrm{lshift}_{-h} \big ( (\textrm{lshift}_h\,1 - {w_\text {r}}_{(i)} \times v) \times {w_\text {r}}_{(i)} \big ), \\&\textrm{rshift}_{-g}\,{w_\text {l}}_{(i)} \rightarrow \textrm{lshinv}_h\,v\\&\textrm{lshift}_{-g}\,{w_\text {r}}_{(i)} \rightarrow \textrm{rshinv}_h\,v. \end{aligned} \end{aligned}$$
(19)

These generalize \(\text {D.Refine1}\) in Algorithm 3. For \(\text {D.Refine2}\) and \(\text {D.Refine3}\), the shifts that reduce the size of intermediate expressions are combined into one pre- and one post-shift in R[x]. But on \(R[x;\sigma , \delta ]\) we do not expect these simplifications of shift expressions to be legitimate.

Even though (19) can be used to compute whole shifted inverses, it does not give any benefit over classical division. In the special case of \(R[x,\delta ]\), the multiplication by v and then by w make it so each iteration creates only one correct term, so \(h-k\) iterations are required rather than \(\log _2(h-k)\). In other skew polynomial rings, e.g. linear difference operators, the iteration (19) can still converge, but with multiple iterations required for each degree of the quotient. It is therefore simpler to compute \(\textrm{lshinv}\) and \(\textrm{rshinv}\) by classical division.

6.3 Quotients from Whole Shifted Inverses in \(R[x; \sigma , \delta ]\)

It is possible to compute left and right quotients from the right and left whole shifted inverses in \(R[x;\sigma ,\delta ]\). Although computing whole shifted inverses is not asymptotically fast as it is in R[x], once a whole shifted inverse is obtained it can be used to compute multiple quotients and hence remainders, each requiring only one multiplication. This is useful, e.g., when working with differential ideals. In some cases this multiplication of skew polynomials is asymptotically fast [8].

Theorem 7

(Quotients from whole shifted inverses in \(R[x; \sigma , \delta ]\)). Let \(u, v \in R[x; \sigma , \delta ]\), with R a ring, \(k = \textrm{degree}\,v\), \(h = \textrm{degree}\,u\), and \(v_{k}\) invertible in R. Then

$$\begin{aligned} u\,\textrm{rquo}\,v&= \textrm{rshift}_{-h} ( u \times \textrm{lshinv}_h\,v ) \end{aligned}$$
(20)
$$\begin{aligned} u\,\textrm{lquo}\,v&= \textrm{lshift}_{-h} ( \textrm{rshinv}_h v \times \,u). \end{aligned}$$
(21)

Proof

We first prove (20). For \(h \ge k\), we proceed by induction on \(h-k\). Suppose \(h-k = 0\). Since \( u - (u_{h} \times 1/v_{k}) \times v \) has no term of degree h, we have

$$\begin{aligned} u\,\textrm{rquo}\,v = u_{h} \times 1/ v_{k}. \end{aligned}$$

On the other hand, when \(h=k\), \(\textrm{lshinv}_h\,v = 1/v_{k}\) so

$$\begin{aligned} \textrm{rshift}_{-h}(u\times \textrm{lshinv}_h\,v) = u_{h} \times 1/v_{k} \end{aligned}$$

and (20) holds. For the inductive step, we assume that (20) holds for \(h-k < N\). For \(h-k=N\), let \(u = q \times v + o(x^k)\) and let Q, \(\hat{q}\) and \(\hat{u}\) be given by

$$\begin{aligned} u&= (Q x^{h-k} + \hat{q}) \times v + r, \quad \quad Q \in R,\;\; \hat{q}\in o(x^{h-k}),\;\; r \in o(x^k), \\ \hat{u}&= u - Q x^{h-k} \times v. \end{aligned}$$

With this, \(\hat{u}\) has degree at most \(h-1\). The inductive hypothesis gives \( \hat{u}\,\textrm{rquo}\, v = \textrm{rshift}_{-h}(\hat{u}\times \textrm{lshinv}_h\,v). \) Therefore,

$$\begin{aligned} \hat{u}= u - Q x^{h-k}\times v&= (\hat{u}\,\textrm{rquo}\, v) \times v + \hat{r}, \quad \hat{r}\in o(x^k) \\&= \textrm{rshift}_{-h}(\hat{u}\times \textrm{lshinv}_h\,v) \times v + \hat{r}\\ \Rightarrow \quad u&= \big (\textrm{rshift}_{-h}(\hat{u}\times \textrm{lshinv}_h\,v) + Q x^{h-k}\big ) \times v + \hat{r}\\&= \textrm{rshift}_{-h}(\hat{u}\times \textrm{lshinv}_h\,v + Q x^{2h-k}) \times v + \hat{r}. \end{aligned}$$

From this, we have

$$\begin{aligned} u \,\textrm{rquo}\, v&= \textrm{rshift}_{-h}(\hat{u}\times \textrm{lshinv}_h\,v + Q x^{2h-k}) \\&= \textrm{rshift}_{-h}\big ( (u- Q x^{h-k} \times v) \times \textrm{lshinv}_h\,v + Q x^{2h-k}\big ) \\&= \textrm{rshift}_{-h}\big ( u\times \textrm{lshinv}_h\,v - Q x^{h-k} \times v \times \textrm{lshinv}_h\,v + Q x^{2h-k}\big ) \\&= \textrm{rshift}_{-h}\big ( u\times \textrm{lshinv}_h\,v - Q x^{h-k} \times v \times (x^h \,\textrm{lquo}\, v) + Q x^{2h-k}\big ) \\&= \textrm{rshift}_{-h}\big ( u\times \textrm{lshinv}_h\,v - Q x^{h-k} \times (x^h + o(x^k)) + Q x^{2h-k}\big ) \\&= \textrm{rshift}_{-h}\big ( u\times \textrm{lshinv}_h\,v + Q \times o(x^h) \big ) = \textrm{rshift}_{-h}( u\times \textrm{lshinv}_h\,v). \end{aligned}$$

This completes the inductive step and the proof of (20). Equation (21) is proven as above, mutatis mutandis.    \(\Box \)

As in the commutative case, it may be more efficient to compute only the required top part of the product in (20) and (21) rather than to compute the whole product and then shift by \(-h\).

7 Skew Polynomial Examples

7.1 Differential Operators

We take \(F_{7}[y, \partial _y]\) as a first example of using whole shifted inverses to compute quotients of skew polynomials. We use Algorithm 4 to compute the left and right whole shifted inverses, and then Theorem 7 to obtain the quotients. We start with u and v

$$\begin{aligned} u&= (3y+6) \partial _y^5 + (3y+1) \partial _y^4 + 6 y \partial _y^3 + 4 y \partial _y^2 + (2y+1) \partial _y + (2y+5) \\ v&= 4 \partial _y^2 + (2y+ 5) \partial _y + (4 y + 6). \end{aligned}$$

The whole shifted inverses \(\textrm{lshinv}_5 v = \partial _y^5 \textrm{lquo}\,v\) and \(\textrm{rshinv}_5 = \partial _y^5 \textrm{rquo}\,v\) are computed by Algorithm 4.

$$\begin{aligned} \textrm{lshinv}_5&= 2 \partial _y^3 + (6y+1) \partial _y^2 + (4 y^2 + 4 y + 3) \partial _y + (5 y^3 + y^2 + 3 y + 2) \\ \textrm{rshinv}_5&= 2 \partial _y^3 + (6y+ 1) \partial _y^2 + (4 y^2 + 4 y + 5) \partial _y + (5 y^3 + y^2 + y + 1). \end{aligned}$$

Then \(q_\text {l}= \textrm{lshift}_{-5}(\textrm{rshinv}_5 v \times u)\) and \(q_\text {r}= \textrm{rshift}_{-5}(u \times \textrm{lshinv}_5\,v)\) so

figure g

A proof-of-concept Maple implementation for generic skew polynomials is given in Fig. 2. The program is to clarify any ambiguities without any serious attention to efficiency. The setup for the above example is

figure h

7.2 Difference Operators

We use linear ordinary difference operators as a second example, this time with \(\sigma \) not being the identity. We construct \(F_7[y, \varDelta _y]\) as \(F_7[y][\varDelta _y; E, E - 1]\). As before, we use Algorithm 4 to compute the left and right whole shifted inverses, and then Theorem 7 to obtain the quotients. We take u and v to be

$$\begin{aligned} u&= y \varDelta _y^5 + (3 y + 6) \varDelta _y^4 + (6 y + 5) \varDelta _y^3 + 3 y \varDelta _y^2 + (2 y + 1) \varDelta _y + 5 y \\ v&= 4 \varDelta _y^2 + (6 y + 1) \varDelta _y + (6 y + 6). \end{aligned}$$

The whole shifted inverses \(\textrm{lshinv}_5 v = \varDelta _y^5 \textrm{lquo}\,\,v\) and \(\textrm{rshinv}_5 = \varDelta _y^5 \textrm{rquo}\,\,v\) are computed by Algorithm 4.

$$\begin{aligned} \textrm{lshinv}_5&= 2 \varDelta _y^3 + (4 y + 2) \varDelta _y^2 + (y^2 + 4 y) \varDelta _y + (2 y^3 + 6 y^2 + y) \\ \textrm{rshinv}_5&= 2 \varDelta _y^3 + (4y + 1) \varDelta _y^2 + (y^2 + 2) \varDelta _y + (2 y^3 + y^2 + 4 y + 1). \end{aligned}$$

Then \(q_\text {l}= \textrm{lshift}_{-5}(\textrm{rshinv}_5 v \times u)\) and \(q_\text {r}= \textrm{rshift}_{-5}(u \times \textrm{lshinv}_5\,v)\) so

figure i

The Maple setup for this example is

figure j

7.3 Difference Operators with Matrix Coefficients

As a final example, we take quotients in \(F_7^{2\times 2}[y, \varDelta _y]\) to underscore the genericity of this method.

figure k
figure l
figure m
figure n

The Maple setup for this example is the same as for the previous example but with F := SquareMatrix(2, GaloisField(7)).

Fig. 1.
figure 1

Maple code for fast generic polynomial \(\textrm{shinv}\) and left and right division

Fig. 2.
figure 2figure 2figure 2

Maple code for generic skew polynomials

8 Conclusions

We have extended earlier work on efficient computation of quotients in a generic setting to the case of non-commutative univariate polynomial rings. We have shown that when the polynomial variable commutes with the coefficients, the whole shift and whole shifted inverse are well-defined and they may be used to compute left and right quotients. The whole shifted inverse may be computed by a modified Newton method in exactly the same way as when the coefficients are commutative and the number of iterations is logarithmic in the degree of the result. When the polynomial variable does not commute with the coefficients, left and right whole shifted inverses exist and may be computed by classical division. Once a left or right whole shifted inverse is obtained, several right or left quotients with that divisor may be computed, each with a single multiplication.