Keywords

1 Introduction

One step in the evolution of a numeric abstract domain is when the domain, originally conceived for idealised, arbitrary-precision arithmetic, is adapted to machine arithmetic to better suit its working environment. This adaption is more often a leap than a step because the domain operations typically need to be fundamentally reimagined to model modular arithmetic. It has taken more than two decades for each of the classical abstract domains of ranges [7, 15], difference constraints [9] and linear equalities [20], to be adjusted to a modular setting, as realised in, respectively, sign agnostic range analysis [11], modular differences [12] and linear equalities modulo a power of two [27]. The tenor of these works is that operating over modular integers is not a restriction, but rather the natural domain for deriving invariants over fixed-width integers, which are the norm in mainstream programming languages.

Modular Polynomial Abstract Domain. For inferring polynomial invariants, one might be forgiven for considering the additional complexity of modular arithmetic to be an irritation, justified only by the desire to faithfully model machine integers and avoid missing invariants. In this paper we challenge this view by demonstrating how Modular Polynomial Abstract Domain (MPAD) can simplify the discovery of polynomial equalities. Contrary to non-modular approaches [4, 8, 17, 22, 23, 25, 29,30,31], MPAD is a finite lattice. The finiteness of modular polynomials has been observed before [33], and exploited in a backwards analysis [33] over programs equipped with polynomial assignment, non-deterministic assignment and negative guards (discussed in Sect. 7). Our work takes modular polynomials in a new direction, literally forwards, enabling MPAD to be combined [5] with classic numeric domains [24]. Like [33], MPAD obviates the need to specify the shape of an invariant up-front (in a template) [29, 31], or limit the syntactic form of the program [17, 18, 23], or drop high-degree polynomials [30].

Closure of Modular Systems. Fundamental to MPAD is the concept of a closed polynomial system. A system of polynomials is closed if it cannot be further augmented with polynomials without restricting its solution set. Ensuring a closed representation is essential to ensure that entailment of a given constraint can always be checked. Morever, mirroring a construction used for the Octagon domain [24], we demonstrate that join and projection can be calculated, without omitting polynomials that actually hold, when they are applied to closed systems. It follows that MPAD can infer all modular polynomial invariants for programs with polynomial and non-deterministic assignments, and non-deterministic branching. Though preserved by join and projection, closedness is lost when intersecting two polynomial systems to compute their meet. To resolve this, we present a divide-and-conquer algorithm for computing closure, thus ensuring a closed representation throughout the analysis.

Expressiveness. MPAD can model positive and negative polynomial guards, that is, \({\textsf {assume}}~(p = 0)\) and \({\textsf {assume}}~(p \ne 0)\) statements where p is a polynomial. Support for negative guards is a direct consequence of working with fixed-width integers: an integer assumes a non-zero value if and only if one of its bits is set, a property that can be expressed in MPAD. The finiteness of MPAD also allows a best transformer [28] to be mechanically calculated. For instance, the best transformer for the 3-bit bitvector operation is the system S:

figure b

where each polynomial \(p \in S\) is satisfied by every assignment of the form \(x, y \in \{ 0, \ldots , 7 \}\) and \( z = (x \& y) \mod 8\). This, and other best abstractions, can only be calculated [28] because MPAD satisfies the ascending chain condition.

Contributions. To summarise, this paper makes the following contributions:

  • We introduce closure for MPAD, showing that it is preserved by join and projection but must be re-established after meet to retain all invariants;

  • We present a divide-and-conquer algorithm for computing closure, introducing reductions and shortcuts that simplify its calculation;

  • We show how redundant calculation can be removed from the algorithm (of Buchberger for modular polynomials [2]) which sits behind closure;

  • We show that using MPAD in forwards analysis can derive invariants that cannot be derived with existing domains (because of its support for guards).

Roadmap. Section 2 introduces MPAD, providing the minimum of detail for following the example of Sect. 6.5. The domain operations of MPAD are built atop of Gröbner bases, which are introduced in Sect. 3. (The detail of Sect. 3.5 can be skipped on first reading since it is only necessary for Sect. 6.6.) Sect. 4 explains how join can be calculated in terms of variable elimination and Gröbner bases. Section 5 introduces covers of polynomial systems, providing an algorithm for computing them. It also shows how meet can be reduced to closure. Section 6 presents correctness and precision results for MPAD over polynomial programs, and concludes with an illustrative example. Section 7 reviews related work and Sect. 8 concludes.

2 Modular Polynomial Abstract Domain

This section abstractly specifies MPAD, and its domain operations, with minimal mathematical machinery. The problems of how to finitely represent the elements of MPAD and compute meet, join and projection are deferred to later sections.

2.1 Modular Arithmetic

Let \(\omega \ge 1\), \(m = 2^\omega \) and \(\mathbb {Z}_m = \{0, \ldots , m - 1\}\) be an abstraction of machine arithmetic over \(\omega \)-bit integers [26, 27]. The relation \(\equiv _m \subseteq \mathbb {Z}\times \mathbb {Z}\) is defined by \(x \equiv _m y\) if there exists \(k \in \mathbb {Z}\) such that \(x - y = km\). Atop, the operation \(\cdot \; (\bmod \; m): \mathbb {Z}\rightarrow \mathbb {Z}_m\) is defined \(x\; (\bmod \; m) = y\) where \(y \in \mathbb {Z}_m\) uniquely satisfies \(x \equiv _m y\). The unary operation \(-:\mathbb {Z}_m \rightarrow \mathbb {Z}_m\) and the dyadic operations \(+, \cdot :\mathbb {Z}_m \times \mathbb {Z}_m \rightarrow \mathbb {Z}_m\) are then defined: \(-x = (\hat{-}x)\; (\bmod \; m)\), \(x + y = (x\; \hat{+}\; y)\; (\bmod \; m)\) and \(x \cdot y = (x\; \hat{\cdot }\; y)\; (\bmod \; m)\) where \(\hat{-}, \hat{+}, \hat{\cdot }\) denote the classical operations over \(\mathbb {Z}\). If \(x \in \mathbb {Z}_m\) then \(y \in \mathbb {Z}_m\) is a multiplicative inverse of x if \(x\cdot y = 1\). Note that \(x \in \mathbb {Z}_m\) has a multiplicative inverse iff it is odd, in which case the inverse is unique. In particular, if \(\omega > 1\) then \(\mathbb {Z}_m\) is not a field, since 2 has no multiplicative inverse.

2.2 Polynomials

Let \(\textbf{x} = \langle x_1, \ldots , x_d \rangle \) be a vector of variables. A monomial over \(\textbf{x}\) is an expression \(\textbf{x}^{\mathbf {\alpha }} = x_1^{\alpha _1}\cdots x_d^{\alpha _d}\) where \(\mathbf {\alpha } = \langle \alpha _1, \ldots , \alpha _d \rangle \in \mathbb {N}^d\). A term over \(\textbf{x}\) is an expression \(t = c\textbf{x}^{\mathbf {\alpha }}\) where \(c \in \mathbb {Z}_m\) and \(\textbf{x}^{\mathbf {\alpha }}\) is a monomial. A polynomial over \(\textbf{x}\) is an expression \(t_1 + \cdots + t_s\) where each \(t_i\) is a term over \(\textbf{x}\), the case \(s = 0\) corresponding to the 0 polynomial. The set of polynomials over \(\textbf{x}\) is denoted \(\mathbb {Z}_m[\textbf{x}]\).

A polynomial \(p = t_1 + \cdots + t_s\) is normalised if either \(s = 0\) or else for all \(t_i= c_i\textbf{x}^{\mathbf {\alpha }_i}\) and \(t_j = c_j\textbf{x}^{\mathbf {\alpha }_j}\) it holds that \(c_i \ne 0\) and if \(i \ne j\) then \(\mathbf {\alpha }_i \ne \mathbf {\alpha }_j\). By repeatedly combining the coefficients of terms with equal monomials, and deleting terms with coefficient 0, a polynomial can be transformed into a normalised form. Two polynomials are considered equal if they have equal normal forms, up to the ordering of terms. If \(c\textbf{x}^{\mathbf {\alpha }}\) is a term then \(\textsf{vars}(c\textbf{x}^{\mathbf {\alpha }}) = \{ x_i \mid \alpha _i > 0 \}\), which is extended to polynomials by \(\textsf{vars}(p) = \bigcup _{t \in p} \textsf{vars}(t)\).

For \(p \in \mathbb {Z}_m[\textbf{x}]\), \(d = |\textbf{x}|\) and \(\textbf{a} \in \mathbb {Z}_m^d\) let \(\llbracket p \rrbracket _{ \textbf{x}}(\textbf{a})\) denote evaluating p at \(\textbf{a}\) by substituting each \(a_i\) for \(x_i\) in p, and calculating the resulting arithmetical expression. Through this definition, a set of polynomials in \(\mathbb {Z}_m[\textbf{x}]\) is a symbolic description of a set of points, interpreted by \(\gamma _{\textbf{x}}\) as follows:

Definition 1

The concretisation map \(\gamma _{\textbf{x}}: \wp (\mathbb {Z}_m[\textbf{x}]) \rightarrow \wp (\mathbb {Z}_m^d)\) where \(d = |\textbf{x}|\) is defined: \( \gamma _{\textbf{x}}(P) = \{ \textbf{a} \in \mathbb {Z}_m^d \mid \llbracket p \rrbracket _{ \textbf{x}}(\textbf{a}) = 0\ \text {for all}\ p \in P \} \)

The set of points \(\gamma _{\textbf{x}}(P)\) is the solution (or zero) set of the set P of polynomials over \(\textbf{x}\). For a single \(p \in \mathbb {Z}_m[\textbf{x}]\), let \(\gamma _{\textbf{x}}(p) = \gamma _{\textbf{x}}(\{p\})\).

Example 1

Let \(\textbf{x} = \langle x, y \rangle \) and \(Q_1, Q_2 \subseteq \mathbb {Z}_{256}[\textbf{x}]\) where

figure c

The solutions sets \(\gamma _{\textbf{x}}(Q_1)\) and \(\gamma _{\textbf{x}}(Q_2)\) are plotted as points in \([0,255]^2\) in Fig. 1(a) and Fig. 1(b) respectively. Here, the grid lines represent increments of 32. Although \(Q_1\) is linear it has 4 solutions, namely (31, 28), (95, 28), (159, 28) and (223, 28), because \(31\cdot 4 \equiv _{256} 95\cdot 4 \equiv _{256} 159\cdot 4 \equiv _{256} 223\cdot 4 \equiv _{256} 124 \equiv _{256} -132\).

Fig. 1.
figure 1

Dyadic join with and without closure

2.3 Closure

Suppose \(P \subseteq \mathbb {Z}_m[\textbf{x}]\), \(p \in \mathbb {Z}_m[\textbf{x}]\) and \(\gamma _{\textbf{x}}(P) \subseteq \gamma _{\textbf{x}}(p)\). Then \(\gamma _{\textbf{x}}(P \cup \{p\}) = \gamma _{\textbf{x}}(P)\), thus P can be augmented with p without restricting its solution set. This is the intuition behind the following definition:

Definition 2

The operator \(\uparrow _{\textbf{x}} : \wp (\mathbb {Z}_m[\textbf{x}]) \rightarrow \wp (\mathbb {Z}_m[\textbf{x}])\) is defined by: \( \uparrow _{\textbf{x}} P = \{ p \in \mathbb {Z}_m[\textbf{x}] \mid \gamma _{\textbf{x}}(P) \subseteq \gamma _{\textbf{x}}(p) \} \)

The following result collects fundamental properties of \(\uparrow _{\textbf{x}}\). The first three together imply that \(\uparrow _{\textbf{x}}\) is a closure operator on \(\langle \wp (\mathbb {Z}_m[\textbf{x}]), \subseteq \rangle \). The fourth implies that \(\uparrow _{\textbf{x}}\) constructs a canonical representation of a system of polynomials. The fifth shows that the canonical representation preserves the solution set.

Proposition 1

The operator \(\uparrow _{\textbf{x}}\) satisfies the following: (1) \(P \subseteq {\uparrow _{\textbf{x}} P}\) (extensive); (2) if \(P_1 \subseteq P_2\) then \({\uparrow _{\textbf{x}} P_1} \subseteq {\uparrow _{\textbf{x}} P_2}\) (monotonic); (3) \({\uparrow _{\textbf{x}} {\uparrow _{\textbf{x}} P}} = {\uparrow _{\textbf{x}} P}\) (idempotent); (4) \(\gamma _{\textbf{x}}(P_1) = \gamma _{\textbf{x}}(P_2)\) iff \({\uparrow _{\textbf{x}} P_1} = {\uparrow _{\textbf{x}} P_2}\); (5) \(\gamma _{\textbf{x}}({\uparrow _{\textbf{x}} P}) = \gamma _{\textbf{x}}(P)\).

The closure operator \(\uparrow _{\textbf{x}}\) yields a canonical representation of a given set of polynomials, yet the representation is not finite. The concept of a basis is introduced, which serves as the starting point for a compact, finite representation:

Definition 3

If \(B \subseteq \mathbb {Z}_m[\textbf{x}]\) then \( \langle B \rangle _{ \textbf{x}} \,{=}\, \left\{ \left. \sum _{i=1}^{s} u_ip_i \right| s \in \mathbb {N}, p_i \,{\in }\, B, u_i \,{\in }\, \mathbb {Z}_m[\textbf{x}] \right\} \)

The set of polynomials \(\langle B \rangle _{ \textbf{x}}\) is an ideal in that it is closed under addition with a polynomial from B and multiplication with an arbitrary polynomial (not necessarily drawn from B). The ideal \(\langle B \rangle _{ \textbf{x}}\) is said to be generated by B, which is called the basis. In particular the solutions of \(\langle B \rangle _{ \textbf{x}}\) are those of B itself and the sets found by applying closure are ideals themselves:

Lemma 1

(1) \(\gamma _{\textbf{x}}(\langle B \rangle _{ \textbf{x}}) = \gamma _{\textbf{x}}(B)\) and (2) If \(P = {\uparrow _{\textbf{x}} P}\) then \(P = \langle P \rangle _{ \textbf{x}}\).

Generating P from itself is enough to show that P is an ideal, but does not provide the necessary finite representation. However, it has long been known that ideals of polynomials admit a finite basis [16], at least for polynomials whose coefficients are drawn from a field. This classical result, which can be interpreted as a statement on representation, adapts naturally to the setting of polynomials over modular integers, as will be explained in Sect. 3.

Example 2

Returning to Example 1, \({\uparrow _{\textbf{x}} Q_1}\) and \({\uparrow _{\textbf{x}} Q_2}\) admit the finite representations \({\uparrow _{\textbf{x}} Q_1} = \langle Q'_1 \rangle _{ \textbf{x}}\) and \({\uparrow _{\textbf{x}} Q_2} = \langle Q_2 \rangle _{ \textbf{x}}\) where \(Q'_1 = \{ x^2 + 2x + 1, 4x + 132, y + 228 \}\). Observe \(31^2 + 2\cdot 31 + 1 = 1024 \equiv _{256} 0\). Similarly it follows \(\gamma _{\textbf{x}}(x^2 + 2x + 1)\) \(\supseteq \) \(\{ (31, y), (95, y), (159, y), (233, y) \mid y \in \mathbb {Z}_{256} \}\). Thus \(x^2 + 2x + 1 \in {\uparrow _{\textbf{x}}} Q_1\). However, \(x^2 + 2x + 1 \not \in \langle Q_1 \rangle _{ \textbf{x}}\). To see this, consider the expansion of the polynomial \(p(4x + 132) + q(y + 228) = 4(xp + 33p + 57q) + yq\). Observe that any term t occurring in this polynomial that is independent of y must be a term of \(4(xp + 33p + 57q)\). But then, the coefficient of t must be a multiple of 4. In particular, there cannot exist pq for which \(x^2 + 2x + 1 = p(4x + 132) + q(y + 228)\), since \(x^2\) (and in fact 2x and 1 as well) is independent of y but has coefficient 1. Hence \(Q_1\) must be enlarged to obtain a basis for \({\uparrow _{\textbf{x}} Q_1}\).

2.4 MPAD

The closure operator characterises the elements of our abstract domain:

Definition 4

\(\textsf {MPAD}_{m}[\textbf{x}] = \{ P \subseteq \mathbb {Z}_m[\textbf{x}] \mid {\uparrow _{\textbf{x}} P} = P \}\)

Elements of \(\textsf {MPAD}_{m}[\textbf{x}]\) are said to be closed. If \(P_1 \subseteq P_2\) then \(\gamma _{\textbf{x}}(P_1) \supseteq \gamma _{\textbf{x}}(P_2)\) thus to align with \(\langle \wp (\mathbb {Z}_m^d), \subseteq \rangle \) the domain \(\textsf {MPAD}_{m}[ \textbf{x}]\) adopts superset ordering:

Proposition 2

\(\langle \textsf {MPAD}_{m}[ \textbf{x}], \sqsubseteq , \bot , \top , \sqcap , \sqcup \rangle \) is a finite lattice, where

$$ \sqsubseteq \; =\; \supseteq \quad \bot = \mathbb {Z}_m[\textbf{x}] \quad \top = {\uparrow _{\textbf{x}} \emptyset } \quad P_1 \sqcap P_2 = {\uparrow _{\textbf{x}} (P_1 \cup P_2)} \quad P_1 \sqcup P_2 = P_1 \cap P_2 $$

Join and meet are specified set theoretically rather than algorithmically. Observe too that MPAD is finite even though there are no bounds, a priori, put on the degree of any polynomial. This follows from the finiteness of \(\mathbb {Z}_m\) and the closure construction that underlies MPAD. To observe this, consider the function space F = \(\{ \llbracket p \rrbracket _{\textbf{x}} \mid p \in \mathbb {Z}_m[\textbf{x}] \} \subseteq \mathbb {Z}_m^d \rightarrow \mathbb {Z}_m\). Since the space \(\mathbb {Z}_m^d \rightarrow \mathbb {Z}_m\) is finite there exists \(p_1, \ldots , p_{\ell } \in \mathbb {Z}_m[\textbf{x}]\) such that \(F = \{ \llbracket p_i \rrbracket _{ \textbf{x}} \mid i \in [1, \ell ] \}\). To see how F determines the structure of \(\textsf {MPAD}_{m}[ \textbf{x}]\), define \(p \equiv q\) iff \(\llbracket q \rrbracket _{ \textbf{x}}(\textbf{a}) = \llbracket p \rrbracket _{ \textbf{x}}(\textbf{a})\) for all \(\textbf{a} \in \mathbb {Z}_m^d\). Let \(P \in \textsf {MPAD}_{m}[ \textbf{x}]\) and \(p \in P\). Observe \(p \equiv p_i\) for some \(i \in [1, \ell ]\) and \(\gamma _{\textbf{x}}(P) \subseteq \gamma _{\textbf{x}}(p) = \gamma _{\textbf{x}}(p_i)\) hence \(p_i \in P\). Conversely, if \(p_j \in P\) and \(p_j \equiv q\) then \(q \in P\). Therefore there exists \(I \subseteq [1, \ell \)] such that \(P = \{ q \in \mathbb {Z}_m[\textbf{x}] \mid q \equiv p_i, i \in I \}\). Thus \(\textsf {MPAD}_{m}[ \textbf{x}]\) only has a finite number of elements.

Example 3

Continuing from Example 2, let

figure d

Then, \(\langle Q'_1 \rangle _{ \textbf{x}} \cap \langle Q_2 \rangle _{ \textbf{x}} = \langle Q' \rangle _{ \textbf{x}}\) and \(\langle Q_1 \rangle _{ \textbf{x}} \cap \langle Q_2 \rangle _{ \textbf{x}} = \langle Q \rangle _{ \textbf{x}}\). Again, we defer the discussion of how Q and \(Q'\) are calculated. Observe from Figs. 1(a), 1(b) and 1(c) that \(\gamma _{\textbf{x}}(\langle Q'_1 \rangle _{ \textbf{x}}) \cup \gamma _{\textbf{x}}(\langle Q_2 \rangle _{ \textbf{x}}) \subseteq \gamma _{\textbf{x}}(\langle Q' \rangle _{ \textbf{x}})\) as required, the diamond points indicating those introduced by join itself. The diamonds in Fig. 1(d) are extraneous points introduced by calculating \(\langle Q_1 \rangle _{ \textbf{x}} \cap \langle Q_2 \rangle _{ \textbf{x}}\) rather than \(\langle Q'_1 \rangle _{ \textbf{x}} \cap \langle Q_2 \rangle _{ \textbf{x}}\). This illustrates that operating on arbitrary bases is not generally sufficient to maintain precision, thus motivating the need for closure.

Finally, the following result asserts that MPAD enjoys mathematical properties that simplify the application of abstract interpretation:

Proposition 3

\(\langle \wp (\mathbb {Z}_m^d), \subseteq \rangle \begin{array}{c} \alpha _{\textbf{x}} \\ [-0.75ex] \rightleftharpoons \\ [-1.25ex] \gamma _{\textbf{x}} \end{array} \langle \textsf {MPAD}_{m}[\textbf{x}], \sqsubseteq \rangle \) is a Galois insertion, where \( \alpha _{\textbf{x}}(A) = \{ p \in \mathbb {Z}_m[\textbf{x}] \mid A \subseteq \gamma _{\textbf{x}}(p) \} \)

2.5 Null Polynomials

Recall \(\top = {\uparrow _{\textbf{x}} \emptyset } = \{ p \in \mathbb {Z}_m[\textbf{x}] \mid \gamma _{\textbf{x}}(\emptyset ) \subseteq \gamma _{\textbf{x}}(p) \}\). It follows \(\top = \{ p \in \mathbb {Z}_m[\textbf{x}] \mid \forall \textbf{a}\in \mathbb {Z}_m^d . \llbracket p \rrbracket _{ \textbf{x}}(\textbf{a}) = 0 \}\) because \(\gamma _{\textbf{x}}(\emptyset ) = \mathbb {Z}_m^d\). Such polynomials are referred to as vanishing or null polynomials [14] and represent universally valid constraints.

Example 4

Let \(\textbf{x} = \langle x, y \rangle \). Then in \(\mathbb {Z}_{16}[\textbf{x}]\), \(\top = \langle N \rangle _{\textbf{x}}\) where

figure e

The \(p_i\) annotations are for future reference (Example 22).

Somewhat surprisingly an algorithm exists for finitely enumerating the set of null polynomials, hence computing \(\top \), for any given number of variables and bit-width [14, Theorem 3.3]. It is tempting to remove null polynomials from bases, since they are vacuous as constraints. Unfortunately, this is not generally possible without sacrificing the canonical representation property of closure.

3 Gröbner Bases

This section provides a primer on Gröbner bases over modulo integers.

3.1 Rank and Divisibility in \(\mathbb {Z}_m\)

Let \(\vert \subseteq \mathbb {Z}^2\) denote the divisibility relation over integers: \(a\, \vert \, b\) iff b is divisible by a. The rank [26] of \(a \in \mathbb {Z}_m\) is defined: \(\textsf {rank}_{\omega }(a) = \max \{ j \in \mathbb {N}\mid 2^j \; \vert \;a\}\) if \(a > 0\) otherwise \(\omega \), and can be computed by counting the number of trailing zeros in the binary representation of a [34].

Example 5

In \(\mathbb {Z}_{256}\) where \(\omega = 8\), \(\textsf {rank}_{8}(0) = 8\), \(\textsf {rank}_{8}(15) = 0\) and \(\textsf {rank}_{8}(56) = 3\).

If \(a \in \mathbb {Z}_m\) then \(a = 2^{\textsf {rank}_{\omega }(a)}d\) for some odd d. If \(a \ne 0\) then \(d = a/2^{\textsf {rank}_{\omega }(a)}\) is unique and the expression \(2^{\textsf {rank}_{\omega }(a)}d\) is referred to as the rank decomposition of a. For completeness, we declare \(0 = 2^{\omega }\cdot 1\) be the rank decomposition of 0.

Example 6

In \(\mathbb {Z}_{256}\), \(0 = 2^{8} \cdot 1\), \(15 = 2^0 \cdot 15\) and \(56 = 2^3 \cdot 7\) are rank decompositions.

For \(a_1 \in \mathbb {Z}_m\) and \(a_2 \in \mathbb {Z}_m \setminus \{0\}\), \(a_1\) is divisible by \(a_2\) if \(a_1 = ba_2\) for some divisor \(b \in \mathbb {Z}_m\). This occurs iff \(\textsf {rank}_{\omega }(a_1) \ge \textsf {rank}_{\omega }(a_2)\), in which case, if \(a_i = 2^{k_i}d_i\) is the rank decomposition of each \(a_i\), then \(b = 2^{k_1 - k_2}d_1{d_2}^{-1}\) where \({d_2}^{-1}\) is the multiplicative inverse of \(d_2\) (which exists since \(d_2\) is odd).

3.2 Monomial Orderings

Gröbner bases are founded on the concept of reduction, which simplifies a polynomial with respect to a set of polynomials. To define reduction it is necessary to order the terms in a polynomial, leading to the concept of monomial ordering:

Definition 5

A total order \(\prec \) over monomials \(\textbf{x}^{\mathbf {\alpha }}\) is a monomial ordering if: (1) \(1 \prec \textbf{x}^{\mathbf {\alpha }}\) for all \(\mathbf {\alpha } > \textbf{0}\) and (2) if \(\textbf{x}^{\mathbf {\alpha }_1} \prec \textbf{x}^{\mathbf {\alpha }_2}\) then \(\textbf{x}^{\mathbf {\alpha }_1}\textbf{x}^{\mathbf {\beta }} \prec \textbf{x}^{\mathbf {\alpha }_2}\textbf{x}^{\mathbf {\beta }}\) for all \(\textbf{x}^{\mathbf {\alpha }_1}\), \(\textbf{x}^{\mathbf {\alpha _2}}\) and \(\textbf{x}^{\mathbf {\beta }}\).

If \(\prec \) is a monomial ordering then \(\preceq \) will denote its non-strict version. Note that monomial orderings are well-orderings, hence there is no infinite decreasing chain \(\textbf{x}^{\mathbf {\alpha }_1} \succ \textbf{x}^{\mathbf {\alpha }_2} \succ \cdots \) of monomials.

Example 7

Let \(\textbf{y} = \langle x_{j_1}, \ldots , x_{j_d} \rangle \) be a permutation of \(\textbf{x}\) and < denote lexicographical ordering over \(\mathbb {N}^d\). Then, the lexicographical ordering \(\prec _{\textbf{y}}\), defined by \(\textbf{x}^{\mathbf {\alpha }} \prec _{\textbf{y}} \textbf{x}^{\mathbf {\beta }}\) iff \(\langle \alpha _{j_1}, \ldots , \alpha _{j_d} \rangle < \langle \beta _{j_1}, \ldots , \beta _{j_d} \rangle \), is a monomial ordering.

Monomial orderings add structure to polynomials: specifically, if \(p \ne 0\) then p can be uniquely expressed as \(p = c\textbf{x}^{\mathbf {\alpha }} + q\) where \(c \ne 0\) and all monomials \(\textbf{x}^{\mathbf {\beta }}\) in q satisfy \(\textbf{x}^{\mathbf {\beta }} \prec \textbf{x}^{\mathbf {\alpha }}\). Making use of this additional structure we define:

Definition 6

Let \(\prec \) be a monomial ordering over \(\textbf{x}\) and \(p = c\textbf{x}^{\mathbf {\alpha }} + q\) where \(c \ne 0\) and all monomials \(\textbf{x}^{\mathbf {\beta }}\) in q satisfy \(\textbf{x}^{\mathbf {\beta }} \prec \textbf{x}^{\mathbf {\alpha }}\). Then, (1) \(\textsf {lt}_{\prec }(p) = c\textbf{x}^{\mathbf {\alpha }}\), (2) \(\textsf {lm}_{\prec }(p) = \textbf{x}^{\mathbf {\alpha }}\) and (3) \(\textsf {lc}_{\prec }(p) = c\) are respectively the leading term, monomial and coefficient of p with respect to \(\prec \).

3.3 Reduction

Reduction is analogous to integer division with remainder:

Definition 7

Let \(p, q, r \in \mathbb {Z}_m[\textbf{x}]\), \(p \ne 0\), \(q \ne 0\) and \(\prec \) a monomial ordering. Then, p is \(\prec \)-reducible by q to r, denoted \(p \rightarrow _{\prec ,q} r\), if \(\textsf {lt}_{\prec }(p) = t \, \textsf {lt}_{\prec }(q)\) and \(p = tq + r\) for some term t.

Reducibility lifts to sets \(B \subseteq \mathbb {Z}_m[\textbf{x}]\) by \(\rightarrow _{\prec ,B}\ = \bigcup _{p \in B} \rightarrow _{\prec ,p}\). Furthermore, let \(\rightarrow ^{+}_{\prec ,B}\) (resp. \(\rightarrow ^{*}_{\prec ,B}\)) denote the transitive (resp. transitive, reflexive) closure of \(\rightarrow _{\prec ,B}\). If \(p \rightarrow ^{+}_{\prec ,B} r\) for some r then p is said to be \(\prec \)-reducible by B, otherwise \(\prec \)-irreducible by B, denoted \(p \not \rightarrow _{\prec ,B}\).

Example 8

Let \(\textbf{x} = \langle x, y, a \rangle \) and \(B \subseteq \mathbb {Z}_{16}[\textbf{x}]\) where

figure f

Now, let \(p = xa + 15 \in \mathbb {Z}_{16}[\textbf{x}]\) and \(\prec \; =\; \prec _{\textbf{x}}\). Then, \(\textsf {lt}_{\prec }(p) = xa = a\,\textsf {lt}_{\prec }(p_1)\) and \(p = ap_1 + r_1\) where \(r_1 = 15a^3 + 9a^2 + 9a + 15\), hence \(p \rightarrow _{\prec ,p_1} r_1\). Similarly, \(\textsf {lt}_{\prec }(r_1) = 15a^3 = 15\,\textsf {lt}_{\prec }(p_3)\) and \(r_1 = 15p_3 + r_2\) where \(r_2 = 10a^2 + 6\), hence \(r_1 \rightarrow _{\prec ,p_3} r_2\). Finally, \(\textsf {lt}_{\prec }(r_2) = 10a^2 = 5\,\textsf {lt}_{\prec }(p_4)\) and \(r_2 = 5p_4 + r_3\) where \(r_3 = 0\), hence \(r_2 \rightarrow _{\prec ,p_4} r_3\). Thus, \(p \rightarrow _{\prec , p_1} r_1 \rightarrow _{\prec , p_3} r_2 \rightarrow _{\prec , p_4} r_3\), hence \(p \rightarrow ^{+}_{\prec , B} 0\).

Note p is \(\prec \)-reducible by B iff \(\textsf {lt}_{\prec }(p)\) is divisible by \(\textsf {lt}_{\prec }(q)\) for some \(q \in B\), where a term \(t_1\) is divisible by a term \(t_2\) if \(t_1 = t_2t_3\) for some term \(t_3\). Moreover, reduction eliminates the leading term of a polynomial, leaving a residue polynomial comprised of strictly smaller terms with respect to \(\prec \):

Lemma 2

If \(p \rightarrow ^{+}_{\prec ,B} r \ne 0\) then \(\textsf {lm}_{\prec }(r) \prec \textsf {lm}_{\prec }(p)\).

Since monomial orderings are well-orderings, the previous result implies that a sequence of reductions cannot continue ad infinitum and must eventually terminate with the 0 polynomial. In this case, it follows that \(p \in \langle B \rangle _{\textbf{x}}\), hence reduction provides a test for membership in an ideal:

Proposition 4

If \(p \rightarrow ^{*}_{\prec ,B} 0\) then \(p \in \langle B \rangle _{\textbf{x}}\).

But reduction against an arbitrary basis B does not lead to a complete test for membership in \(\langle B \rangle _{\textbf{x}}\), hence motivating Gröbner bases.

3.4 Gröbner Bases

With reduction in place, the concept of Gröbner basis can be introduced:

Definition 8

Let \(B \subseteq \mathbb {Z}_m[\textbf{x}]\) and \(\prec \) a monomial ordering over \(\textbf{x}\). Then, \(G \subseteq \langle B \rangle _{\textbf{x}}\) is a Gröbner basis for \(\langle B \rangle _{\textbf{x}}\) with respect to \(\prec \) if for all \(p \in \langle B \rangle _{\textbf{x}}\), if \(p \ne 0\) then p is \(\prec \)-reducible by G.

Gröbner bases provide a complete test for membership in \(\langle B \rangle _{\textbf{x}}\), as asserted by:

Lemma 3

If G is a Gröbner basis for \(\langle B \rangle _{\textbf{x}}\) with respect to \(\prec \) then for all \(p \in \langle B \rangle _{\textbf{x}}\), \(p \rightarrow ^{*}_{\prec ,G} 0\).

Example 9

Let \(\textbf{x} = \langle x, y \rangle \), \(\prec \; = \;\prec _{\textbf{x}}\) and \(p \in \mathbb {Z}_{16}[\textbf{x}]\) where \(p = 4x\). Moreover, let \(B = \{p_1, p_2 \} \subseteq \mathbb {Z}_{16}[\textbf{x}]\) where \(p_1 = 2x^2y + 2x^2 + 6xy + x\) and \(p_2 = 4y + 4\). Then, \(p = 12p_1 + (10x^2 + 10x)p_2 \in \langle B \rangle _{ \textbf{x}}\), yet \(p \not \rightarrow _{\prec ,B}\), thus B is not a Gröbner basis with respect to \(\prec \). However, it can be shown if \(p_3 = 6x\) and \(p_4 = 3x\) then \(G = \{p_1, p_2, p_3, p_4\}\) is a Gröbner basis for \(\langle B \rangle _{\textbf{x}}\) with respect to \(\prec \). Note that p is \(\prec \)-reducible by \(p_4 \in G\). Indeed, \(p \rightarrow _{\prec ,p_4} 0\), hence \(p \rightarrow ^{*}_{\prec ,G} 0\), as predicted by the previous result.

3.5 Buchberger’s Algorithm

Classically [3], Gröbner bases are computed by evaluating S-polynomials:

Definition 9

Let \(\prec \) be a monomial ordering over \(\textbf{x}\). The S-polynomial of \(p_1, p_2\in \mathbb {Z}_m[\textbf{x}]\) with respect to \(\prec \) is defined:

$$ \textsf {S}_{\prec }(p_1, p_2) = d_22^{k - k_1}\textbf{x}^{\mathbf {\alpha } - \mathbf {\alpha }_1}p_1 - d_12^{k - k_2}\textbf{x}^{\mathbf {\alpha } - \mathbf {\alpha }_2}p_2 $$

where, if \(p_i = 0\) then \(k_i = \omega \), \(d_i = 1\) and \(\mathbf {\alpha }_i = \textbf{0}\), else \(2^{k_i}d_i\) is the rank decomposition of \(\textsf {lc}_{\prec }(p_i)\) and \(\textbf{x}^{\mathbf {\alpha _i}} = \textsf {lm}_{\prec }(p_i)\), \(k = \max (k_1, k_2)\) and \(\mathbf {\alpha } = \max (\mathbf {\alpha _1}, \mathbf {\alpha _2})\).

Example 10

If \(p_1 = 2x^2y + 2x^2 + 6xy + x\) and \(p_2 = 4y + 4\) then it follows \(\textsf {S}_{\prec }(p_1, p_2) = 2(2xy^2 + 6xy + 2y^2 + y) - y^2(4x + 4) = 12xy + 2y\) and \(\textsf {S}_{\prec }(p_1, 0) = 8(2xy^2 + 6xy + 2y^2 + y) - xy^2(0) = 8y\).

Note that \(\textsf {lt}_{\prec }(d_22^{k - k_1}\textbf{x}^{\mathbf {\alpha } - \mathbf {\alpha }_1}p_1) = \textsf {lt}_{\prec }(d_12^{k - k_2}\textbf{x}^{\mathbf {\alpha } - \mathbf {\alpha }_2}p_2)\), hence the S-polynomial \(\textsf {S}_{\prec }(p_1, p_2)\) leads to a cancellation of leading terms. In particular, the S-polynomial \(\textsf {S}_{\prec }(p_1, 0)\) eliminates the leading term of \(p_1\), and possible other terms as well. This deviates from the classical case of fields, where only multiplying by 0 can eliminate a leading term. S-polynomials then yield an effective criterion [2, Theorem 30] to determine if a given basis is a Gröbner basis.

Theorem 1 (Buchberger’s criterion)

Let \(\prec \) be a monomial ordering and \(B = \{p_1, \ldots , p_s\} \subseteq \mathbb {Z}_m[\textbf{x}]\). If \(\textsf {S}_{\prec }(p_i, p_j) \rightarrow ^{*}_{\prec , B} 0\) and \(\textsf {S}_{\prec }(p_i, 0) \rightarrow ^{*}_{\prec , B} 0\) for all \(1 \le i < j \le s\) then B is a Gröbner basis for \(\langle B \rangle _{\textbf{x}}\) with respect to \(\prec \).

Fig. 2.
figure 2

Gröbner basis algorithm over integers modulo \(2^{\omega }\)

Buchberger’s criterion justifies Buchberger’s algorithm for constructing Gröbner bases. Figure 2 presents a version of Buchberger’s algorithm [2] that takes \(B \subseteq \mathbb {Z}_m[\textbf{x}]\) and a monomial ordering \(\prec \) over \(\textbf{x}\) and returns a Gröbner basis for \(\langle B \rangle _{\textbf{x}}\) with respect to \(\prec \). The algorithm maintains a basis G, initialised to B, and a set of unverified S-polynomials S. The algorithm attempts to verify that G is a Gröbner basis by reducing each S-polynomial pair in S against it. If some S-polynomial does not reduce, it yields a new element which is added to G, and generates further S-polynomials. The algorithm terminates when all S-polynomials for the current basis reduce to 0, at which point Buchberger’s criterion applies to show the result, henceforth denoted \(\textsf {gb}_{\prec }(B)\), is a Gröbner basis. Observe that \(B \subseteq G\) on each iteration of the while loop hence \(B \subseteq \textsf {gb}_{\prec }(B)\).

4 Calculating Variable Elimination and Join

This section explains how variable elimination can be computed using Gröbner bases, and how variable elimination can be combined with a relaxation to compute the join of two ideals finitely represented as bases.

4.1 Variable Elimination

A generic projection function \(\pi _i(\langle a_1, \ldots , a_{\ell } \rangle )\) = \(\langle a_1, \ldots , a_{i-1}, a_{i+1}, \ldots , a_{\ell } \rangle \) is used to formulate elimination. The presentation of elimination commences with a syntactic version which removes polynomials that contains a given variable:

Definition 10

(Syntactic) variable elimination is an operation \(\textsf {elim}[x_j]\) where \(\textsf {elim}[x_j] : \wp (\mathbb {Z}_{m}[\textbf{x}]) \rightarrow \wp (\mathbb {Z}_{m}[\pi _j(\textbf{x})])\) defined by \(\textsf {elim}[x_j](P) = P \cap \mathbb {Z}_m[\pi _j(\textbf{x})]\)

The following result demonstrates that abstraction and elimination commute. The result is formulated in terms of the natural lifting of \(\pi _j\) from the function space \(\mathbb {Z}_m^d \rightarrow \mathbb {Z}_m^{d-1}\) to \(\wp (\mathbb {Z}_m^d) \rightarrow \wp (\mathbb {Z}_m^{d-1})\).

Proposition 5

If \(A \subseteq \mathbb {Z}_m^d\) then \(\textsf {elim}[x_j](\alpha _{\textbf{x}}(A)) = \alpha _{\pi _j(\textbf{x})}(\pi _j(A))\).

It follows from this result that elimination preserves closure:

Corollary 1

If \(P \in \textsf {MPAD}_{m}[\textbf{x}]\) then \(\textsf {elim}[x_j](P) \in \textsf {MPAD}_{m}[\pi _j(\textbf{x})]\).

Example 11

Consider \(B = \{ wx + 10w, 15wx^2 + wx + x^2 + 15x \} \subseteq \mathbb {Z}_{16}[w, x]\) and observe \(\textsf {elim}[w](B) = \emptyset \). However \((x^2 + 7x + 8)(wx + 10w) + (x + 2)(15wx^2 + wx + x^2 + 15x) = x^3 + x^2 + 14x\) hence \(x^3 + x^2 + 14x \in \langle B \rangle _{ \langle w, x\rangle }\). Since \(w \not \in \textsf{vars}(x^3 + x^2 + 14x)\) it follows \(x^3 + x^2 + 14x \in \textsf {elim}[w](\langle B \rangle _{ \langle w, x \rangle })\). In particular, \(\textsf {elim}[w](\langle B \rangle _{ \langle w, x\rangle }) \ne \{ 0 \} = \langle \emptyset \rangle _{\langle x \rangle } = \langle \textsf {elim}[w](B) \rangle _{ \langle x \rangle }\).

The previous example shows that syntactic variable elimination is not well-behaved with respect to ideal generation, thus motivating the following:

Definition 11

(Semantic) variable elimination is a relation \(\rightarrow _{\textsf {elim}[x_j]}\) where \(\rightarrow _{\textsf {elim}[x_j]} \subseteq \wp (\mathbb {Z}_{m}[\textbf{x}]) \times \wp (\mathbb {Z}_m[\pi _j(\textbf{x})])\) defined by \(B \rightarrow _{\textsf {elim}[x_j]} B' \text { iff } \textsf {elim}[x_j](\langle B \rangle _{ \textbf{x}}) = \langle B' \rangle _{ \pi _j(\textbf{x})}\)

Proposition 6

Let \(B \subseteq \mathbb {Z}_{m}[\textbf{x}]\) and \(B'\) be a Gröbner basis for \(\langle B \rangle _{ \textbf{x}}\) with respect to \(\prec _{\textbf{y}}\) where \(\textbf{y}\) is a permutation of \(\textbf{x}\) and \(y_1 = x_j\). Then \(B \rightarrow _{\textsf {elim}[x_j]} \textsf {elim}[x_j](B')\).

The previous result can be stated more generally in terms of elimination orderings [1]; the restriction to lexicographical ordering is adopted merely to simplify the presentation. Consistent with this choice, \(\textsf {gb}_{\prec _{\textbf{y}}}\) is henceforth abbreviated to \(\textsf {gb}_{\textbf{y}}\), again purely to streamline the exposition.

Example 12

Let \(B = \{ wx + 10w, 15wx^2 + wx + x^2 + 15x \} \subseteq \mathbb {Z}_{16}[w, x, y]\). Then, \( \textsf {gb}_{\langle w,x,y \rangle }(B) = B \cup \{ wx + 3x^2 + 13x, 2w + x^2 + 15x, x^3 + x^2 + 14x \} \). It therefore follows \(B \rightarrow _{\textsf {elim}[w]} \{ x^3 + x^2 + 14x \}\).

Example 13

Let \(B = \{ w(x + 3), w(y + 9), (1-w)(x+6), (1-w)(y+2) \}\). Then,

figure g

Thus \(B \rightarrow _{\textsf {elim}[w]} B'\) and \(B \rightarrow _{\textsf {elim}[w]} B''\) where \(B' = \{ x + 5y, y^2 + 11y + 2 \}\) and \(B'' = \{ y + 13x, x^2 + 9x + 2 \}\) illustrating why \(\rightarrow _{\textsf {elim}[w]}\) is defined as a relation. To see \(\langle B' \rangle _{ \langle x, y \rangle } = \langle B'' \rangle _{ \langle x, y \rangle }\) observe \(x + 5y \rightarrow _{y + 13x} 0\) and

figure h

Similarly, \(p \rightarrow _{B'} 0\) for all \(p \in B''\).

4.2 Join

Once variable elimination is in place, join can be calculated by adapting a standard relaxation [1] to the current setting. The result, which provides a way of intersecting ideals, hence calculating join, is stated in terms of a lifted product \(qP = \{ qp \mid p \in P \}\) where \(P \subseteq \mathbb {Z}_m[\textbf{x}]\) and \(q \in \mathbb {Z}_m[\textbf{x}]\):

Proposition 7

Let \(\langle B_1 \rangle _{ \textbf{x}}, \langle B_2 \rangle _{ \textbf{x}} \in \textsf {MPAD}_{m}[\textbf{x}]\). If \(w \not \in \textsf{vars}(B_1 \cup B_2)\) then \(\langle B_1 \rangle _{ \textbf{x}} \cap \langle B_2 \rangle _{ \textbf{x}} = \langle B \rangle _{ \textbf{x}}\) whenever \( wB_1 \cup (1-w)B_2 \rightarrow _{\textsf {elim}[w]} B \)

Example 14

Let \(\textbf{x} = \langle x, y \rangle \) and \(B_1, B_2 \subseteq \mathbb {Z}_{16}[\textbf{x}]\) where \(B_1 = \{ x + 10 \}\), \(B_2 = \{ x^2 + 15x \}\) and \(I_i = \langle B_i \rangle _{ \textbf{x}}\). Both \(I_i\) are closed, that is, . Let

$$ B = wB_1 \cup (1 - w)B_2 = \{ wx + 10w, 15wx^2 + wx + x^2 + 15x \} $$

By Example 11, \(B \rightarrow _{\textsf {elim}[w]} \{ x^3 + x^2 + 14x \}\), hence \(\langle B_1 \rangle _{ \textbf{x}} \sqcup \langle B_2 \rangle _{ \textbf{x}}\) = \(\langle x^3 + x^2 + 14x \rangle _{ \textbf{x}}\).

Figures 3(a), 3(b) and 3(i) depict \(\gamma _{\textbf{x}}(I_1)\), \(\gamma _{\textbf{x}}(I_2)\) and \(\gamma _{\textbf{x}}(I_1 \sqcup I_2)\) respectively. Observe \((8, y) \in \gamma _{\textbf{x}}(I_1 \sqcup I_2)\) but \((8, y) \not \in \gamma _{\textbf{x}}(I_1) \cup \gamma _{\textbf{x}}(I_2)\) for any \(y \in \mathbb {Z}_{16}\). These additional points, which are introduced by join itself, stem not from the relaxation \(wB_1 \cup (1 - w)B_2\) which introduces w, but the elimination of w from \(\textsf {gb}_{\langle w,x,y \rangle }(B)\) which derives a unary polynomial representation over x alone. To see this, observe \(B[x \mapsto 8] = \{ 8w, 8w + 8 \}\) and \(B[x \mapsto 14] = \{ 14w+12, 10w + 6 \}\) both have no solutions.

Fig. 3.
figure 3

Examples of join on \(\mathbb {Z}_{16}[\textbf{x}]\) for \(\textbf{x} = \langle x, y \rangle \)

Example 15

Figure 3 presents a series of examples of join on \(\mathbb {Z}_{16}[\textbf{x}]\) for \(\textbf{x} = \langle x, y \rangle \). Figures 3(a)–(h) depict \(\gamma _{\textbf{x}}(I_i)\) for \(I_i = \langle B_i \rangle _{ \textbf{x}}\) where and \(B_i\) are as follows:

figure k

For comparison, the yellow points give the best abstraction of \(\gamma _{\textbf{x}}(I_i)\) using systems of linear congruences modulo 16 (linear polynomials).

Figures 3(i)–(p) depict \(\gamma _{\textbf{x}}(I_i \sqcup I_j)\) for various combinations of \(i, j \in \{ 1, \ldots , 8 \}\), illustrating where a polynomial representation introduces additional points via join. Observe that join induces a loss of information as witnessed by additional points of the form (8, y) and (14, y) in \(\gamma _{\textbf{x}}(I_1 \sqcup I_2)\). Again, the yellow points give the join of the best linear abstractions, which can be computed by combining a relaxation with variable elimination [21]. To illustrate the working, consider \(B_3\) and \(B_4\) rewritten as follows:

$$ B_3 = \left\{ \begin{array}{l} x \equiv _{16} -3, \quad y \equiv _{16} -9 \end{array} \right\} \qquad B_4 = \left\{ \begin{array}{l} x \equiv _{16} -6, \quad y \equiv _{16} -2 \end{array} \right\} . $$

The relaxation introduces fresh variables \(x', y', x'', y''\) and \(\mu \):

$$ \begin{array}{l} x \equiv _{16} x' + x'' \\ y \equiv _{16} y' + y'' \end{array} \quad \begin{array}{l} x' \equiv _{16} -3\mu \\ y' \equiv _{16} -9\mu \end{array} \quad \begin{array}{l} x'' \equiv _{16} -6(1 - \mu ) \\ y'' \equiv _{16} -2(1 - \mu ) \end{array} $$

Eliminating \(x', y', x''\) and \(y''\) gives a system of two congruences: \(x \equiv _{16} 3\mu - 6\) and \(y \equiv _{16} -7\mu - 2\). Rearranging for \(\mu \) gives \(\mu \equiv _{16} 2 - 5x\) hence \(y \equiv _{16} 3x\) as illustrated in Fig. 3(k). The other linear joins are computed likewise.

Note the loss of precision in using linear, rather than polynomial, abstractions. For instance, the set \(\gamma _{\textbf{x}}(I_2)\) can only be approximated by a trivial (unconstrained) linear system, which loses all information. Moreover, as demonstrated in Figs. 3(j)–(k) and Figs. 3(n)–(p), even if the arguments to a (polynomial) join are representable via linear systems, the result may not be.

5 Calculating Closure and Meet

This section addresses how to finitely compute closure. The problem is reduced to that of computing a cover of a system of polynomials. A cover provides a way to decompose closure to sub-problems for which closure can be computed directly. A divide-and-conquer algorithm is introduced for computing a cover, which exploits a simplification procedure based on Gröbner bases, to avoid superfluous work. The section concludes by showing how meet can be computed using closure.

Fig. 4.
figure 4

Covers of \(F_1\) over \(\langle w_1 \rangle \) and \(F_2\) over \(\langle w_1, w_2 \rangle \)

5.1 Covering

An algorithm for closure is formulated in terms of the concept of a cover, which is itself defined through a lifting of polynomial evaluation \(\llbracket p \rrbracket _{ \textbf{x}}(\textbf{a})\) to a vector of polynomials \(\textbf{p} = \langle p_1, \ldots , p_n \rangle \) by \(\llbracket \textbf{p} \rrbracket _{ \textbf{x}}(\textbf{a}) = \langle \llbracket p_1 \rrbracket _{ \textbf{x}}(\textbf{a}), \ldots , \llbracket p_n \rrbracket _{ \textbf{x}}(\textbf{a}) \rangle \).

Definition 12

Let \(\mathcal {W} \subseteq \mathbb {Z}_m[\textbf{w}]^d\), \(A \subseteq \mathbb {Z}_m^d\) and \(F \subseteq \mathbb {Z}_m[\textbf{x}]\). Then

  • \(\mathcal {W}\) is a cover of A over \(\textbf{w}\) iff \(A = \{ \llbracket \textbf{W} \rrbracket _{ \textbf{w}}(\textbf{a}) \mid \textbf{W} \in \mathcal {W} \wedge \textbf{a} \in \mathbb {Z}_m^{|\textbf{w}|} \}\)

  • \(\mathcal {W}\) is a cover of F over \(\textbf{w}\) iff \(\mathcal {W}\) is a cover of \(\gamma _{\textbf{x}}(F)\) over \(\textbf{w}\)

Example 16

Figures 4(a) and (g) depict \(\gamma _{\textbf{x}}(F_1)\) and \(\gamma _{\textbf{x}}(F_2)\) for \(\textbf{x} = \langle x, y \rangle \) where

figure l

Figures 4(b), (c) and (d) illustrate \(A_i = \{ \llbracket \textbf{W}_i \rrbracket _{ \textbf{w}}(\textbf{a}) \mid \textbf{a} \in \mathbb {Z}_m^1 \}\) for \(\textbf{w} = \langle w_1 \rangle \) where \(\textbf{W}_1 = \langle 4w_1 + 6, 4w_1 \rangle \), \(\textbf{W}_2 = \langle 8, 8w_1 + 1 \rangle \) and \(\textbf{W}_3 = \langle 12, 8w_1 + 7 \rangle \). Observe \(\{\textbf{W}_i\}\) is a cover of \(A_i\) and since \(\gamma _{\textbf{x}}(F_1) = A_1 \cup A_2 \cup A_3\), \(\{ \textbf{W}_1, \textbf{W}_2, \textbf{W}_3 \}\) is a cover of \(F_1\) over \(\textbf{w}\). The set of 4 vectors \(\{ \textbf{W}_1, \textbf{W}_2, \textbf{W}_4, \textbf{W}_5 \}\) where \(\textbf{W}_4 = \langle 12, 7 \rangle \) and \(\textbf{W}_5 = \langle 12, 15 \rangle \) is also a cover of \(F_1\), illustrating that covers are not unique. The polynomial vectors \(\textbf{W}_4\) and \(\textbf{W}_5\) define single points and suggest how a cover can be constructed for an arbitrary \(F \subseteq \mathbb {Z}_m[\textbf{w}]\) by putting \(\mathcal {W} = \{ \textbf{a} \mid \textbf{a} \in \gamma _{\textbf{x}}(F) \}\). The vector \(\textbf{w}\) is not necessarily unary as the cover \(\{ \textbf{W}_6 \}\) of \(F_2\) over \(\textbf{w} = \langle w_1, w_2 \rangle \) illustrates where \(\textbf{W}_6 = \langle 8w_1+3, 4w_2+1 \rangle \) and \(\gamma _{\textbf{x}}(F_2) = A_6 = \{ \llbracket \textbf{W}_6 \rrbracket _{ \textbf{w}}(\textbf{a}) \mid \textbf{a} \in \mathbb {Z}_m^2 \}\), and \(\gamma _{\textbf{x}}(F_2)\) and \(A_6\) are illustrated in Figs. 4(g) and (h) respectively.

Fig. 5.
figure 5

The cover algorithm

The challenge is to compute a cover over some \(\textbf{w}\) for arbitrary \(F \subseteq \mathbb {Z}_m[\textbf{x}]\) without naively enumerating all points of \(\gamma _{\textbf{x}}(F)\). To this end, Fig. 5 presents a divide-and-conquer algorithm that recursively decomposes \(\gamma _{\textbf{x}}(F)\) into subsets following the structure of F. Ultimately the function computes a cover \(\mathcal {W} \subseteq \mathbb {Z}_m[\textbf{w}]^d\) for F over \(\textbf{w}\) where \(|\textbf{w}| = d = |\textbf{x}|\). The function cover depends on three auxiliary functions, simplify, constrain and safe all of which are listed in Fig. 6. The function cover and its auxiliaries operate on pairs \(S = \langle \textbf{W}, F \rangle \) where \(\textbf{W} \in \mathbb {Z}_m[\textbf{w}]^d\) is a vector of polynomials and \(F \subseteq \mathbb {Z}_m[\textbf{w}]\) is a system. The vector \(\textbf{W}\) provides a lens to interpret the solutions of F, as is formalised below:

Definition 13

The concretisation map \(\gamma _{\textbf{w}} : \mathbb {Z}_m[\textbf{w}]^d \times \wp (\mathbb {Z}_m[\textbf{w}]) \rightarrow \wp (\mathbb {Z}_m^d)\) is defined: \(\gamma _{\textbf{w}}(\langle \textbf{W}, F \rangle ) = \{ \llbracket \textbf{W} \rrbracket _{ \textbf{w}}(\textbf{a}) \mid \textbf{a} \in \gamma _{\textbf{w}}(F) \}\)

Fig. 6.
figure 6

The simplify, constrain and safe functions

Example 17

Consider \(S_b = \langle \textbf{W}_b, F_b \rangle \) and \(S_c = \langle \textbf{W}_c, F_c \rangle \), where \(\textbf{W}_b = \langle w_1, 2w_2 \rangle \), \(\textbf{W}_c = \langle w_1, 4w_2 \rangle \) and

$$ F_b = \left\{ \begin{array}{@{}c@{}} w_1^2 + w_1 + 6w_2 + 12, \\ 2w_1w_2 + 4w_1, \quad 4w_2^2, \quad 8w_2 \end{array} \right\} \qquad F_c = \left\{ \begin{array}{@{}c@{}} w_1^2 + w_1 + 12w_2 + 12, \\ 4w_1w_2 + 4w_1 \end{array}\right\} $$

Figure 8(b) illustrates \(\gamma _{\textbf{w}}(F_b)\) as translucent points and \(\gamma _{\textbf{w}}(S_b)\) as opaque points. Observe \(\langle 8, 2 \rangle , \langle 8, 10 \rangle \in \gamma _{\textbf{w}}(F_b)\) and \(\llbracket \textbf{W}_b \rrbracket _{ \textbf{w}}(\langle 8, 2 \rangle )\) = \(\langle 8, 4 \rangle \) = \(\llbracket \textbf{W}_b \rrbracket _{ \textbf{w}}(\langle 8, 10 \rangle )\). Hence, in general, there is a many-to-one relationship between \(\gamma _{\textbf{w}}(F_b)\) and \(\gamma _{\textbf{w}}(S_b)\). Figure 8(c) depicts \(\gamma _{\textbf{w}}(F_c)\) and \(\gamma _{\textbf{w}}(S_c)\) using the same convention. Observe too that \(\gamma _{\textbf{w}}(S_b)\) = \(\gamma _{\textbf{w}}(S_c)\) but the cardinality of \(\gamma _{\textbf{w}}(F_c)\) is 4-fold that of \(\gamma _{\textbf{w}}(S_c)\) since \(\textbf{W}_c = \langle w_1, 4w_2 \rangle \).

Fig. 7.
figure 7

Solution sets for F, \(F_3\) and \(F_4\)

Observe that if \(\mathcal {W} \subseteq \mathbb {Z}_m[\textbf{w}]^d\) is a cover for \(F \subseteq \mathbb {Z}_m[\textbf{x}]\) over \(\textbf{w}\) then \(\gamma _{\textbf{x}}(F)\) = \(\cup \{ \gamma _{\textbf{w}}(\langle \textbf{W}, \emptyset \rangle ) \mid \textbf{W} \in \mathcal {W} \}\). Thus a cover is formed from pairs \(\langle \textbf{W}, F \rangle \) that are degenerate in that \(F = \emptyset \). The rationale behind cover is thus to decompose a single pair \(\langle \textbf{W}, F \rangle \) where \(\textbf{W} = \textbf{w}\) into a collection of degenerate pairs:

Example 18

Consider computing a cover for the system

$$ F = \left\{ \begin{array}{@{}c@{}} x^2 + x + 7y^2 + 11y + 12, \quad xy + 4x + 10y^2 \end{array} \right\} $$

over \(\textbf{w} = \langle w_1, w_2 \rangle \). The set \(\gamma _{\textbf{x}}(F)\) is plotted in Fig. 7(a). The top-level cover function expresses F as the pair \(S_a = \langle \textbf{W}_a, F_a \rangle \) where

$$ \textbf{W}_a = \textbf{w} \qquad F_a = \left\{ \begin{array}{@{}c@{}} w_1^2 + w_1 + 7w_2^2 + 11w_2 + 12, \quad w_1w_2 + 4w_1 + 10w_2^2 \end{array} \right\} $$

Since \(\llbracket \textbf{W}_a \rrbracket _{ \textbf{w}}(\textbf{b}) = \textbf{b}\) for all \(\textbf{b} \in \mathbb {Z}_m^2\), it follows \(\gamma _{\textbf{w}}(S_a) = \gamma _{\textbf{x}}(F)\).

The cover function invokes both simplify and constrain. The function simplify performs simplification, either returning nil, indicating \(\gamma _{\textbf{w}}(\langle \textbf{W}, F \rangle ) = \emptyset \), or \(S' = \langle \textbf{W}', F' \rangle \) where \(\gamma _{\textbf{w}}(S) = \gamma _{\textbf{w}}(S')\) (possibly with \(S = S'\)). The first substantive action of simplify is to calculate a Gröbner basis \(F'\) for the ideal \(\langle F \rangle _{ \textbf{w}}\) using the variable ordering \(\textbf{w}\). If there exists a constant polynomial \(c \in F'\) such that \(c \ne 0\) then this reveals \(\gamma _{\textbf{w}}(F) = \gamma _{\textbf{w}}(F') = \emptyset \) hence \(\gamma _{\textbf{w}}(S) = \emptyset \). Otherwise, constrain is invoked if \(F'\) contains a polynomial of the form \(2^{\omega - j}(w_i + r)\) where \(r \in \mathbb {Z}_m[w_{i+1}, \ldots , w_d]\), \(0 < j \le \omega \) and the safety check \({\textsf {safe}}(\textbf{W}, w_i, r)\) is satisfied. The added polynomial \(w_i - 2^jw + r\) asserts that \(w_i + r\) is a multiple of \(2^j\), which is a direct consequence of \(2^{\omega - j}(w_i + r)\). The safety check ensures that the addition of \(2^{\omega - j}(w_i + r)\) does not induce a coupling between the variables of \(\textbf{w}\), specifically those arising in r, that would compromise the termination argument behind simplify and cover. The safety check is vacuously satisfied if \(\textsf{vars}(r) = \emptyset \).

Simplification is used in tandem with splitting, the latter employed by cover only when the former cannot infer new information. When constrain is invoked from cover, two pairs \(S'_0\) and \(S'_1\) are derived from \(S' = \langle \textbf{W}', F' \rangle \) for which \(\gamma _{\textbf{w}}(S') = \gamma _{\textbf{w}}(S'_0) \cup \gamma _{\textbf{w}}(S'_1)\). The pairs \(S'_0\) and \(S'_1\) are formed by adding \(w_i -2w + 0\) and \(w_i - 2w + 1\) to \(F'\), which stipulate, respectively, whether \(w_i\) takes an even or an odd value. Note, in this case, \({\textsf {constrain}}(S', 1, w_i, r)\) is called with \(\textsf{vars}(r) = \emptyset \), hence \({\textsf {safe}}(\textbf{W}, w_i, r)\) holds independently of \(\textbf{W}\) and \(w_i\) and need not be deployed within the body of cover itself. The cover function is then recursively applied to \(S'_0\) and \(S'_1\) to compute two covers, which are combined by set union. The function returns a singleton set \(\{ \textbf{W} \}\) when \(F = \emptyset \).

Fig. 8.
figure 8

Covering F: \(\gamma _{\textbf{y}}(F_n)\) (large, translucent points) and \(\gamma _{\textbf{y}}(S_n)\) (small, opaque points) for \(S_n = \langle \textbf{W}_n, F_n \rangle \)

Example 19

Figure 9 presents the simplification and splitting actions that arise during a run of the algorithm on the pair \(S_a = \langle \textbf{W}_a, F_a \rangle \) introduced in Example 18. The actions are presented as a tree rooted at node a where the leaves, nodes e, h and k, are each decorated with a single polynomial vector. Together these 3 vectors constitute the cover. Figure 8 augments Fig. 9 with details of \(S_n = \langle \textbf{W}_n, F_n \rangle \) for each node n of the tree: \(\textbf{W}_n\) written above \(F_n\). In each diagram \(\gamma _{\textbf{w}}(F_n)\) is represented as large, translucent points and \(\gamma _{\textbf{w}}(S_n)\) as small, opaque points. Observe that \(F_a\) does not contain any polynomial of the general form \(2^{\omega - j}(w_i + r)\) hence cover immediately splits the problem into calculating a cover for \(\langle \textbf{W}_b, F_b \rangle \) and a cover for \(\langle \textbf{W}_i, F_i \rangle \). Note how splitting doubles a leading constant: \(\textbf{W}_a = \textbf{w}\) whereas \(\textbf{W}_b = \langle w_1, 2w_2 \rangle \) and \(\textbf{W}_i = \langle w_1, 2w_2 + 1 \rangle \). This form of scaling by a power of 2 is a general pattern. By comparing the number of small, opaque points in Fig. 8(a) against those in (b) and (i), observe that the solutions of \(\gamma _{\textbf{w}}(S_a)\) are preserved by the split, that is, \(\gamma _{\textbf{w}}(S_a)\) = \(\gamma _{\textbf{w}}(S_b) \cup \gamma _{\textbf{w}}(S_i)\).

Fig. 9.
figure 9

Covering F: the simplification and splitting actions

The system \(F_b\) contains \(8w_2 = 2^{4 - 1}(w_2 + r)\) where \(r = 0\) hence cover deploys simplification to derive \(S_c = \langle \textbf{W}_c, F_c \rangle \) from \(S_b\). Since \(\textsf{vars}(r) = \emptyset \) the check \({\textsf {safe}}(\textbf{W}, w_i, r)\) is vacuously satisfied. Recall from Example 17 that \(\gamma _{\textbf{w}}(S_b)\) = \(\gamma _{\textbf{w}}(S_c)\). Observe too how a leading constant is again doubled, with a commensurate doubling in the cardinality of \(\gamma _{\textbf{w}}(F_c)\) over \(\gamma _{\textbf{w}}(F_b)\). Since \(F_c\) does not contain any polynomial \(2^{\omega - j}(w_i + r)\) splitting is again applied to give a total of three branches that emanate from a. Observe \(F_e = F_g = F_h = \emptyset \) hence the pairs \(\langle \textbf{W}_e, F_e \rangle \), \(\langle \textbf{W}_g, F_g \rangle \) and \(\langle \textbf{W}_h, F_h \rangle \) are degenerate and thereby define the final cover \(\{ \textbf{W}_e, \textbf{W}_g, \textbf{W}_h \}\) over \(\textbf{w}\).

Example 20

Figure 9 illustrates the application of the check \({\textsf {safe}}(\textbf{W}, w_i, r)\) within simplify. Observe that \(\textsf{vars}(r) = \emptyset \) in all but one of the simplification steps. For the step that applies \(2(w_1 + 6w_2 + 6)\), \(r = 6w_2 + 6\) and \(\textbf{W} = \langle 2^{1}w_1, 2^{2}w_2 \rangle \). The polynomial r contains a single term \(6w_2\), which contains the single variable \(w_2\). The test \({\textsf {safe}}(\textbf{W}, w_1, r)\) thus reduces to a single inequality \(k_1 + \textsf {rank}(6) < k_2\) which is false since \(k_1 = 1\), \(\textsf {rank}(6) = 1\) and \(k_2 = 2\). Thus \({\textsf {safe}}\) returns true.

The cover function, and its auxiliaries, are justified by two independent sets of results, the first establishing termination of simplify and cover  and the second proving that cover is indeed a cover. The headline results are stated below:

Theorem 2

simplify and cover terminate.

Theorem 3

Let \(F \subseteq \mathbb {Z}_m[\textbf{x}]\) and \({\textsf {cover}}(F) = \mathcal {W} \subseteq \mathbb {Z}_m^{d}[\textbf{w}]\). Then \(\mathcal {W}\) is a cover of F over \(\textbf{w}\).

Example 21

Returning again to \(F_1\) and \(F_2\) of Example 16, cover computes \(\mathcal {W}_1 = \{ \langle 4w_2 + 6, 4w_2 \rangle , \langle 8, 8w_2 + 1 \rangle , \langle 12, 8w_2 + 7 \rangle \} \) and \( \mathcal {W}_2 = \{ \langle 8w_1 + 3, 4w_2 + 1 \rangle \} \) over \(\textbf{w} = \langle w_1, w_2 \rangle \), where the 3 vectors of \(\mathcal {W}_1\) corresponds to \(A_1\), \(A_2\) and \(A_3\) respectively and the single vector constituting \(\mathcal {W}_2\) corresponds to \(A_6\) of Fig. 4, but with a different parametric variable \(w_2\) from \(w_1\) used in Example 16.

5.2 Closure

This section explains how a cover provides a vehicle for computing closure. A closed set of polynomials can be represented by different bases, and therefore a relation is introduced to express when one basis represents the closure of another:

Definition 14

The relation \(\rightarrow _{\textsf {cl}[\textbf{x}]}\; \subseteq \wp (\mathbb {Z}_m[\textbf{x}])^2\) is defined \(B \rightarrow _{\textsf {cl}[\textbf{x}]} B'\) iff \(\uparrow _{\textbf{x}} \langle B \rangle _{\textbf{x}} = \langle B' \rangle _{ \textbf{x}}\).

The following lemma provides a method for computing \({\uparrow _{\textbf{x}} \langle F \rangle _{ \textbf{x}}}\) when \(\{ \textbf{W} \}\) is a singleton cover for F. The lemma is stated by lifting the elimination relation to vectors of variables defined thus \(B \rightarrow _{\textsf {elim}[\epsilon ]} B\) and \(B \rightarrow _{\textsf {elim}[y:\textbf{y}]} B''\) iff \(B \rightarrow _{\textsf {elim}[y]} B'\) and \(B' \rightarrow _{\textsf {elim}[\textbf{y}]} B''\). The computational tactic given in the lemma amounts to augmenting null polynomials with d polynomials which equate each variable \(x_{\ell }\) with \(W_{\ell }\) and then applying variable elimination:

Lemma 4

Suppose \(\textbf{W} \in \mathbb {Z}_m[\textbf{w}]^{d}\), \(\top = \langle N \rangle _{ \textbf{w}}\) and \(\{x_1 - W_1, \ldots , x_d - W_d\} \cup N \rightarrow _{\textsf {elim}[\textbf{w}]} B \subseteq \mathbb {Z}_m[\textbf{x}]\). Then, \(\langle B \rangle _{ \textbf{x}} = \alpha _{\textbf{x}}(\{ \llbracket \textbf{W} \rrbracket _{ \textbf{w}}( \textbf{a}) \mid \textbf{a} \in \mathbb {Z}_m^{\vert \textbf{w} \vert } \})\).

Example 22

To illustrate this tactic, recall from Example 21 that \(\{ \textbf{W} \}\) is a cover of \(F_2\) over \(\textbf{w} = \langle w_1, w_2 \rangle \) where \(\textbf{W} = \langle 8w_1 + 3, 4w_2 + 1 \rangle \). Recall too from Example 4 that \(\top = \langle N \rangle _{ \textbf{x}}\) where \(\textbf{x} = \langle x, y \rangle \). Observe \(N' = N[x \mapsto w_1, y \mapsto w_2]\) is also a set of nulls, albeit over \(\mathbb {Z}_m[\textbf{w}]\). Let \(p'_i = p_i[x \mapsto w_1, y \mapsto w_2]\) using the abbreviations of Example 4. Then \(\textsf {gb}_{\textbf{w}:\textbf{x}}(\{x - W_1, y - W_2\} \cup N']) = B\) where

figure m

The three regions delineate polynomials depending on both \(w_1\) and \(w_2\) (top), \(w_2\) but not \(w_1\) (middle) and neither \(w_1\) nor \(w_2\) (bottom). It thus follows that \(\{x - W_1, y - W_2 \} \cup N' \rightarrow _{\textsf {elim}[w_1]} B'\) where

figure n

\(B'\) is a Gröbner basis (with respect to \(\prec _{\langle w, x, y \rangle }\)), hence \(B' \rightarrow _{\textsf {elim}[w_2]} B''\) where

$$ B'' = \{x^2 + 7, \quad xy + x + y + 9, \quad 2x + 10, \quad y^2 + 2y + 13, \quad 4y + 12 \} $$

Composing the two eliminations yields \(\{x - W_1, y - W_2\} \cup N' \rightarrow _{\textsf {elim}[\textbf{w}]} B''\). Note that it is only necessary to compute a single Gröbner basis to derive \(B''\). Observe that each polynomial of \(B''\) satisfies the points of \(\gamma _{\textbf{x}}(F_2)\) illustrated in Fig. 4(g).

The following theorem generalises this tactic to arbitrary covers:

Theorem 4

Let \(B \subseteq \mathbb {Z}_m[\textbf{x}]\), \(\top = \langle N \rangle _{ \textbf{w}}\) and \(\mathcal {W} \subseteq \mathbb {Z}_m[\textbf{w}]^d\) be a cover for B over \(\textbf{w}\). Suppose for each \(\textbf{W} \in \mathcal {W}\), \(\{ x_1 - W_1, \ldots , x_d - W_d \} \cup N \rightarrow _{\textsf {elim}[\textbf{w}]} B_{\textbf{W}}\) and \(\langle B' \rangle _{ \textbf{x}} = \bigsqcup _{\textbf{W} \in \mathcal {W}} \langle B_{\textbf{W}} \rangle _{ \textbf{x}}\). Then, \(B \rightarrow _{\textsf {cl}[\textbf{x}]} B'\).

Example 23

Now recall from Example 19 that \(\{ \textbf{W}_e, \textbf{W}_h, \textbf{W}_k \}\) is a cover of

$$ F = \left\{ \begin{array}{@{}c@{}} x^2 + x + 7y^2 + 11y + 12, \quad xy + 4x + 10y^2 \end{array} \right\} $$

over \(\textbf{w} = \langle w_1, w_2 \rangle \) where \(\textbf{W}_e = \langle 4w_2 + 4, 4w_2 \rangle \), \(\textbf{W}_h = \langle 15, 12 \rangle \) and \(\textbf{W}_k = \langle 14, 1 \rangle \). To apply the theorem, \(B_{\textbf{W}_e}\) is derived by \(\{x - (4w_2 + 4), y - 4w_2\} \cup N \rightarrow _{\textsf {elim}[\textbf{w}]} B_{\textbf{W}_e}\). Since \(\textbf{W}_e\) depends only on \(w_2\), \(B_{\textbf{W}_e}\) can be computed by \(\{x - (4w_2 + 4), y - 4w_2\} \cup N' \rightarrow _{\textsf {elim}[w_2]} B_{\textbf{W}_e}\) where \(\top = \langle N' \rangle _{ \langle w_2 \rangle }\). To that end, note \(\textsf {gb}_{\langle w_2,x,y \rangle }(\{x - (4w_2 - 4), y - 4w_2\} \cup N') = B'_{\textbf{W}_e}\) where

figure o

thus \(B_{\textbf{W}_e} = \{x + 3y + 12, y^2, 4y\}\) is computed avoiding nulls containing \(w_1\).

The bases \(B_{\textbf{W}_h}\) and \(B_{\textbf{W}_k}\) can be derived without recourse to elimination or any nulls since \(\textbf{W}_h\) and \(\textbf{W}_k\) are independent of \(w_1\) and \(w_2\) hence put

$$ B_{\textbf{W}_h} = \{x - 15, y - 12\} = \{x + 1, y + 4\} \qquad B_{\textbf{W}_k} = \{x - 14, y - 1\} = \{x + 2, y + 15\} $$

By Theorem 4\({\uparrow _{\textbf{x}} \langle F \rangle _{ \textbf{x}}} = \langle B \rangle _{ \textbf{x}}\) where \(\langle B \rangle _{\textbf{x}} = \langle B_{\textbf{W}_e} \rangle _{ \textbf{x}} \sqcup \langle B_{\textbf{W}_h} \rangle _{ \textbf{x}} \sqcup \langle B_{\textbf{W}_k} \rangle _{ \textbf{x}}\) giving

$$ B = \left\{ \begin{array}{@{}c@{}} x^2 + x + 7y^2 + 11y + 12, \quad xy + 4x + 10y^2, \quad y^3 + 7y^2 + 8y, \quad 4y^2 + 12y \end{array} \right\} $$

All the polynomials of B satisfy the points \(\gamma _{\textbf{x}}(F)\) plotted in Fig. 7(a). Observe

$$ F' = \left\{ \begin{array}{@{}c@{}} x^2 + x + 7y^2 + 11y + 12, \quad xy + 4x + 10y^2, \quad y^3 + 7y^2 + 8y \end{array} \right\} $$

is a Gröbner basis for \(\langle F \rangle _{\textbf{x}}\) with respect to \(\prec _{\textbf{x}}\). Since \(4y^2 + 12y\) is irreducible by \(F'\) it follows \(4y^2 + 12y \notin \langle F \rangle _{ \textbf{x}}\) which is why closure augments F with \(4y^2 + 12y\).

5.3 Meet

Despite the central importance of meet, this section is relatively short, since the following proposition demonstrates how meet can be reduced to closure:

Proposition 8

Let . If \(B_1 \cup B_2 \rightarrow _{\textsf {cl}[\textbf{x}]} B\) then \(\langle B_1 \rangle _{ \textbf{x}} \sqcap \langle B_2 \rangle _{ \textbf{x}} = \langle B \rangle _{\textbf{x}}\).

Example 24

Consider \(F_3, F_4 \subseteq \mathbb {Z}_{16}[x,y]\) where \(F_3 = \{ x^2 + x + 7y^2 + 11y + 12 \}\) and \(F_4 = \{ xy + 4x + 10y^2 \}\) and let \(F = F_3 \cup F_4\). The solution sets \(\gamma _{\textbf{x}}(F)\), \(\gamma _{\textbf{x}}(F_3)\) and \(\gamma _{\textbf{x}}(F_4)\) are plotted in Figs. 7(a), (b) and (c) respectively. The diamond points in Figs. 7(b) and (c) are those contained in both \(\gamma _{\textbf{x}}(F_3)\) and \(\gamma _{\textbf{x}}(F_4)\) and show \(\gamma _{\textbf{x}}(F) = \gamma _{\textbf{x}}(F_1) \cap \gamma _{\textbf{x}}(F_2)\). Now, Example 23 shows \(F \rightarrow _{\textsf {cl}[\textbf{x}]} B\) where

$$ B = \left\{ \begin{array}{@{}c@{}} x^2 + x + 7y^2 + 11y + 12, \quad xy + 4x + 10y^2, \quad y^3 + 7y^2 + 8y, \quad 4y^2 + 12y \end{array} \right\} $$

thus \(\langle F_3 \rangle _{ \textbf{x}} \sqcap \langle F_4 \rangle _{ \textbf{x}} = \langle B \rangle _{ \textbf{x}}\). As noted in Example 23, \(4y^2 + 12y \notin \langle F \rangle _{ \textbf{x}}\) hence \(\langle F \rangle _{ \textbf{x}} \ne \langle B \rangle _{ \textbf{x}}\).

6 Forwards Analysis of Polynomial Programs

In this section, the class of polynomial programs is introduced for which a concrete semantics is defined over sets of points drawn from \(\mathbb {Z}_m^d\). The corresponding abstract semantics over MPAD defines a forwards analysis. The development builds to show the soundness of the analysis, as well as state a precision result for programs consisting solely of polynomial assignments, non-deterministic assignments and non-deterministic branching. The section concludes with an illustrative example for a program which computes the modular inverse.

6.1 Polynomial Programs

Let \(\textbf{x} = \langle x_1, \ldots , x_d \rangle \) denote a vector of program variables. A polynomial program over \(\textbf{x}\) is a graph \(G = \langle N, E, n^{*} \rangle \) where N is a finite set of program points, \(E \subseteq N \times \textsf {Stmt} \times N\) is a finite set of annotated edges and \(n^{*} \in N\) is the entry point into G. The set \(\textsf {Stmt}\) of program statements is defined:

$$ x_j {:}{=}p \quad \mid \quad x_j {:}{=}* \quad \mid \quad {\textsf {assume}}~(p = 0) \quad \mid \quad {\textsf {assume}}~(p \ne 0) $$

where \(x_j {:}{=}*\) and \(x_j {:}{=}p\) denote, respectively, non-deterministic assignment to the variable \(x_j\) and polynomial assignment to \(x_j\) for some \(p \in \mathbb {Z}_m[\textbf{x}]\). The assume statements for \(p = 0\) and \(p \ne 0\) provide a linguistic abstraction for positive and negative guards, respectively expressing that p is satisfied, and conversely p is not satisfied, by an assignment to \(\textbf{x}\).

6.2 Concrete Semantics

To define the concrete semantics, let \(\textbf{a}[j \mapsto c] = \langle a_1, \ldots , a_{j - 1}, c, a_{j+1},\ldots , a_d\rangle \) for \(\textbf{a} \in \mathbb {Z}_m^d\), \(c \in \mathbb {Z}_m\) and j a variable index denote a vector update. The concrete semantics is then formulated in terms of a set of (concrete) transfer functions \(\llbracket s \rrbracket :\wp (\mathbb {Z}_m^d) \rightarrow \wp (\mathbb {Z}_m^d)\), one for each program statement s, defined as follows:

$$ \begin{array}{rcl} \llbracket x_j {:}{=}p \rrbracket (A) &{} = &{} \{ (\textbf{a})[j \mapsto c] \mid \textbf{a} \in A, c = \llbracket p \rrbracket _{ \textbf{x}}(\textbf{a}) \} \\ \llbracket x_j {:}{=}* \rrbracket (A) &{} = &{} \{ (\textbf{a})[j \mapsto c] \mid \textbf{a} \in A, c \in \mathbb {Z}_m \} \\ \llbracket {\textsf {assume}}~(p = 0) \rrbracket (A) &{} = &{} \{ \textbf{a} \in A \mid \llbracket p \rrbracket _{ \textbf{x}}(\textbf{a}) = 0 \} \\ \llbracket {\textsf {assume}}~(p \ne 0) \rrbracket (A) &{} = &{} \{ \textbf{a} \in A \mid \llbracket p \rrbracket _{ \textbf{x}}(\textbf{a}) \ne 0 \} \\ \end{array} $$

Observe that the function space \(N\rightarrow \wp (\mathbb {Z}_m^d)\) is ordered point-wise: \(\theta \sqsubseteq \theta '\) iff \(\theta (n) \subseteq \theta '(n)\) for all \(n \in N\). Thus the concrete semantics can be defined as follows:

Definition 15

The concrete semantics for \(G = \langle N, E, n^{*} \rangle \) is the least map \(\theta : N\rightarrow \wp (\mathbb {Z}_m^d)\) satisfying:

  • \(\mathbb {Z}_m^d \subseteq \theta (n^{*})\)

  • \(\llbracket s \rrbracket (\theta (n)) \subseteq \theta (n')\) for all \(\langle n, s, n' \rangle \in E\)

6.3 Abstract Semantics

Analogous to the concrete semantics, the abstract semantics for \(G = \langle N, E, n^{*} \rangle \) is defined in terms of a set of (abstract) transfer functions. For a statement s, is defined thus:

$$ \begin{array}{rcl} \llbracket x_j {:}{=}p \rrbracket (P) &{} = &{} \{ q \in \mathbb {Z}_{m}[\textbf{x}] \mid q[p/x_j] \in P \} \\ \llbracket x_j {:}{=}* \rrbracket (P) &{} = &{} \uparrow _{\textbf{x}} \textsf {elim}[x_j](P) \\ \llbracket {\textsf {assume}}~(p = 0) \rrbracket (P) &{} = &{} \uparrow _{\textbf{x}}(\{p\} \cup P ) \\ \llbracket {\textsf {assume}}~(p \ne 0) \rrbracket (P) &{} = &{} \bigsqcup _{k=1}^{\omega } \llbracket {\textsf {assume}}~(2^{\omega - k}p + 2^{\omega - 1} = 0) \rrbracket (P) \\ \end{array} $$

Here, the notation \(q[p/x_j]\) denotes the polynomial constructed by substituting p for every instance of \(x_j\) in q. To comprehend the encoding for \({\textsf {assume}}(p \ne 0)\), suppose \(\textbf{a} \in \mathbb {Z}_m\) such that \(\llbracket p \rrbracket _{ \textbf{x}}(\textbf{a}) \ne 0\). Observe there exists some \(1 \le k \le \omega \) such that \(\llbracket p \rrbracket _{\textbf{x}}(\textbf{a})\) has 1 in its k-th lowest bit position and 0 in all lower bits. Therefore \(\llbracket 2^{\omega - k}p + 2^{\omega - 1} \rrbracket _{ \textbf{x}}(\textbf{a}) = 0\). Conversely, if \(\llbracket 2^{\omega - k}p + 2^{\omega - 1} \rrbracket _{ \textbf{x}}(\textbf{a}) = 0\) then \(\llbracket p \rrbracket _{\textbf{x}}(\textbf{a})\) has 1 in its k-th lowest bit position hence \(\llbracket p \rrbracket _{ \textbf{x}}(\textbf{a}) \ne 0\).

The function space \(N\rightarrow \textsf {MPAD}_{m}[\textbf{x}]\) is likewise ordered point-wise: \(\sigma \sqsubseteq \sigma '\) iff \(\sigma (n) \sqsubseteq \sigma '(n)\) for all \(n \in N\), allowing the abstract semantics to be defined thus:

Definition 16

The abstract semantics for \(G = \langle N, E, n^{*} \rangle \) is the least map \(\sigma :N\rightarrow \textsf {MPAD}_{m}[\textbf{x}]\) satisfying:

  • \(\top \sqsubseteq \sigma (n^{*})\)

  • \(\llbracket s \rrbracket (\theta (n)) \sqsubseteq \sigma (n')\) for all \(\langle n, s, n' \rangle \in E\)

Since MPAD is finite, the abstract semantics can be concretely computed by fixed-point iteration; an example of such a procedure is illustrated in Sect. 6.5. The relationship between the concrete and abstract semantics is developed in the following section. The following result details how the abstract transfer functions are actually computed:

Proposition 9

Let \(\langle B \rangle _{\textbf{x}} \in \textsf {MPAD}_{m}[\textbf{x}]\) and \(w \notin \textsf{vars}(B)\). Then:

  • If \(B \cup \{w - p\} \rightarrow _{\textsf {elim}[x_j]} B'\) and \(B' \cup \{x_j - w\} \rightarrow _{\textsf {elim}[w]} B''\) then it follows \(\llbracket x_j {:}{=}p \rrbracket (\langle B \rangle _{\textbf{x}}) = \langle B'' \rangle _{\textbf{x}}\).

  • If \(B \rightarrow _{\textsf {elim}[x_j]} B'\) and \(B' \rightarrow _{\textsf {cl}[\textbf{x}]} B''\) then \(\llbracket x_j {:}{=}* \rrbracket (\langle B \rangle _{\textbf{x}}) = \langle B'' \rangle _{\textbf{x}}\).

  • If \(\{p\} \cup B \rightarrow _{\textsf {cl}[\textbf{x}]} B'\) then \(\llbracket {\textsf {assume}}~(p = 0) \rrbracket (\langle B \rangle _{\textbf{x}}) = \langle B' \rangle _{\textbf{x}}\).

6.4 Correctness and Precision

Key to establishing correctness and precision of the analysis are the following results. The first elucidates the relationship between concrete and abstract join using the abstraction map \(\alpha _{\textbf{x}}:\wp (\mathbb {Z}_m^d) \rightarrow \textsf {MPAD}_{m}[\textbf{x}]\) introduced in Proposition 3:

Proposition 10

Suppose \(P_1, P_2 \in \textsf {MPAD}_{m}[\textbf{x}]\) where \(P_1 = \alpha _{\textbf{x}}(A_1),P_2 = \alpha _{\textbf{x}}(A_2)\) for some \(A_1, A_2 \subseteq \mathbb {Z}_m^d\). Then \(P_1 \sqcup P_2 = \alpha _{\textbf{x}}(A_1 \cup A_2)\).

The second result demonstrates soundness of each of the abstract transfer functions, as well as optimality for the two assignment operations:

Proposition 11

Let \(A \subseteq \mathbb {Z}_m^d\) and \(P = \alpha _{\textbf{x}}(A)\) so that \(P \in \textsf {MPAD}_{m}[\textbf{x}]\). Then

$$ \begin{array}{rcl} \alpha _{\textbf{x}}(\llbracket x_j {:}{=}p \rrbracket (A)) &{} = &{} \llbracket x_j {:}{=}p \rrbracket (P) \\ \alpha _{\textbf{x}}(\llbracket x_j {:}{=}* \rrbracket (A)) &{} = &{} \llbracket x_j {:}{=}* \rrbracket (P) \\ \alpha _{\textbf{x}}(\llbracket {\textsf {assume}}~(p=0) \rrbracket (A)) &{} \sqsubseteq &{} \llbracket {\textsf {assume}}~(p=0) \rrbracket (P) \\ \alpha _{\textbf{x}}(\llbracket {\textsf {assume}}~(p \ne 0) \rrbracket (A)) &{} \sqsubseteq &{} \llbracket {\textsf {assume}}~(p \ne 0) \rrbracket (P) \end{array} $$

With these results in the place, the following theorem can be demonstrated:

Theorem 5

If the concrete and the abstract semantics for \(G = \langle N, E, n^{*}\rangle \) are \(\theta :N\rightarrow \wp (\mathbb {Z}_m^d)\) and \(\sigma :N \rightarrow \textsf {MPAD}_{m}[\textbf{x}]\) respectively then:

$$ \alpha _{\textbf{x}}(\theta (n)) \sqsubseteq \sigma (n) \text { for all } n \in N $$

If G is free from \({\textsf {assume}}~(p = 0)\) and \({\textsf {assume}}~(p \ne 0)\) statements then:

$$ \alpha _{\textbf{x}}(\theta (n)) = \sigma (n) \text { for all } n \in N $$

In particular, MPAD provides a sound analysis for polynomial programs, and moreover finds all modular polynomial invariants for programs consisting of polynomial and non-deterministic assignments and non-deterministic branching.

6.5 Illustrative Example

To illustrate how MPAD can be applied, consider the algorithm [34] listed in Fig. 10(a). The algorithm computes the multiplicative inverse of an (odd) modular integer \(a \in \mathbb {Z}_m\). The variables x, y and a all store a \(\omega \)-bit (unsigned) integer. The algorithm is abstracted by the polynomial program represented in Fig. 10(b) where \(\textbf{x} = \langle x, y, a \rangle \), the nodes are \(N = \{0, \ldots , 7\}\) and the entry node is \(0\). Each edge is decorated with a polynomial assignment or an assumption involving a polynomial equality or a polynomial disequality.

Fig. 10.
figure 10

An algorithm and flow graph for computing the multiplicative inverse

The statement \({\textsf {assume}}~(a\ \text {odd})\) is rendered as \({\textsf {assume}}~(2^{\omega - 1}a - 2^{\omega - 1} = 0)\), where the (linear) polynomial \(2^{\omega - 1}a - 2^{\omega - 1} = 0\) expresses that a is odd. The control-flow for the do ... while is represented as two edges decorated with \({\textsf {assume}}~(x - y \ne 0)\) and \({\textsf {assume}}~(x - y = 0)\), which, respectively, encode the loop condition \(x \ne y\) and its negation. The control flow for the assert statement is expressed through two edges decorated with \({\textsf {assume}}~(ax - 1 = 0)\) and \({\textsf {assume}}~(ax - 1 \ne 0)\), where the node \(7\) is reached if the assertion fails.

Figure 11 presents each \(\sigma _k\) computed by a work-list algorithm primed with the edges that flow from \(0\). The second column displays the worklist \(w_k\). The selected edge \(\langle n, n' \rangle \in w_k\) is always the first listed in \(w_k\). For instance, at step 4, the edge \(\langle 4, 2 \rangle \) is selected, rather than \(\langle 4, 5 \rangle \). The third column displays \(\sigma _{k+1}\) as a function of the \(\sigma _k\): \(\sigma _k\) if no update occurred, else \(\sigma _k[n' \mapsto P]\) where \(P \in \textsf {MPAD}_{m}[ \textbf{x}]\). Polynomials that appear multiply are referenced with a label: \(p_a\), \(p_b\), etc. Most significantly, the table demonstrates \(\sigma _{14}(7) = \bot \) thus \(\textsf {assert}~(a*x = 1)\) must succeed (this can be seen from \(ax - 1 \rightarrow ^{*}_{\prec ,\sigma _{13}(5)} 0\)). No other abstract domain can verify this code because the invariants are both polynomial and modular and the analysis requires polynomial guards; indeed the manual proof of correctness of the algorithm [34] relies on both polynomial manipulation and observing \(a^2 (2^\omega ) = 0 \mod 2^\omega \).

Fig. 11.
figure 11

Updates to the state map

Since a positive polynomial guard is modelled by meet, detailed commentaries are only given for a polynomial assignment and a negative guard:

Polynomial Assignment. When \(k = 1\), the edge \(\langle 1, 2 \rangle \) is selected, corresponding to the statement \(y {:}{=}1\). From the table it follows \(\sigma _1(1) = P_0 = \langle B_0 \rangle _{\textbf{x}}\). To model the assignment, first the basis \(B_0\) is adjoined with the polynomial \(w - 1\). Here, w is a new variable that represents the value of y after the assignment and the polynomial \(w - 1\) expresses that this value must equal 1. Then, y is eliminated from \(B_0 \cup \{w - 1\}\), to reflect that y is overwritten during the assignment. This elimination step is achieved in two phases. First, a Gröbner basis is computed for \(\langle B_0 \cup \{w - 1\} \rangle _{ \textbf{x}}\) with respect to a lexicographical ordering \(\langle y, x, w, a \rangle \) over the variables, yielding

$$ \left\{ \begin{array}{@{}c@{}} y^4a + y^4 + 2y^3a + 2y^3 + 3y^2a + 3y^2 + 2ya + 2y, \\ 2y^2x^2a + 2y^2x^2 + 2y^2xa + 2y^2x + 2yx^2a + 2yx^2 + 2yxa + 2yx, \\ y^2a^2 + 7y^2 + ya^2 + 7y, \quad 4y^2a + 4y^2 + 4ya + 4y, \\ x^4a + x^4 + 2x^3a + 2x^3 + 3x^2a + 3x^2 + 2xa + 2x, \quad x^2a^2 + 7x^2 + xa^2 + 7x, \\ 4x^2a + 4x^2 + 4xa + 4x, \quad w + 15, \quad a^3 + a^2 + 7a + 7, \quad 2a^2 + 14, \quad 8a + 8 \end{array} \right\} $$

Then, all polynomials involving y are deleted:

$$ B_0' = \left\{ \begin{array}{@{}c@{}} x^4a + x^4 + 2x^3a + 2x^3 + 3x^2a + 3x^2 + 2xa + 2x, \\ x^2a^2 + 7x^2 + xa^2 + 7x, \quad 4x^2a + 4x^2 + 4xa + 4x, \\ w + 15, \quad a^3 + a^2 + 7a + 7, \quad 2a^2 + 14, \quad 8a + 8 \end{array} \right\} $$

the Gröbner basis ensuring that this deletion does not lose information. To finalise the assignment, a Gröbner basis is computed for \(\langle B_0' \cup \{w - y\} \rangle _{\textbf{x}}\) with respect to the lexicographical ordering \(\langle w, y, x, a \rangle \), then all constraints containing w are deleted, yielding:

$$ B_1 = \left\{ \begin{array}{@{}l@{}} x^4a + x^4 + 2x^3a + 2x^3 + 3x^2a + 3x^2 + 2xa + 2x, \\ x^2a^2 + 7x^2 + xa^2 + 7x, \quad 4x^2a + 4x^2 + 4xa + 4x, \\ y + 15, \quad a^3 + a^2 + 7a + 7, \quad 2a^2 + 14, \quad 8a + 8 \end{array} \right\} $$

Negative Polynomial Guard. When \(k = 4\), the edge \(\langle 4, 2 \rangle \) is selected, the edge corresponding to the statement \({\textsf {assume}}~(x - y \ne 0)\). From the table it follows \(\sigma _3(4) = \langle B_3 \rangle _{\textbf{x}}\) where

$$ B_3 = \left\{ \begin{array}{@{}c@{}} x + 15, \quad y + a + 14, \quad a^3 + a^2 + 7a + 7, \quad 2a^2 + 14, \quad 8a + 8 \end{array} \right\} $$

To effect the operation, closure is separately applied to four bases:

$$ \begin{array}{lcl} B_3 \cup \{ 8(x - y) + 8 \} &{} \rightarrow _{\textsf {cl}[\textbf{x}]} &{} B_{4,1} = \{1\} \\ B_3 \cup \{ 4(x - y) + 8 \} &{} \rightarrow _{\textsf {cl}[\textbf{x}]} &{} B_{4,2} = \{ x + 15, \; y + a + 14, \; a^2 + 2a + 1, \; 4a + 4\} \\ B_3 \cup \{ 2(x - y) + 8 \} &{} \rightarrow _{\textsf {cl}[\textbf{x}]} &{} B_{4,3} = \{x + 15, \; y + a + 14, \; a^2 + 7, \; 2a + 6\} \\ B_3 \cup \{ (x - y) + 8 \} &{} \rightarrow _{\textsf {cl}[\textbf{x}]} &{} B_{4,4} = \{ x + 15, \; y + 7, \; a + 7 \} \end{array} $$

The intuition is that each \(\gamma _{\textbf{x}}(B_{4,k})\) is the subset of \(\textbf{a} \in \gamma _{\textbf{x}}(B_3)\) for which the k least-significant bits of \(\llbracket x - y \rrbracket _{ \textbf{x}}( \textbf{a})\) store the value \(2^{k-1}\). Thus \(\gamma _{\textbf{x}}(B_{4,1})\) is the subset of \(\textbf{a} \in \gamma _{\textbf{x}}(B_3)\) for which the least bit of \(\llbracket x - y \rrbracket _{ \textbf{x}}( \textbf{a})\) is \(2^0 = 1\); \(\gamma _{\textbf{x}}(B_{4,2})\) is the subset for which the 2 least bits of \(\llbracket x - y \rrbracket _{ \textbf{x}}( \textbf{a})\) store \(2^1 = 2\), etc. Since \(\llbracket x - y \rrbracket _{ \textbf{x}}( \textbf{a}) \ne 0\) holds precisely when at least one bit is set, it follows \(P_4 = \bigsqcup _{k=1}^4 \langle B_{4,k} \rangle _{\textbf{x}} \in \textsf {MPAD}_{m}[\textbf{x}]\) satisfies the property above. In fact, in this case \(P_4 = P_3\), hence the abstract execution of \({\textsf {assume}}~(x - y \ne 0)\) does not strengthen the polynomial constraints even though \(B_{4,1} = \{1\}\) reveals that the difference between x and y is never odd.

6.6 Implementation

Buchberger’s algorithm, like many in symbolic computation, has poor worst-case complexity [19]. Performance can be dramatically improved, however, by factoring out redundant s-polynomial calculations (and the ensuing reductions) using the so-called Gebauer and Möller rules [13]. To reestablish these rules for modular polynomials, let \(B = \{p_1, \ldots , p_s\} \subseteq \mathbb {Z}_{m}[\textbf{x}] \setminus \{0\}\) and put \(\textbf{p} = \langle p_1, \ldots , p_s \rangle \) and \(\textbf{t} = \langle \textsf {lt}_{\prec }(p_1), \ldots , \textsf {lt}_{\prec }(p_s) \rangle \). A vector \(\textbf{q} \in \mathbb {Z}_{m}[\textbf{x}]^s\) is a syzygy of \(\textbf{t}\) iff \(\textbf{q} \cdot \textbf{t} = 0\). The set \(\textsf {syz}(\textbf{t})\) of syzygies of \(\textbf{t}\) forms a module. It can be shown [13] that if \(\textbf{q} \cdot \textbf{p} \rightarrow _{\prec , B} 0\) for all syzygies \(\textbf{q}\) in a module-basis for \(\textsf {syz}(\textbf{t})\), then B is a Gröbner basis. Letting \(\{\textbf{e}_1, \ldots , \textbf{e}_s\}\) denote the standard basis for \(\mathbb {Z}_{m}[\textbf{x}]^s\), \(k_i = \textsf {rank}(\textsf {lc}_{\prec }(p_i))\) and \(t_{i,j} = \textsf {lcm}(t_i, t_j)/t_i\), the principle syzygies \(\pi _i\) and \(\pi _{i,j}\) are defined:

$$ \pi _i = 2^{\omega - k_i}\textbf{e}_i \quad \text {and} \quad \pi _{i,j} = t_{i,j}\textbf{e}_i - t_{j,i}\textbf{e}_j \text { where } 1 \le i < j \le s $$

The following result yields a condition for detecting redundant principle syzygies:

Proposition 12

Given \(\textbf{t} = \langle t_1, \ldots , t_s \rangle \) be a vector of non-zero terms. Then

$$ \frac{\textsf {lcm}(t_i,t_j,t_k)}{\textsf {lcm}(t_i,t_j)}\mathbf {\pi }_{i,j} + \frac{\textsf {lcm}(t_i,t_j,t_k)}{\textsf {lcm}(t_j,t_k)}\mathbf {\pi }_{j,k} + \frac{\textsf {lcm}(t_i,t_j,t_k)}{\textsf {lcm}(t_k,t_i)}\mathbf {\pi }_{k,i} = 0 $$

and, in particular, if \(t_k\) divides \(\textsf {lcm}(t_i, t_j)\) then \(\mathbf {\pi }_{i,j}\) is in the submodule generated by \(\mathbf {\pi }_{j,k}\) and \(\mathbf {\pi }_{k,i}\)

Principle syzygies align with S-polynomials: \(\pi _{i}\) and \(\pi _{i,j}\) correspond to \(\textsf {S}_{\prec }(p_i, p_i)\) and \(\textsf {S}_{\prec }(p_i, p_j)\) respectively. Thus, the above result can be reinterpreted for S-polynomials as follows: given polynomials \(p_i, p_j,p_k \in \mathbb {Z}_{m}[\textbf{x}]\) such that \(\textsf {lt}_{\prec }(p_i)\) divides \(\textsf {lcm}(\textsf {lt}_{\prec }(p_j), \textsf {lt}_{\prec }(p_k))\) then the S-polynomial \(\textsf {S}_{\prec }(p_i, p_k)\) can be dropped (from S in Fig. 2) if \(\textsf {S}_{\prec }(p_i, p_k)\) and \(\textsf {S}_{\prec }(p_i, p_j)\) are (eventually) computed. Although this rule mirrors the triple criteria of [13], modular polynomials offer additional redundancy rules. Together these rules reduce the running time from hours on the above 4-bit example to 510 ms on a 2.5 GHz 16 GB Macbook.

Our implementation is 9293 LOC of Scala3 and is stratified in 3 layers: the worklist-driven fixpoint engine, the domain operations and the underlying Gröbner basis solver. For bit-widths of 8, 12 and 16, the running times are 1,398 ms 5,894 ms and 54,019 ms respectively (though the actual target for our work is AVR micro-controller code which is merely 8-bit). To scale to these higher bit-widths, it is necessary to reduce the numbers of terms in each polynomial. Rather surprisingly, this can be achieved on-the-fly as a polynomial is generated term-by-term rather than as a post-processing step which is applied to some (huge) polynomial derived from an arithmetic operation. To illustrate, consider summing two polynomials \(q_1\) and \(q_2\) where \(q_1 = t_1+r_1\) and \(q_2=t_2+r_2\) and \(t_1\) and \(t_2\) are leading terms with the same powers. The term \(t=t_1+t_2\) is computed then reduced (whenever possible) with a null polynomial whose leading term divides t to give a simplified polynomial q. Then we apply the same tactic to sum \(q + r_1\) to give a polynomial s and apply the same tactic yet again to sum \(s + r_2\). A null polynomial whose leading terms divides t can be found directly, provided one exists, without resorting to search or even a lookup table. In fact, given t, the null can be found simply by multiplying terms together [32, definition 9]. For example, if \(\omega = 4\) and \(t = 3x^2y^3\) then the null polynomial \(3x^2y^3 + 7x^2y^2 + 6x^2y + 13xy^3 + 9xy^2 + 10xy\) is found by expanding \(3x(x - 1)y(y - 1)(y - 2)\) so that \(q = 9x^2y^2 + 10x^2y + 3xy^3 + 7xy^2 + 6xy\). The degree of the leading term of \(q_1 + q_2\) is then reduced from \(x^2y^3\) to \(x^2y^2\). Applying this tactic repeatedly keeps the number of terms small in all arithmetic operations. (The tactic is not mentioned in [2] but appears to be a dynamic version of the technique used in [33] that applies null (vanishing) polynomials to statically bound the representation of polynomials.)

7 Related Work

Momentum for migrating abstract domains from idealised arithmetic to machine arithmetic is growing [10,11,12, 21, 26, 27], driven by the desire to soundly model program behaviour and low-level code. Some of these domains [10, 21, 26, 27] satisfy the ascending chain condition, which is key to computing best transformers [28] though, to our knowledge, this observation has not been previously made.

Early approaches to deriving (non-modular) polynomial invariants employed forwards [29] and backwards [25] abstract interpretation over domains of polynomial ideals. In the former case, termination of the analysis requires a widening operator to remove polynomials of high degree, since polynomial ideals over \(\mathbb {Q}\) do not satisfy the ascending chain condition. In the latter case, the analysis is primed with a template polynomial of bounded degree; linear systems are then solved to find assignments to the (symbolic) coefficients of the templates, which yield the polynomial invariants. An alternative approach [31], also employing template polynomials, directly encodes the conditions for a given template polynomial to be an invariant as a parametric linear system, which can then be solved with suitable methods [6]. None of these analyses is complete and [25] concludes, “It is a challenging open problem whether or not the set of all polynomial relations can be computed not just ones of some given form”.

This challenge [25] has motivated subsequent work [17, 18, 23, 30], which restrict the form of programs that can be analysed, either to those containing only simple loops [30], P-solvable loops [23] or affine programs [17] (where a variable is assigned to an affine expression). State-of-the-art in computing all polynomial relations focuses on affine programs [17] where the problem is reduced to that of computing the Zariski closure of the semigroup generated by a finite set of rational square matrices. However, it is not clear how this approach extends to general polynomial assignments, particularly those in a modular setting.

A more promising line of enquiry in this vein [33] seeks to adapt backwards abstract interpretation to inferring non-modular polynomial invariants [25] to a modular setting. The insight is that it is possible to bound the degree of the template polynomial without losing precision, by exploiting the fact that any modular polynomial is semantically equivalent to one with degree at most \(1.5(\omega + d)\). Building on this bound, the analysis of [33, pp. 311] seeks to infer all polynomial invariants for programs consisting of polynomial and non-deterministic assignments, non-deterministic branching and polynomial disequality guards. For disequality guards, the weakest precondition transformer is defined as \(\llbracket p \ne 0 \rrbracket ^T q = \{\, pq\, \}\) [33, pp. 306]. The subtlety of the modular setting is that the pre-condition pq can vanish, compromising soundness. To illustrate, put \(p = 2x\) and \(q = 128x\) in \(\mathbb {Z}_{256}[x]\). Then \(pq = (2x)(128x) = 256x^2 = 0\), which holds vacuously. Now observe that the assignment \(x = 1\), which satisfies 0, passes the disequality guard \(2x \ne 0\) but violates q. Thus the weakest precondition transformer is actually unsound. This not only illustrates the delicacy of modular reasoning, but suggests that the ability to reason about disequalities, even imprecisely, is a key advantage of the present work.

By design, the analysis [33] does not support equality guards, as these are not readily handled [33, pp. 301] by weakest precondition transformers. However, handling equalities is sometimes necessary, as demonstrated by the worked example, where \(x - y = 0\) is required for inferring \(ax - 1 = 0\) at program exit. Interestingly, the analysis [33] does not rely on Gröbner basis computations, but rather only exploits reduction and properties of null polynomials to detect fixed points. Although this finesses the need for Gröbner bases, it is not clear how they can be avoided when computing join in forwards analysis.

8 Conclusions

Working over modular integers is not merely more realistic, but reshapes the domain operations which can and need be applied. Widening is unnecessary since modular integers induce a domain of polynomial invariants which satisfies the ascending chain condition. Negative polynomial guards can be supported by partitioning the solution set of a polynomial disequality into sets of integers whose least bits equal a power of two. MPAD extends the scope of invariant discovery as demonstrated on an algorithm for calculating a multiplicative inverse.