Keywords

1 Introduction

Gentry’s breakthrough ideal-lattice-based homomorphic encryption system at STOC 2009 [29] was shown several years later to be breakable by a fast quantum algorithm if the underlying number fieldFootnote 1 is chosen as a cyclotomic field (with “small \(h^+\)”, a condition very frequently satisfied). Cyclotomic fields were considered in Gentry’s paper (“As an example, \(f(x)=x^n\pm 1\)”), in a faster cryptosystem from Smart–Vercauteren [38], and in an even faster cryptosystem from Gentry–Halevi [31]. Cyclotomic fields were used in all of the experiments reported in [31, 38]. Cyclotomic fields are also used much more broadly in the literature on lattice-based cryptography, although many cryptosystems are stated for more general number fields.

The secret key in the systems of Gentry, Smart–Vercauteren, and Gentry–Halevi is a short element g of the ring of integers \(\mathcal {O}\) of the number field. The public key is the ideal \(g\mathcal {O}\) generated by g. The attack has two stages:

  • Find some generator of \(g\mathcal {O}\), using an algorithm of Biasse and Song [10], building upon a unit-group algorithm of Eisenträger, Hallgren, Kitaev, and Song [24]. This is the stage that uses quantum computation. The best known pre-quantum attacks reuse ideas from NFS, the number-field sieve for integer factorization, and take time exponential in \(N^{c+o(1)}\) for a real number c with \(0<c<1\) where N is the field degree. If N is chosen as an appropriate power of the target security level then the pre-quantum attacks take time exponential in the target security level, but the Biasse–Song attack takes time polynomial in the target security level.

  • Reduce this generator to a short generator, using an algorithm introduced by Campbell, Groves, and Shepherd [17, p. 4]: “A simple generating set for the cyclotomic units is of course known. The image of \(\mathcal{O}^{\times }\) under the logarithm map forms a lattice. The determinant of this lattice turns out to be much bigger than the typical log-length of a private key \(\alpha \) [i.e., g], so it is easy to recover the causally short private key given any generator of \(\alpha \mathcal{O}\) e.g. via the LLL lattice reduction algorithm.”Footnote 2 This is the stage that relies on the field being cyclotomic.

A quantum algorithm for the first stage was stated in [17] before [10], but the effectiveness of this algorithm was disputed by Biasse and Song (see [9]) and was not defended by the authors of [17]. The algorithm in [17, p. 4] quoted above for the second stage does not rely on quantum computers, and its effectiveness is easily checked by experiment.

It is natural to ask whether quantum computers play an essential role in this polynomial-time attack. It is also natural to ask whether the problem of finding g given \(g\mathcal {O}\) is weak for all number fields, or whether there is something that makes cyclotomic fields particularly weak.

1.1 Why Focus on the Problem of Finding g Given \(g\mathcal {O}\)?

There are many other lattice-based cryptosystems that are not broken by the Biasse–Song–Campbell–Groves–Shepherd attack. For example, the attack does not break a more complicated homomorphic encryption system introduced in Gentry’s thesis [28, 30]; it does not break the classic NTRU system [32]; and it does not break the BCNS [12] and New Hope [3] systems. But the simple problem of finding g given \(g\mathcal {O}\) remains of interest for several reasons.

First, given the tremendous interest in Gentry’s breakthrough paper, the scientific record should make clear whether Gentry’s original cryptosystem is completely broken, or is merely broken for some special number fields.

Second, despite burgeoning interest in post-quantum cryptography, most cryptographic systems today are chosen for their pre-quantum security levels. Fast quantum attacks have certainly not eliminated the interest in RSA and ECC, and also do not end the security analysis of Gentry’s system.

Third, the problem of finding a generator of a principal ideal has a long history of being considered hard—even if the ideal actually has a short generator, and even if the output is allowed to be a long generator. There is a list of five “main computational tasks of algebraic number theory” in [19, p. 214], and the problem of finding a generator is the fifth on the list. Smart and Vercauteren describe their key-recovery problem as an “instance of a classical and well studied problem in algorithmic number theory”, point to the Buchmann–Maurer–Möller cryptosystem [13] a decade earlier relying on the hardness of this problem, and summarize various slow solutions.

Fourth, this problem has been reused in various attempts to build secure multilinear maps, starting with the Garg–Gentry–Halevi construction [27]. We do not mean to overstate the security or applicability of multilinear maps (see, e.g., [18, 21]), but there is a clear pattern of this problem appearing in the design of advanced cryptosystems. Future designers need to understand whether this problem should simply be discarded, or whether it can be a plausible foundation for security.

Fifth, even when cryptosystems rely on more complicated problems, it is natural for cryptanalysts to begin by studying the security of simpler problems. Successful attacks on complicated problems are usually outgrowths of successful attacks on simpler problems. As explained in Appendix B (in the full version of this paper), the Biasse–Song–Campbell–Groves–Shepherd attack has already been reused to attack a more complicated problem.

1.2 Contributions of This Paper

We introduce a pre-quantum algorithm that, for a large class of number fields, computes a short g given \(g\mathcal {O}\). Plausible heuristic assumptions imply that, for a wide range of number fields in this class, this algorithm (1) has success probability converging rapidly to 100% as the field degree increases and (2) takes time quasipolynomial in the field degree.

One advantage of building pre-quantum algorithms is that the algorithms can be tested experimentally. We have implemented our algorithm within the Sage computer-algebra system; the resulting measurements are consistent with our analysis of the performance of the algorithm.

The number fields that we target are multiquadratics, such as the degree-256 number field \(\mathbb {Q}(\sqrt{2},\sqrt{3},\sqrt{5},\sqrt{7},\sqrt{11},\sqrt{13},\sqrt{17},\sqrt{19})\), or more generally any \(\mathbb {Q}(\sqrt{d_{1}},\sqrt{d_{2}},\ldots ,\sqrt{d_{n}})\). Sometimes we impose extra constraints for the sake of simplicity: for example, in a few steps we require \(d_{1},\ldots ,d_{n}\) to be coprime and squarefree, and in several steps we require them to be positive.

A preliminary step in the attack (see Sect. 5.1) is to compute a full-rank subgroup of “the unit group of” the number field (which by convention in algebraic number theory means the unit group of the ring of integers of the field): namely, the subgroup generated by the units of all real quadratic subfields. We dub this subgroup the set of “multiquadratic units” by analogy to the standard terminology “cyclotomic units”, with the caveat that “multiquadratic units” (like “cyclotomic units”) are not guaranteed to be all units.

The degree-256 example above has exactly 255 real quadratic subfields

$$\begin{aligned} \mathbb {Q}(\sqrt{2}),\mathbb {Q}(\sqrt{3}),\mathbb {Q}(\sqrt{6}),\ldots , \mathbb {Q}(\sqrt{2\cdot 3\cdot 5\cdot 7\cdot 11\cdot 13\cdot 17\cdot 19}). \end{aligned}$$

Each of these has a unit group quickly computable by standard techniques. For example, the units of \(\mathbb {Q}(\sqrt{2})\) are \(\pm (1+\sqrt{2})^{\mathbb {Z}}\), and the units of the last field are \(\pm (69158780182494876719 + 22205900901368228 \sqrt{2\cdot 3\cdot 5\cdot 7\cdot 11\cdot 13\cdot 17\cdot 19})^{\mathbb {Z}}\).

This preliminary step generally becomes slower as \(d_{1},\ldots ,d_{n}\) grow, but it takes time quasipolynomial in the field degree N, assuming that \(d_{1},\ldots ,d_{n}\) are quasipolynomial in N.

In the next step (the rest of Sect. 5) we go far beyond the multiquadratic units: we quickly compute the entire unit group of the multiquadratic field. This is important because the gap between the multiquadratic units and all units would interfere, potentially quite heavily, with the success probability of our algorithm, the same way that a “large \(h^+\)” (a large gap between cyclotomic units and all units) would interfere with the success probability of the cyclotomic attacks. Note that computing the unit group is another of the five “main computational tasks of algebraic number theory” listed in [19]. There is an earlier algorithm by Wada [42] to compute the unit group of a multiquadratic field, but that algorithm takes exponential time.

We then go even further (Sect. 6), quickly computing a generator of the input ideal. The generator algorithm uses techniques similar to, but not the same as, the unit-group algorithm. The unit-group computation starts from unit groups computed recursively in three subfields, while the generator computation starts from generators computed recursively in those subfields and from the unit group of the top field.

There is a very easy way to extract short generators when \(d_{1},\ldots ,d_{n}\) are large enough, between roughly N and any quasipolynomial bound. This condition is satisfied by a wide range of fields of each degree.

We do more work to extend the applicability of our attack to allow smaller \(d_{1},\ldots ,d_{n}\), using LLL to shorten units and indirectly generators. Analysis of this extension is difficult, but experiments suggest that the success probability converges to 1 even when \(d_{1},\ldots ,d_{n}\) are chosen to be as small as the first n primes starting from \(n^2\).

There are many obvious opportunities for precomputation in our algorithm, and in particular the unit group can be reused for attacking many targets \(g\mathcal {O}\) in the same field. We separately measure the cost of computing the unit group and the cost of subsequently finding a generator.

1.3 Why Focus on Multiquadratics?

Automorphisms and subfields play critical roles in several strategies to attack discrete logarithms. These strategies complicate security analysis, and in many cases they have turned into successful attacks. For example, small-characteristic multiplicative-group discrete logarithms are broken in quasipolynomial time; there are ongoing disputes regarding a strategy to attack small-characteristic ECC; and very recently pairing-based cryptography has suffered a significant drop in security level, because of new optimizations in attacks exploiting subfields of the target field. See, e.g., [5, 26, 33].

Do automorphisms and subfields also damage the security of lattice-based cryptography? We chose multiquadratics as an interesting test case because they have a huge number of subfields, presumably amplifying and clarifying any impact that subfields might have upon security.

A degree-\(2^n\) multiquadratic field is Galois: i.e., it has \(2^n\) automorphisms, the maximum possible for a degree-\(2^n\) field. The Galois group, the group of automorphisms, is isomorphic to \((\mathbb {Z}/2)^n\). The number of subfields of the field is the number of subgroups of \((\mathbb {Z}/2)^n\), i.e., the number of subspaces of an n-dimensional vector space over \(\mathbb {F}_2\). The number of k-dimensional subspaces is the 2-binomial coefficient

$$\begin{aligned} {n\atopwithdelims ()k}_2= {(2^n-1)(2^{n-1}-1)\cdots (2^1-1)\over (2^k-1)(2^{k-1}-1)\cdots (2^1-1) (2^{n-k}-1)(2^{n-k-1}-1)\cdots (2^1-1)}, \end{aligned}$$

which is approximately \(2^{n^2/4}\) for \(k\approx n/2\). This turns out to be overkill from the perspective of our attack: as illustrated in Figs. 5.1 and 5.2, the number of subfields we use ends up essentially linear in \(2^n\).

2 Multiquadratic Fields

A multiquadratic field is, by definition, a field that can be written in the form \(\mathbb {Q}(\sqrt{r_1},\ldots ,\sqrt{r_m})\) where \((r_1,\ldots ,r_m)\) is a finite sequence of rational numbers. The notation \(\mathbb {Q}(\sqrt{r_1},\ldots ,\sqrt{r_m})\) means the smallest subfield of \(\mathbb {C}\), the field of complex numbers, that contains \(\sqrt{r_1},\ldots ,\sqrt{r_m}\).

When we write \(\sqrt{r}\) for a nonnegative real number r, we mean specifically the nonnegative square root of r. When we write \(\sqrt{r}\) for a negative real number r, we mean specifically \(i\sqrt{-r}\), where i is the standard square root of \(-1\) in \(\mathbb {C}\); for example, \(\sqrt{-2}\) means \(i\sqrt{2}\). These choices do not affect the definition of \(\mathbb {Q}(\sqrt{r_1},\ldots ,\sqrt{r_m})\), but many other calculations rely on each \(\sqrt{r}\) having a definite value.

See full version of paper on multiquad.cr.yp.to for proofs.

Theorem 2.1

Let n be a nonnegative integer. Let \(d_{1},\ldots ,d_{n}\) be integers such that, for each nonempty subset \(J\subseteq \mathord {\left\{ {1,\ldots ,n}\right\} }\), the product \(\prod _{j\in J}d_{j}\) is not a square. Then the \(2^n\) complex numbers \(\prod _{j\in J}\sqrt{d_{j}}\) for all subsets \(J\subseteq \mathord {\left\{ {1,\ldots ,n}\right\} }\) form a basis for the multiquadratic field \(\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}})\) as a \(\mathbb {Q}\)-vector space. Furthermore, for each \(j\in \mathord {\left\{ {1,\ldots ,n}\right\} }\) there is a unique field automorphism of \(\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}})\) that preserves \(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}}\) except for mapping \(\sqrt{d_{j}}\) to \(-\sqrt{d_{j}}\).

Consequently \(\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}})\) is a degree-\(2^n\) number field.

Theorem 2.2

Every multiquadratic field can be expressed in the form of Theorem 2.1 with each \(d_{j}\) squarefree.

3 Fast Arithmetic in Multiquadratic Fields

See full version of paper on multiquad.cr.yp.to.

4 Recognizing Squares

This section explains how to recognize squares in a multiquadratic field \(L=\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}})\). The method does not merely check whether a single element \(u\in L\) is a square: given nonzero \(u_1,\dots ,u_r\in L\), the method rapidly identifies the set of exponent vectors \((e_1,\dots ,e_r)\in \mathbb {Z}^r\) such that \(u_1^{e_1}\cdots u_r^{e_r}\) is a square.

The method here was introduced by Adleman [2] as a speedup to NFS. The idea is to apply a group homomorphism \(\chi \) from \(L^\times \) to \(\mathord {\left\{ {-1,1}\right\} }\), or more generally from T to \(\mathord {\left\{ {-1,1}\right\} }\), where T is a subgroup of \(L^\times \) containing \(u_1,\dots ,u_r\). Then \(\chi \) reveals a linear constraint, hopefully nontrivial, on \((e_1,\dots ,e_r)\) modulo 2. Combining enough constraints reveals the space of \((e_1,\dots ,e_r)\) mod 2.

One choice of \(\chi \) is the sign of a real embedding of L, but this is a limited collection of \(\chi \) (and empty if L is complex). Adleman suggested instead taking \(\chi \) as a quadratic character defined by a prime ideal. There is an inexhaustible supply of prime ideals, and thus of these quadratic characters.

Section 3.6 (in the full version of this paper) used this idea for \(L=\mathbb {Q}\), but only for small r (namely \(r=n\)), where one can afford to try \(2^r\) primes. This section handles arbitrary multiquadratics and allows much larger r.

4.1 Computing Quadratic Characters

Let q be an odd prime number modulo which all the \(d_{i}\) are nonzero squares. For each i, let \(s_i\) be a square root of \(d_{i}\) modulo q. The map \(\mathbb {Z}[x_1,\dots , x_n] \rightarrow \mathbb {F}_q\) defined by \(x_i \mapsto s_i\) and reducing coefficients modulo q induces a homomorphism \(\mathbb {Z}[x_1,\dots , x_n]/(x_1^2-d_{1},\dots ,x_n^2-d_{n}) \rightarrow \mathbb {F}_q\), or equivalently a homomorphism \(\varphi :\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n}}]\rightarrow \mathbb {F}_q\).

Let \(\mathfrak {P}\) be the kernel of \(\varphi \). Then \(\mathfrak {P}\) is a degree-1 prime ideal of \(\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n}}]\) above q, i.e., a prime ideal of prime norm q. Write \(\mathcal {O}_L\) for the ring of integers of \(L\); then \(\mathfrak {P}\) extends to a unique degree-1 prime ideal of \(\mathcal {O}_L\). The map \(\varphi \) extends to the set \(R_\varphi \) of all \(u\in L\) having nonnegative valuation at this prime ideal. For each \(u\in R_\varphi \) define \(\chi (u)\in \mathord {\left\{ {-1,0,1}\right\} }\) as the Legendre symbol of \(\varphi (u)\in \mathbb {F}_q\). Then \(\chi (uu')=\chi (u)\chi (u')\), since \(\varphi (uu')=\varphi (u)\varphi (u')\) and the Legendre symbol is multiplicative. In particular, \(\chi (u^2)\in \mathord {\left\{ {0,1}\right\} }\).

More explicitly: Given a polynomial \(u\in \mathbb {Z}[x_1,\dots , x_n]/(x_1^2-d_{1}, \dots ,x_n^2-d_{n})\) represented as coefficients of 1, \(x_1\), \(x_2\), \(x_1x_2\), etc., first take all coefficients modulo q to obtain \(u\bmod q\in \mathbb {F}_q[x_1,\dots , x_n]/(x_1^2-d_{1},\dots ,x_n^2-d_{n})\). Then substitute \(x_n\mapsto s_n\): i.e., write \(u\bmod q\) as \(u_0+u_1x_n\), where \(u_0,u_1\in \mathbb {F}_q[x_1,\dots , x_{n-1}]/(x_1^2-d_{1},\dots ,x_{n-1}^2-d_{n-1})\), and compute \(u_0+u_1s_n\). Inside this result substitute \(x_{n-1}\mapsto s_{n-1}\) similarly, and so on through \(x_1\mapsto s_1\), obtaining \(\varphi (u)\in \mathbb {F}_q\). Finally compute the Legendre symbol modulo q to obtain \(\chi (u)\).

As in Sect. 3 (in the full version of this paper), assume that each coefficient of u has at most B bits, and choose q (using the \(\mathop {\text {GoodPrime}}\nolimits \) function from Sect. 3.2) to have \(n^{O(1)}\) bits. Then the entire computation of \(\chi (u)\) takes time essentially NB, mostly to reduce coefficients modulo q. The substitutions \(x_j\mapsto s_j\) involve a total of O(N) operations in \(\mathbb {F}_q\), and the final Legendre-symbol computation takes negligible time.

More generally, any element of L is represented as u/h for a positive integer denominator h. Assume that q is coprime to h; this is true with overwhelming probability when q is chosen randomly. (It is also guaranteed to be true for any \(u/h\in \mathcal {O}_L\) represented in lowest terms, since q is coprime to \(2d_1\cdots d_n\).) Then \(\varphi (u/h)\) is simply \(\varphi (u)/h\), and computing the Legendre symbol produces \(\chi (u/h)\).

4.2 Recognizing Squares Using Many Quadratic Characters

Let \(\chi _{1}, \ldots , \chi _{m}\) be quadratic characters. Define \(T\) as the subset of \(L\) on which all \(\chi _{i}\) are defined and nonzero. Then \(T\) is a subgroup of \(L^\times \), the intersection of the unit groups of the rings \(R_\varphi \) defined above. Define a group homomorphism \(X:T\rightarrow (\mathbb {Z}/2)^m\) as \(u\mapsto (\log _{-1}\chi _{1},\dots ,\log _{-1}\chi _{m})\).

Given nonzero \(u_1,\dots ,u_r\in L\), choose m somewhat larger than r, and then choose \(\chi _{1},\dots ,\chi _{m}\) randomly using \(\mathop {\text {GoodPrime}}\nolimits \). Almost certainly \(u_1,\dots ,u_r\in T\); if any \(\chi (u_j)\) turns out to be undefined or zero, simply switch to another prime.

Define U as the subgroup of \(T\) generated by \(u_1,\dots ,u_r\). If a product \(\pi =u_1^{e_1}\cdots u_r^{e_r}\) is a square in L then its square root is in \(T\) so \(X(\pi )=0\), i.e., \(e_1 X(u_1)+\dots +e_r X(u_r)=0\). Conversely, if \(X(\pi )=0\) and m is somewhat larger than r then almost certainly \(\pi \) is a square in L, as we now explain.

The group \(U/(U\cap L^2)\) is an \(\mathbb {F}_2\)-vector space of dimension at most r, so its dual group \({{\mathrm{Hom}}}(U/(U\cap L^2),\mathbb {Z}/2)\) is also an \(\mathbb {F}_2\)-vector space of dimension at most r. As in [16, Sect. 8], we heuristically model \(\log _{-1}\chi _{1},\dots ,\log _{-1}\chi _{m}\) as independent uniform random elements of this dual; then they span the dual with probability at least \(1-1/2^{m-r}\) by [16, Lemma 8.2]. If they do span the dual, then any \(\pi \in U\) with \(X(\pi )=0\) must have \(\pi \in U\cap L^2\).

The main argument for this heuristic is the fact that, asymptotically, prime ideals are uniformly distributed across the dual. Restricting to degree-1 prime ideals does not affect this heuristic: prime ideals are counted by norm, so asymptotically 100% of all prime ideals have degree 1. Beware that taking more than one prime ideal over a single prime number q would not justify the same heuristic.

Computing \(X(u_1),\dots ,X(u_r)\) involves \(mr\approx r^2\) quadratic-character computations, each taking time essentially NB. We do better by using remainder trees to merge the reductions of B-bit coefficients mod q across all r choices of q; this reduces the total time from essentially \(r^2 N B\) to essentially \(r N (r+B)\).

We write \(\mathop {\text {EnoughCharacters}}\nolimits (L,(v_1,\dots ,v_s))\) for a list of m randomly chosen characters that are defined and nonzero on \(v_1,\dots ,v_s\). In higher-level algorithms in this paper, the group \(\left\langle v_1,\dots ,v_s\right\rangle \) can always be expressed as \(\left\langle u_1,\dots ,u_r\right\rangle \) with \(r\le N+1\), and we choose m as \(N+64\), although asymptotically one should replace 64 by, e.g., \(\sqrt{N}\). The total time to compute \(X(u_1),\dots ,X(u_R)\) is essentially \(N^2(N+B)\). The same heuristic states that these characters have probability at most \(1/2^{63}\) (or asymptotically at most \(1/2^{\sqrt{N}-1}\)) of viewing some non-square \(u_1^{e_1}\cdots u_r^{e_r}\) as a square. Our experiments have not encountered any failing square-root computations.

5 Computing Units

This section presents a fast algorithm to compute the unit group \(\mathcal {O}_{L}^\times \) of a multiquadratic field \(L\). For simplicity we assume that \(L\) is real, i.e., that \(L\subseteq \mathbb {R}\). Note that a multiquadratic field is real if and only if it is totally real, i.e., every complex embedding \(L\rightarrow \mathbb {C}\) has image in \(\mathbb {R}\). For \(L=\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}})\) this is equivalent to saying that each \(d_{j}\) is nonnegative.

Like Wada [42], we recursively compute unit groups for three subfields \(K_\sigma \), \(K_\tau \), \(K_{\sigma \tau }\), and then use the equation \(u^2=N_{L:K_\sigma }(u)N_{L:K_\tau }(u)/\sigma (N_{L:K_{\sigma \tau }}(u))\) to glue these groups together into a group U between \(\mathcal {O}_{L}^\times \) and \((\mathcal {O}_{L}^\times )^2\). At this point Wada resorts to brute-force search to identify the squares in U, generalizing an approach taken by Kubota in [34] for degree-4 multiquadratics (“biquadratics”). We reduce exponential time to polynomial time by using quadratic characters as explained in Sect. 4.

5.1 Fundamental Units of Quadratic Fields

A quadratic field is, by definition, a degree-2 multiquadratic field; i.e., a field of the form \(\mathbb {Q}(\sqrt{d})\), where d is a non-square integer.

Fix a positive non-square integer d. Then \(L=\mathbb {Q}(\sqrt{d})\) is a real quadratic field, and the unit group \(\mathcal {O}_{L}^\times \) is

$$\begin{aligned} \mathord {\left\{ {\ldots ,-\varepsilon ^2,-\varepsilon ,-1,-\varepsilon ^{-1},-\varepsilon ^{-2},\ldots ,\varepsilon ^{-2},\varepsilon ^{-1},1,\varepsilon ,\varepsilon ^2,\ldots }\right\} } \end{aligned}$$

for a unique \(\varepsilon \in \mathcal {O}_{L}^\times \) with \(\varepsilon >1\). This \(\varepsilon \), the smallest element of \(\mathcal {O}_{L}^\times \) larger than 1, is the normalized fundamental unit of \(\mathcal {O}_L\). For example, the normalized fundamental unit is \(1+\sqrt{2}\) for \(d=2\); \(2+\sqrt{3}\) for \(d=3\); and \((1+\sqrt{5})/2\) for \(d=5\).

Sometimes the literature says “fundamental unit” instead of “normalized fundamental unit”, but sometimes it defines all of \(\varepsilon , -\varepsilon , 1/\varepsilon , -1/\varepsilon \) as “fundamental units”. The phrase “normalized fundamental unit” is unambiguous.

The size of the normalized fundamental unit \(\varepsilon \) is conventionally measured by the regulator \(R=\ln (\varepsilon )\). A theorem by Hua states that \(R<\sqrt{d}(\ln (4d)+2)\), and experiments suggest that R is typically \(d^{1/2+o(1)}\), although it is often much smaller. Write \(\varepsilon \) as \(a+b\sqrt{d}\) with \(a,b\in \mathbb {Q}\); then both 2a and \(2b\sqrt{d}\) are very close to \(\exp (R)\), and there are standard algorithms that compute ab in time essentially R, i.e., at most essentially \(d^{1/2}\). See generally [36, 43].

For our time analysis we assume that d is quasipolynomial in N, i.e., \(\log d\in (\log N)^{O(1)}\). Then the time to compute \(\varepsilon \) is also quasipolynomial in N.

Take, for example, \(d=d_{1}\cdots d_{n}\), where \(d_{1},\dots ,d_{n}\) are the first n primes, and write \(N=2^n\). The product of primes \({\le }y\) is approximately \(\exp (y)\), so \(\ln d\approx n\ln n= (\log _2 N)\ln \log _2 N\). As a larger example, if \(d_{1},\dots ,d_{n}\) are primes between \(N^3\) and \(N^4\), and again \(d=d_{1}\cdots d_{n}\), then \(\log _2 d\) is between \(3n^2\) and \(4n^2\), i.e., between \(3(\log _2 N)^2\) and \(4(\log _2 N)^2\). In both of these examples, d is quasipolynomial in N.

Subexponential Algorithms. There are much faster algorithms that compute \(\varepsilon \) as a product of powers of smaller elements of \(L\). There is a deterministic algorithm that provably takes time essentially \(R^{1/2}\), i.e., at most essentially \(d^{1/4}\); see [11]. Heuristic algorithms take subexponential time \(\exp ((\ln (d))^{1/2+o(1)})\), and thus time polynomial in N if \(\ln (d)\in O((\log N)^{2-\epsilon })\); see [1, 15, 19, 40]. Quantum algorithms are even faster, as mentioned in the introduction, but in this paper we focus on pre-quantum algorithms.

This representation of units is compatible with computing products, quotients, quadratic characters (see Sect. 4), and automorphisms, but we also need to be able to compute square roots. One possibility here is to generalize from “product of powers” to any algebraic algorithm, i.e., any chain of additions, subtractions, multiplications, and divisions. This seems adequate for our square-root algorithm in Sect. 3.7 (in the full version of this paper): for example, \(h_0\) inside Algorithm 3.3 can be expressed as the chain \((h+\sigma (h))/2\) for an appropriate automorphism \(\sigma \), and the base case involves square roots of small integers that can be computed explicitly. However, it is not clear whether our recursive algorithms produce chains of polynomial size. We do not explore this possibility further.

5.2 Units in Multiquadratic Fields

Let \(d_{1},\ldots ,d_{n}\) be integers satisfying the conditions of Theorem 2.1. Assume further that \(d_{1},\dots ,d_{n}\) are positive. Then \(L=\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}})\) is a real multiquadratic field.

This field has \(N-1=2^n-1\) quadratic subfields, all of which are real. Each quadratic subfield is constructed as follows: take one of the \(N-1\) nonempty subsets \(J\subseteq \{1,\dots ,n\}\); define \(d_{J}= \prod _{j\in J}d_{j}\); the subfield is \(\mathbb {Q}(\sqrt{d_{J}})\). We write the normalized fundamental units of these \(N-1\) quadratic subfields as \(\varepsilon _{1},\dots ,\varepsilon _{N-1}\).

The set of multiquadratic units of \(L\) is the subgroup \(\left\langle -1,\varepsilon _{1},\dots ,\varepsilon _{N-1}\right\rangle \) of \(\mathcal {O}_{L}^\times \); equivalently, the subgroup of \(\mathcal {O}_{L}^\times \) generated by \(-1\) and all units of rings of integers of quadratic subfields of \(L\). (The “\(-1\) and” can be suppressed except for \(L=\mathbb {Q}\).) A unit in \(\mathcal {O}_L\) is not necessarily a multiquadratic unit, but Theorem 5.2 states that its Nth power must be a multiquadratic unit.

The group \(\mathcal {O}_{L}^\times \) is isomorphic to \((\mathbb {Z}/2)\times \mathbb {Z}^{N-1}\) by Dirichlet’s unit theorem. For \(N\ge 2\) this isomorphism takes the Nth powers to \(\mathord {\left\{ {0}\right\} }\times (N\mathbb {Z})^{N-1}\), a subgroup having index \(2^{1+n(N-1)}\). The index of the multiquadratic units in \(\mathcal {O}_{L}^\times \) is therefore a divisor of \(2^{1+n(N-1)}\). One corollary is that \(\varepsilon _{1},\dots ,\varepsilon _{N-1}\) are multiplicatively independent: if \(\prod \varepsilon _{j}^{a_j}=1\), where each \(a_j\in \mathbb {Z}\), then each \(a_j=0\).

Lemma 5.1

Let \(L\) be a real multiquadratic field and let \(\sigma ,\tau \) be distinct non-identity automorphisms of \(L\). Define \(\sigma \tau = \sigma \circ \tau \). For \(\ell \in \{\sigma ,\tau ,\sigma \tau \}\) let \(K_{\ell }\) be the subfield of \(L\) fixed by \(\ell \). Define \(U = \mathcal {O}_{K_{\sigma }}^\times \cdot \mathcal {O}_{K_{\tau }}^\times \cdot \sigma (\mathcal {O}_{K_{\sigma \tau }}^\times )\). Then

$$\begin{aligned} (\mathcal {O}_{L}^\times )^2\le U \le \mathcal {O}_{L}^\times . \end{aligned}$$

Proof

\(\mathcal {O}_{K_{\sigma }}^\times \), \(\mathcal {O}_{K_{\tau }}^\times \), and \(\mathcal {O}_{K_{\sigma \tau }}^\times \) are subgroups of \(\mathcal {O}_{L}^\times \). The automorphism \(\sigma \) preserves \(\mathcal {O}_{L}^\times \), so \(\sigma (\mathcal {O}_{K_{\sigma \tau }}^\times )\) is a subgroup of \(\mathcal {O}_{L}^\times \). Hence U is a subgroup of \(\mathcal {O}_{L}^\times \).

For the first inclusion, let \(u \in \mathcal {O}_{L}^\times \). Then \(N_{L:K_{\ell }}(u) \in \mathcal {O}_{K_{\ell }}^\times \) for \(\ell \in \{\sigma ,\tau ,\sigma \tau \}\). Each non-identity automorphism of \(L\) has order 2, so in particular each \(\ell \in \mathord {\left\{ {\sigma ,\tau ,\sigma \tau }\right\} }\) has order 2 (if \(\sigma \tau \) is the identity then \(\sigma =\sigma \sigma \tau =\tau \), contradiction), so \(N_{L:K_{\ell }}(u)=u\cdot \ell (u)\). We thus have

$$\begin{aligned} \frac{N_{L:K_{\sigma }}(u)N_{L:K_{\tau }}(u)}{\sigma (N_{L:K_{\sigma \tau }}(u))} = \frac{u\cdot \sigma (u)\cdot u\cdot \tau (u)}{\sigma (u\cdot \sigma \tau (u))} = u^2. \end{aligned}$$

Hence \(u^2= N_{L:K_{\sigma }}(u)N_{L:K_{\tau }}(u) \sigma (N_{L:K_{\sigma \tau }}(u^{-1})) \in U\). This is true for each \(u\in \mathcal {O}_{L}^\times \), so \((\mathcal {O}_{L}^\times )^2\) is a subgroup of U.    \(\square \)

Theorem 5.2

Let \(L\) be a real multiquadratic field of degree N. Let Q be the group of multiquadratic units of \(L\). Then \(\mathcal {O}_{L}^\times =Q\) if \(N=1\), and \((\mathcal {O}_{L}^\times )^{N/2}\le Q\) if \(N\ge 2\). In both cases \((\mathcal {O}_{L}^\times )^N\le Q\).

Proof

Induct on N. If \(N=1\) then \(L=\mathbb {Q}\) so \(\mathcal {O}_{L}^\times =\left\langle -1\right\rangle =Q\). If \(N=2\) then \(L\) is a real quadratic field so \(\mathcal {O}_{L}^\times =\left\langle -1,\varepsilon _{1}\right\rangle =Q\) where \(\varepsilon _{1}\) is the normalized fundamental unit of \(L\).

Assume from now on that \(N\ge 4\). By Theorem 2.2, \(L\) can be expressed as \(\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}})\) where \(d_1,\dots ,d_n\) are positive integers meeting the conditions of Theorem 2.1 and \(N=2^n\).

Define \(\sigma \) as the automorphism of \(L\) that preserves \(\sqrt{d_{1}},\dots ,\sqrt{d_{n}}\) except for negating \(\sqrt{d_{n}}\). The field \(K_{\sigma }\) fixed by \(\sigma \) is \(\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n-1}})\), a real multiquadratic field of degree N/2. Write \(Q_\sigma \) for the group of multiquadratic units of \(K_{\sigma }\). By the inductive hypothesis, \((\mathcal {O}_{K_{\sigma }}^\times )^{N/4}\le Q_\sigma \le Q\).

Define \(\tau \) as the automorphism of \(L\) that preserves \(\sqrt{d_{1}},\dots ,\sqrt{d_{n}}\) except for negating \(\sqrt{d_{n-1}}\). Then the field \(K_{\tau }\) fixed by \(\tau \) is \(\mathbb {Q}(\sqrt{d_{1}},\dots ,\sqrt{d_{n-2}},\sqrt{d_{n}})\), and the field \(K_{\sigma \tau }\) fixed by \(\sigma \tau \) is \(\mathbb {Q}(\sqrt{d_{1}},\dots ,\sqrt{d_{n-2}},\sqrt{d_{n-1}d_{n}})\). Both of these are real multiquadratic fields of degree N/2, so \((\mathcal {O}_{K_{\tau }}^\times )^{N/4}\le Q\) and \((\mathcal {O}_{K_{\sigma \tau }}^\times )^{N/4}\le Q\). The automorphism \(\sigma \) preserves Q, so \(\sigma (\mathcal {O}_{K_{\sigma \tau }}^\times )^{N/4}\le Q\).

By Lemma 5.1, \((\mathcal {O}_{L}^\times )^2\le \mathcal {O}_{K_{\sigma }}^\times \cdot \mathcal {O}_{K_{\tau }}^\times \cdot \sigma (\mathcal {O}_{K_{\sigma \tau }}^\times )\). Simply take (N / 4)th powers: \((\mathcal {O}_{L}^\times )^{N/2} \le (\mathcal {O}_{K_{\sigma }}^\times )^{N/4}\cdot (\mathcal {O}_{K_{\tau }}^\times )^{N/4}\cdot \sigma (\mathcal {O}_{K_{\sigma \tau }}^\times )^{N/4}\le Q\).    \(\square \)

5.3 Representing Units: Logarithms and Approximate Logarithms

Sections 5.4 and 5.5 will use Lemma 5.1, quadratic characters, and square-root computations to obtain a list of generators for \(\mathcal {O}_{L}^\times \). However, this is usually far from a minimal-size list of generators. Given this list of generators we would like to produce a basis for \(\mathcal {O}_{L}^\times \). This means a list of \(N-1\) elements \(u_1,\dots ,u_{N-1}\in \mathcal {O}_{L}^\times \) such that each element of \(\mathcal {O}_{L}^\times \) can be written uniquely as \(\zeta u_1^{e_1}\cdots u_{N-1}^{e_{N-1}}\) where \(\zeta \) is a root of unity; i.e., as \(\pm u_1^{e_1}\cdots u_{N-1}^{e_{N-1}}\). In other words, it is a list of independent generators of \(\mathcal {O}_{L}^\times /\mathord {\left\{ {\pm 1}\right\} }\).

A basis \(u_1,\dots ,u_{N-1}\) for \(\mathcal {O}_{L}^\times \) is traditionally viewed as a lattice basis in the usual sense: specifically, as the basis \(\text {Log }u_1,\dots ,\text {Log }u_{N-1}\) for the lattice \(\text {Log }\mathcal {O}_{L}^\times \), where \({{\mathrm{Log}}}\) is Dirichlet’s logarithm map. However, this view complicates the computation of a basis. We instead view a basis \(u_1,\dots ,u_{N-1}\) for \(\mathcal {O}_{L}^\times \) as a basis \(\text {ApproxLog }u_1,\dots ,\text {ApproxLog }u_{N-1}\) for the lattice \(\text {ApproxLog }\mathcal {O}_{L}^\times \), where \({{\mathrm{ApproxLog}}}\) is an “approximate logarithm map”. We define our approximate logarithm map here, explain why it is useful, and explain how we use the approximate logarithm map in our representation of units. In Sect. 5.5 we use \({{\mathrm{ApproxLog}}}\) to reduce a list of generators to a basis.

Dirichlet’s Logarithm Map. Let \(\sigma _{1}, \sigma _{2}, \dots , \sigma _{N}\) be (in some order) the embeddings of \(L\) into \(\mathbb {C}\), i.e., the ring homomorphisms \(L\rightarrow \mathbb {C}\). Since \(L\) is Galois, these are exactly the automorphisms of \(L\). Dirichlet’s logarithm map \({{\mathrm{Log}}}: L^\times \rightarrow \mathbb {R}^N\) is defined as follows:

$$\begin{aligned} {{\mathrm{Log}}}(u) = (\ln |\sigma _{1} (u)|, \ln |\sigma _{2} (u)|, \dots , \ln |\sigma _{N}(u)|). \end{aligned}$$

This map has several important properties. It is a group homomorphism from the multiplicative group \(L^\times \) to the additive group \(\mathbb {R}^N\). The kernel of \({{\mathrm{Log}}}\) restricted to \(\mathcal {O}^\times _L\) is the cyclic group of roots of unity in \(L\), namely \(\mathord {\left\{ {1,-1}\right\} }\). The image \({{\mathrm{Log}}}(\mathcal {O}^\times _L)\) forms a lattice of rank \(N-1\), called the log-unit lattice.

Given units \(u_1,\dots ,u_b\) generating \(\mathcal {O}_{L}^\times \), one can compute \({{\mathrm{Log}}}(u_1),\dots ,{{\mathrm{Log}}}(u_b)\) in \(\mathbb {R}^N\), and then reduce these images to linearly independent vectors in \(\mathbb {R}^N\) by a chain of additions and subtractions, obtaining a basis for the log-unit lattice. Applying the corresponding chain of multiplications and divisions to the original units produces a basis for \(\mathcal {O}_{L}^\times \).

However, elements of \(\mathbb {R}\) are conventionally represented as nearby rational numbers. “Computing” \({{\mathrm{Log}}}(u_1),\dots ,{{\mathrm{Log}}}(u_b)\) thus means computing nearby vectors of rational numbers. The group generated by these vectors usually has rank larger than \(N-1\): instead of producing \(N-1\) linearly independent vectors and \(b-(N-1)\) zero vectors, reduction can produce as many as b linearly independent vectors.

One can compute approximate linear dependencies by paying careful attention to floating-point errors. An alternative is to use p-adic techniques as in [7]. Another alternative is to represent logarithms in a way that allows all of the necessary real operations to be carried out without error: for example, one can verify that \(\text {Log }u>\text {Log }v\) by using interval arithmetic in sufficiently high precision, and one can verify that \(\text {Log }u=\text {Log }v\) by checking that u/v is a root of unity.

Approximate Logarithms. We instead sidestep these issues by introducing an approximate logarithm function \({{\mathrm{ApproxLog}}}\) as a replacement for the logarithm function \({{\mathrm{Log}}}\). This new function is a group homomorphism from \(\mathcal {O}_{L}^\times \) to \(\mathbb {R}^N\). Its image is a lattice of rank \(N-1\), which we call the approximate unit lattice. Its kernel is the group of roots of unity in L. The advantage of \({{\mathrm{ApproxLog}}}\) over \({{\mathrm{Log}}}\) is that all the entries of \({{\mathrm{ApproxLog}}}(u)\) are rationals, allowing exact linear algebra.

To define \({{\mathrm{ApproxLog}}}\), we first choose N linearly independent vectors

$$\begin{aligned} (1,1,\ldots ,1),{{\mathrm{ApproxLog}}}(\varepsilon _{1}),\dots ,{{\mathrm{ApproxLog}}}(\varepsilon _{N-1})\in \mathbb {Q}^N, \end{aligned}$$

where \(\varepsilon _{1},\dots ,\varepsilon _{N-1}\) are the normalized fundamental units of the quadratic subfields of \(L\) as before; \((1,1,\ldots ,1)\) is included here to simplify other computations. We then extend the definition by linearity to the group \(\left\langle -1,\varepsilon _{1},\dots ,\varepsilon _{N-1}\right\rangle \) of multiquadratic units: if

$$\begin{aligned} u = \pm \prod _{j=1}^{N-1} \varepsilon _{j}^{e_{j}} \end{aligned}$$

then we define \({{\mathrm{ApproxLog}}}(u)\) as \(\sum _j e_j{{\mathrm{ApproxLog}}}(\varepsilon _{j})\). Finally, we further extend the definition by linearity to all of \(\mathcal {O}_{L}^\times \): if \(u\in \mathcal {O}_{L}^\times \) then \(u^N\) is a multiquadratic unit by Theorem 5.2, and we define \({{\mathrm{ApproxLog}}}(u)\) as \({{\mathrm{ApproxLog}}}(u^N)/N\). It is easy to check that \({{\mathrm{ApproxLog}}}\) is a well-defined group homomorphism.

For example, one can take \({{\mathrm{ApproxLog}}}(\varepsilon _{1})=(1,0,\ldots ,0,0)\), \({{\mathrm{ApproxLog}}}(\varepsilon _{2})=(0,1,\ldots ,0,0)\), and so on through \({{\mathrm{ApproxLog}}}(\varepsilon _{N-1})=(0,0,\ldots ,1,0)\). Then \({{\mathrm{ApproxLog}}}(u)=(e_1/N,e_2/N,\ldots ,e_{N-1}/N,0)\) if \(u^N=\pm \varepsilon _{1}^{e_1}\varepsilon _{2}^{e_2}\cdots \varepsilon _{N-1}^{e_{N-1}}\). In other words, write each unit modulo \(\pm 1\) as a product of powers of \(\varepsilon _{1},\dots ,\varepsilon _{N-1}\); \({{\mathrm{ApproxLog}}}\) is then the exponent vector.

We actually define \({{\mathrm{ApproxLog}}}\) to be numerically much closer to \({{\mathrm{Log}}}\). We choose a precision parameter \(\beta \), and we choose each entry of \({{\mathrm{ApproxLog}}}(\varepsilon _{j})\) to be a multiple of \(2^{-\beta }\) within \(2^{-\beta }\) of the corresponding entry of \({{\mathrm{Log}}}(\varepsilon _{j})\). Specifically, we build \({{\mathrm{ApproxLog}}}(\varepsilon _{j})\) as follows:

  • Compute the regulator \(R=\ln (\varepsilon _{j})\) to slightly more than \(\beta +\log _2 R\) bits of precision.

  • Round the resulting approximation to a (nonzero) multiple \(R'\) of \(2^{-\beta }\).

  • Build a vector with \(R'\) at the N/2 positions i for which \(\sigma _i(\varepsilon _{j})=\varepsilon _{j}\), and with \(-R'\) at the remaining N/2 positions i.

The resulting vectors \({{\mathrm{ApproxLog}}}(\varepsilon _{1}),\dots ,{{\mathrm{ApproxLog}}}(\varepsilon _{N-1})\) are orthogonal to each other and to \((1,1,\ldots ,1)\).

How Units are Represented. Each unit in Algorithms 5.1 and 5.2 is implicitly represented as a pair consisting of (1) the usual representation of an element of \(L\) and (2) the vector \({{\mathrm{ApproxLog}}}(u)\). After the initial computation of \(\ln (\varepsilon _{j})\) for each j, all subsequent units are created as products (or quotients) of previous units, with sums (or differences) of the \({{\mathrm{ApproxLog}}}\) vectors; or square roots of previous units, with the \({{\mathrm{ApproxLog}}}\) vectors multiplied by 1/2. This approach ensures that we do not have to compute \(\ln |\sigma (u)|\) for the subsequent units u.

As mentioned in Sect. 5.1, we assume that each quadratic field \(\mathbb {Q}(\sqrt{d})\) has \(\log d\in (\log N)^{O(1)}=n^{O(1)}\), so \(\log R\in n^{O(1)}\). We also take \(\beta \in n^{O(1)}\), so each entry of \({{\mathrm{ApproxLog}}}(\varepsilon _{j})\) has \(n^{O(1)}\) bits. One can deduce an \(n^{O(1)}\) bound on the number of bits in any entry of any \({{\mathrm{ApproxLog}}}\) vector used in our algorithms, so adding two such vectors takes time \(n^{O(1)}N\), i.e., essentially N.

For comparison, recall that multiplication takes time essentially NB, where B is the maximum number of bits in any coefficient of the field elements being multiplied. For normalized fundamental units, this number of bits is essentially R, i.e., quasipolynomial in N, rather than \(\log R\), i.e., polynomial in n.

5.4 Pinpointing Squares of Units Inside Subgroups of the Unit Group

Algorithm 5.1, UnitsGivenSubgroup, is given generators \(u_1,\dots ,u_b\) of any group U with \((\mathcal {O}_{L}^\times )^2\le U\le \mathcal {O}_{L}^\times \). It outputs generators of \(\mathcal {O}_{L}^\times /\mathord {\left\{ {\pm 1}\right\} }\).

The algorithm begins by building enough characters \(\chi _1,\dots ,\chi _m\) that are defined and nonzero on U. Recall from Sect. 4.2 that m is chosen to be slightly larger than N.

For each \(u\in U\) define X(u) as the vector \((\log _{-1}(\chi _1(u)),\dots ,\log _{-1}(\chi _m(u)))\in (\mathbb {Z}/2)^m\). If \(u\in (\mathcal {O}_{L}^\times )^2\) then \(X(u)=0\). Conversely, if \(u\in U\) and \(X(u)=0\) then (heuristically, with overwhelming probability) \(u=v^2\) for some \(v\in L\); this v must be a unit, so \(u\in (\mathcal {O}_{L}^\times )^2\).

The algorithm assembles the rows \(X(u_1),\dots ,X(u_b)\) into a matrix M; computes a basis S for the left kernel of M; lifts each element \((S_{i1},\dots ,S_{ib})\) of this basis to a vector of integers, each entry 0 or 1; and computes \(s_i=u_1^{S_{i1}}\cdots u_b^{S_{ib}}\). By definition \(X(s_i)=S_{i1}X(u_1)+\cdots +S_{ib}X(u_b)=0\), so \(s_i\in (\mathcal {O}_{L}^\times )^2\). The algorithm computes a square root \(v_i\) of each \(s_i\), and it outputs \(u_1,\dots ,u_b,v_1,v_2,\dots \).

To see that \(-1,u_1,\dots ,u_b,v_1,v_2,\dots \) generate \(\mathcal {O}_{L}^\times \), consider any \(u\in \mathcal {O}_{L}^\times \). By definition \(u^2\in (\mathcal {O}_{L}^\times )^2\), so \(u^2\in U\), so \(u^2=u_1^{e_1}\cdots u_b^{e_b}\) for some \(e_1,\dots ,e_b\in \mathbb {Z}\). Furthermore \(X(u^2)=0\) so \(e_1X(u_1)+\cdots +e_bX(u_b)=0\); i.e., the vector \((e_1\bmod 2,\dots ,e_b\bmod 2)\) in \((\mathbb {Z}/2)^b\) is in the left kernel of M. By definition S is a basis for this left kernel, so \((e_1\bmod 2,\dots ,e_b\bmod 2)\) is a linear combination of the rows of S modulo 2; i.e., \((e_1,\dots ,e_b)\) is some \((2f_1,\dots ,2f_b)\) plus a linear combination of the rows of S; i.e., \(u^2\) is \(u_1^{2f_1}\cdots u_b^{2f_b}\) times a product of powers of \(s_i\); i.e., u is \(\pm u_1^{f_1}\cdots u_b^{f_b}\) times a product of powers of \(v_i\).

figure a

Complexity Analysis and Improvements. Assume that the inputs \(u_1,\dots ,u_b\) have at most B bits in each coefficient. Each of the products \(s_1,s_2,\dots \) is a product of at most b inputs, and thus has, at worst, essentially bB bits in each coefficient.

Computing the character matrix M takes time essentially \(bN(b+B)\); see Sect. 4.2. Computing S takes \(O(N^3)\) operations by Gaussian elimination over \(\mathbb {F}_2\); one can obtain a better asymptotic exponent here using fast matrix multiplication, but this is not a bottleneck in any case. Computing one product \(s_i\) takes time essentially bNB with a product tree, and computing its square root \(v_i\) takes time essentially \(bN^{\log _2 3}B\). There are at most b values of i.

Our application of this algorithm has \(b\in \varTheta (N)\). The costs are essentially \(N^3+N^2B\) for characters, \(N^3\) for kernel computation, \(N^3 B\) for products, and \(N^{2+\log _2 3}B\) for square roots.

These bounds are too pessimistic, for three reasons. First, experiments show that products often have far fewer factors, and are thus smaller and faster to compute. Second, one can enforce a limit upon output size by integrating the algorithm with lattice-basis reduction (see Sect. 5.5), computing products and square roots only after reduction. Third, we actually use the technique of Sect. 3.5 (in the full version of this paper) to compute products of powers.

5.5 A Complete Algorithm to Compute the Unit Group

Algorithm 5.2 computes a basis for \(\mathcal {O}_{L}^\times \), given a real multiquadratic field \(L\).

As usual write N for the degree of \(L\). There is no difficulty if \(N=1\). For \(N=2\), the algorithm calls standard subroutines cited in Sect. 5.1. For \(N\ge 4\), the algorithm calls itself recursively on three subfields of degree N/2; merges the results into generators for a subgroup \(U\le \mathcal {O}_{L}^\times \) such that \((\mathcal {O}_{L}^\times )^2\le U\); calls UnitsGivenSubgroup to find generators for \(\mathcal {O}_{L}^\times \); and then uses lattice-basis reduction to find a basis for \(\mathcal {O}_{L}^\times \). A side effect of lattice-basis reduction is that the basis is short, although it is not guaranteed to be minimal.

figure b

The Subgroup and the Generators. Lemma 5.1 defines \(U = \mathcal {O}_{K_{\sigma }}^\times \cdot \mathcal {O}_{K_{\tau }}^\times \cdot \sigma (\mathcal {O}_{K_{\sigma \tau }}^\times )\) where \(\sigma ,\tau \) are distinct non-identity automorphisms of \(L\).

The three subfields used in the algorithm are \(K_{\sigma }\), \(K_{\tau }\), and \(K_{\sigma \tau }\). The recursive calls produce lists of generators for \(\mathcal {O}_{K_{\sigma }}^\times /\mathord {\left\{ {\pm 1}\right\} }\), \(\mathcal {O}_{K_{\tau }}^\times /\mathord {\left\{ {\pm 1}\right\} }\), and \(\mathcal {O}_{K_{\sigma \tau }}^\times /\mathord {\left\{ {\pm 1}\right\} }\) respectively. The algorithm builds a list G that contains each element of the first list; each element of the second list; \(\sigma \) applied to each element of the third list; and \(-1\). Then G generates U. As a speedup, we sort G to remove duplicates.

We cache the output of \(\mathop {\text {Units}}\nolimits (L)\) for subsequent reuse (without saying so explicitly in Algorithm 5.2). For example, if \(L=\mathbb {Q}(\sqrt{2},\sqrt{3},\sqrt{5})\), then the three subfields might be \(\mathbb {Q}(\sqrt{2},\sqrt{3})\), \(\mathbb {Q}(\sqrt{2},\sqrt{5})\), and \(\mathbb {Q}(\sqrt{2},\sqrt{15})\), and the next level of recursion involves \(\mathbb {Q}(\sqrt{2})\) three times. We perform the \(\mathop {\text {Units}}\nolimits (\mathbb {Q}(\sqrt{2}))\) computation once and then simply reuse the results the next two times.

The overall impact of caching depends on how \(\sigma \) and \(\tau \) are chosen (which is also not specified in Algorithm 5.2). We use the following specific strategy. As usual write \(L\) as \(\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}})\), where \(d_{1},\dots ,d_{n}\) are integers meeting the conditions of Theorem 2.1. Assume that \(0<d_{1}<\dots <d_{n}\). Choose \(\sigma \) and \(\tau \) such that \(K_\sigma = \mathbb {Q}(\sqrt{d_{1}}, \sqrt{d_{2}}, \dots , \sqrt{d_{n-1}})\) and \(K_\tau = \mathbb {Q}(\sqrt{d_{1}}, \sqrt{d_{2}}, \dots , \sqrt{d_{n-2}}, \sqrt{d_{n}})\). We depict the resulting set of subfields in Figs. 5.1 and 5.2. Notice that, in Figs. 5.1 and 5.2, the leftmost field in each horizontal layer is a subfield used by all fields in the horizontal layer above it.

Fig. 5.1.
figure 1

How to pick subfields for the recursive algorithm for multiquadratic fields of degree 8.

Fig. 5.2.
figure 2

How to pick subfields for the recursive algorithm for multiquadratic fields of degree 16.

With this strategy, the recursion reaches exactly \(2^{n-\ell +1} - 1\) subfields of degree \(2^\ell \), namely the subfields of the form \(\mathbb {Q}(\sqrt{d_{1}},\dots ,\sqrt{d_{\ell -1}},\sqrt{D})\) where D is a product of a nonempty subset of \(\mathord {\left\{ {d_{\ell },\dots ,d_{n}}\right\} }\). With a less disciplined strategy, randomly picking 3 subfields of degree N/2 at each step, we would instead end up with nearly \(3^{n- \ell }\) subfields of degree \(2^\ell \). “Nearly” accounts for accidental collisions and for the limited number of subfields of low degree.

Finding Short Bases Given Generators. Applying Pohst’s modified LLL algorithm [37] to the vectors \({{\mathrm{ApproxLog}}}(u_1),\ldots ,{{\mathrm{ApproxLog}}}(u_b)\) would find \(b-(N-1)\) zero vectors and \(N-1\) independent short combinations of the input vectors. The algorithm is easily extended to produce an invertible \(b\times b\) transformation matrix T that maps the input vectors to the output vectors. (The algorithm in [37] already finds the part of T corresponding to the zero outputs.) We could simply use the entries of any such T as exponents of \(u_j\) in our algorithm. It is important to realize, however, that there are many possible choices of T (except in the extreme case \(b=N-1\)), and the resulting computations are often much slower than necessary. For example, if \(u_3=u_1u_2\), then an output \(u_1/u_2\) might instead be computed as \(u_1^{1001}u_2^{999}/u_3^{1000}\).

We instead apply LLL to the matrix A shown in Algorithm 5.2. This has three effects. First, if H is chosen sufficiently large, then the right side of A is reduced to \(b-(N-1)\) zero vectors and \(N-1\) independent short combinations of the vectors \(H\cdot {{\mathrm{ApproxLog}}}(u_1),\dots ,H\cdot {{\mathrm{ApproxLog}}}(u_b)\). (We check that there are exactly \(b-(N-1)\) zero vectors.) Second, the left side of A keeps track of the transformation matrix that is used. Third, this transformation matrix is automatically reduced: short coefficients are found for the \(b-(N-1)\) zero vectors, and these coefficients are used to reduce the coefficients for the \(N-1\) independent vectors.

An upper bound on LLL cost can be computed as follows. LLL in dimension N, applied to integer vectors where each vector has O(B) bits, uses \(O(N^4 B)\) arithmetic operations on integers with O(NB) bits; see [35, Proposition 1.26]. The total time is bounded by essentially \(N^5 B^2\). To bound B one can bound each \(H\cdot {{\mathrm{ApproxLog}}}(\cdots )\). To bound H one can observe that the transformation matrix has, at worst, essentially N bits in each coefficient (see, e.g., [39]), while the required precision of \({{\mathrm{ApproxLog}}}\) is essentially 1, so it suffices to take essentially N bits in H. The total time is, at worst, essentially \(N^7\).

Our experiments show much better LLL performance for these inputs. We observe LLL actually using very few iterations; evidently the input vectors are already very close to being reduced. It seems plausible to conjecture that the entries of the resulting transformation matrix have at most \(n^{O(1)}\) bits, and that it suffices to take H with \(n^{O(1)}\) bits, producing B bounded by \(n^{O(1)}\). The total time might be as small as essentially \(N^3\), depending on how many iterations there are.

6 Finding Generators of Ideals

This section presents the main contribution of this paper: a fast pre-quantum algorithm to compute a nonzero g in a multiquadratic ring, given the ideal generated by g. For simplicity we focus on the real case, as in Sect. 5. The algorithm takes quasipolynomial time under reasonable heuristic assumptions if \(d_{1},\dots ,d_{n}\) are quasipolynomial.

The algorithm reuses the equation \(g^2=N_{L:K_\sigma }(g)N_{L:K_\tau }(g)/\sigma (N_{L:K_{\sigma \tau }}(g))\) that was used for unit-group computation in Sect. 5. To compute \(N_{L:K}(g)\), the algorithm computes the corresponding norm of the input ideal, and then calls the same algorithm recursively.

The main algebraic difficulty here is that there are many generators of the same ideal: one can multiply g by any unit, such as \(-1\) or \(1+\sqrt{2}\), to obtain another generator. What the algorithm actually produces is some ug where u is a unit. This means that the recursion produces unit multiples of \(N_{L:K_\sigma }(g)\) etc., and thus produces some \(vg^2\) rather than \(g^2\). The extra unit v might not be a square, so we cannot simply compute the square root of \(vg^2\). Instead we again use the techniques of Sect. 4, together with the unit group computed in Sect. 5, to find a unit u such that \(u(vg^2)\) is a square, and we then compute the square root.

6.1 Representing Ideals and Computing Norms of Ideals

Let \(L\) be a real multiquadratic field of degree \(N=2^n\). Let \(\mathcal {R}\) be an order inside \(L\), such as \(\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n}}]\) inside \(\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}})\). Our algorithm does not require \(\mathcal {R}\) to be the ring of integers \(\mathcal {O}_L\), although its output allows arbitrary units from the ring of integers; i.e., if the input is a principal ideal \(\mathcal {I}\) of \(\mathcal {R}\) then the output is some \(g\in \mathcal {O}_L\) such that \(g\mathcal {O}_L=\mathcal {I}\mathcal {O}_L\). Equivalently, one can (with or without having computed \(\mathcal {O}_L\)) view \(\mathcal {I}\) as representing the ideal \(\mathcal {I}\mathcal {O}_L\) of \(\mathcal {O}_L\).

We consider three representations of an ideal \(\mathcal {I}\) of \(\mathcal {R}\):

  • One standard representation is as a \(\mathbb {Z}\)-basis \(\omega _{1}, \omega _{2}, \dots , \omega _{N}\in \mathcal {R}\), i.e., a basis of \(\mathcal {I}\) as a lattice.

  • A more compact standard representation is the “two-element representation” \((\alpha _1,\alpha _2)\) representing \(\mathcal {I}=\alpha _1 \mathcal {R}+\alpha _2 \mathcal {R}\), typically with \(\alpha _1\in \mathbb {Z}\). If \(\mathcal {R}\ne \mathcal {O}_L\) then \(\mathcal {I}\) might not have a two-element representation, but failure to convert \(\mathcal {I}\) to a two-element representation reveals a larger order.

  • Our target cryptosystem in Appendix A uses another representation that works for many, but certainly not all, ideals of \(\mathcal {R}=\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n}}]\): namely, \((q,s_1,\dots ,s_n)\in \mathbb {Z}^{n+1}\), where each \(s_j\) is a nonzero square root of \(d_{j}\) modulo q and where q is odd, representing \(\mathcal {I}=q\mathcal {R}+(\sqrt{d_{1}}-s_1)\mathcal {R}+\dots +(\sqrt{d_{n}}-s_n)\mathcal {R}\).

Our algorithm works with any representation that allows basic ideal operations, such as ideal norms, which we discuss next. Performance depends on the choice of representation.

Let \(\sigma \) be a nontrivial automorphism of \(L\), and let K be its fixed field; then K is a subfield of \(L\) with \([L:K]=2\). Assume that \(\sigma (\mathcal {R})=\mathcal {R}\), and let \(\mathcal {S}\) be the order \(K\cap \mathcal {R}\) inside K. For example, if \(\mathcal {R}=\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n}}]\) and \(\sigma \) preserves \(\sqrt{d_{1}},\dots ,\sqrt{d_{n-1}}\) while negating \(\sqrt{d_{n}}\), then \(\mathcal {S}=\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n-1}}]\); if \(\mathcal {R}=\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n}}]\) and \(\sigma \) preserves \(\sqrt{d_{1}},\dots ,\sqrt{d_{n-2}}\) while negating \(\sqrt{d_{n-1}},\sqrt{d_{n}}\), then \(\mathcal {S}=\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n-2}},\sqrt{d_{n-1}d_{n}}]\).

The relative norm \(N_{L:K}(\mathcal {I})\) is, by definition, \(\mathcal {I}\sigma (\mathcal {I})\cap K\), which is the same as \(\mathcal {I}\sigma (\mathcal {I})\cap \mathcal {S}\). This is an ideal of \(\mathcal {S}\). It has two important properties: it is not difficult to compute; and if \(\mathcal {I}=g\mathcal {R}\) then \(N_{L:K}(\mathcal {I})=N_{L:K}(g)\mathcal {S}\). See, e.g., [20].

Given a \(\mathbb {Z}\)-basis of \(\mathcal {I}\), one can compute a \(\mathbb {Z}\)-basis of \(N_{L:K}{\mathcal {I}}\) by computing \(\{\omega _{i}\cdot \sigma (\omega _{j})\mid 1\le i\le j\le N\}\), transforming this into a Hermite-Normal-Form (HNF) basis for \(\mathcal {I}\sigma (\mathcal {I})\), and intersecting with \(\mathcal {S}\). A faster approach appears in [6]: compute a two-element representation of \(\mathcal {I}\); multiply the two elements by a \(\mathbb {Z}\)-basis for \(\sigma (\mathcal {I})\); convert to HNF form; and intersect with \(\mathcal {S}\), obtaining a \(\mathbb {Z}\)-basis for \(N_{L:K}{\mathcal {I}}\). This takes total time essentially \(N^5 B\).

The \((q,s_1,\dots ,s_n)\) representation allows much faster norms, and is used in our software. The norm to \(\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n-1}}]\) is simply \((q,s_1,\dots ,s_{n-1})\), and the norm to \(\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n-2}},\sqrt{d_{n-1}d_{n}}]\) is simply \((q,s_1,\dots ,s_{n-2},s_{n-1}s_n)\).

6.2 Computing a Generator of \(\mathcal {I}\) from a Generator of \(\mathcal {I}^2\)

Assume now that we have a nonzero principal ideal \(\mathcal {I}\subseteq \mathcal {O}_L\), and a generator h for \(\mathcal {I}^2\). To find a generator \(g\) for \(\mathcal {I}\), it is sufficient to find a square generator for \(\mathcal {I}^2\) and take its square root. To this end we seek a unit \(u \in \mathcal {O}_{L}^\times \) such that \(uh = g^2\) for some g. Applying the map X from Sect. 4.2 to this equation, we obtain

$$\begin{aligned} X(uh) = X(g^2) = 2X(g) = 0. \end{aligned}$$

Therefore \(X(u) = X(h)\).

We start by computing X(h) from h. We then compute a basis \(u_1,\ldots ,u_{N-1}\) for \(\mathcal {O}_{L}^\times \), and we define \(u_0=-1\), so \(u_0,u_1,\dots ,u_{N-1}\) generate \(\mathcal {O}_{L}^\times \). We then solve the matrix equation

$$\begin{aligned}{}[e_0, e_1, \ldots , e_{N-1}] \begin{bmatrix} X(u_0)\\ X(u_1)\\ \vdots \\ X(u_{N-1})\\ \end{bmatrix} = X(h) \end{aligned}$$

for \([e_0, e_1, \ldots , e_{N-1}] \in (\mathbb {Z}/2)^N\) and set \(u = \prod _j u_j^{e_j}\). Then uh is (almost certainly) a square, so its square root g is a generator of \(\mathcal {I}\). This algorithm is summarized in Algorithm 6.1.

figure c

The subroutine SolveLeft(MV) solves the matrix equation \(eM = V\) for the vector e. One can save time by precomputing the inverse of an invertible full-rank submatrix of M, and using only the corresponding characters.

Note that for this computation to work we need a basis of the full unit group. If we instead use units \(v_1,\dots ,v_{N-1}\) generating, e.g., the group \(U=(\mathcal {O}_{L}^\times )^2\), and if \(h=vg^2\) for some \(v\in \mathcal {O}_{L}^\times -U\), then uh cannot be a square for any \(u\in U\): if it were then h would be a square (since every \(u\in U\) is a square), so v would be a square, so v would be in U, contradiction.

There are several steps in this algorithm beyond the unit-group precomputation. Characters for \(u_0,\ldots ,u_{N-1}\) take time essentially \(N^3+N^2 B\) and can also be precomputed. Characters for h take time essentially \(N^2+NB\). Linear algebra mod 2 takes time essentially \(N^3\), or better with fast matrix multiplication; most of this can be precomputed, leaving time essentially \(N^2\) to multiply a precomputed inverse by X(h). The product of powers takes time essentially \(N^2B\), and the square root takes time essentially \(N^{1+\log _2 3}B\), although these bounds are too pessimistic for the reasons mentioned in Sect. 5.4.

6.3 Shortening

Algorithm 6.2, \(\mathrm{ShortenGen}\), finds a bounded-size generator g of a nonzero principal ideal \(\mathcal {I}\subseteq \mathcal {O}_L\), given any generator h of \(\mathcal {I}\). See Sect. 8 for analysis of the success probability of this algorithm at finding the short generators used in a cryptosystem.

Recall the log-unit lattice \({{\mathrm{Log}}}(\mathcal {O}_{L}^\times )\) defined in Sect. 5.3. The algorithm finds a lattice point \(\text {Log }u\) close to \(\text {Log }h\), and then computes \(g=h/u\).

In more detail, the algorithm works as follows. Start with a basis \(u_1,\dots ,u_{N-1}\) for \(\mathcal {O}_{L}^\times \). Compute \(\text {Log }h\), and write \(\text {Log }h\) as a linear combination of the vectors \({{\mathrm{Log}}}(u_1),\dots ,{{\mathrm{Log}}}(u_{N-1}),(1,1,\dots ,1)\); recall that \((1,1,\dots ,1)\) is orthogonal to each \({{\mathrm{Log}}}(u_j)\). Round the coefficients in this combination to integers \((e_1,\dots ,e_N)\). Compute \(u=u_1^{e_1}\cdots u_{N-1}^{e_{N-1}}\) and \(g=h/u\).

The point here is that \(\text {Log }h\) is close to \(e_1{{\mathrm{Log}}}(u_1)+\cdots +e_{N-1}{{\mathrm{Log}}}(u_{N-1})+e_N(1,1,\dots ,1)\), and thus to \(\text {Log }u + e_N(1,1,\dots ,1)\). The gap \(\text {Log }g=\text {Log }h-\text {Log }u\) is between \(-0.5\) and 0.5 in each of the \({{\mathrm{Log}}}(u_j)\) directions, plus some irrelevant amount in the \((1,1,\dots ,1)\) direction.

Normally the goal is to find a generator that is known in advance to be short. If the logarithm of this target generator is between \(-0.5\) and 0.5 in each of the \({{\mathrm{Log}}}(u_j)\) directions then this algorithm will find this generator (modulo \(\pm 1\)). See Sect. 8 for further analysis of this event.

Approximations. The algorithm actually computes \(\text {Log }h\) only approximately, and uses \(\text {ApproxLog }u_j\) instead of \(\text {Log }u_j\), at the expense of marginally adjusting the 0.5 bounds mentioned above.

Assume that h has integer coefficients with at most B bits. (We discard the denominator in any case: it affects only the irrelevant coefficient of \((1,1,\dots ,1)\).) Then \(|\sigma _j(h)|\le 2^B\prod _i (1+\sqrt{|d_i|})\), so \(\ln |\sigma _j(h)|\le B\ln 2+\sum _i \ln (1+\sqrt{|d_i|})\). By assumption each \(d_i\) is quasipolynomial in N, so \(\ln |\sigma _j(h)|\le B\ln 2+n^{O(1)}\).

To put a lower bound on \(\ln |\sigma _j(h)|\), consider the product of the other conjugates of h. Each coefficient of this product is between \(-2^C\) and \(2^C\) where C is bounded by essentially NB. Dividing this product by the absolute norm of h, a nonzero integer, again produces coefficients between \(-2^C\) and \(2^C\), but also produces exactly \(1/\sigma _j(h)\). Hence \(\ln |1/\sigma _j(h)|\le C\ln 2+n^{O(1)}\).

In short, \(\ln |\sigma _j(h)|\) is between essentially \(-NB\) and B, so an approximation to \(\ln |\sigma _j(h)|\) within \(2^{-\beta }\) uses roughly \(\beta +\log (NB)\) bits. We use interval arithmetic with increasing precision to ensure that we are computing \(\text {Log }h\) accurately; the worst-case precision is essentially NB. Presumably it would save time here to augment our representation of ideal generators to include approximate logarithms, the same way that we augment our representation of units, but we have not implemented this yet.

Other Reduction Approaches. Finding a lattice point close to a vector, with a promised bound on the distance, is called the Bounded-Distance Decoding Problem (BDD). There are many BDD algorithms in the literature more sophisticated than simple rounding: for example, Babai’s nearest-plane algorithm [4]. See generally [25].

Our experiments show that, unsurprisingly, failures in rounding are triggered most frequently by the shortest vectors in our lattice bases. One cheap way to eliminate these failures is to enumerate small combinations of the shortest vectors.

figure d

6.4 Finding Generators of Ideals for Quadratics

We now have all the ingredients for the attack algorithm. It will work in a recursive manner and in this subsection we will treat the base case.

figure e

Recall from Sect. 5.1 that there are standard algorithms to compute the normalized fundamental unit \(\varepsilon _{}\) of a real quadratic field \(\mathbb {Q}(\sqrt{d})\) in time essentially \(R=\ln (\varepsilon _{})\), which is quasipolynomial under our assumptions. There is, similarly, a standard algorithm to compute a generator of a principal ideal of \(\mathcal {O}_{\mathbb {Q}(\sqrt{d})}\) in time essentially \(R+B\), where B is the number of bits in the coefficients used in the ideal. We call this algorithm FindQGen.

There are also algorithms that replace R by something subexponential in d; see [8, 14, 41]. As in Sect. 5.1, these algorithms avoid large coefficients by working with products of powers of smaller field elements, raising other performance questions in our context.

Algorithm 6.3, QPIP, first calls FindQGen to find a generator h, and then calls ShortenGen from Sect. 6.3 to find a short generator g. For quadratics this is guaranteed to find a generator with a minimum-size logarithm, up to the limits of the approximations used in computing logarithms.

6.5 Finding Generators of Ideals for Multiquadratics

Algorithm 6.4 recursively finds generators of principal ideals of orders in real multiquadratic fields. The algorithm works as follows.

Assume, as usual, that \(d_{1},\dots ,d_{n}\) are positive integers meeting the conditions of Theorem 2.1. Let \(L\) be the real multiquadratic field \(\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}})\) of degree \(N=2^n\). Let \(\mathcal {I}\) be a principal ideal of an order inside \(L\), for which we want to find a generator.

If \(N=1\) then there is no difficulty. If \(N = 2\), we find the generator with the \(\mathrm{QPIP}\) routine of the previous section. Assume from now on that \(N\ge 4\).

As in Sect. 5.5, choose distinct non-identity automorphisms \(\sigma , \tau \) of \(L\), and let \(K_{\sigma }, K_{\tau }, K_{\sigma \tau }\) be the fields fixed by \(\sigma ,\tau ,\sigma \tau \) respectively. These are fields of degree N/2.

For each \(\ell \in \{\sigma ,\tau ,\sigma \tau \}\), compute \(\mathcal {I}_\ell = N_{L:K_{\ell }}(\mathcal {I})\) as explained in Sect. 6.1, and call MQPIP\((K_{\ell },\mathcal {I}_\ell )\) recursively to compute a generator \(g_\ell \) for each \(\mathcal {I}_\ell \mathcal {O}_{K_{\ell }}\). Notice that if \(g\) is a generator of \(\mathcal {I}\mathcal {O}_L\), then \(g\ell (g)\) generates \(\mathcal {I}_\ell \mathcal {O}_{K_{\ell }}\), so \(g_\ell = u_\ell g\ell (g)\) for some \(u_\ell \in \mathcal {O}_{K_{\ell }}^\times \). Therefore

$$\begin{aligned} \frac{g_\sigma g_\tau }{\sigma (g_{\sigma \tau })} = \frac{u_\sigma g\sigma (g) u_\tau g\tau (g)}{\sigma (u_{\sigma \tau } g\sigma \tau (g))} = g^2 u_\sigma u_\tau \sigma (u_{\sigma \tau }^{-1}), \end{aligned}$$

so that \(h = g_\sigma g_\tau / \sigma (g_{\sigma \tau })\) is a generator of \(\mathcal {I}^2\mathcal {O}_L\). Now use IdealSqrt to find a generator of \(\mathcal {I}\mathcal {O}_L\), and ShortenGen to find a bounded-size generator.

figure f

Table 6.1 summarizes the scalability of the subroutines inside MQPIP. Many of the costs are in precomputations that we share across many ideals \(\mathcal {I}\), and these costs involve larger powers of N than the per-ideal costs. On the other hand, the per-ideal costs can dominate when the ideals have enough bits B per coefficient.

Table 6.1. Complexities of subroutines at the top and bottom levels of recursion of MQPIP. Logarithmic factors are suppressed. B is assumed to be at least as large as regulators. “UGS” means UnitsGivenSubgroup; “IS” means IdealSqrt; “SG” means ShortenGen. “Precomp” means that the results of the computation can be reused for many inputs \(\mathcal {I}\).

7 Timings

This section reports experiments on the timings of our software for our algorithms: specifically, the number of seconds used for various operations in the Sage [23] computer-algebra system on a single core of a 4 GHz AMD FX-8350 CPU.

7.1 Basic Subroutine Timings

Table 7.1 shows the time taken for multiplication, squaring, etc., rounded to the nearest 0.0001 s: e.g., 0.0627 s to multiply two elements of a degree-256 multiquadratic ring, each element having random 1000-bit coefficients. The table is consistent with the analysis earlier in the paper: e.g., doubling the degree approximately doubles the cost of multiplication, and approximately triples the cost of square roots.

We have, for comparison, also explored the performance of multiquadratics using Sage’s tower-field functions, Sage’s absolute-number-field functions (using the polynomial F defined in Appendix A), and Sage’s ring constructors. The underlying polynomial-arithmetic code inside Sage is written in C, avoiding Python overhead, but suffers from poor algorithm scalability. Sage’s construction of degree-2 relative extensions (in towers of number fields or in towers of rings) uses Karatsuba arithmetic, losing a factor of 3 for each extension, with no obvious way to enable FFTs. Working with one variable modulo F produces good scalability for multiplication but makes norms difficult. Division is very slow in any case: for example, it takes 0.2 s, 2.8 s, and 93 s in degrees 32, 64, and 128 respectively using the tower-field representation, and it takes 0.15 s, 1.16 s, and 11.3 s in degrees 32, 64, and 128 respectively using the single-variable representation, while we use under 0.06 s in degree 256.

Table 7.1. Observed time for basic operations in \(\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n}}]\), with \(d_{1}=2\), \(d_{2}=3\), \(d_{3}=5\), etc., and \(\lambda =64\). The “mult” column is the time to compute \(h=fg\) where fg have each coefficient chosen randomly between \(-2^{1000}\) and \(2^{1000}-1\). The “square” column is the time to compute \(f^2\). The “relnorm” column is the time to compute \(f\sigma (f)\) where \(\sigma \) is any of the automorphisms in Theorem 2.1. The “absnorm” column is the time to compute \(N_{\mathbb {Q}(\sqrt{d_{1}},\ldots ,\sqrt{d_{n}}):\mathbb {Q}}f\). The “div” column is the time to divide \(h=fg\) by g, recovering f. The “sqrt” column is the time to recover \(\pm f\) from \(f^2\). Each timing is the median of 21 measurements.
Table 7.2. Observed time to compute (once) the unit group of \(\mathbb {Z}[\sqrt{d_{1}},\ldots ,\sqrt{d_{n}}]\), with \(d_{1}=2\), \(d_{2}=3\), \(d_{3}=5\), etc.; and to break the cryptosystem presented in Appendix A. The “tower” column is the time used by Sage’s tower-field unit-group functions (with proof=False); for \(n=6\) these functions ran out of memory after approximately 710000 s. The “absolute” column is the time used by Sage’s absolute-field unit-group functions (also with proof=False), starting from the polynomial F defined in Appendix A. The “new” column is the time used by this paper’s unit-group algorithm. The “attack” column is the time to find a generator of the public key, after the unit group is precomputed. In “new2” and “attack2” the same timings are given for the field with the first n consecutive primes after n. In “new3” and “attack3” the same timings are given for the field with the first n consecutive primes after \(n^2\).

7.2 Timings to Compute the Unit Group and Generators

The difference in scalability is much more striking for unit-group computation, as shown in Table 7.2. Our algorithm uses 2.34 s for degree 16, 7.80 s for degree 32, 26.62 s for degree 64, 146.60 s for degree 128, etc., slowing down by a factor considerably below \(2^5\) for each doubling in the degree. Sage’s internal C library uses 4.8 s for degree 32, but we did not see it successfully compute a unit group for degree 64.

Table 7.2 also shows that our short-generator algorithm has similar scaling to our unit-group algorithm, as one would expect from the structure of the algorithms. As inputs we used public keys from a Gentry-style multiquadratic cryptosystem;Footnote 3 see Appendix A. The number of bits per coefficient in this cryptosystem grows almost linearly with \(2^n\), illustrating another dimension of scalability of our algorithm. See Sect. 8 for analysis of the success probability of the algorithm as an attack against the cryptosystem.

8 Key-Recovery Probabilities

In this section we analyze the success probability of our algorithm recovering the secret key g in a Gentry-style multiquadratic cryptosystem.

The specific system that we target is the system defined in Appendix A (in the full version of this paper), the same system used for timings in Sect. 7.2. The secret key g in this cryptosystem is \(g_0+g_1\sqrt{d_{1}}+g_2\sqrt{d_{2}}+g_3\sqrt{d_{1}}\sqrt{d_{2}}+\cdots +g_{N-1}\sqrt{d_{1}}\cdots \sqrt{d_{n}}\), where \(g_0,g_1,g_2,\ldots \) are independent random integers chosen from intervals \([-G,G], [-G/\sqrt{d_{1}},G/\sqrt{d_{1}}], [-G/\sqrt{d_{2}},G/\sqrt{d_{2}}], \ldots \). The distribution within each interval is uniform, except for various arithmetic requirements (e.g., g must have odd norm) that do not appear to have any impact on the performance of our attack.

Section 8.1 presents heuristics for the expected size of \(\text {Log }g\) on the basis \(\text {Log }\varepsilon _{1},\dots ,\text {Log }\varepsilon _{N-1}\) for the logarithms of multiquadratic units, a sublattice of the log-unit lattice. Section 8.2 presents experimental data confirming these heuristics. Section 8.3 presents experimental data regarding the size of \(\text {Log }g\) on the basis that we compute for the full log-unit lattice. Section 8.4 presents an easier-to-analyze way to find g when \(\text {Log }\varepsilon _{1},\dots ,\text {Log }\varepsilon _{N-1}\) are large enough.

8.1 MQ Unit Lattice: Heuristics for \(\text {Log }g\)

Write \(U_L\) for the group of multiquadratic units in L. Recall that \(U_L\) is defined as the group \(\left\langle -1,\varepsilon _{1},\ldots ,\varepsilon _{N-1}\right\rangle \), where \(\varepsilon _{1},\dots ,\varepsilon _{N-1}\) are the normalized fundamental units of the \(N-1\) quadratic subfields \(\mathbb {Q}(\sqrt{D_1}),\dots ,\mathbb {Q}(\sqrt{D_{N-1}})\).

The logarithms \(\text {Log }\varepsilon _{1},\dots ,\text {Log }\varepsilon _{N-1}\) form a basis for the MQ unit lattice \(\text {Log }U_L\). This is an orthogonal basis: for example, for \(\mathbb {Q}(\sqrt{2},\sqrt{3})\), the basis vectors are \((x,-x,x,-x)\), \((y,y,-y,-y)\), and \((z,-z,-z,z)\) with \(x=\log (1+\sqrt{2})\), \(y=\log (2+\sqrt{3})\), and \(z=\log (5+2 \sqrt{6})\). The general pattern (as in Sect. 5.3) is that \(\text {Log }\varepsilon _{j}\) is a vector with \(R_j=\ln \varepsilon _{j}\) at N/2 positions and \(-R_j\) at the other N/2 positions, specifically with \(R_j\) at position i if and only if \(\sigma _i(\varepsilon _{j})=\varepsilon _{j}\).

One consequence of orthogonality is that rounding on this basis is a perfect solution to the closest-vector problem for the MQ unit lattice. If 0 is the closest lattice point to \(\text {Log }g\), and u is any multiquadratic unit, then rounding \(\text {Log }gu\) produces \(\text {Log }u\). One can decode beyond the closest-vector problem by enumerating some combinations of basis vectors, preferably the shortest basis vectors, but for simplicity we skip this option.

Write \(c_j\) for the coefficient of \(\text {Log }g\) on the jth basis vector \(\text {Log }\varepsilon _{j}\); note that if each \(c_j\) is strictly between \(-0.5\) and 0.5 then 0 is the closest lattice point to \(\text {Log }g\). Another consequence of orthogonality is that \(c_j\) is simply the dot product of \(\text {Log }g\) with \(\text {Log }\varepsilon _{j}\) divided by the squared length of \(\text {Log }\varepsilon _{j}\); i.e., the dot product of \(\text {Log }g\) with a pattern of N/2 copies of \(R_j\) and N/2 copies of \(-R_j\), divided by \(NR_j^2\); i.e., \(Y/(NR_j)\), where Y is the dot product of \(\text {Log }g\) with a pattern of N/2 copies of 1 and N/2 copies of \(-1\).

We heuristically model \(g_0\) as a uniform random real number from the interval \([-G,G]\); \(g_1\) as a uniform random real number from \([-G/\sqrt{d_{1}},G/\sqrt{d_{1}}]\); etc. In this model, each conjugate \(\sigma _i(g)\) is a sum of N independent uniform random real numbers from \([-G,G]\). For large N, the distribution of this sum is close to a Gaussian distribution with mean 0 and variance \(G^2N/3\); i.e., the distribution of \((G\sqrt{N/3})\mathcal {N}\), where \(\mathcal {N}\) is a normally distributed random variable with mean 0 and variance 1. The distribution of \(\ln |\sigma _i(g)|\) is thus close to the distribution of \(\ln (G\sqrt{N/3})+\ln |\mathcal {N}|\).

Recall that \({{\mathrm{Log}}}(g)\) is the vector of \(\ln |\sigma _i(g)|\) over all i, so Y is \(\ln |\sigma _1(g)|-\ln |\sigma _2(g)|+\cdots \) modulo an irrelevant permutation of indices. The mean of \(\ln |\sigma _1(g)|\) is close to the mean of \(\ln (G\sqrt{N/3})+\ln |\mathcal {N}|\), while the mean of \(-\ln |\sigma _2(g)|\) is close to the mean of \(-\ln (G\sqrt{N/3})-\ln |\mathcal {N}|\), etc., so the mean of Y is close to 0. (For comparison, the mean of the sum of entries of \({{\mathrm{Log}}}(g)\) is close to \(N\ln (G\sqrt{N/3})+Nc\). Here c is a universal constant, the average of \(\ln |\mathcal {N}|\).)

To analyze the variance of Y, we heuristically model \(\sigma _1(g),\dots ,\sigma _N(g)\) as independent. Then the variance of Y is the variance of \(\ln |\sigma _1(g)|\) plus the variance of \(-\ln |\sigma _2(g)|\) etc. Each term is close to the variance of \(\ln |\mathcal {N}|\), a universal constant V, so the variance of Y is close to VN. The deviation of Y is thus close to \(\sqrt{VN}\), and the deviation of \(c_j=Y/(NR_j)\) is close to \(\sqrt{V}/(\sqrt{N}R_j)\approx 1.11072/(\sqrt{N}R_j)\).

To summarize, this model predicts that the coefficient of \(\text {Log }g\) on the jth basis vector \(\text {Log }\varepsilon _{j}\) has average approximately 0 and deviation approximately \(1.11072/(\sqrt{N}R_j)\), where \(R_j=\ln \varepsilon _{j}\). Recall that \(R_j\) typically grows as \(D_j^{1/2+o(1)}\).

Fig. 8.1.
figure 3

Blue dots: For \(n=4,5,6\), the observed average absolute coefficient of \({{\mathrm{Log}}}(g)\) in the direction of the basis vector corresponding to \(\mathbb {Q}(\sqrt{D})\). Yellow dots: Predicted values. (Color figure online)

8.2 MQ Unit Lattice: Experiments for \(\text {Log }g\)

The experiments in Fig. 8.1 confirm the prediction of Sect. 8.1. For each n, we took possibilities for n consecutive primes \(d_1,\dots ,d_n\) below 100. For each corresponding multiquadratic field, there are \(N-1\) blue dots. For each D in \(\{d_1,d_2,d_1d_2,\dots ,d_1d_2\cdots d_n\}\), one of these \(N-1\) dots is at horizontal position D. The vertical position is the observed average absolute coefficient of \(\text {Log }g\) in the direction of the basis vector corresponding to D, where g ranges over 1000 secret keys for the \(\mathbb {Q}(\sqrt{d_1},\dots ,\sqrt{d_n})\) cryptosystem. There is also a yellow dot at the same horizontal position and at vertical position \(1.11\sqrt{2/\pi }/(\sqrt{N}\cdot \ln {\varepsilon _{D}})\); here \(\sqrt{2/\pi }\) accounts for the average of \(\mathord {\left| {\mathcal {N}}\right| }\).

For all experiments we see a similar distribution in the yellow dots (predictions) and the blue dots (experiment). We can even more strongly see this by rescaling the x-axis from D to \(1.11\sqrt{2/\pi }/(\sqrt{N}\cdot \ln {\varepsilon _{D}})\), where \(\varepsilon _{D}\) is again the normalized fundamental unit of \(\mathbb {Q}(\sqrt{D})\). This rescaling of the blue dots is shown in Fig. 8.2. In purple we compare these to the \(x = y\) line.

Fig. 8.2.
figure 4

Rescaling of the experiments of Fig. 8.1, also including \(n=3\).

Fig. 8.3.
figure 5

Curves: \(n=2,3,4,5,6,7,8\). Horizontal axis: \(d_1\), specifying n consecutive primes \(d_1,\dots ,d_n\). Vertical axis: Observed probability, for 1000 randomly drawn secret keys g in the cryptosystem, that \(\text {Log }g\) is successfully rounded to 0 in the MQ unit lattice.

After exploring these geometric aspects of the MQ unit lattice, we ran experiments on the success probability of rounding in the lattice. Figure 8.3 shows how often \({{\mathrm{Log}}}(g)\) is rounded to 0 (by simple rounding without enumeration) in our basis for the MQ unit lattice.

This graph shows a significant probability of failure if \(d_1\) and n are both small. Fields that contain the particularly short unit \((1+\sqrt{5})/2\) seem to be the worst case, as one would expect from our heuristics. However, even in this case, failures disappear as n increases. The success probability seems to be uniformly bounded away from 0, seems to be above 90% for all fields with \(d_1\ge 7\) and \(n\ge 4\), and seems to be above 90% for all fields with \(n\ge 7\).

8.3 Full Unit Lattice: Experiments for \(\text {Log }g\)

Analyzing the full unit lattice is difficult, so we proceed directly to experiments. We first numerically compare the MQ unit lattice basis to the full unit lattice basis. The results of this are shown in Table 8.1. The index of \({{\mathrm{Log}}}(U_L)\) in \({{\mathrm{Log}}}(\mathcal {O}_{L}^\times )\) seems to grow as roughly \(N^{0.3N}\).

Table 8.1. Experimental comparison of the MQ unit lattice \({{\mathrm{Log}}}(U_L)\), with basis formed by logarithms of the fundamental units of the quadratic subfields, and the full unit lattice \({{\mathrm{Log}}}(\mathcal {O}_{L}^\times )\), with basis produced by Algorithm 5.2. For each dimension \(2^n\), \(U_L\) and \(\mathcal {O}_{L}^\times \) were computed for 1130 (except for \(n=8\): first 832 that have finished) random multiquadratic fields \(L= \mathbb {Q}(\sqrt{d_{1}}, \dots , \sqrt{d_{n}})\), with \(d_{i}\) primes bounded by \(2n^2\). First row shows the average over these fields of \(\log _2||u^*||\), where \(||u^*||\) is the length of the smallest Gram–Schmidt vector of the basis for \(U_L\). Second row shows the same for \(\mathcal {O}_{L}^\times \). Third row shows the average of \(\log _2\) of the index of \({{\mathrm{Log}}}(U_L)\) in \({{\mathrm{Log}}}(\mathcal {O}_{L}^\times )\).
Table 8.2. Observed attack success probabilities for various multiquadratic fields. By definition \(L_j = \mathbb {Q}(\sqrt{d_{1}}, \dots , \sqrt{d_{n}})\), with \(d_{i}\) the first n consecutive primes larger than or equal to j; and \(p_\mathrm{suc}(L_{j})\) is the fraction of keys out of at least 1000 trials (except 100 for \(n=8, j = 1\)) that were successfully recovered without any enumeration by our attack on field \(L_j\). Table covers \(j\in \mathord {\left\{ {1,n,n^2}\right\} }\).

In Table 8.2 we see the total success probability of the attack, with public keys provided as inputs, and with each successful output verified to match the corresponding secret key times \(\pm 1\).

We see that as the size and the number of the primes grow, the success probability increases, as was the case for the MQ unit basis. Specifically for the first n primes after \(n^2\) the success probability seems to rapidly converge towards 1, as was mentioned in Sect. 1.

8.4 Full Unit Lattice: An Alternative Strategy

The following alternative method of computing g is easier to analyze asymptotically, because it does not require understanding the effectiveness of reduction in the full unit lattice. It does require \(d_1,\ldots ,d_n\) to be large enough compared to N, say larger than \(N^{1.03}\), and it will obviously fail for many smaller \(d_i\) where our experiments succeed, but it still covers a wide range of real multiquadratic number fields.

The point of requiring \(d_1,\ldots ,d_n\) to be larger than \(N^{1.03}\) is that, for sufficiently large N and most such choices of \(d_1,\dots ,d_n\), the n corresponding regulators \(\log \varepsilon _{}\) are heuristically expected to be larger than \(N^{0.51}\), and the remaining regulators for \(d_1d_2\) etc. are heuristically expected to be even larger. The coefficients of \(\text {Log }g\) on the MQ unit basis are then predicted to have deviation at most \(1.11072/N^{1.01}\); see Sect. 8.1. We will return to this in a moment.

Compute, by our algorithm, some generator gu of the public key \(\mathcal {I}\). From Theorem 5.2 we know that \(u^N\) is an MQ unit. Compute \(N\text {Log }gu\) and round in the MQ unit lattice. The coefficients of \(N\text {Log }g\) on the MQ unit basis are predicted to have deviation at most \(1.11072/N^{0.01}\), so for sufficiently large N these coefficients have negligible probability of reaching 0.5 in absolute value. Rounding thus produces \({{\mathrm{Log}}}(u^N)\) with high probability, revealing \({{\mathrm{Log}}}(g^N)\) and thus \(\pm g^N\). Use a quadratic character to deduce \(g^N\), compute the square root \(\pm g^{N/2}\), use a quadratic character to deduce \(g^{N/2}\), and so on through \(\pm g\).

One can further extend the range of applicability of this strategy by finding a smaller exponent e such that \(u^e\) is always an MQ unit. Theorem 5.2 says N/2 for \(N\ge 2\). By computing the MQ units for a particular field one immediately sees the minimum value of e for that field; our computations suggest that N/2 is usually far from optimal.