1 Introduction

Starting around the turn of the century, a significant push, both theoretical and computational, arose to answer some of the most basic yet frustrating questions in the theory of packing problems (for background on packing problems, see [3, 6, 10]). Along with the proof and formal verification of the Kepler conjecture [12], a number of other results on sphere packing in higher dimensions have proved illuminating [4, 23, 24]. For packings by congruent anisotropic bodies, sharp results are limited mostly to the plane, where the best packings of all centrally-symmetric bodies are achieved by lattices [8], and a series of sparse results in higher dimensions [2, 5].

Among general convex bodies, the problem of finding the best packing of regular pentagons serves as a toy model for harder problems, like finding the best packing of regular tetrahedra. However, the pentagon problem is still not a tractable one. Explicit upper bounds for the packing density of regular tetrahedra and octahedra are better than the trivial unity upper bound by minuscule margins [9]. A semidefinite programming (SDP) approach has been suggested by de Oliveira and Vallentin to calculate improved upper bounds [20]. Though the SDP method has not yet yielded a nontrivial upper bound for packing of tetrahedra, it has been used to obtain an upper bound of 0.98103 on the density of regular pentagon packings. A new computer assisted proof gives the improved upper bound of 0.9611 [13]. There remains a sizable gap between the highest density achieved for pentagon packings and this upper bound.

In [17], Kuperberg and Kuperberg describe a recipe for finding the double lattice packing of congruent planar convex bodies with maximal density by solving an optimization problem over inscribed parallelograms. This optimization problem is usually tractable, and in the case of convex polygons can be solved by an algorithm with running time linear in the number of vertices [19]. As examples, Kuperberg and Kuperberg construct the densest double lattice packing for both regular pentagons and regular heptagons and show that these packings have densities of \((5-\sqrt{5})/3 = 0.92131\dots \) and \(0.8926\dots \) respectively. These are the current records and possibly the best general packings of the plane by regular pentagons and by regular heptagons.

A long-open problem, still wide open even in the plane, asks for the pessimal convex body for packing, that is, the shape that has the lowest maximum packing density [2, 3, 14]. In the class of centrally-symmetric bodies in the plane, it is Reinhardt [21] who conjectured that a smoothed octagon is the minimizer and this problem is still open. In the class of general convex bodies in the plane, it is conjectured to be the regular heptagon [15]. However, even though the maximum-density double lattice is conjectured to achieve the maximum packing density for the regular heptagon, no sharp upper bound has been proved.

The regular pentagon and heptagon are cases of special interest, and we initially sought out to investigate whether their optimal double-lattice packing can be shown to be also optimal among a broader class of packings. We were able to show that these packings are optimal at least in some neighborhood in the space of all packings. Furthermore, we discovered that our method can be generalized to all convex polygons. We demonstrate that, while double lattices are in general not globally optimal, they are always at least locally optimal.

Theorem

If a double lattice packing is an isolated local maximum for density among double lattices and is not one of two exceptional cases, then it is a local maximum for density among all packings.

The precise meanings of the terms used will be elucidated in the rest of the paper.

2 Theoretical Preparations

2.1 Local Optimality

We will look at packings of congruent copies of a body K. That is, every element of the packing is given by \(\xi (K)\), where \(\xi \in E(n)\) is an isometry of Euclidean space. It will be convenient to assume that the reference body K is situated so that its interior contains the origin. The isometry group E(n) of \(\mathbb {R}^n\) can be considered as a subgroup of \(GL_{n+1}(\mathbb {R})\) that stabilizes the plane \((x_1,\ldots ,x_n,1)\in \mathbb {R}^{n+1}\). This identification gives us the Frobenius norm \(N(\xi ) = \Vert \xi - \mathrm {Id}\Vert \) for \(\xi \in E(n)\). Some useful inequalities for this norm that we will use are

$$\begin{aligned} \begin{aligned} N(\xi ^{-1})&\le \Vert \xi ^{-1}\Vert N(\xi ),\\ N(\xi \xi ')&\le N(\xi )+N(\xi ')+N(\xi )N(\xi '),\\ N(\psi \xi \psi ^{-1})&\le \Vert \psi \Vert \Vert \psi ^{-1}\Vert N(\xi ),\\ \Vert \xi (0)\Vert \le N(\xi )&\le \Vert \xi (0)\Vert +2n^{1/2}\text{. } \end{aligned} \end{aligned}$$
(1)

Definition 2.1

Let \(\Xi \) be a set of isometries. The limit

$$\begin{aligned} d(\Xi )=\lim _{t\rightarrow \infty } \frac{\mathrm {vol}\,B(0,t)}{|\{\xi \in \Xi : \xi (0)\in B(0,t)\}|}\text{, } \end{aligned}$$
(2)

if it exists, is its mean volume. The limits superior and inferior are its upper and lower mean volumes, denoted \(\overline{d}(\Xi )\) and \(\underline{d}(\Xi )\). We say \(\Xi \) is a (rR)-set if the point set \(\{\xi (0):\xi \in \Xi \}\) has a packing radius at least r and a covering radius at most R.

Definition 2.2

Let K be a compact set with interior (hence, a body). We say that \(\Xi \) is admissible for K if the interiors of \(\xi (K)\) and \(\psi (K)\) are disjoint for any two distinct isometries \(\xi ,\psi \in \Xi \). We say furthermore that \(\Xi \) is saturated if there is no \(\psi \in E(n)\setminus \Xi \) such that \(\Xi \cup \{\psi \}\) is again admissible.

There are r(K) and R(K) such that when \(\Xi \) is admissible and saturated, then \(\Xi \) is a (r(K), R(K))-set. As a consequence, such sets are countable.

Definition 2.3

Given two sets \(\Xi \) and \(\Xi '\) of isometries, we define the premetric

(3)

The infimum is over all bijections \((\cdot )':\Xi \rightarrow \Xi '\).

When at least one set has covering radius no greater than R, \(\delta _R(\Xi ,\Xi ')=0\) if and only if \(\xi = \psi \xi '\) for some \(\psi \in E(n)\) and some bijection. Consider a body K. When \(R\ge R(K)\), \(\delta _R(\Xi ,\Xi ')\) induces a metric on the space of admissible (rR)-sets up to overall isometry, which includes the saturated sets as a subset.

Definition 2.4

We say an admissible and saturated set \(\Xi \) is strongly extreme for K if there is \(R>0\) and \(\varepsilon >0\), such that whenever \(\delta _R(\Xi ,\Xi ')<\varepsilon \), then either \(\Xi '\) is inadmissible or \(\underline{d}(\Xi ')\ge \overline{d}(\Xi )\).

We stop to discuss our choice of definitions. A packing of K is usually described as the collection \(\{\xi (K):\xi \in \Xi \}\) generated by an admissible set of isometries. We choose to work in the space of admissible sets instead of the space of packings. When the symmetry group of K is trivial, there is a one-to-one correspondence between the two spaces. Even when K has a nontrivial symmetry group (discrete or continuous), a strongly extreme admissible set generates a locally maximum-density packing in the induced topology. We henceforth implicitly identify admissible sets of isometries and the packings they generate.

We now discuss the choice of the topology we define in this space. A naive choice of topology is the one given by the metric

$$\begin{aligned} \delta _H(\Xi ,\Xi ') = \inf _{\text {bij.},\psi \in E(n)}\sup _{\xi \in \Xi } N(\xi ^{-1}\psi \xi ')\text{, } \end{aligned}$$
(4)

which we call the Hausdorff metric. However, under the Hausdorff topology the mean volume is locally constant, and any packing is trivially locally optimal. For the topology to allow the mean volume to vary locally, it must allow elements that are increasingly far apart to move by the action of increasingly different isometries. One metric satisfying this criterion, similar to the one used in [16], is

$$\begin{aligned} \inf _{\text {bij.},\psi \in E(n)} \sup _{\xi \in \Xi } \min \left( N(\xi ^{-1}\psi \xi '),1/\min (\Vert \xi (0)\Vert ,\Vert \xi '(0)\Vert )\right) \text{. } \end{aligned}$$
(5)

While suited for the discussions of recurrence there, here this metric will yield the result that any packing that is not as dense as the densest packing of K is not even locally optimal, since the density can be improved by a finite amount at arbitrarily small distance by completely changing the packing outside a ball of arbitrarily large radius. Under our definition, for a family of packings to have a reference packing as a limit, it must be the case that in every bounded region of space, there is a packing in the family that agrees with the reference packing to an arbitrarily small Hausdorff distance.

Importantly, there are deformations that intuitively feel continuous but are not continuous under our topology. One example is constructed by decorating a cylinder with a screw on its top base and a corresponding screw hole boring into its bottom base. Consider the packing where each cylinder is screwed into a cylinder above it, in such a way that the two cylinders are related to each other by a translation, and the screw is not completely screwed in. This creates a column of cylinders, parallel copies of which we arrange in a penny-packing pattern (see Fig. 1). Since every screw is not completely screwed in, the density of the packing can be increased by screwing each screw in further. Since the interlayer spacing is related to the relative rotation between cylinders on the two layers, even an arbitrarily small consistent decrease in interlayer spacing will cause some layers to be rotated by at least some finite angle. Because the orientation of the penny-packing grid remains unchanged, there is always a cylinder whose arrangement of neighbors, from the frame of reference in which it remains fixed, has changed by a finite extent. Therefore, this motion is not continuous in the topology we defined. As we prove some packing arrangements strongly extreme, it is worth keeping in mind what kinds of local improvement such a result rules out and what kinds are not ruled out.

Fig. 1
figure 1

Sample of the packing arrangement of cylinders with screws and screw holes described in the text. The arrangement can be condensed, but in a way that is not continuous in the topology given by \(\delta _R\)

Nevertheless, compared to some previously introduced notions of local optimality, our notion of strong extremality is both stronger and more widely applicable. The notions of an extreme lattice packing [18] and a periodic-extreme periodic packing [22] apply only to special classes of packings. We show that strong extremality, which applies more generally, implies extremality and periodic-extremality in these special classes.

Definition 2.5

A set of isometries \(\Lambda \) is called a (full rank) lattice if it is an (rR)-set for some \(r>0\) and \(R<\infty \), it consists only of translations, and it is closed under composition and inversion.

Definition 2.6

A lattice \(\Lambda \) is extreme for a body K if it is admissible for K and there exists \(\varepsilon >0\) such that for all \(T\in GL_n(\mathbb {R})\), either \(T[\Lambda ]=\{T[\lambda ]=T\lambda T^{-1}:\lambda \in \Lambda \}\) is inadmissible for K, \(\det T\ge 1\), or \(\Vert T-\mathrm {Id}\Vert >\varepsilon \).

Theorem 2.1

If a lattice \(\Lambda \) is strongly extreme for K, then \(\Lambda \) is extreme for K.

Proof

If \(\Lambda \) is not extreme for K, then for all \(\varepsilon >0\) there exists \(T\in GL_n(\mathbb {R})\) such that \(\Vert T-\mathrm {Id}\Vert \le \varepsilon \), \(\det T<1\), and \(T[\Lambda ]\) is admissible for K. We have \(N(\lambda _1^{-1}\lambda _2(T[\lambda _2])^{-1}T[\lambda _1]) = N((T-\mathrm {Id})[\lambda _1\lambda _2^{-1}]) \le \varepsilon \Vert \lambda _1(0)-\lambda _2(0)\Vert \). Thus, \(\delta _R(\Lambda ,T\Lambda )<2R\varepsilon \), and for arbitrarily small \(\varepsilon \), \(T\Lambda \) is an admissible packing of lower mean volume in an arbitrary neighborhood of \(\Lambda \). Therefore, \(\Lambda \) is not strongly extreme.

Definition 2.7

A set \(\Xi \subset E(n)\) is periodic if it is of the form \(\Lambda \Psi =\{\lambda \psi :\lambda \in \Lambda ,\psi \in \Psi \}\), where \(\Lambda \) is a lattice and \(\Psi \) is finite.

Definition 2.8

A periodic set \(\Xi =\Lambda \Psi \) is periodic-extreme for K if it is admissible for K and whenever \(\tilde{\Lambda }\subseteq \Lambda \) is a sublattice of \(\Lambda \) and \(\tilde{\Psi }\) is a set of \(|\Lambda /\tilde{\Lambda }|\) translations such that \(\Lambda = \tilde{\Lambda }\tilde{\Psi }\), there exists \(\varepsilon >0\) such that for all \(T\in GL_n(\mathbf {R})\) and \(\phi :\tilde{\Psi }\times \Psi \rightarrow E(n)\) we have either \(\{T[\tilde{\lambda }]\tilde{\psi }\psi \phi (\tilde{\psi },\psi ):\tilde{\lambda }\in \tilde{\Lambda },\tilde{\psi }\in \tilde{\Psi },\psi \in \Psi \}\) is inadmissible for K, \(\det T\ge 1\), \(\Vert T-\mathrm {Id}\Vert >\varepsilon \), or \(\Vert \phi (\tilde{\psi },\psi )-\mathrm {Id}\Vert >\varepsilon \) for some \(\tilde{\psi }\in \tilde{\Psi },\psi \in \Psi \).

Theorem 2.2

If a periodic set \(\Xi \) is strongly extreme for K, then it is periodic-extreme for K.

Proof

If \(\Xi \) is not periodic-extreme, then there exists \(\tilde{\Lambda }\subseteq \Lambda \) and \(\tilde{\Psi }\) as in Definition 2.8, such that for all \(\varepsilon >0\) there exists \(T\in GL_n(\mathbb {R})\) and \(\phi :\tilde{\Psi }\times \Psi \rightarrow E(n)\) such that \(\Xi '=\{T[\tilde{\lambda }]\tilde{\psi }\psi \phi (\tilde{\psi },\psi ):\tilde{\lambda }\in \tilde{\Lambda },\tilde{\psi }\in \tilde{\Psi },\psi \in \Psi \}\) is admissible for K, \(\det T < 1\), \(\Vert T-\mathrm {Id}\Vert \le \varepsilon \), and \(\Vert \phi (\tilde{\psi },\psi )-\mathrm {Id}\Vert \le \varepsilon \) for all \(\tilde{\psi }\in \tilde{\Psi },\psi \in \Psi \). Since \(\Psi \times \Psi '\) is finite, we have some bound \(\Vert \psi \tilde{\psi }\Vert ,\Vert \tilde{\psi }^{-1}\psi ^{-1}\Vert <M\). Consider the elements \(\xi _1=\tilde{\lambda }_1\tilde{\psi }_1\psi _1\), \(\xi _2=\tilde{\lambda }_2\tilde{\psi }_2\psi _2\), \(\xi _1,\xi _2\in \Xi \) and the corresponding elements \(\xi _1'=T[\tilde{\lambda }_1]\tilde{\psi }_1\psi _1\phi (\tilde{\psi }_1,\psi _1)\) \(\xi _2'=T[\tilde{\lambda }_2]\tilde{\psi }_2\psi _2\phi (\tilde{\psi }_2,\psi _2)\), \(\xi _1',\xi _2'\in \Xi '\). Using the inequalities (1), it is fairly straightforward to see that \(\Vert \xi _1^{-1}\xi _2(\xi _2')^{-1}\xi _1'\Vert \le C\varepsilon \), where C depends on M and R. Therefore \(\delta _R(\Xi ,\Xi ')\) can be made arbitrarily small, \(\Xi '\) is admissible, and \(d(\Xi ') = \det T d(\Xi )<d(\Xi )\), so \(\Xi \) is not strongly extreme. \(\square \)

We now derive a general method for proving strong extremality that we will use in the following sections.

Definition 2.9

Let \(\Xi \) be an (rR)-set of isometries. Let \(\mathcal T\) be a simplicial complex whose underlying space is \(\overline{\mathcal T}=\mathbb {R}^n\), whose vertices are in \(\{\xi (0):\xi \in \Xi \}\), and whose simplices s have underlying space \(\overline{s} = \mathrm {conv}_{\xi (0)\in s} \xi (0)\) with diameter uniformly bounded from above and inradius uniformly bounded from below. Let \(p:\mathcal T_n\rightarrow \Xi \) be a labeling of the full-dimensional simplices, such that

  • \(\xi (0)\in s\) whenever \(p(s) =\xi \).

  • \(\mathrm {vol}\,\mathcal P_\xi = v\) for all \(\xi \), where \(\mathcal P_\xi = \overline{p^{-1}(\xi )}=\bigcup _{s\text { s.t. }p(s)=\xi }\overline{s}\).

Then \((\mathcal T,p)\) is called a honeycomb of \(\Xi \), and \(\mathcal P_\xi \), \(\xi \in \Xi \), are the cells of the honeycomb.

It is easy to verify that if \(\Xi \) has a honeycomb with cells of volume v, then its mean volume is \(d(\Xi )=v\). We denote by \(\Xi _s,\Xi _\xi \subset \Xi \), the set of elements \(\xi \in \Xi \), such that \(\xi (0)\) is a vertex of \(s\in \mathcal T\), or respectively a vertex of any simplex in \(p^{-1}(\xi )\). When we consider another set \(\Xi '\) in bijection \((\cdot )':\Xi \rightarrow \Xi '\) with \(\Xi \), such that \(\delta _R(\Xi ,\Xi ')<r/2\), the triangulation \(\mathcal T\) gives us a new triangulation \(\mathcal T' = \{s' = \{\xi '(0): \xi (0)\in s\} : s\in \mathcal T\}\). The new triangulation gives new cells \(\mathcal P'_{\xi } = \bigcup _{s, p(s)=\xi } \overline{s}'\).

For a specified honeycomb \((\mathcal T,p)\) of \(\Xi \), we can consider the finite volume-minimization problem for the individual cells \(\mathcal P_\phi \):

$$\begin{aligned} \begin{aligned} \text {minimize }&\mathrm {vol}\,\mathcal P'_\phi =\sum _{s,p(s) = \phi } \mathrm {vol}\,\mathrm {conv}_{\xi (0)\in s} \xi '(0)\text{, } \\ \text {over }&(\cdot )':\Xi _\phi \rightarrow E(n)\text{, }\\ \text {subj. to }&\mathrm {int}\,\xi '_1(K)\cap \mathrm {int}\,\xi '_2(K) = \emptyset \quad \text { for all }\xi _1,\xi _2\in \Xi _\phi \text{, }\\&N(\xi ^{-1}\xi ')<\varepsilon \quad \text { for all }\xi \in \Xi _\phi \text{. } \end{aligned} \end{aligned}$$
(6)

Since, for a fixed \(\xi \in \Xi \), \(\xi '\) may appear in more than one cell-restricted problems, \(\Xi \) might be strongly extreme for K without the restriction of \(\mathrm {Id}:\Xi \rightarrow E(n)\) optimizing any of these restricted problems. However, it is reasonable to expect, and in fact we prove, that if the restriction of the identity optimizes all of these problems, then \(\Xi \) is strongly extreme.

Theorem 2.3

Let \(\Xi \) be admissible and saturated for K and let \((\mathcal T,p)\) be a honeycomb of \(\Xi \). If there exists \(\varepsilon >0\) such that \(\xi '=\xi \), \(\xi \in \Xi _\phi \), minimizes (6) for every cell \(\mathcal P_\phi \), then \(\Xi \) is strongly extreme.

Proof

Consider a set \(\Xi '\) admissible for K with \(\delta _{M}(\Xi ,\Xi ')<\varepsilon '\), where M is a uniform bound on the distance between two vertices in the same cell. When \(\varepsilon '\) is small enough, then \(N(\xi ^{-1} \phi (\phi ')^{-1} \xi ') < \varepsilon \) for all \(\xi \in \Xi _\phi \). Let \(\xi '' = \phi (\phi ')^{-1}\xi '\) for all \(\xi \in \Xi _\phi \), then \((\cdot )''\) is feasible for (6) and we have \(\mathrm {vol}\,\mathcal P'_\phi = \mathrm {vol}\,\mathcal P''_\phi \ge \mathrm {vol}\,\mathcal P_\phi = v\).

We now wish to bound the lower mean volume \(\underline{d}(\Xi ')\) from below. From the definition, \(\underline{d}(\Xi ') = \lim \inf _{t\rightarrow \infty } \mathrm {vol}\,B(0,t) / |\Xi '_t|\), where \(\Xi '_t = \{\xi \in \Xi :\xi '(0)\in B(0,t)\}\). The limit does not change if we replace t in the numerator \(\mathrm {vol}\,B(0,t)\) by \(t+2M\). Since \(B(0,t+2M)\) includes all the cells \(\mathcal P'_\xi \) for \(\xi \in \Xi '_t\), the volume of the ball must be at least \(v |\Xi '_t|\), and \(\underline{d}(\Xi ')\ge v = d(\Xi )\). Therefore, \(\Xi \) is strongly extreme.\(\square \)

Theorem 2.3 will be strong enough to prove that the densest known packing of regular pentagons is strongly extreme. However, to prove the same for regular heptagons, we will need a stronger version that allows us to introduce auxiliary objectives. We will consider instead of (6), a modified optimization problem:

$$\begin{aligned} \begin{aligned} \text {minimize }&\mathrm {vol}\,\mathcal P'_\phi + f_\phi ( (\xi ')_{\xi \in \Xi _\phi } )\text{, } \\ \text {over }&(\cdot )':\Xi _\phi \rightarrow E(n)\text{, }\\ \text {subj. to }&\mathrm {int}\,\xi '_1(K)\cap \mathrm {int}\,\xi '_2(K) = \emptyset \quad \text { for all }\xi _1,\xi _2\in \Xi _\phi \text{, }\\&N(\xi ^{-1}\xi ')<\varepsilon \quad \text { for all }\xi \in \Xi _\phi \text{. } \end{aligned} \end{aligned}$$
(7)

We say that the set of auxiliary functions is negligible in the aggregate (cf. [11, p. 149]) if there exist \(R,\varepsilon ,C,\) and T such that whenever \(d_R(\Xi ,\Xi ')<\varepsilon \), we have \(\sum _{\phi \in \Xi '_t}f_\phi ( (\xi ')_{\xi \in \Xi _\phi } ) \le C t^{n-1}\) for all \(t>T\). The auxiliary function is isometry-invariant if \(f_\phi ( (\psi \xi ')_{\xi \in \Xi _\phi } ) = f_\phi ( (\xi ')_{\xi \in \Xi _\phi } )\) for all \(\psi \in E(n)\). It is straightforward to extend the proof of Theorem 2.3 to obtain

Theorem 2.4

Let \(\Xi \) be admissible and saturated for K, let \((\mathcal T,p)\) be a honeycomb of \(\Xi \), and let \(f_\phi \), \(\phi \in \Xi \), be negligible in the aggregate and isometry-invariant. If there exists \(\varepsilon >0\) such that \(\xi '=\xi \), \(\xi \in \Xi _\phi \), minimizes (7) for each cell \(\mathcal P_\phi \), then \(\Xi \) is strongly extreme.

2.2 Double Lattices

In this section, we focus on convex bodies in 2 dimensions, and specifically, convex polygons. We recapitulate some of the theory of double lattices, due to Kuperberg and Kuperberg [17] and Mount [19].

Definition 2.10

A chord of a convex body K is a line segment whose endpoints lie on the boundary of K. A chord is an affine diameter if there is no longer chord parallel to it.

Definition 2.11

The convex hull of two parallel chords that are half the length of the parallel affine diameter is called a half-length parallelogram.

An affine diameter of K does not in general uniquely determine a half-length parallelogram (e.g., when a parallel segment of sufficient length exists on the boundary), but it always uniquely determines its area.

Definition 2.12

A set \(\Lambda \subset E(n)\) is called a (full rank) double lattice if it is an (rR)-set for some \(r>0\) and \(R<\infty \), it consists of translations and point reflections, it is closed under composition and inversion, and it is not a lattice (that is, includes at least one point reflection).

An n-dimensional double lattice is generated by a lattice and a point reflection, or alternatively by reflections about \(n+1\) affine-independent points or by reflections about the 2n vertices of a parallelepiped.

Theorem 2.5

(Kuperberg and Kuperberg, Mount). For a planar convex body K, an admissible double lattice of smallest mean area is generated by reflection about the vertices of a half-length parallelogram.

Kuperberg and Kuperberg use extensive parallelograms, inscribed parallelograms with all edge lengths greater than half the length of the parallel affine diameter. They later restrict the analysis to the set of half-length parallelograms [17]. Mount gives an explicit proof that it suffices to consider only the half-length parallelograms [19].

Fig. 2
figure 2

Half-length parallelograms in the regular 9-gon. That the minimum at \(y=0\) is not the global minimum, and therefore that the densest double-lattice packing is not given by the symmetric arrangement as for pentagons and heptagons, was possibly first noticed by Graaf et al. [7]

Figure 2 illustrates some affine diameters and half-length parallelograms in the regular 9-gon. We denote one endpoint of the affine diameter by \(\mathbf {p}_1\) and the parallelogram vertices and other affine diameter endpoint as \(\mathbf {p}_i\), \(i=2,\ldots ,6\) in counterclockwise order. The parallelogram edges \(\mathbf {p}_2\mathbf {p}_3\) and \(\mathbf {p}_5\mathbf {p}_6\) are the ones parallel to the affine diameter \(\mathbf {p}_1\mathbf {p}_4\). The space of such labeled half-length parallelograms of a body K is a circle, in the sense that we can continuously parameterize \(\mathbf {p}_i(t)\), \(i=1,\ldots ,6\), \(t\in S^1\). In fact, when the body is a polygon, as we henceforth assume, this parameterization can be piecewise linear [19]. This parameterization specifies an affine diameter, a specification that does not change the half-length parallelogram or the generated double lattice. In the interior of each linear piece of the parameterization, either (1) one endpoint of the affine diameter is stationary at a vertex of K, and the other points either move at a constant speed (possibly zero) along the interior of an edge or are stationary at a vertex or (2) all parallelogram vertices are stationary while the affine diameter moves along two parallel edges. The piecewise linear parameterization of affine-diameter-augmented parallelograms can be converted to a piecewise linear parameterization of parallelograms by eliminating the intervals of type 2. A half-length parallelogram is called pivotal if it sits at the boundary of two linear pieces, that is, there is a discontinuity in the direction of motion of at least one point \(\mathbf {p}_i\).

Fig. 3
figure 3

Three examples of the exceptional cases defined in Definition 2.13. Left and middle: exceptional half-length parallelograms of type I. Right: exceptional half-length parallelogram of type II

In order to prove our main theorem in Sect. 3.3, we will need to know that the double lattice we wish to show is strongly extreme does not have any of the vertices of the half-length parallelogram coincident with vertices of the polygon K. This will generically follow from the fact that the double lattice packing is an isolated minimum. In some exceptional cases, which must be treated separately, it does not. We define the following exceptional types of parallelograms, illustrated in Fig. 3:

Definition 2.13

  1. 1.

    If \(\mathbf {p}_2\) and \(\mathbf {p}_6\) are in the interiors of edges that meet at \(\mathbf {p}_1\) and \(\mathbf {p}_3\) and \(\mathbf {p}_5\) are in the interiors of edges that meet at \(\mathbf {p}_4\), then the parallelogram is exceptional of type I.

  2. 2.

    If the affine diameter parallel to \(\mathbf {p}_2\mathbf {p}_3\) is not unique, then if \(\mathbf {p}_2\) and \(\mathbf {p}_3\) are in the interior of edges that have as endpoints the endpoints of one such affine diameter and \(\mathbf {p}_5\) and \(\mathbf {p}_6\) are in the interior of edges that have as endpoints the endpoints of another such affine diameter, the parallelogram is also be considered exceptional of type I.

  3. 3.

    If \(\mathbf {p}_3\) and \(\mathbf {p}_4\) are in the interior of the same edge, and \(\mathbf {p}_2\) is at a vertex, then the parallelogram is exceptional of type II. It can also be exceptional of type II if the same situation occurs with \((\mathbf {p}_2,\mathbf {p}_1,\mathbf {p}_3)\), \((\mathbf {p}_5,\mathbf {p}_4,\mathbf {p}_6)\), or \((\mathbf {p}_6,\mathbf {p}_1,\mathbf {p}_5)\) in place of \((\mathbf {p}_3,\mathbf {p}_4,\mathbf {p}_2)\).

Theorem 2.6

When the half-length parallelogram \(\mathbf {p}_2\mathbf {p}_3\mathbf {p}_5\mathbf {p}_6\) is an isolated local minimum of area in the space of half-length parallelograms of a convex polygon K, then either (1) it is not pivotal, or (2) it is exceptional of type I.

Proof

Assume that the given parallelogram is pivotal. If the parallel affine diameter is unique, then at least one endpoint is a vertex of K. If only one endpoint is a vertex let it be \(\mathbf {p}_1\). Otherwise, let \(\mathbf {p}_1\) be the vertex that makes the smaller angle between the adjoining counterclockwise edge and the affine diameter. If the affine diameter is not unique, all of the subsequent analysis is equivalent if we remove from K the parallelogram traced out by the affine diameters and identify the two extreme affine diameters. In the modified version of K, the affine diameter is unique. We may pick our orientation, scale, and origin without loss of generality so that the affine diameter is horizontal, of length 2, and bisected by the origin.

We can evolve the parameterization forward (counterclockwise) or backward (clockwise) from the given parallelogram. For each \(i=1,\ldots ,6\), let \(\mathbf {v}_i=(\cos \phi _i,\sin \phi _i)\) be a unit vector pointing, in the counterclockwise direction, along the edge of K that \(\mathbf {p}_i\) belongs to. If \(\mathbf {p}_i\) is at the intersection of two edges, we use the counterclockwise adjoining edge. Let \(\mathbf {u}_i=(\cos \chi _i,\sin \chi _i)\) be defined exactly the same way, except we use the clockwise adjoining edge in case of intersection. The forward evolution is by velocities \(c_i\), \(i=1,\ldots ,6\) satisfying

$$\begin{aligned} c_1 \mathbf {v}_1 - c_4 \mathbf {v}_4 = 2 c_2\mathbf {v}_2-2 c_3\mathbf {v}_3 = 2 c_6\mathbf {v}_6-2 c_5\mathbf {v}_5\text{. } \end{aligned}$$
(8)

Together with the fact that \(c_1=0\), which follows from our choice of which affine diameter endpoint to label as \(\mathbf {p}_1\), this equation determines \(c_i\), \(i=1,\ldots ,6\), up to multiplication by a common factor. The points evolves as \(\mathbf {q}_i(t) = \mathbf {p}_i+t c_i\mathbf {v}_i\).

The area of the parallelogram is a quadratic function of t. If its slope for forward evolution is negative, then the parallelogram is not a local minimum, and we are done. Therefore, let it be nonnegative. We now extend the forward evolution to negative times. We obtain parallelograms, not necessarily inscribed, of area \(A(t) \le A(0) +O(t^2)\), together with parallel segments not necessarily affine diameters, but double the length of the parallel sides. Since the initial parallelogram is pivotal, at least one point \(\mathbf {q}_i(t)\), \(i=2,\ldots ,5\) will go into the exterior of K, and the distance between it and K will grow linearly with t. We will argue that from these parallelograms, we can construct half-length parallelograms of area \(A'(t) \le A(t) + a t\), with \(a>0\). When we have shown those exist, it follows that the given parallelogram is not a local minimum.

We first treat the case where exactly one point becomes exterior. Suppose the point that becomes exterior is \(\mathbf {q}_2(t)\) (Fig. 4, left). We find the points \(\mathbf {q}_2'(t) = \mathbf {p}_2 + t c_2'\mathbf {u}_2\) and \(\mathbf {q}_3'(t) = \mathbf {p}_3 + t c_3'\mathbf {v}_3\) that form a segment that is parallel to and of the same length as \(\mathbf {q}_2(t)\mathbf {q}_3(t)\). A trigonometric calculation gives \(c_2'= c_2 \sin (\phi _3-\phi _2)/\sin (\phi _3-\chi _2)\) and \(c_3'=c_3 - c_2\sin (\phi _2-\chi _2)/\sin (\phi _3-\chi _2)\). At least up to some finite negative time \(-T\le t<0\), \(\mathbf {q}_2'(t)\) is on the clockwise edge adjoining \(\mathbf {p}_2\) (\(c_2'>0\)), and \(\mathbf {q}_3(t)\) is either on the interior of the same edge as \(\mathbf {p}_3\), if the latter is in the interior of an edge, or else, if \(\mathbf {p}_3\) is at a vertex and stationary for forward evolution, then \(c_3=0\) and \(c_3'<0\), so \(\mathbf {q}_3'(t)\) is on the counterclockwise adjoining edge to \(\mathbf {p}_3\). In any case, the parallelogram \(\mathbf {q}_2'(t)\mathbf {q}_3'(t)\mathbf {q}_5(t)\mathbf {q}_6(t)\) is inscribed (therefore a half-length parallelogram), and its area is \(A'(t) = A(t) - c_2 t \sin \phi _3 \sin (\phi _2-\chi _2)/\sin (\phi _3-\chi _2) + O(t^2)\). If \(\phi _3>\pi \), the linear term is negative for negative times, and we have our desired parallelogram. Otherwise, \(\phi _3=\pi \) and the edge \(\mathbf {p}_2\mathbf {p}_3\) can be moved left while staying in K, so the given parallelogram is not an isolated minimum. The same argument works, mutatis mutandis, when any one of the other three parallelogram vertices becomes exterior.

Fig. 4
figure 4

Extending the forward evolution of the half-length parallelogram to negative times at a pivotal parallelogram. Left: When one of the parallelogram vertices becomes external, we construct the parallel chord of the same length. Right: When an endpoint of the affine diameter becomes external and the intersection with the polygon is still an affine diameter, we construct the two parallel chords of half the length of that affine diameter (bottom chord omitted)

Now suppose that \(\mathbf {q}_4(t)\) becomes exterior for \(t<0\) (Fig. 4, right). We distinguish two cases, (i) if \(\chi _4-\chi _1\ge \pi \), then \(\mathbf {p}_1\mathbf {q}'_4\), the intersection of \(\mathbf {p}_1\mathbf {q}_4(t)\) with K, is an affine diameter, and (ii) otherwise, there is a parallel affine diameter \(\mathbf {q}'_1(t)\mathbf {p}_4\), with \(\mathbf {q}'_1(t)\) on the clockwise adjoining edge to \(\mathbf {p}_1\). In case (i), we have \(\mathbf {q}'_4(t) = \mathbf {p}_4 + l_4(t) \mathbf {u}_4\), where \(l_4(t) = c_4 t \sin \phi _4/(\sin \chi _4 + \tfrac{1}{2} c_4 t \sin (\phi _4-\chi _4))\). We find points \(\mathbf {q}_i'(t) = \mathbf {q}_i(t) + l_i(t) \mathbf {v}_i\), \(i=2,5\) and \(\mathbf {q}_i'(t) = \mathbf {q}_i(t) + l_i(t) \mathbf {u}_i\), \(i=3,6\), such that \(2(\mathbf {q}_2'(t) - \mathbf {q}_3'(t)) = 2(\mathbf {q}_6'(t) - \mathbf {q}_5'(t)) = \mathbf {p}_1 - \mathbf {q}_4'(t)\). For small enough negative times, this construction will yield a half-length parallelogram, and its area is

$$\begin{aligned} A'(t) = A(t) + \tfrac{1}{2}c_4 \frac{\sin (\phi _4-\chi _4)}{\sin \chi _4} \left[ h - \frac{1}{\cot \chi _3-\cot \phi _2} - \frac{1}{\cot \chi _6-\cot \phi _5}\right] + O(t^2)\text{, } \end{aligned}$$
(9)

where h is the height of the parallelogram. Since we chose 2 for the length of the affine diameter \(\mathbf {p}_1\mathbf {p}_4\), we have

$$\begin{aligned} h\le \frac{1}{\cot \chi _3-\cot \phi _2} + \frac{1}{\cot \chi _2-\cot \phi _5}, \end{aligned}$$

with equality only if the parallelogram is exceptional of type I. In case (ii), we have \(\mathbf {q}'_1(t) = \mathbf {p}_1 + l_1(t)\mathbf {u}_1\), where \(l_1(t) = c_4 t \sin \phi _4/(\sin (\chi _1+\pi ) + \tfrac{1}{2} c_4 t \sin (\phi _4-\chi _1-\pi ))\). Again we find the points \(\mathbf {q}_i'(t)\), \(i=2,3,5,6\), that make up a half-length parallelogram with the new affine diameter, and the area of this parallelogram is

$$\begin{aligned} A'(t)= & {} A(t) + \tfrac{1}{2}c_4 \frac{\sin (\phi _4-\chi _1-\pi )}{\sin (\chi _1-\pi )} \left[ h - \frac{1}{\cot \chi _3-\cot \phi _2} - \frac{1}{\cot \chi _6-\cot \phi _5}\right] \nonumber \\&+ \, O(t^2)\text{. } \end{aligned}$$
(10)

In this case too, the given parallelogram is either not a local minimum or is exceptional of type I.

To handle the case of multiple points becoming exterior upon extension of the forward evolution to negative times, we can simply compose the operations we performed in each of the cases of a single exterior point. In each step of the composed operation, we pretend that the exterior points we have not yet handled and we are not handling right now are actually on the boundary of the polygon. Since in each step of the operation we obtain a linear-order reduction to the area of the parallelogram, the composition also provides a linear-order reduction. Therefore, the given parallelogram cannot be an isolated local minimum. \(\square \)

Theorem 2.7

If the half-length parallelogram \(\mathbf {p}_2\mathbf {p}_3\mathbf {p}_5\mathbf {p}_6\) is not pivotal and is an isolated local minimum of area in the space of half-length parallelograms of a convex polygon K then either (1) all its vertices and at least one affine diameter endpoint are in the interior of polygon edges, or (2) it is exceptional of type II.

Proof

Since the parallelogram is not pivotal, forward and backward evolution use the same linear velocities. The solution (up to common factor) to (8) is

$$\begin{aligned} \begin{aligned} c_1&= 0,\\ c_2&= \sin (\phi _4-\phi _3)\sin (\phi _6-\phi _5),\\ c_3&= \sin (\phi _4-\phi _2)\sin (\phi _6-\phi _5),\\ c_4&= 2\sin (\phi _3-\phi _2)\sin (\phi _6-\phi _5),\\ c_5&= \sin (\phi _3-\phi _2)\sin (\phi _6-\phi _4),\\ c_6&= \sin (\phi _3-\phi _2)\sin (\phi _5-\phi _4)\text{. } \end{aligned} \end{aligned}$$
(11)

When the affine diameter is not unique, we perform the same modification as in the previous proof. The only way one of the labeled points except \(\mathbf {p}_1\) can be at a vertex, is if its velocity \(c_i\) vanishes. If \(\phi _3=\phi _2\) or \(\phi _5=\phi _6\), then one of the parallelogram edges is strictly contained in a polygon edge and the parallelogram is not a local minimum. Therefore, the only way a labeled point other than \(\mathbf {p}_1\) can be stationary is if \(\phi _4=\phi _2+\pi ,\phi _3,\phi _5\), or \(\phi _6-\pi \). If \(\phi _4=\phi _2+\pi \), then necessarily also \(\phi _1=\phi _2\), the affine diameter is not unique, reaching a contradiction. Similarly, we do not have \(\phi _4=\phi _6-\pi \). If \(\phi _4=\phi _3\) and \(\mathbf {p}_2\) is stationary at a vertex or if \(\phi _4=\phi _5\) and \(\mathbf {p}_5\) is stationary at a vertex then the parallelogram is exceptional of type II.\(\square \)

We will prove in Sect. 3.3 that whenever the least-area half-length parallelogram of a polygon K is an isolated local minimum of area and is not one of the two exceptional types, the resulting double-lattice is strongly extreme. However, the cases where the least-area half-length parallelogram is not an isolated minimum or is exceptional do not arise generically: it can be easily seen that when these cases arise, a generic small displacement of one boundary point removes the polygon from that case.

2.3 Honeycomb Construction

We now describe a honeycomb associated with any double lattice packing. Let K be a convex polygon and let \(\mathbf p_2\mathbf p_3\mathbf p _5\mathbf p_6\) be a half-length parallelogram, such that \(\mathbf p_2\mathbf p_3\) and \(\mathbf p_6\mathbf p_5\) are half the length of and parallel to the affine diameter \(\mathbf p_1\mathbf p_4\). The double lattice generated by reflections about the vertices of the parallelogram is \(\Xi \) and the subgroup of translations is the lattice \(\Lambda \). Let \(\xi _0 = \mathrm {Id}\), \(\xi _1 = \mathrm {Tran}_{\mathbf p_1-\mathbf p_4}\), \(\xi _2 = \mathrm {Ref}_{\mathbf {p}_2}\), and \(\xi _6 = \mathrm {Ref}_{\mathbf {p}_6}\), where \(\mathrm {Ref}_\mathbf {r}\) is a reflection about \(\mathbf {r}\) and \(\mathrm {Tran}_\mathbf {r}\) is a translation by \(\mathbf {r}\). Let \(s_2=\{\xi _0(0),\xi _1(0),\xi _2(0)\}\) and \(s_6=\{\xi _0(0),\xi _6(0), \xi _1(0)\}\), then \(\mathcal {T}_2 = \{ \xi (s_2), \xi (s_6): \xi \in \Xi \}\) are the full-dimensional simplices of a triangulation \(\mathcal T\) of \(\mathbb {R}^2\), and if \(p(\xi (s_2))=p(\xi (s_6))=\xi \), then \((\mathcal T,p)\) is a honeycomb of the double lattice \(\Xi \) (Fig. 5). The optimization problem of minimizing \(\mathrm {vol}\,\mathcal P'_\phi \) over \((\cdot )':\Xi _\phi \rightarrow E(n)\), is equivalent for every \(\phi \in \Xi \). Therefore, to show that Theorem 2.3 applies, it suffices to show that the restriction of the identity is optimal for the problem associated with the cell \(\mathcal P_{\xi _0}\).

For every convex body K and double lattice \(\Xi \) we now have a concrete optimization problem to solve: we wish to minimize the area of the quadrilateral \(\xi '_0(0)\xi '_6(0)\xi '_1(0)\xi '_2(0)\) subject to the constraints that \(\xi '_i(K)\) and \(\xi '_j(K)\) do not overlap. Since the objective and the constraints are invariant under common isometry, we may fix \(\xi '_i=\xi _i\) for one i. We parametrize \(\xi _i' = \mathrm {Tran}_{\mathbf {r}_i}\xi _i \mathrm {Rot}_{\theta _i}\), where \(\mathrm {Rot}_\theta \) is a rotation by \(\theta \) about the origin. Since we are only interested in certifying that the initial configuration is a local minimum, we can replace the constraints with ones that are equivalent in a neighborhood.

Lemma 2.1

Let K and \(K'\) be two polygons that intersect at a segment, which is not identical with a full edge of K or of \(K'\). The endpoints of the segments are \(\mathbf {x}\) a vertex of K and \(\mathbf {y}\) a vertex of \(K'\). Let \(\mathbf {y}\mathbf {y}'\) and \(\mathbf {x}\mathbf {x}'\) be the edges of K and \(K'\) containing the intersection. Let \(\mathbf {x}'\mathbf {y}\mathbf {x}\mathbf {y}'\) be oriented counterclockwise from the point of view of the interior of K (otherwise switch K and \(K'\)). There is some \(\varepsilon >0\) such that whenever \(N(\xi ),N(\xi ')<\varepsilon \), then \(\xi (K)\) and \(\xi '(K')\) have disjoint interiors if and only if \(\alpha (\xi (\mathbf {x})\xi (\mathbf {x}')\xi '(\mathbf {y}))\ge 0\) and \(\alpha (\xi '(\mathbf {y}')\xi '(\mathbf {y})\xi (\mathbf {x}))\ge 0\), where \(\alpha \) is the signed area of the oriented triangle.

Fig. 5
figure 5

Honeycomb construction for the regular 9-gon. Each honeycomb cell (hatched parallelogram) is composed of two triangles. The area minimization problem for each cell involves only the four 9-gons centered at the cell vertices

Lemma 2.2

Let K and \(K'\) be two polygons that intersect at a point and not at a segment. The intersection point \(\mathbf y\) is a vertex of one polygon, which we let be \(K'\), and sits in the relative interior of the segment \(\mathbf {x}'\mathbf {x}\), which in turn is contained in an edge of K. We assume the segment \(\mathbf {x}'\mathbf {x}\) is oriented counterclockwise from the point of view of the interior of K. There is some \(\varepsilon >0\) such that whenever \(N(\xi ),N(\xi ')<\varepsilon \), then \(\xi (K)\) and \(\xi '(K')\) have disjoint interiors if and only if \(\alpha (\xi (\mathbf {x})\xi (\mathbf {x}')\xi '(\mathbf {y}))\ge 0\).

Note that two cases are not treated: the case of an intersection at a point that is a vertex of both polygons and the case of an intersection at a full edge of one or both polygons. In the optimal double-lattice packings of the first two bodies we treat, the regular pentagon and the regular heptagon, there are no such intersections. For the application to more general convex polygons, we have already shown that the first intersection case only arises in exceptional cases. The constraint arising in the second case can be relaxed to a constraint of the form of Lemma 2.1 without affecting the solution.

We will show that optimization problems that arise fall into a convenient form, where linear stability holds along all but one direction. Along the direction of vanishing linear stability, the theory of Kuperberg and Kuperberg will be shown to guarantee stability.

Consider the nonlinear optimization problem

$$\begin{aligned} \begin{aligned} \text {minimize }&f(x)\text{, }\\ \text {over }&x\in \mathbb {R}^n\text{, } \\ \text {subject to }&g_r(x)\ge 0, \quad r\in I\text{, }\\&\Vert x\Vert \le \varepsilon \text{. } \end{aligned} \end{aligned}$$
(12)

We will prove that the following conditions are sufficient for the origin to be the unique solution of (12) for some \(\varepsilon >0\). Let \(e_1\) denote the standard unit vector \((1, 0,\dots ,0)\in \mathbb {R}^n\), \(E=\{te_1\}\) its span, and H the orthogonal complement, so that \(\mathbb {R}^n = E\oplus H\).

Conditions 2.1

  1. 1.

    I is a finite set.

  2. 2.

    f(x) and \(g_r(x)\), \(r\in I\), are continuously differentiable. Denote their derivatives \(F(t) = \nabla f (te_1)\) and \(G_r(t) = \nabla g_r(te_1)\).

  3. 3.

    \(f(0) = g_r(0) = 0\) for all r in I.

  4. 4.

    The linear program

    $$\begin{aligned} \text {minimize }_{x\in \mathbb {R}^n}F(0)\cdot x \quad \text { subject to }G_r(0)\cdot x \ge 0,\quad r\in I, \end{aligned}$$
    (13)

    has E as the set of solutions.

  5. 5.

    There is an \(\varepsilon > 0\) so the functions \(g_r(te_1) = 0\) for all \(-\varepsilon<t<\varepsilon \), \(r\in I\).

  6. 6.

    \(f(te_1)\) has an isolated local minimum at \(t=0\).

Theorem 2.8

Given Conditions 2.1, there exists \(\varepsilon >0\) such that the origin is the unique solution of (12).

Proof

Consider the sliced problem with t fixed,

$$\begin{aligned} \begin{aligned} \text {minimize }&f(x)\text{, }\\ \text {over }&x\in H+t e_1\text{, }\\ \text {subject to }&g_r(x)\ge 0, \quad r\in I,\\&\Vert x\Vert \le \varepsilon \text{. } \end{aligned} \end{aligned}$$
(14)

For \(t=0\), the local minimum at \(x=0\) is guaranteed by the LP (13) [1]. The LP implies, by duality, that F(0) lies in the interior (relative to H) of the cone finitely generated by \(G_r(0)\), \(r\in I\). By continuity, we have that F(t) also lies in the interior of the cone generated by \(G_r(t)\), \(r\in I\), for all \(|t|<\varepsilon \), for some \(\varepsilon >0\). Therefore, in each slice, the minimum of (14) occurs at \(x=te_1\). From the final condition of 2.1, we conclude that the origin is the unique solution of (12) for some \(\varepsilon >0\).\(\square \)

3 Calculations

3.1 Pentagons

Let us fix a regular pentagon \(K=\mathrm {conv} \{\mathbf {k}_i:i=0,\ldots 4\}\), where \(\mathbf {k}_i = \mathrm {Rot}_{2\pi i/5} (1,0)\). In this subsection, we do all the calculations in the extension field \(\mathbb {Q}(u,v)\), where \(u=\cos \pi /5\) and \(v=\sin \pi /5\).

Fig. 6
figure 6

Half-length parallelograms in the regular pentagon

One minimum-area half-length parallelogram corresponds to the affine diameter \(\mathbf {p}_1\mathbf {p}_4\), where \(\mathbf {p}_1 = \mathbf {k}_0\) and \(\mathbf {p}_4 = \tfrac{1}{2} (\mathbf {k}_2+\mathbf {k}_3)\). The vertices of the parallelogram are given by \(\mathbf {p}_2=\tfrac{1}{4}\mathbf k_0+\tfrac{3}{4}\mathbf k_1\), \(\mathbf {p}_3=\tfrac{3-2u}{4}\mathbf k_1+\tfrac{1+2u}{4}\mathbf k_2\), \(\mathbf {p}_5=\tfrac{1+2u}{4}\mathbf k_3+\tfrac{3-2u}{4}\mathbf k_4\), and \(\mathbf {p}_6=\tfrac{3}{4}\mathbf k_4+\tfrac{1}{4}\mathbf k_0\). This half-length parallelogram is illustrated in Fig. 6.

The four pentagons that surround our primitive honeycomb cell are \(\xi _i(K)\), \(i=0,1,2,6\), where \(\xi _0=\mathrm {Id}\), \(\xi _1=\mathrm {Tran}_{\mathbf {p}_1-\mathbf {p}_4}\), \(\xi _2=\mathrm {Ref}_{\mathbf {p}_2}\), and \(\xi _6=\mathrm {Ref}_{\mathbf {p}_6}\). We are interested in showing that the assignment \(\xi _i'=\xi _i\), \(i=0,1,2,6\), locally minimizes the area of the quadrilateral \(\xi _0(0)\xi _6(0)\xi _1(0)\xi _2(0)\), subject to the nonoverlap constraints. As explained in the previous section, we may fix \(\xi '_1=\xi _1\) and replace the nonoverlap constraints by signed area constraints. We obtain the following optimization problem:

$$\begin{aligned} \begin{aligned} \text {minimize } f(z)&=\alpha (\xi '_0(0),\xi '_1(0),\xi '_2(0))+\alpha (\xi '_0(0),\xi '_6(0),\xi '_1(0) )\text{, } \\ \text {subj. to } g_1(z)&=\alpha (\xi '_0(\mathbf {k}_1),\xi '_0(\mathbf {k}_0),\xi '_2(\mathbf {k}_1))\ge 0,\\ g_2(z)&=\alpha (\xi '_2(\mathbf {k}_1),\xi '_2(\mathbf {k}_0),\xi '_0(\mathbf {k}_1))\ge 0,\\ g_3(z)&=\alpha (\xi '_0(\mathbf {k}_0),\xi '_0(\mathbf {k}_4),\xi '_6(\mathbf {k}_4))\ge 0,\\ g_4(z)&=\alpha (\xi '_6(\mathbf {k}_0),\xi '_6(\mathbf {k}_4),\xi '_0(\mathbf {k}_4))\ge 0,\\ g_5(z)&=\alpha (\xi '_1(\mathbf {k}_2),\xi '_1(\mathbf {k}_1),\xi '_2(\mathbf {k}_2))\ge 0,\\ g_6(z)&=\alpha (\xi '_2(\mathbf {k}_2),\xi '_2(\mathbf {k}_1),\xi '_1(\mathbf {k}_2))\ge 0,\\ g_7(z)&=\alpha (\xi '_1(\mathbf {k}_4),\xi '_1(\mathbf {k}_3),\xi '_6(\mathbf {k}_3))\ge 0,\\ g_8(z)&=\alpha (\xi '_6(\mathbf {k}_4),\xi '_6(\mathbf {k}_3),\xi '_1(\mathbf {k}_3))\ge 0,\\ g_9(z)&=\alpha (\xi '_1(\mathbf {k}_3),\xi '_1(\mathbf {k}_2),\xi '_0(\mathbf {k}_0))\ge 0\text{, } \end{aligned} \end{aligned}$$
(15)

where \(\xi '_i=\mathrm {Tran}_{(x_i,y_i)}\xi _i\mathrm {Rot}_\theta \) for \(i=0,2,6\), and \(\xi '_1=\xi _1\). We adopt a condensed notation for the free variables \(z=(x_0,y_0,\theta _0,x_2,y_2,\theta _2,x_6,y_6,\theta _6)\).

We consider the linearization of (15) around the point \(z=0\in \mathbb {R}^9\). This gives a problem of the form

$$\begin{aligned} \text {minimize } c\cdot z \quad \text { subject to } G z\ge 0\text{, } \end{aligned}$$
(16)

where \(c\in \mathbb R^9\), \(G\in \mathbb {R}^{9\times 9}\) and we use the linear programming notation \(\ge 0\) to denote a vector lying in the closed positive orthant. The exact numeric values of G and c are given in Table 1. We can show by direct calculation a vector \(\eta >0\) lying in the open positive orthant exists such that \(c=\eta ^T G\). Such a vector is given in Table 1. By the fundamental theorem of linear algebra, this observation implies that \(G z\ge 0\) and \(c\cdot z\le 0\) if and only if \(G z = 0\) and \(c\cdot z=0\), and so the program (16) is minimized exactly by the null space of G and is suboptimal elsewhere in the cone \(Gz\ge 0\). We calculate the rank of G to be 8, and so the null space is one-dimensional, and it is generated by the vector \(z_0\) given in Table 1. The null space corresponds precisely to the rearrangement given by evolving the half-length parallelogram according to the locally linear parameterization discussed in Sect. 2.2. Importantly, this motion involves no rotations. We can verify directly that \(f(tz_0)\) is a quadratic function of t minimized at \(t=0\), and that \(g_r(tz_0)=0\) identically for \(r=1,\ldots ,9\). Indeed, perturbing the half-length parallelogram away from the minimum-area one increases the area of the resulting cell and maintains all the contacts.

Table 1 Constraint Jacobian and objective gradient for the pentagon packing problem

Therefore, (15), after a change in coordinates, satisfies all the conditions of Theorem 2.8, and following directly from Theorem 2.3 we have:

Theorem 3.1

The optimal double-lattice packing of regular pentagons, illustrated in Fig. 6, is strongly extreme.

3.2 Heptagons

The calculation for the regular heptagon starts out in the same manner as the calculation presented above for regular pentagons. However, it will turn out that the linear program equivalent to (16) is in this case not minimized at \(z=0\), and so we will need to add an auxiliary objective to the area as allowed for in Theorem 2.4.

We fix a regular heptagon \(K=\mathrm {conv} \{\mathbf {k}_i:i=0,\ldots 6\}\), where \(\mathbf {k}_i = \mathrm {Rot}_{2\pi i/7} (1,0)\). In this case, our calculations are performed in the extension field \(\mathbb {Q}(u,v)\), where \(u=\cos \pi /7\) and \(v=\sin \pi /7\). A minimum-area half-length parallelogram corresponds to the affine diameter \(\mathbf {p}_1\mathbf {p}_4\), where \(\mathbf {p}_1 = \mathbf {k}_0\) and \(\mathbf {p}_4 = \tfrac{1}{2} (\mathbf {k}_3+\mathbf {k}_4)\). The vertices of the parallelogram are given by \(\mathbf {p}_2=(1-a)\mathbf k_1+a\mathbf k_2\), \(\mathbf {p}_3=(1-b)\mathbf k_2+b\mathbf k_3\), \(\mathbf {p}_5=b\mathbf k_4+(1-b)\mathbf k_5\), and \(\mathbf {p}_6=a\mathbf k_5+(1-a)\mathbf k_6\). This half-length parallelogram is illustrated in Fig. 7.

Fig. 7
figure 7

Half-length parallelograms in the regular heptagon. The dashed configuration is the local minimum at \(y>0\)

Following the same steps as in the previous sections, we obtain the following optimization problem:

$$\begin{aligned} \begin{aligned} \text {minimize }&f(z)=\alpha (\xi '_0(0),\xi '_1(0),\xi '_2(0))+\alpha (\xi '_0(0),\xi '_6(0),\xi '_1(0) )\text{, }\\ \text {subj. to }&g_1(z)=\alpha (\xi '_0(\mathbf {k}_2),\xi '_0(\mathbf {k}_1),\xi '_2(\mathbf {k}_1))\ge 0,\\&g_2(z)=\alpha (\xi '_2(\mathbf {k}_2),\xi '_2(\mathbf {k}_1),\xi '_0(\mathbf {k}_1))\ge 0,\\&g_3(z)=\alpha (\xi '_0(\mathbf {k}_6),\xi '_0(\mathbf {k}_5),\xi '_6(\mathbf {k}_6))\ge 0,\\&g_4(z)=\alpha (\xi '_6(\mathbf {k}_6),\xi '_6(\mathbf {k}_5),\xi '_0(\mathbf {k}_6))\ge 0,\\&g_5(z)=\alpha (\xi '_1(\mathbf {k}_3),\xi '_1(\mathbf {k}_2),\xi '_2(\mathbf {k}_3))\ge 0,\\&g_6(z)=\alpha (\xi '_2(\mathbf {k}_3),\xi '_2(\mathbf {k}_2),\xi '_1(\mathbf {k}_3))\ge 0,\\&g_7(z)=\alpha (\xi '_1(\mathbf {k}_5),\xi '_1(\mathbf {k}_4),\xi '_6(\mathbf {k}_4))\ge 0,\\&g_8(z)=\alpha (\xi '_6(\mathbf {k}_5),\xi '_6(\mathbf {k}_4),\xi '_1(\mathbf {k}_4))\ge 0,\\&g_9(z)=\alpha (\xi '_1(\mathbf {k}_4),\xi '_1(\mathbf {k}_3),\xi '_0(\mathbf {k}_0))\ge 0\text{. } \end{aligned} \end{aligned}$$
(17)

The linearization of (17) around \(z=0\) gives

$$\begin{aligned} \text {minimize } c\cdot z \quad \text { subject to } G z\ge 0\text{. } \end{aligned}$$
(18)

The values of G and c are given in Table 2. Unfortunately, (18) is unbounded. This can be shown by producing some \(z_u\) such that \(c\cdot z_u<0\) and \(G z_u\ge 0\). In the dual setting, this implies that there is no \(\eta \) such that \(c=\eta ^T G\) and \(\eta >0\).

Table 2 Constraint Jacobian and objective gradient for the heptagon packing problem

Due to Theorem 2.4, we are allowed to modify the cost function f(z) by adding auxiliary functions as long as they are negligible in the aggregate and isometry-invariant. In order for the new problem to be locally minimized, we need the new gradient \(c'\) to lie in the cone \(\{\eta ^T G:\eta >0\}\). We will take the following simple form for our modified problem

$$\begin{aligned} \begin{aligned} \text {minimize }&f'(z) = f(z) +\sum _{r=1}^9 \mu _r g_r(z)\text{, }\\ \text {subj. to }&\text {same constraints as in (17).} \end{aligned} \end{aligned}$$
(19)

For a cell \(\mathcal P_\xi \) other than the primitive cell \(\mathcal P_{\xi _0}\), the modified version is the same as (19), except we replace \(\xi '_i\) everywhere with \(\xi \circ \xi '_i\). Since the problem is invariant under common isometry, the problem is equivalent for all the cells and it is enough to show that (19) is locally minimized. Note that \(\xi _2\circ \xi _0=\xi _2\) and \(\xi _2\circ \xi _2=\xi _0\), so \(g_1^{\mathcal P_{\xi _0}}(z) = g_2^{\mathcal P_{\xi _2}}(z)\) and \(g_2^{\mathcal P_{\xi _0}}(z) = g_1^{\mathcal P_{\xi _2}}(z)\). Therefore, if \(\mu _1=-\mu _2\), the auxiliary addition \(\mu _1 g_1(z)+\mu _2 g_2(z)\) is equal in magnitude and opposite in sign for the two cells \(\mathcal P_{\xi _0}\) and \(\mathcal P_{\xi _2}\) sharing the edge \(\xi _0(0)\xi _2(0)\). Similar cancellation occurs across the three other edges if \(\mu _3=-\mu _4\), \(\mu _5=-\mu _6\), and \(\mu _7=-\mu _8\). The term \(\mu _9 g_9(z)\) does not cancel with any neighboring cells, and so we set \(\mu _9=0\). Therefore, when \(\mu _i=-\mu _{i+1}\) for \(i=1,3,5,7\) and \(\mu _9=0\), the sum of the auxiliary objectives of a collection of cells only has contributions from unmatched edges at the boundary of the collection and the auxiliary objectives are negligible in the aggregate. A choice for \(\mu _r\) satisfying these conditions and such that \(c'\) lies in the cone \(\{\eta ^T G:\eta >0\}\) exists if and only if there is some \(\eta \) such that \(c=\eta ^T G\) and \(\eta _1+\eta _2>0\), \(\eta _3+\eta _4>0\), \(\eta _5+\eta _6>0\), \(\eta _7+\eta _8>0\), and \(\eta _9>0\). We can show directly that such \(\eta \) exists, and we give an example in Table 2.

We now have that \(G z\ge 0\) and \(c'\cdot z\le 0\) if and only if \(G z = 0\) and \(c'\cdot z=0\). The rank of G is again 8, and so the program (19) is minimized exactly at the one-dimensional null space of G, which is generated by the vector \(z_0\) given in Table 2. We can verify directly that \(f(tz_0)\) is a quadratic function of t minimized at \(t=0\), and that \(g_r(tz_0)=0\) identically for \(r=1,\ldots ,9\).

Therefore, (19) satisfies all the conditions of Theorem 2.8, and we have:

Theorem 3.2

The optimal double-lattice packing of regular heptagons, illustrated in Fig. 7, is strongly extreme.

3.3 General Polygons

The structure of the solution in the cases of pentagons and heptagons suggests that it might be possible to extend the result to general convex polygons. We will consider a general convex polygon and a double-lattice packing generated by a half-length parallelogram. We will assume that this half-length parallelogram is an isolated local minimum of the area among half-length parallelograms. Due to Theorems 2.6 and 2.7, this assumption implies that, except in the exceptional cases, all the contacts in the double-lattice packing fall into the vertex-to-edge and the edge-to-edge types. Moreover, when the affine diameter is not unique, the contact between K and \(\mathrm {Tran}_{\mathbf {p}_1-\mathbf {p}_4}(K)\) is edge-to-edge, but the associated constraint can be relaxed to a vertex-to-edge constraint so that our analysis may follow the generic case. We find that the only data about the polygon that enters into the calculation are the following:

  1. 1.

    The coordinates of vertices of the minimum-area half-length parallelogram and the endpoints of a corresponding affine diameter, with at least one endpoint in the interior of an edge. Without loss of generality, we may assume the affine diameter is horizontal, of length 2, and bisected by the origin. The remaining data are encoded into the following parameters: h is the height of parallelogram and a is the horizontal shift between the top and bottom sides of the parallelogram. The position of the parallelogram center, not determined by these parameters, is eliminated during the calculation.

  2. 2.

    The angle of the polygon edges on which the vertices of the parallelogram and the endpoint of the affine diameter lie, \(\phi _i\), \(i=2,3,4,5,6\).

  3. 3.

    The distance and direction along the edge from those points to the nearest polygon vertex, that is, half the length of the contact. We denote these distances \(l_i\), \(i=2,3,4,5,6\). For the directions, we will assume the nearest polygon vertex is counterclockwise from the parallelogram vertex, but we will argue later that this assumption has no effect on the subsequent analysis.

The assumption that the area of the half-length parallelogram is minimized can be written as

$$\begin{aligned} a= h \cot \phi _4 - \frac{\sin \phi _3 \sin (\phi _4 - \phi _2)}{\sin \phi _4 \sin (\phi _3-\phi _2)} + \frac{\sin \phi _5 \sin (\phi _6 - \phi _4)}{\sin \phi _4 \sin (\phi _6-\phi _5)}. \end{aligned}$$
(20)

As in the previous sections the objective is given by the area of the quadrilateral \(\xi '_0(0)\xi '_6(0)\xi '_1(0)\xi '_2(0)\), and we parameterize the search space using \(z=(x_0,\ldots ,\theta _6)\in \mathbb {R}^9\) and \(\xi '_1=\xi _1\). The oriented triangles to be used to represent the nonoverlap constraints depend on the directions of \(l_i\), \(i=2,3,5,6\). With the contact directions assumed, the optimization problem is:

$$\begin{aligned} \begin{aligned} \text {minimize }&f(z)=\alpha (\xi '_0(0),\xi '_1(0),\xi '_2(0))-\alpha (\xi '_0(0),\xi '_6(0),\xi '_1(0) )\text{, }\\ \text {subj. to }&g_1(z)=\alpha (\xi '_0(\mathbf {p}_2+l_2\mathbf {u}_2),\xi '_0(\mathbf {p}_2-l_2\mathbf {u}_2),\xi '_2(\mathbf {p}_2+l_2\mathbf {u}_2))/l_2^2\ge 0,\\&g_2(z)=\alpha (\xi '_2(\mathbf {p}_2+l_2\mathbf {u}_2),\xi '_2(\mathbf {p}_2-l_2\mathbf {u}_2),\xi '_0(\mathbf {p}_2+l_2\mathbf {u}_2))/l_2^2\ge 0,\\&g_3(z)=\alpha (\xi '_2(\mathbf {p}_3+l_3\mathbf {u}_3),\xi '_2(\mathbf {p}_3-l_3\mathbf {u}_3),\xi '_1(\mathbf {p}_3+l_3\mathbf {u}_3))/l_3^2\ge 0,\\&g_4(z)=\alpha (\xi '_1(\mathbf {p}_3+l_3\mathbf {u}_3),\xi '_1(\mathbf {p}_3-l_3\mathbf {u}_3),\xi '_2(\mathbf {p}_3+l_3\mathbf {u}_3))/l_3^2\ge 0,\\&g_5(z)=\alpha (\xi '_1(\mathbf {p}_5+l_5\mathbf {u}_5),\xi '_1(\mathbf {p}_5-l_5\mathbf {u}_5),\xi '_6(\mathbf {p}_5+l_5\mathbf {u}_5))/l_5^2\ge 0,\\&g_6(z)=\alpha (\xi '_6(\mathbf {p}_5+l_5\mathbf {u}_5),\xi '_6(\mathbf {p}_5-l_5\mathbf {u}_5),\xi '_1(\mathbf {p}_5+l_5\mathbf {u}_5))/l_5^2\ge 0,\\&g_7(z)=\alpha (\xi '_6(\mathbf {p}_6+l_6\mathbf {u}_6),\xi '_6(\mathbf {p}_6-l_6\mathbf {u}_6),\xi '_0(\mathbf {p}_6+l_6\mathbf {u}_6))/l_6^2\ge 0,\\&g_8(z)=\alpha (\xi '_0(\mathbf {p}_6+l_6\mathbf {u}_6),\xi '_0(\mathbf {p}_6-l_6\mathbf {u}_6),\xi '_6(\mathbf {p}_6+l_6\mathbf {u}_6))/l_6^2\ge 0,\\&g_9(z)=\alpha (\xi '_1(\mathbf {p}_4+l_4\mathbf {u}_4),\xi '_1(\mathbf {p}_4-l_4\mathbf {u}_4),\xi '_0(\mathbf {p}_1))/l_4^2\ge 0\text{. } \end{aligned} \end{aligned}$$
(21)
Table 3 Right and left null space generators of the constraint matrix in the general polygon packing problem

We linearize the problem to obtain a problem of the form (18). The constraint matrix G is singular, and we can obtain right and left null space vectors \(z_0\) and \(\eta _0\), whose values are given in Table 3. In fact, the null spaces are one-dimensional, as can be seen by calculating \(\det (G-\eta _0 z_0^T)/(z_0\cdot z_0) = 2^{14}\sin (\phi _3-\phi 2)\sin (\phi _6-\phi _5)/(l_2 l_3 l_4 l_5 l_6) \ne 0\). Note that the variables associated with the rotations \(\theta _i\) in \(z_0\) are zero, and the assignment \(z=tz_0\) corresponds to evolving the half-length parallelogram. We therefore will have that \(f(tz_0)\) has an isolated local minimum at \(t=0\) and that \(g_r(tz_0)=0\) for all t. We note also that \(z_0\cdot c = 0\) and so c is contained in the row space of G. We solve for the vector \(\eta \) such that \(\eta \cdot G = c\) and \(\eta \cdot \eta _0=0\). We obtain the following values:

$$\begin{aligned} \begin{aligned} \eta _1+\eta _2&= -l_2\sin \phi _3/\sin (\phi _3-\phi _2),\\ \eta _3+\eta _4&= l_3\sin \phi _2/\sin (\phi _3-\phi _2),\\ \eta _5+\eta _6&= l_5\sin \phi _6/\sin (\phi _6-\phi _5),\\ \eta _7+\eta _8&= -l_6\sin \phi _5/\sin (\phi _6-\phi _5),\\ \eta _9&= -\frac{l_4}{\sin \phi _4}\bigg (h - \frac{1}{\cot \phi _3-\cot \phi _2} - \frac{1}{\cot \phi _6-\cot \phi _5}\bigg )\text{. } \end{aligned} \end{aligned}$$
(22)

Since all the above are positive, we can proceed as in the case of heptagons to include auxiliary functions that would make all the \(\eta _i\)’s individually positive and be negligible in the aggregate and isometry-invariant. Therefore, we have shown that the packing is strongly extreme.

To see that the directions of the contacts do not matter, simply relax the correct signed-area constraints to the signed-area constraints derived from a proper subinterval of the contact with the contact direction as assumed in the proof. This change is in fact a relaxation in an appropriate neighborhood. Since the relaxed problem is still solved uniquely at \(\xi _i'=\xi _i\), the same is true of the original. The same argument also takes care of contacts at full edges.

We have therefore proved that any double-lattice packing that is an isolated local minimum among double-lattice packings and is not one of the exceptional types of Definition 2.13 is also strongly extreme. We expect that the optimal double-lattice packing is strongly extreme even when it is not an isolated local minimum or when it is one of the exceptional types. However, these cases require a more careful analysis. In practice, we have a procedure which, when combined with the Mount algorithm, takes an arbitrary convex polygon as input, produces the densest double lattice packing, and certifies it as local maximum among general packings.