Abstract
The problem of packing ellipsoids is considered in the present work. Usually, the computational effort associated with numerical optimization methods devoted to packing ellipsoids grows quadratically with respect to the number of ellipsoids being packed. The reason is that the number of variables and constraints of ellipsoids’ packing models is associated with the requirement that every pair of ellipsoids must not overlap. As a consequence, it is hard to solve the problem when the number of ellipsoids is large. In this paper, we present a nonlinear programming model for packing ellipsoids that contains a linear number of variables and constraints. The proposed model finds its basis in a transformation-based non-overlapping model recently introduced by Birgin et al. (J Glob Optim 65(4):709–743, 2016). For solving large-sized instances of ellipsoids’ packing problems with up to 1000 ellipsoids, a multi-start strategy that combines clever initial random guesses with a state-of-the-art (local) nonlinear optimization solver is presented. Numerical experiments show the efficiency and effectiveness of the proposed model and methodology.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Ellipsoids’ packing techniques are important tools for several physical-chemistry and condensed matter physics applications (see [20,21,22,23,24, 35] and the references therein). In a robotic problem addressed in [18], a robot arm and other elements in a scene are approximated by three-dimensional ellipsoids and it allows the authors to explore the relation between the overlapping of ellipsoids and the overlapping of free-form objects. In [4, 20], simulation techniques are employed to study the properties of random packings of ellipsoids. In both works, rectangular boxes with periodic boundaries are considered in order to simulate the three-dimensional infinite space, and properties like density, orientation, and contacts among the ellipsoids are analyzed.
In the last few decades, many works [5, 7, 14,15,16, 19, 29, 41,42,43,44] addressed the problem of packing non-overlapping spheres in the n-dimensional space within a variety of fixed- or variable-dimensions containers of different shapes by means of nonlinear programming (NLP) models and methods. However, to the best of our knowledge, only eight very recent works [8, 25, 30,31,32,33, 38, 40] have considered the problem of packing non-overlapping ellipses, spheroids, or ellipsoids using nonlinear programming models and optimization techniques.
In 2013, the problem of packing as many identical ellipses as possible within a rectangular container was considered in [25]. By restricting the ellipses to have their semi-axes parallel to the Cartesian axes and allowing the centers of the ellipses to belong to a finite set of points in the plane (grid), the problem was modelled as a mixed-integer linear programming problem (MILP). As expected, only small instances of the MILP model could be solved to optimality within an affordable time.
In 2014, the problem of packing a given set of (non-necessarily identical) freely-rotated ellipses within a rectangle of minimum area was considered [31]. The non-overlapping between the ellipses was modelled using the idea of separating lines. State-of-the-art global optimization solvers were able to find solutions for instances with up to 14 ellipses. For instances with up to 100 ellipses, the authors presented solutions obtained by a constructive heuristic method. The same problem was addressed in [40]. The problem was modelled using “quasi-phi-functions” that is an extension of the phi-functions [17] extensively and successfully used to model a large variety of complicated packing problems. As well as in [31], the non-overlapping between ellipses was modelled based on the idea of separating lines. Models were tackled by a local optimization solver combined with ad hoc initial points and a multi-start strategy. Most of the solutions presented in [31] were improved in [40], where numerical experiments with additional instances with up to 120 ellipses were also shown.
In 2015, the work [30] extended the ideas presented for the two-dimensional case in [31] to deal with the three-dimensional problem of packing a given set of (non-necessarily identical) freely-rotated ellipsoids within a rectangular container of minimum volume. The idea of separating lines to model the non-overlapping between ellipses was naturally extended to separating planes to model the non-overlapping between ellipsoids. Resulting NLP models are non-convex and highly complex. Numerical experiments in [30], that considered instances with up to 100 ellipsoids, showed that state-of-the-art global optimization solvers were only able to deliver feasible solutions within an affordable prescribed CPU time limit. Heuristic methods were also proposed in [30]. Also in 2015, Pankratov et al. [38] extended the methods and methodology proposed in [40] from the two- to the three-dimensional case; but only spheroids, instead of arbitrary ellipsoids, were considered in the three-dimensional case. In that work, quasi-phi-functions were defined, NLP models proposed (based on separating planes), and solutions were delivered by applying a multi-start strategy associated with a local NLP solver. Illustrative numerical experiments in [38] describe solutions obtained for instances with up to 12 spheroids.
Still in 2015, continuous and differentiable NLP models for n-dimensional ellipsoids’ packing problems were proposed in [8]. The non-overlapping between ellipsoids was formulated in two different ways: (i) based on separating hyperplanes (and, in this sense, similarly to the already mentioned approaches) and (ii) based on linear transformations. In the latter case, the non-overlapping between a pair of ellipsoids reduces to the non-overlapping between a sphere and an ellipsoid. The solution to this simpler problem was inspired in the models proposed in [5] for packing circles within an ellipse. The non-overlapping models proposed in [8] were employed in the same work, as an illustration of their applicability, to tackle five variations of two- and three-dimensional ellipsoids’ packing problems: (a) packing the maximum number of identical ellipses with given semi-axis lengths within a rectangular container with given length and width; (b) finding the rectangular container with minimum area that can pack a given set of ellipses; (c) finding the elliptical container with minimum area that can pack a given set of identical ellipses; (d) finding the spherical container with minimum volume that can pack a given set of identical ellipsoids; and (e) finding the cuboid with minimum volume that can pack a given set of identical ellipsoids. In all cases, a multi-start strategy combined with a local nonlinear programming solver was employed with the aim of finding good quality solutions. Numerical experiments showed that, when applied to the instances of problem (b) presented in [31], the proposed models and methodology improved most of the solutions presented in [31, 40]. When applied to problem (e), models proposed in [8] can deal with instances with up to 100 ellipsoids in the sense that good quality solutions can be found within an affordable CPU time limit. On the other hand, only small-sized instances of the NLP models proposed in [30, 38] can be handled by off-the-shelf local or global nonlinear solvers.
In 2016, the problems of packing ellipses within minimizing-area circles and regular polygons were considered in [32, 33], respectively. The introduced models include variables classified as “primary variables”, that are the variables that define the ellipses’ positions (center and orientation), and “secondary variables”, that are auxiliary variables required to model the non-overlapping between the ellipses and the container fitting constraints. The later variables can be seen as “embedded Lagrange multipliers” since they are, in fact, Lagrange multipliers of underlying optimization subproblems related to the non-overlapping and fitting constraints (this idea is also present in the models proposed in [8]). In [33], illustrative numerical examples that consider the ellipses of the instances presented in [31] are given (all with up to 14 ellipses). In [32], instances with up to 20 ellipses are considered. It is worth noting that the authors mention that the application of non-dedicated deterministic global optimization strategies could become prohibitively expensive for larger instances and that heuristic approaches might be useful in such case.
Nonlinear programming models introduced in [8, 30,31,32,33, 38, 40] share the following property: If m ellipsoids are considered, the cost of evaluating the constraints that define the models is \(O(m^2)\). This means that, independently of variations in the modelling process that may affect the difficulties that appear when a local or global solver is employed, they all share the quadratic complexity in the number of ellipsoids. This inhibits their applicability to large-sized instances of any kind of ellipsoids’ packing problem. For the two-dimensional case, this difficulty was addressed in [40] by means of an heuristic approach that solves a sequence of NLP problems with expected O(m) variables and constraints. Starting from a given initial point, for each ellipse, a square that contains the ellipse is considered and an additional constraint that says that the ellipse must belong to the square is added to an underlying model with \(O(m^2)\) variables and constraints (based on quasi-phi-functions). Since a pair of ellipses that must belong to non-overlapping squares cannot overlap, its associated non-overlapping constraint and auxiliary variables can be deleted from the model. The resulting NLP model is solved and the process is repeated. Hopefully, all subproblems have a linear number of variables and constraints and the sequence of their solutions converges to a solution to the original problem. This strategy was also extended in [39] for tackling several three-dimensional related packing problems.
In the present work, we introduce non-overlapping models with an evaluation cost that is linear in the number of ellipsoids. As a consequence, models for a variety of ellipses’ and ellipsoids’ packing problems are introduced and numerical experiments with up to 1000 ellipsoids are delivered. To fix ideas, suppose that our problem consists of packing m different ellipsoids in a given container in \(\mathbb {R}^n\) without overlapping. Assume that, in principle, each ellipsoid is placed with its center in the origin. The natural variables of the problem turn out to be the displacements of the ellipsoids by means of which each ellipsoid is inside the container and intersections between (the interior of the) ellipsoids do not occur. Each displacement is a combination of a translation and a rotation. Since the translation is represented by n real parameters and the rotation is represented by \(n(n-1)/2\) rotation angles, it is natural to think that the problem has \(m [n + n(n-1)/2]\) variables. However, practical mathematical formulations usually involve \(O(m^2)\) unknowns. This is because there exists one non-overlapping requirement for each pair of ellipsoids and each non-overlapping requirement involves additional specific variables. For example, in order to verify whether two ellipsoids overlap, one finds the points in the ellipsoids that realize the minimum distance. These points are “specific” variables that we would like to eliminate. The main idea of this paper is to eliminate the additional \(O(m^2)\) variables by making them implicit by means of suitable geometrical transformations. Although the non-overlapping model we present has a linear number of variables and constraints, it is still a very-hard-to-solve nonlinear model. In order to find good quality solutions for the ellipsoids’ packing problem, we consider a multi-start method with ad hoc initial guesses.
The rest of this work is organized as follows. A formal definition of the problem is given in Sect. 2. The model introduced in [8] that formulates the non-overlapping problem in terms of its natural variables plus \(O(m^2)\) additional variables is briefly presented in Sect. 3. This model is then used to derive a new non-overlapping model with implicit variables in Sect. 4. The new model contains a linear number of variables and constraints on the number of ellipsoids to be packed. In Sect. 5, we show how to efficiently evaluate the constraints of the non-overlapping model. In Sect. 6, we briefly discuss the applicability of existent state-of-the-art deterministic global optimization tools to tackle packing problems based on the proposed model. In Sect. 7, we present some numerical experiments that show that the non-overlapping model introduced in the present work can be used to pack a large number of ellipsoids. Finally, we draw some conclusions in Sect. 8. The computer implementation of the models and methods introduced in the current work and the solutions reported in Sect. 7 are freely available at http://www.ime.usp.br/~lobato/.
2 Statement of the problem
An ellipsoid in \(\mathbb {R}^n\) is a set of the form \(\{ x \in \mathbb {R}^n \mid (x - c)^{\top }M^{-1} (x-c) \le 1\}\), where \(M \in \mathbb {R}^{n \times n}\) is a symmetric and positive definite matrix. The vector \(c \in \mathbb {R}^n\) is the center of the ellipsoid. The eigenvectors of \(M^{-1}\) determine the principal axes of the ellipsoid and the eigenvalues of \(M^{\frac{1}{2}}\) are the lengths of the semi-axes of the ellipsoid. In this work, we deal with the problem of packing ellipsoids in the n-dimensional space. This problem can be stated as follows. Let m be the number of ellipsoids to be packed. We are given the lengths \(l_i^1, \dots , l_i^n\) of the semi-axes of the i-th ellipsoid for each \(i \in \{1,\dots ,m\}\) and we are also given a set \(\mathcal {C}\subset \mathbb {R}^n\), which we call the container. The objective of the problem is to find ellipsoids \(\mathcal {E}_1, \dots , \mathcal {E}_m\) such that:
-
1.
\(\mathcal {E}_i\) has semi-axes with lengths \(l_i^1, \dots , l_i^n\) for all \(i \in \{1,\dots ,m\}\);
-
2.
\(\text {int}(\mathcal {E}_i) \cap \text {int}(\mathcal {E}_j) = \emptyset \), for each \(i,j \in \{1,\dots ,m\}\) with \(i \ne j\);
-
3.
\(\mathcal {E}_i \subseteq \mathcal {C}\) for all \(i \in \{1,\dots ,m\}\).
The first constraint states that the ellipsoids must have the given lengths of the semi-axes. The second requirement is that the ellipsoids must not overlap each other, which means that the interiors of the ellipsoids must be mutually disjoint. The last constraint says that the ellipsoids must be inside the container. This is a feasibility problem where one must determine the center and rotation of each ellipsoid so that the constraints 1, 2, and 3 are satisfied. We also consider an optimization version of this problem in which the volume of the container must be minimized.
3 Transformation-based non-overlapping model
In [8], continuous and differentiable nonlinear programming models for packing ellipsoids within polyhedra and ellipsoids were introduced. Two models for the non-overlapping constraints were given. One of them is based on a linear transformation that, for every pair of ellipsoids, converts one of the ellipsoids into a ball. Since this model forms the basis for the non-overlapping model that will be introduced in this work, we briefly describe it below. For a more detailed explanation, see [8].
Let \(I = \{1,\dots ,m\}\) be the set of indices of the ellipsoids to be packed. For each \(i \in I\), we denote by \(P_i^{\frac{1}{2}}\) the \(n \times n\) diagonal matrix whose diagonal entries are the lengths of the semi-axes of the ellipsoid \(\mathcal {E}_i\). Then, for each \(i \in I\), we can represent the ellipsoid \(\mathcal {E}_i\) as
where \(c_i \in \mathbb {R}^n\) is the center of the ellipsoid and \(Q_i \in \mathbb {R}^{n \times n}\) is an orthogonal matrix that determines the orientation of the ellipsoid. For example, for \(n = 2\), we can represent \(Q_i\) as
whereas, for \(n = 3\), we can represent \(Q_i\) as
The parameters \(\theta _i\) (when \(n=2\)) and \(\theta _i\), \(\phi _i\), and \(\psi _i\) (when \(n = 3\)) are called “rotation angles” of the ellipsoid. Similar parametrizations can be made for \(n > 3\).
Let \(i,j \in I\) be such that \(i < j\) and consider the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\). Let \(T_{ij}: \mathbb {R}^n \rightarrow \mathbb {R}^n\) be the linear transformation defined by
Let \(\mathcal {E}_i^{ij}\) and \(\mathcal {E}_j^{ij}\) be the ellipsoids obtained when the transformation \(T_{ij}\) defined in (3) is applied to \(\mathcal {E}_i\) and \(\mathcal {E}_j\), respectively, i.e.,
and
where
Therefore, \(\mathcal {E}_i^{ij}\) is a unit-radius ball centered at \(P_i^{-\frac{1}{2}}Q_i^{\top }(c_i-c_j)\) and \(\mathcal {E}_j^{ij}\) is an ellipsoid centered at the origin (since \(S_{ij} = V_{ij}^{\top }V_{ij}\) with \(V_{ij} = P_j^{-\frac{1}{2}}Q_j^{\top }Q_iP_i^{\frac{1}{2}}\)). By Lemma 3.1 below (that is a restatement of Lemma 3.1 in [8]), we have that the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) overlap if and only if the ellipsoids \(\mathcal {E}_i^{ij}\) and \(\mathcal {E}_j^{ij}\) overlap.
Lemma 3.1
Consider the ellipsoids \(\mathcal {E}_i\), \(\mathcal {E}_j\), \(\mathcal {E}_i^{ij}\), and \(\mathcal {E}_j^{ij}\) defined above. Thus, the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) overlap if and only if the ellipsoids \(\mathcal {E}_i^{ij}\) and \(\mathcal {E}_j^{ij}\) overlap.
Proof
For any \(x \in \mathbb {R}^n\), we have
Therefore, we have that \(x \in \text {int}(\mathcal {E}_i)\) if and only if \(T_{ij}(x) \in \text {int}(\mathcal {E}_i^{ij})\). Furthermore,
Then, \(x \in \text {int}(\mathcal {E}_j)\) if and only if \(T_{ij}(x) \in \text {int}(\mathcal {E}_j^{ij})\). Hence, \(\text {int}(\mathcal {E}_i) \cap \text {int}(\mathcal {E}_j) \ne \emptyset \) if and only if \(\text {int}(\mathcal {E}_i^{ij}) \cap \text {int}(\mathcal {E}_j^{ij}) \ne \emptyset \). In other words, the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) overlap if and only if the ellipsoids \(\mathcal {E}_i^{ij}\) and \(\mathcal {E}_j^{ij}\) overlap.\(\square \)
Two propositions given in [8] are now required. The frontier of a set \(\mathcal {E}\) is denoted by \(\partial \mathcal {E}\).
Proposition 3.1
Let \(\mathcal {E}= \{x \in \mathbb {R}^n \mid x^{\top }Mx \le 1\}\), where \(M \in \mathbb {R}^{n \times n}\) is symmetric and positive definite. Thus, for each \(y \in \mathbb {R}^n \setminus \text {int}(\mathcal {E})\), there exist unique \(x^* \in \mathbb {R}^n\) and \(\mu ^* \in \mathbb {R}\) such that \(y = x^* + \mu ^* M x^*\) and \(x^*\) is the projection of y onto \(\mathcal {E}\). Moreover, \(x^* \in \partial \mathcal {E}\) and \(\mu ^* \in \mathbb {R}_+\).
Proof
See the proof of Proposition 4.1 in [8]. \(\square \)
Proposition 3.2
Let \(\mathcal {E}= \{x \in \mathbb {R}^n \mid x^{\top }Mx \le 1\}\), where \(M \in \mathbb {R}^{n \times n}\) is symmetric and positive definite. Thus, for each \(x \in \partial \mathcal {E}\) and \(\mu > 0\), we have \((x + \mu M x)^{\top }M(x + \mu Mx) > 1\).
Proof
See the proof of Proposition 4.2 in [8]. \(\square \)
For the ball \(\mathcal {E}_i^{ij}\) not to overlap with \(\mathcal {E}_j^{ij}\), the distance between the center \(c_{i}^{ij}\) of the unit-radius ball \(\mathcal {E}_i^{ij}\) and the ellipsoid \(\mathcal {E}_j^{ij}\) must be at least one. In particular, \(c_{i}^{ij}\) must not belong to \(\mathcal {E}_j^{ij}\). If \(c_{i}^{ij} \notin \mathcal {E}_j^{ij}\), then, by Proposition 3.1, there exist unique \(x_{ij} \in \partial \mathcal {E}_j^{ij}\) and \(\mu _{ij} \in \mathbb {R}_+\) such that
Moreover, \(x_{ij}\) is the projection of \(c_{i}^{ij}\) onto \(\mathcal {E}_j^{ij}\). Hence, the distance \(d(c_{i}^{ij},\mathcal {E}_i^{ij})\) between \(c_{i}^{ij}\) and \(\mathcal {E}_i^{ij}\) is given by
By Proposition 3.2, if \(x_{ij} \in \partial \mathcal {E}_j^{ij}\) and \(\mu _{ij} > 0\), then \(c_{i}^{ij} \notin \mathcal {E}_j^{ij}\). Therefore, we obtain the following non-overlapping model:
Figure 1 illustrates the transformation-based non-overlapping model. The unit-radius ball \(\mathcal {E}_i^{ij}\) centered at \(c_i^{ij}\) is shown to the left; while the ellipsoid \(\mathcal {E}_j^{ij}\) is shown to the right. The point \(c_i^{ij}\) can be written as the sum of \(x_{ij}\) (which is the orthogonal projection of \(c_i^{ij}\) onto the frontier of \(\mathcal {E}_j^{ij}\)) and a positive multiple of the vector \(S_{ij}x_{ij}\).
Notice that \(\mu _{ij}\) must be positive in order to satisfy constraints (5, 7). Proposition 3.3 below provides a positive lower bound \(\epsilon _{ij}\) on \(\mu _{ij}\).
Proposition 3.3
Any solution to the system (4)–(7) is such that \(\mu _{ij} \ge \epsilon _{ij}\) for all \(i < j\), where
Proof
See the proof of Proposition 4.3 in [8]. \(\square \)
By manipulating the equalities and inequalities in model (4)–(7) and including the positive bound on \(\mu _{ij}\), we obtain the following equivalent non-overlapping model:
The numbers of variables and constraints of this non-overlapping model are quadratically proportional to the number of ellipsoids to be packed. Thus, when the number of ellipsoids is relatively high, this model becomes very hard to be numerically solved. The introduction of a new model that does not suffer from this inconvenience starts by reducing the number of variables and constraints of model (8)–(11). Although those reductions do not reduce the complexity of evaluating the model, they are the first step in the direction of developing a more tractable model. In order to reduce the number of constraints, we will combine all constraints from the non-overlapping model (8)–(11) into a linear number of constraints by simple summing up the squared infeasibilities. To reduce the number of variables, we will replace the variables \(x_{ij}\) and \(\mu _{ij}\) from model (8)–(11) with functions that play the same roles as these variables.
4 Non-overlapping model with implicit variables
In this section, we present a non-overlapping model with implicit variables. The model is derived from the transformation-based non-overlapping model briefly described in the previous section. In Sect. 4.1, we show how to reduce the number of constraints and in Sect. 4.2 we show how to reduce the number of variables. The complete model with implicit variables, that contains a linear number of variables and constraints, is presented in Sect. 4.3.
4.1 Reduction of the number of constraints
Consider the non-overlapping model (8)–(11). By replacing each of the inequality constraints of this model with its squared infeasibility measure, we obtain the following model:
This model is equivalent to the model (8)–(11), in the sense that any solution to (12)–(15) is a solution to the model (8)–(11) and vice-versa.
For each \(i \in I\), let \(\Omega _i \in \mathbb {R}^q\) denote the vector of rotation angles of the i-th ellipsoid. For each \(i,j \in I\) such that \(i < j\), define the sum of the squared infeasibilities \(o: \mathbb {R}^{3n+2q+1} \rightarrow \mathbb {R}_+\) by
Observe that the set of constraints (12)–(15) is equivalent to the constraints
In order to obtain a model with a linear number of constraints, we can combine the constraints (17) in the following way
or even combining them into a single constraint:
The constraints (18) (or the constraint (19)) are equivalent to the constraints (17). Therefore, we can replace the constraints (8)–(11) with constraints (18) (or constraint (19)) and obtain an equivalent model for the non-overlapping of ellipsoids.
Although this new model has a linear number of constraints that models the non-overlapping of ellipsoids, the total number of terms in the summations is quadratically proportional to the number of ellipsoids to be packed. Thus, the computational cost of evaluating the constraints (18) at a given point is practically the same as the cost of evaluating the constraints (8)–(11). In Sect. 5, we will see how to efficiently evaluate these constraints.
4.2 Reduction of the number of variables
Consider the constraints (18) and suppose that the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) do not overlap. In this case, we know that there exist values for \(x_{ij}\) and \(\mu _{ij}\) such that the term \(o(c_i, c_j, \Omega _i, \Omega _j, x_{ij}, \mu _{ij}; P_i, P_j)\) as defined in (16) vanishes and, therefore, does not contribute to the summation in (18). We can simply take \(x_{ij}\) as the projection of \(c_{i}^{ij}\) onto the ellipsoid \(\mathcal {E}_j^{ij}\) and \(\mu _{ij}\) as the nonnegative scalar that satisfies \(c_{i}^{ij} = x_{ij} + \mu _{ij} S_{ij} x_{ij}\). The projection of \(c_{i}^{ij}\) onto the ellipsoid \(\mathcal {E}_j^{ij}\) is the solution to the problem
However, taking \(x_{ij}\) as the solution to the problem (20) does not lead to a good overlapping measure when ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) do overlap. If \(c_{i}^{ij} \in \text {int}(\mathcal {E}_j^{ij})\), then the solution to the problem (20) is \(c_{i}^{ij}\) and we must have \(\mu _{ij} = 0\). Then, the term associated with the distance between \(c_{i}^{ij}\) and \(\mathcal {E}_j^{ij}\) will be constant for any \(c_{i}^{ij} \in \text {int}(\mathcal {E}_j^{ij})\), as well as the term associated with the positivity of \(\mu _{ij}\) in (18). Consider the following problem:
If \(c_{i}^{ij} \notin \text {int}(\mathcal {E}_j^{ij})\), problems (20) and (21) are equivalent and they both have a single solution. Notice that the null vector is not a feasible solution to problem (21). Thus, since problem (21) has a single constraint and the matrix \(S_{ij}\) is positive definite, the gradient of the constraint is nonzero at every feasible point. Therefore, any solution to this problem satisfies the linear independence constraint qualification. This means that the Karush–Kuhn–Tucker optimality conditions of problem (21) (see, for example, Proposition 3.3.1 in [3]) is satisfied by every solution to problem (21). Thus, if \(x^*\) is a solution to this problem then there exists \(\mu ^* \in \mathbb {R}\) such that
If \(c_{i}^{ij} \notin \text {int}(\mathcal {E}_j^{ij})\) then, by Proposition 3.1, there exist a unique \(x^* \in \partial \mathcal {E}_j^{ij}\) and a unique \(\mu ^*\) that satisfy (22). Moreover, \(\mu ^* \ge 0\). On the other hand, if \(c_{i}^{ij} \in \text {int}(\mathcal {E}_j^{ij})\) then problem (21) may have more than one solution. But, by Proposition 4.2, the Lagrange multiplier associated with the constraint of this problem is the same for any solution and belongs to the interval \([-1/\lambda _{\max }(S_{ij}),0]\).
We restate here Lemma 4.1 and Proposition 4.1 that were proved in [8]. Lemma 4.1 will be used in the proof of Lemma 4.2 and Proposition 4.1 will be used in the proof of Proposition 4.2. Lemma 4.2 and Proposition 4.2 are introduced in the present work. Lemma 4.2 will be used in the proof of the Proposition 4.2, which shows that the Lagrange multiplier associated with the constraint of problem (21) is the same for any solution to this problem. Lemma 4.2 is a particular case of Proposition 4.2.
Lemma 4.1
Consider ellipsoid \(\mathcal {E}= \{z \in \mathbb {R}^n \mid z^{\top } D z \le 1\}\), where \(D \in \mathbb {R}^{n \times n}\) is a positive definite diagonal matrix. For each \(y \in \mathcal {E}\), there exist \(x \in \partial \mathcal {E}\) and \(\alpha \in [-1/\lambda _{\max }(D), 0]\) such that \(y = x + \alpha Dx\).
Proof
See the proof of Lemma 5.1 in [8]. \(\square \)
Proposition 4.1
Consider ellipsoid \(\mathcal {E}= \{z \in \mathbb {R}^n \mid z^{\top } S z \le 1\}\), where \(S \in \mathbb {R}^{n \times n}\) is symmetric and positive definite. For each \(y \in \mathcal {E}\), there exist \(x \in \partial \mathcal {E}\) and \(\alpha \in [-1/\lambda _{\max }(S), 0]\) such that \(y = x + \alpha Sx\).
Proof
See the proof of Proposition 5.1 in [8]. \(\square \)
Lemma 4.2
Consider the ellipsoid \(\mathcal {E}= \{z \in \mathbb {R}^n \mid z^{\top } D z \le 1\}\), where \(D \in \mathbb {R}^{n \times n}\) is diagonal and positive definite. Given \(y \in \mathcal {E}\), there exists a unique \(\alpha \in [-1/\lambda _{\max }(D), 0]\) and there exists \(x \in \partial \mathcal {E}\) such that \(y = x + \alpha Dx\). Moreover, if \(\alpha \in (-1/\lambda _{\max }(D), 0]\) then \(x \in \partial \mathcal {E}\) is unique.
Proof
Let \(\mathcal {I} = \{1,\dots ,n\}\). For each \(i \in \mathcal {I}\), denote the i-th diagonal element of matrix D by \(d_i\). Consider the system
By Lemma 4.1, the system (23)–(25) has at least one solution. Suppose that \((x^*,\alpha ^*)\) be a solution to this system and that \(\alpha ^* > -1/\lambda _{\max }(D)\). We shall prove that this is the only solution to this system. Notice that this is enough to prove the lemma.
Since \(\alpha ^* > -1/\lambda _{\max }(D)\), we have that
By (23), we have \(y_i = (1 + \alpha ^*d_i)x_i^*\) for each \(i \in \mathcal{I}\). Then, (23) and (26) imply that, for each \(i \in \mathcal{I}\), \(x^*_i = 0\) if \(y_i = 0\). However, since \(x^* = 0\) does not satisfy (24), there must exist \(i \in \mathcal{I}\) such that \(y_i \ne 0\).
Since \(1+\alpha ^*d_i > 0\) for each \(i \in \mathcal{I}\), by (23) we have that
In order to derive a contradiction, suppose that the system has a solution \((\bar{x},\bar{\alpha }) \ne (x^*,\alpha ^*)\). If \(\bar{\alpha } = \alpha ^*\), then \(\bar{x} = x^*\) by (27). Thus, we must have \(\bar{\alpha } \ne \alpha ^*\). We shall divide the proof in the cases where \(\bar{\alpha } > -1/\lambda _{\max }(D)\) and \(\bar{\alpha } = -1/\lambda _{\max }(D)\).
Case 1 Suppose that \(\bar{\alpha } > -1/\lambda _{\max }(D)\). Then, \(1+\bar{\alpha }d_i > 0\) for each \(i \in \mathcal{I}\) and, therefore,
By (24), we have that
If \(\bar{\alpha } < \alpha ^*\), then \(1 + \alpha ^*d_i> 1+\bar{\alpha }d_i > 0\) for each \(i \in \mathcal{I}\) and
If \(\bar{\alpha } > \alpha ^*\), then \(0< 1 + \alpha ^*d_i < 1+\bar{\alpha }d_i\) for each \(i \in \mathcal{I}\) and
In both cases we have that \(\bar{x}^{\top }D\bar{x} \ne 1\) and, therefore, \((\bar{x},\bar{\alpha })\) is not a solution to the system (23)–(25).
Case 2 Suppose that \(\bar{\alpha } = -1/\lambda _{\max }(D)\). Let \(\mathcal{I}^+ = \{i \in \mathcal{I} \mid d_i = \lambda _{\max }(D)\}\) and \(\mathcal{I}^- = \mathcal{I} \setminus \mathcal{I}^+\). By (23), we must have that \(y_i = 0\) for each \(i \in \mathcal{I}^+\), since \(1 + \bar{\alpha }d_i = 0\) for each \(i \in \mathcal{I}^+\). Thus, since \(\alpha ^* > -1/\lambda _{\max }(D)\), we must have that \(x_i^* = 0\) for each \(i \in \mathcal{I}^+\). Hence, since \({x^*}^{\top }Dx^* = 1\), there exists \(i \in \mathcal{I}^-\) such that \(x_i^* \ne 0\) and, consequently, \(y_i \ne 0\), since \(y_i = (1 + \alpha ^*d_i)x_i^*\). Since \(1+\bar{\alpha }d_i > 0\) for each \(i \in \mathcal{I}^-\), by (23) we have that
Therefore,
Thus,
that is, \((\bar{x},\bar{\alpha })\) is not a solution to the system (23)–(25).
Hence, \((x^*,\alpha ^*)\) is the only solution to the system (23)–(25). \(\square \)
Proposition 4.2
Consider the ellipsoid \(\mathcal {E}= \{z \in \mathbb {R}^n \mid z^{\top } S z \le 1\}\), where \(S \in \mathbb {R}^{n \times n}\) is a symmetric and definite positive matrix. Given \(y \in \mathcal {E}\), there exists a unique \(\alpha \in [-1/\lambda _{\max }(S), 0]\) and there exists \(x \in \partial \mathcal {E}\) such that \(y = x + \alpha Sx\). Moreover, if \(\alpha \in (-1/\lambda _{\max }(S), 0]\) then \(x \in \partial \mathcal {E}\) is unique.
Proof
By Proposition 4.1, there exist \(x^* \in \partial \mathcal {E}\) and \(\alpha ^* \in [-1/\lambda _{\max }(S), 0]\) such that
Suppose that \(\alpha ^* \in (-1/\lambda _{\max }(S), 0]\). We shall prove that there do not exist \(\bar{x} \in \partial \mathcal {E}\) and \(\bar{\alpha } \in [-1/\lambda _{\max }(S), 0]\) such that \(y = \bar{x} + \bar{\alpha } S\bar{x}\) and \((\bar{x}, \bar{\alpha }) \ne (x^*, \alpha ^*)\). In order to derive a contradiction, suppose that there exist \(\bar{x} \in \partial \mathcal {E}\) and \(\bar{\alpha } \in [-1/\lambda _{\max }(S), 0]\) such that
and \((\bar{x}, \bar{\alpha }) \ne (x^*, \alpha ^*)\).
Since S is symmetric, there exist an orthogonal matrix \(Q \in \mathbb {R}^{n \times n}\) and a diagonal matrix \(D \in \mathbb {R}^{n \times n}\) formed by the eigenvalues of S such that \(S = QDQ^{\top }\) and \(\lambda _{\max }(S) = \lambda _{\max }(D)\) (see, for example, Theorem 8.1.1 in [26]). Consider the ellipsoid \(\mathcal {E}' = \{z \in \mathbb {R}^n \mid z^{\top } D z \le 1\}\).
Then, \(Q^{\top }y \in \mathcal {E}'\), \(Q^{\top }x^* \in \partial \mathcal {E}'\), \(Q^{\top }\bar{x} \in \partial \mathcal {E}'\). Moreover, since \(\lambda _{\max }(S) = \lambda _{\max }(D)\), we have that \(\alpha ^* \in (-1/\lambda _{\max }(D), 0]\) and \(\bar{\alpha } \in [-1/\lambda _{\max }(D),0]\). By left multiplying both sides of Eqs. (28) and (29) by Q, we obtain
and
By Lemma 4.2, if \(\bar{\alpha } = \alpha ^*\), we must have \(Q^{\top }\bar{x} = Q^{\top }x^*\) and, consequently, \(\bar{x} = x^*\), which contradicts the hypothesis that \((\bar{x}, \bar{\alpha }) \ne (x^*, \alpha ^*)\). If \(\bar{\alpha } \ne \alpha ^*\), then (30) and (31) contradict Lemma 4.2.
Hence, if \(\alpha ^* \in (-1/\lambda _{\max }(S), 0]\), then the system
has a unique solution. \(\square \)
The equation (22) implies that
Since \({x^*}^{\top } S_{ij} x^* = 1\), this implies that
Therefore, since \(c_{i}^{ij} = P_i^{-\frac{1}{2}}Q_i^{\top }(c_i-c_j)\), any solution \(x^*\) to problem (21) together with the corresponding Lagrange multiplier \(\mu ^*\) satisfy
Thus, if we take \(\mathcal {X}_{ij}\) as a solution to problem (21) and \(\mathcal {U}_{ij}\) as the corresponding Lagrange multiplier, the constraints (18) become
Thus, the variables \(x_{ij}\) and \(\mu _{ij}\) cease to be part of the non-overlapping model. In (32), we define \(\mathcal {X}_{ij}\) to be a mapping whose value is a solution to problem (21) and \(\mathcal {U}_{ij}\) is the function whose value is the Lagrange multiplier associated with the value of \(\mathcal {X}_{ij}\).
4.3 The model
To make it clearer that \(x_{ij}\) and \(\mu _{ij}\) are no longer variables of the model but functions of the centers and rotation angles of the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\), we shall rewrite (32) in the following way:
where
\(\mathcal {X}(c_i,c_j,\Omega _i,\Omega _j; P_i, P_j)\) is a solution to the problem
and \(\mathcal {U}(c_i,c_j,\Omega _i,\Omega _j; P_i, P_j)\) is the Lagrange multiplier associated with this solution.
This new non-overlapping model has \(m-1\) constraints given by (33) and its variables are the centers of the ellipsoids (\(c_i \in \mathbb {R}^n\) for each \(i \in \{1,\dots ,m\}\)) and the rotation angles of the ellipsoids (\(\Omega _i \in \mathbb {R}^q\) for each \(i \in \{1,\dots ,m\}\)). Therefore, this model has a linear number of variables and a linear number of constraints on the number of ellipsoids to be packed. The following lemma shows that the constraints (33) constitute indeed a non-overlapping model.
Lemma 4.3
The constraints (33) are satisfied if and only if the ellipsoids do not overlap.
Proof
Firstly, notice that the function \(f(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) is nonnegative at every point. Let \(i,j \in \{1,\dots ,m\}\) be such that \(i < j\). Suppose that the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) do not overlap. Thus, we have that the ellipsoids \(\mathcal {E}_i^{ij}\) and \(\mathcal {E}_j^{ij}\) do not overlap either. So, the distance from the center \(c_{i}^{ij}\) of the unit-radius ball \(\mathcal {E}_i^{ij}\) to the ellipsoid \(\mathcal {E}_j^{ij}\) is at least one. Since in this case \(c_{i}^{ij} \notin \mathcal {E}_j^{ij}\), we have that \(\mathcal {X}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) is the projection of \(c_{i}^{ij}\) onto the ellipsoid \(\mathcal {E}_j^{ij}\). Therefore, recalling that \(c_{i}^{ij} = P_i^{-\frac{1}{2}}Q_i^{\top }(c_{i} - c_{j})\), we have
Consequently, we have that
Since \(\mathcal {X}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) is the solution to problem (21) and \(\mathcal {U}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) is the Lagrange multiplier associated with this solution and satisfies (22), we have that
by Proposition 3.1. By Proposition 3.3, we have that
Therefore,
Hence, if the ellipsoids do not overlap, \(f(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) vanishes.
Suppose that there exist two ellipsoids that overlap. Let \(i,j \in \{1,\dots ,m\}\) such that \(i < j\) be the indices of those ellipsoids. Let \(c_{i}^{ij}\) be the center of the ball \(\mathcal {E}_i^{ij}\), that is, \(c_{i}^{ij} = P_i^{-\frac{1}{2}}Q_i^{\top }(c_{i} - c_{j})\). Let us consider two cases. Suppose that \(c_{i}^{ij} \notin \text {int}(\mathcal {E}_j^{ij})\). In this case, \(\mathcal {X}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) is the projection of \(c_{i}^{ij}\) onto ellipsoid \(\mathcal {E}_j^{ij}\). Since \(\mathcal {E}_i\) and \(\mathcal {E}_j\) overlap, we have that the ellipsoids \(\mathcal {E}_i^{ij}\) and \(\mathcal {E}_j^{ij}\) also overlap. Thus,
Therefore,
Now, suppose that \(c_{i}^{ij} \in \text {int}(\mathcal {E}_j^{ij})\). Since \(c_{i}^{ij} \in \text {int}(\mathcal {E}_j^{ij})\) and \(\mathcal {X}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j) \in \partial \mathcal {E}_j^{ij}\), by (22) and by Proposition 3.2, we must have \(\mathcal {U}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j) \le 0\). Moreover, since \(c_{i}^{ij} \ne \mathcal {X}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\), we must have \(\mathcal {U}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j) \ne 0\). Consequently,
Then,
Therefore, if the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) overlap, \(f(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) is strictly positive.
\(\square \)
We now show that the overlapping measure f of two ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) given by (34) does not depend on the scaling of the ellipsoids.
Definition 1
Consider the set \(\mathcal {E}\subseteq \mathbb {R}^n\). For each \(\nu \in \mathbb {R}_{++}\), we define \(\nu \mathcal {E}= \{\nu x \mid x \in \mathcal {E}\}\).
Consider the ellipsoid \(\mathcal {E}_i = \{ x \in \mathbb {R}^n \mid (x - c_i)^{\top }Q_iP_i^{-1}Q_i^{\top } (x-c_i) \le 1\}\). Let \(\nu \in \mathbb {R}_{++}\). According to Definition 1, we have
Thus, \(\nu \mathcal {E}_{i}\) is an ellipsoid defined by the tuple \((\nu c_i,\Omega _i,\nu ^2 P_i)\). It is centered at \(\nu c_i\), its semi-axis lengths form the diagonal of \(\nu P_i^{\frac{1}{2}}\), and it has the same orientation as the one of ellipsoid \(\mathcal {E}_{i}\).
Lemma 4.4
The function defined in (34) is invariant with respect to the scaling of the ellipsoids. That is, \(f(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j) = f(\nu c_i,\nu c_j,\Omega _i,\Omega _j;\nu ^2P_i,\nu ^2P_j)\) for each \(\nu \in \mathbb {R}_{++}\).
Proof
Consider the ellipsoids \(\mathcal {E}_i = \{ x \in \mathbb {R}^n \mid (x - c_i)^{\top }Q_iP_i^{-1}Q_i^{\top } (x-c_i) \le 1\}\) and \(\mathcal {E}_j = \{ x \in \mathbb {R}^n \mid (x - c_j)^{\top }Q_jP_j^{-1}Q_j^{\top } (x-c_j) \le 1\}\). Let \(\nu \in \mathbb {R}_{++}\). According to Definition 1, we have that
Notice that the constant \(\epsilon _{ij}\) given by Proposition 3.3 for the pair of ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) is the same for the pair of ellipsoids \(\nu \mathcal {E}_i\) and \(\nu \mathcal {E}_j\), since
Thus, we have that
Let \((\nu \mathcal {E}_i)^{ij}\) be the set obtained by applying the transformation \(T_{ij}\) to the ellipsoid \(\nu \mathcal {E}_i\), that is,
Thus, \((\nu \mathcal {E}_i)^{ij} = \mathcal {E}_i^{ij}\). Similarly, we obtain \((\nu \mathcal {E}_j)^{ij} = \mathcal {E}_j^{ij}\). Therefore, since \(\mathcal {X}(c_i, c_j,\Omega _i,\Omega _j;P_i,P_j)\) is a projection of the center of \(\mathcal {E}_i^{ij}\) onto the frontier of \(\mathcal {E}_j^{ij}\) and \(\mathcal {X}(\nu c_i, \nu c_j,\Omega _i,\Omega _j;\nu ^2 P_i,\nu ^2 P_j)\) is a projection of the center of \((\nu \mathcal {E}_i)^{ij}\) onto the frontier of \((\nu \mathcal {E}_j)^{ij}\), \((\nu \mathcal {E}_i)^{ij} = \mathcal {E}_i^{ij}\) and \((\nu \mathcal {E}_j)^{ij} = \mathcal {E}_j^{ij}\) imply that
by Proposition 4.2, and it is possible to take
Consequently,
Hence, the function defined in (34) is invariant with respect to the scaling of the ellipsoids. \(\square \)
Let \(c_{i}^{ij} \in \mathbb {R}^n\) and \(S_{ij} \in \mathbb {R}^{n \times n}\) be a symmetric and positive definite matrix. We have defined \(\mathcal {X}\) to be the mapping that returns a solution to problem (35) and \(\mathcal {U}\) to be the mapping that returns the Lagrange multiplier associated with that solution. By Proposition 4.2, any solution to problem (35) is associated with the same Lagrange multiplier. Therefore, no matter what optimal solution to problem (35) is returned by the mapping \(\mathcal {X}\), \(\mathcal {U}\) will behave like a function.
If \(c_{i}^{ij} \notin \text {int}(\mathcal {E}_j^{ij})\) then problem (35) has only one solution. On the other hand, if \(c_{i}^{ij} \in \text {int}(\mathcal {E}_j^{ij})\) then problem (35) may have multiple optimal solutions. Figure 2a illustrates some of those cases. This picture shows an ellipse with semi-axis lengths a and b with \(a > b\). The projection of the point \(y^1\) onto the frontier of the ellipse is univocally determined: the point \(x^1\). However, for any point \(y^2\) in the set \(\{y \in \mathbb {R}^2 \mid b^2/a - a< y_1 < a - b^2/a, y_2 = 0\}\), there are two projections: \(\bar{x}^2\) and \(\underline{x}^2\). If the ellipse is a circle, that is, \(a = b\), then every point in the frontier of the circle is a projection of the center of the circle onto the frontier. Even if we were able to determine a single solution to be returned by \(\mathcal {X}\) in the case in which multiple solutions exist (which is an easy task since it can be accomplished by tackling (35) with a deterministic nonlinear programming solver), \(\mathcal {X}\) would be discontinuous in those cases (see Fig. 2b). In any case, since the set of points for which problem (35) has multiple optimal solutions has zero measure, it does not appear that it would be an issue in practice.
Since the constraints (33) depend on \(\mathcal {X}\) and \(\mathcal {U}\) whose values are given by the solution of an optimization problem, the computation of their derivatives (at the points where they are continuous and differentiable) is nontrivial. In “Appendix”, we show how to compute the first and second order derivatives of the functions that define the constraints (33).
5 Evaluation of the constraints
5.1 Evaluation of the overlapping measure
In order to evaluate the constraints (33) at a given point, we need to find the values of \(\mathcal {X}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) and \(\mathcal {U}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\). As we have seen, if the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) do not overlap, then \(f(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) in the summation in (33) is zero. Thus, if we know that the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) do not overlap, then we do not need to evaluate the functions \(\mathcal {X}\) and \(\mathcal {U}\).
Suppose that we do not know whether the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) overlap. In this case, we need to compute the values of \(\mathcal {X}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) and \(\mathcal {U}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\). For this, we need to solve the problem
where \(c_{i}^{ij} = P_i^{-\frac{1}{2}}Q_i^{\top }(c_i-c_j)\). By performing the change of variable \(w = S_{ij}^{\frac{1}{2}}x\), this problem is equivalent to the problem
The objective function of problem (36) is a convex quadratic function and the feasible set is the frontier of the unit-radius ball centered at the origin. This problem can be numerically solved by the algorithm proposed in [36].
5.2 Efficient evaluation of the constraints
The total number of terms presented in constraints (33) is \(O(m^2)\). However, most of these terms do not need to be computed when the constraints are evaluated at a point that is almost feasible, that is, a point where most of the ellipsoids do not overlap each other. In a feasible solution, only a constant number of ellipsoids may touch a given ellipsoid. For example, suppose that the ellipsoids are identical balls. In the two-dimensional case, at most six balls can touch a given ball. In the three-dimensional case, this number is twelve. For identical ellipsoids, the number of ellipsoids that can touch a given one will depend on the eccentricities of these ellipsoids. In any case, in a (almost) feasible solution, only O(m) terms need to be evaluated.
If the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) do not overlap, then the term associated with this pair of ellipsoids in the summation (33) is zero and does not need to be evaluated. A sufficient condition for the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) not to overlap is that their enclosing balls do not overlap. An enclosing ball of a set is a ball that contains that set. The minimal enclosing ball of \(\mathcal {E}_i\) is the ball with radius \(r_i = \lambda _{\max }(P_i^{\frac{1}{2}})\) centered at \(c_i\). Therefore, if
then the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) do not overlap. Let
Thus, every ellipsoid \(\mathcal {E}_i\) is contained in a ball with radius R centered at \(c_i\). Therefore, if
then the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) do not overlap.
The method described here to identify the pairs of ellipsoids that do not meet the condition (37) has been used in other works, such as [15, 37], to identify pairs of balls that may overlap. Let \(l = 2R\) and consider a hypercube with edge length L that contains the container. This hypercube can be covered by \({\lceil L/l \rceil }^n\) hypercubes with edge lengths l whose interiors are mutually disjoint. We refer to each of these hypercubes with edge lengths l as a region. Two regions are adjacent if they share at least one vertex. Suppose that \(\mathcal {E}_i\) and \(\mathcal {E}_j\) have their centers in non-adjacent regions. In this case, since each region is a hypercube with edge length 2R, we must have \(\left\| c_i - c_j\right\| \ge 2R\), that is, the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) do not overlap. Therefore, if two ellipsoids are in non-adjacent regions, they do not overlap. If two ellipsoids are in the same region or in adjacent regions, they may or may not overlap. Hence, considering all the terms that appear in the constraints (33), we can evaluate only those that are associated with pairs of ellipsoids that lie in the same region or in adjacent regions.
Each ellipsoid can be assigned to a region in constant time based on its center. The region of the ellipsoid \(\mathcal {E}_i\) is defined as the tuple
where
and \(N_\mathrm{reg} = {\lceil L/l \rceil }\). The method to determine which pairs of ellipsoids should be considered works as follows. First, an n-dimensional array with \(N_\mathrm{reg}\) entries for each dimension is created. Each element of this array is associated with a region and stores a list with the indices of ellipsoids that belong to that region. This structure can be constructed in O(m) time. Also, there is a list with the non-empty regions (regions that have at least one ellipsoid). This list is also constructed in O(m) time. Then, for each non-empty region and for each ellipsoid \(\mathcal {E}_i\) in that region, the term \(f(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) associated with the ellipsoids \(\mathcal {E}_i\) and \(\mathcal {E}_j\) is computed for each ellipsoid \(\mathcal {E}_j\) in that region and in adjacent regions. Considering the case where all the ellipsoids have approximately the same size, each region will contain only a constant number of ellipsoids in an almost feasible solution. In this case, this algorithm performs in O(m) time.
6 Solvability of the proposed model
In the present paper, solutions to large-sized instances of ellipsoids’ packing problems are being sought. Nonlinear programming models (that involve a quadratic number of variables and constraints) for several variants of the problem of packing ellipsoids have already been introduced in the literature [8, 30,31,32,33, 38, 40]. Numerical experiments in all the mentioned papers show that state-of-the-art deterministic global optimization methods are able to find global solutions with a certificate of optimality for instances (of the models with a quadratic number of variables and constraints) with only 3 (three) ellipsoids in the two- and three-dimensional spaces. In all the papers, the authors highlight how fast the difficulty of the problems increases with the number of ellipsoids. In [32], it is suggested that global optimization strategies could become prohibitively expensive for an arbitrarily increasing number of ellipsoids and that large-sized instances might be tackled with heuristic approaches.
A non-overlapping model with a linear number of variables and constraints was introduced in Sect. 4. A procedure for evaluating the model in linear expected cost was described in Sect. 5. Those two ingredients overcome the quadratic cost inherent to all the previously introduced models. However, considering the poor performance reported in the literature, this improvement is not enough to expect that state-of-the-art deterministic global optimization solvers would be able to tackle the large-sized instances that are the focus of the present work.
In any case, assuming that applying existent deterministic global optimization tools to a given problem requires the analytical expressions of the problem to be written in an algebraic modelling language, there are two characteristics of the proposed non-overlapping model that impair the attempt of applying those tools to a packing problem based on it. The non-overlapping constraints (33) are constituted, as a whole, by the summation of a quadratic number of non-negative terms. Their efficient evaluation consists in determining, in linear expected cost, which terms may be strictly positive, discarding those that are certainly null. This procedure (described in Sect. 5) uses complex data structures and algorithms and, therefore, up to the authors’ acknowledge, it cannot be efficiently expressed in an algebraic modelling language. Moreover, once the terms that need to be evaluated have been identified, their evaluation requires the computation of implicit variables that are the (primal and dual) solutions to nonlinear optimization problems of the form (35). Since those subproblems are numerically solved, no analytical expressions for the implicit variables are available and, once again, this situation cannot be handled by an algebraic modelling language.
By the reasons exposed in the previous paragraphs, we concluded that the adequate tool for enhancing the probability of finding good quality solutions to large-sized instances of ellipsoids’ packing problem would be to consider stochastic multi-start global optimization methods. Two are the main ingredients of the considered method: (a) specially devised random initial guesses and (b) a robust local solver that was developed with the intention of actively pursue good quality local minimizers. The description of the considered local solver as well as the ad hoc initial guesses, that depend on the packing problem being considered, are described in detail in the following section.
7 Numerical experiments
In this section, we present some experiments to show that the non-overlapping model (33) with implicit variables introduced in the present work can be used to solve instances larger than the ones solved by the transformation-based model (8)–(11) introduced in [8]. Since the former model can be solved faster (by solved we mean that a stationary point can be found), it opens the possibility for the use of a multi-start strategy to potentially obtain better quality solutions. On the one hand, this strategy could not be applied with the original transformation-based model for medium- or large-sized instances due to the excessive amount of time spent to perform a single local minimization starting from a given initial guess. For example, a single local minimization of the problem of packing 100 three-dimensional ellipsoids within a minimum volume ball took almost 20 h in the numerical experiments reported in [8]. On the other hand, the model with implicit variables is not suitable for small-sized instances because of the overhead of evaluating the non-overlapping constraints that were designed for medium- and large-sized instances and, in that case, the models introduced in [8] should be preferred. For a given ellipsoids’ packing problem, the decision on whether the model introduced in the present work or the model proposed in [8] should be used depends on how different the ellipsoids being packed are and on the number of ellipsoids being packed. If there are O(m) ellipsoids that are much smaller than the largest ellipsoid, the “efficient evaluation” of constraint (33) described in Sect. 5.2 may cost \(O(m^2)\). If this is not the case, numerical experiments in [34] (that are not being included here by lack of space) suggest that \(m \ge 50\) is the rule of thumb for using the model with implicit variables introduced in the present work.
We have implemented two- and the three-dimensional versions of the introduced non-overlapping models with implicit variables in Fortran 2003, as well as the optimization procedure. To solve the nonlinear programming problems, we used Algencan [1, 11] version 3.0.0, which is available for downloading at the TANGO Project web page (http://www.ime.usp.br/~egbirgin/tango/). The models, the optimization procedure, and Algencan were compiled with the GNU Fortran compiler (GCC) 4.7.2 with the -O3 option enabled. The experiments were run on an Intel 2.4GHz Intel® Core™ i7-3770 with 16GB of RAM memory and Debian GNU/Linux 7.8 (Linux version 3.2.0-4-amd64) operating system. Our computer implementation and the solutions reported in this section are freely available at http://www.ime.usp.br/~lobato/.
Algencan is an augmented Lagrangian method for nonlinear programming that solves the bound-constrained subproblems using Gencan [2, 9, 10], an active-set method for bound-constrained minimization. Gencan adopts the leaving-face criterion described in [9], that employs spectral projected gradients defined in [12, 13]. For the internal-to-the-face minimization, Gencan uses an unconstrained algorithm that depends on the dimension of the problem and the availability of second-order derivatives. For small problems with available Hessians, a Newtonian trust-region approach is used (see [2]); while for medium- and large-sized problems with available Hessians a Newtonian line-search method that combines backtracking and extrapolation is used. When second-order derivatives are not available, each step of Gencan computes the direction inside the face using a line-search truncated-Newton approach with incremental quotients to approximate the matrix-vector products and memoryless BFGS preconditioners (this is the case of the problems considered in the present section, for which only first-order derivatives were coded). In all the experiments described in the present section, the local solver Algencan was run using its default parameters; while the optimality and feasibility tolerances \(\varepsilon _{\text{ feas }}\) and \(\varepsilon _{\text{ opt }}\) (that are parameters that must be provided by the user) were both set to \(10^{-4}\). Those tolerances, related to the stopping criteria, are used to determine whether a solution (stationary point) to the optimization problem being solved has been found. See [11, pp. 116–117] for details.
Although Algencan is a local nonlinear programming solver, it was designed in such a way that global minimizers of subproblems are actively pursued, independently of the fulfillment of approximate stationarity conditions in the subproblems. In other words, Algencan’s subproblem solvers try always to find the lowest possible function values, even when this is not necessary for obtaining approximate local minimizers. As a consequence, practical behavior of Algencan is usually well explained by the properties of their global-optimization counterparts [6]. The “preference for global minimizers” of Algencan has been discussed in [1]. This has also been observed in papers devoted to numerical experiments concerning Algencan and other solvers (see, for example, [27, 28] and the references therein). This does not mean at all that Algencan is always able to find global minimizers.
7.1 Two-dimensional packing
In this section, we consider the problem of packing ellipses within a minimum area container. Each instance is defined by the lengths of the semi-axes of the m ellipses to be packed and the type of the container. For illustration purposes, we have considered circles, ellipses, and rectangles as containers. We used the models introduced in [8] for modelling the confinement of ellipses within circles, ellipses, and rectangles. Assuming that the lengths of semi-axes of the i-th ellipse are \(a_i\) and \(b_i\), for each \(i \in I\), \(P_i^{\frac{1}{2}}\) is the diagonal matrix whose diagonal entries are \(a_i\) and \(b_i\). For each \(i \in I\), the rotation matrix for the i-th ellipse is given by (1). Therefore, the variables of the problem that determine the center and orientation of the i-th ellipse are \(c_i \in \mathbb {R}^2\) and \(\theta _i \in \mathbb {R}\), respectively, for \(i \in I\). When the container is a circle, the additional variable r determines the radius of the circular container. When the container is an ellipse, the additional variables a and b determine the lengths of semi-axes of the elliptical container. When the container is a rectangle, the additional variables l and w determine the length and width of the rectangular container. The containment model may have additional variables depending on the type of the container. For more details about the containment models, see [8].
7.1.1 Identical ellipses
Since this is a non-convex optimization problem, we employ a multi-start strategy in order to enhance the probability of finding good quality solutions. At each iteration of the multi-start strategy, an initial solution is constructed and then the NLP local solver Algencan is used to solve the problem starting from this initial solution. In the initial solution of the first iteration, the ellipses to be packed are not rotated and their centers are arranged in a generalisation of the hexagonal lattice for circles (see Fig. 3a). For each of the subsequent iterations, the initial solution is given by a random perturbation of the most recent feasible solution found. The centers and rotation angles of the ellipses are perturbed by at most 2.5%. A CPU time limit of 48 h for running the multi-start strategy on each considered instance was imposed.
Table 1 shows the results obtained for the problem of packing \(m \in \{100, 200, 300, 400, 500,\) \(1000 \}\) identical ellipses with semi-axis lengths 2 and 1 within a minimum area ellipse. The first column shows the number m of ellipses packed. The second and third columns show the area of the container and the density of the packing in the best solution found, respectively. Areas and densities are rounded to 5 decimal places (results up to the machine precision can be found in http://www.ime.usp.br/~lobato/). The fourth and fifth columns show the number of local minimizations and the total time spent until the best solution was found, respectively. The sixth column shows the total number of (completed) local minimizations performed within the imposed CPU time limit and the last column shows the average time per local minimization.
Figure 3 shows graphical representations of the initial and the best solutions found for the problem of packing 100 ellipses within a minimum area ellipse. Figure 3a depicts the initial solution of the first iteration. Figure 3b represents the initial solution of the 542nd iteration and Fig. 3c shows the solution found in the 542nd local minimization. Figure 4 illustrates the best solutions found for the problem of packing m ellipses, for \(m \in \{200, 300, 400, 500, 1000\}\), within a minimum area ellipse.
The results we obtained for the problem of packing identical ellipses within a minimum area rectangle are shown in Table 2. Figure 5 depicts the best solutions found within the imposed CPU time limit.
7.1.2 Non-identical ellipses
We also dealt with the problem \(\mathcal {P}_C\) of packing m non-identical ellipses within a minimum area circle. We considered an instance formed by \(m=231\) ellipses with mutually different pairs of semi-axis lengths belonging to the set \(\{(1 + 0.2\alpha , 1 + 0.2\beta ) \mid \alpha ,\beta \in \{0,1,\dots ,20\}, \alpha \le \beta \}\). In order to create an initial solution to this problem, we considered another problem, called \(\mathcal {P}_S\), whose solution will serve as a starting guess to problem \(\mathcal {P}_C\). Problem \(\mathcal {P}_S\) is the problem of packing the ellipses without overlap so that the sum of the squared distances of the ellipses’ centers to the origin of the Cartesian coordinate system is minimum. For solving problem \(\mathcal {P}_S\), we considered a multi-start strategy similar to the one used for solving the problem of packing ellipses within a minimum area ellipse. We imposed a CPU time limit of 1 h to solve problem \(\mathcal {P}_S\). Figure 6a depicts the solution found to this problem within the given time limit. This is the first initial solution to problem \(\mathcal {P}_C\). (The circle that appears around the ellipses in Fig. 6a is not related to problem \(\mathcal {P}_S\) but it is part of the initial solution for problem \(\mathcal {P}_C\).) For problem \(\mathcal {P}_C\), the same multi-start strategy was used and Fig. 6b shows the best solution found after 48 h (and 135 local minimizations). In this solution, the container has radius 49.39860, approximately.
7.2 Three-dimensional packing
In this section, we consider the problems of packing ellipsoids within a minimum volume ball and within a minimum volume cuboid. We used the models introduced in [8] for modelling the confinement of ellipsoids within balls and cuboids. We deal with instances formed by \(m \in \{100,200,300,400,500\}\) identical ellipsoids with semi-axis lengths \(l_1 = 1\), \(l_2 = 0.75\), and \(l_3 = 0.5\). For each \(i \in I\), \(P_i^{\frac{1}{2}}\) is the diagonal matrix whose diagonal entries are \(l_1\), \(l_2\), and \(l_3\). For each \(i \in I\), the rotation matrix for the i-th ellipsoid is given by (2). As in the two-dimensional experiments, we use a multi-start strategy in order to find good quality solutions. The initial solution is defined as follows. The ellipsoids are not rotated and their centers correspond to m points in the set \(\{(\delta _1 l_1, \delta _2 l_2, \delta _3 l_3) \mid \delta _1, \delta _2, \delta _3 \in \mathbb {Z}\}\) that are closest to the origin. In this way, the ellipsoids do not overlap in the initial solution. The containers (ball and cuboid) in the initial solution are the smallest ones that contain the enclosing balls of each ellipsoid. For each of the subsequent iterations, the initial solution is given by a random perturbation of the most recent feasible solution found. The centers and rotation angles of the ellipsoids were perturbed by at most 10%.
Table 3 exhibits the results for the minimization of the volume of the spherical container; while Table 4 presents the results for the minimization of the volume of the cuboidal container. The first column shows the number m of packed ellipsoids. The second and third columns show the volume of the container and the packing density in the best solution found, respectively. The fourth and fifth columns show the number of local minimizations (number of multi-start iterations) and the total time required to find the best solution found. The sixth and seventh columns show the number of (completed) local minimizations and the average time per local minimization within the CPU time limit of 48 h.
Figure 7 depicts some solutions that appeared during the multi-start procedure for the problem of packing \(m=100\) identical ellipsoids within a minimum volume ball. Figure 7a illustrates the first initial solution that was constructed for this problem. Figure 7b shows the initial solution for the 369th iteration and Fig. 7c displays the solution found at iteration 369. Figure 8 illustrates the solutions for the problem of packing \(m \in \{200,300,400,500\}\) identical ellipsoids within a minimum volume ball; while Fig. 9 exhibits the solutions for the problem of packing \(m \in \{100,200,300,400,500\}\) identical ellipsoids within a minimum volume cuboid.
8 Concluding remarks
Up to the authors’ acknowledge, all published nonlinear programming models for avoiding the overlapping between ellipsoids have a quadratic number of variables and constraints on the number of ellipsoids to be packed. Therefore, when the number of ellipsoids is relatively large, solving these models become a computationally prohibitive task. In order to alleviate this shortcoming, we proposed a model with implicit variables, inspired by the transformation-based non-overlapping model introduced in [8], that has a linear number of variables and constraints. Moreover, a clever algorithm for evaluating the whole model in linear time was also proposed. Numerical experiments showed that, by considering the proposed non-overlapping model, it is possible to tackle problems one or two order of magnitude larger than the instances previously addressed in the literature.
References
Andreani, R., Birgin, E.G., Martínez, J.M., Schuverdt, M.L.: On augmented Lagrangian methods with general lower-level constraints. SIAM J. Optim. 18(4), 1286–1309 (2007)
Andretta, M., Birgin, E.G., Martínez, J.M.: Practical active-set Euclidian trust-region method with spectral projected gradients for bound-constrained minimization. Optimization 54(3), 305–325 (2005)
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)
Bezrukov, A., Stoyan, D.: Simulation and statistical snalysis of random packings of ellipsoids. Par. Part. Syst. Charact. 23(5), 388–398 (2007)
Birgin, E.G., Bustamante, L.H., Callisaya, H.F., Martínez, J.M.: Packing circles within ellipses. Int. Trans. Oper. Res. 20(3), 365–389 (2013)
Birgin, E.G., Floudas, C.A., Martínez, J.M.: Global minimization using an augmented Lagrangian method with variable lower-level constraints. Math. Program. 125(1), 139–162 (2010)
Birgin, E.G., Gentil, J.M.: New and improved results for packing identical unitary radius circles within triangles, rectangles and strips. Comput. Oper. Res. 37(7), 1318–1327 (2010)
Birgin, E.G., Lobato, R.D., Martínez, J.M.: Packing ellipsoids by nonlinear optimization. J. Glob. Optim. 65(4), 709–743 (2016)
Birgin, E.G., Martínez, J.M.: A box-constrained optimization algorithm with negative curvature directions and spectral projected gradients. Computing [Suppl] 15(1), 49–60 (2001)
Birgin, E.G., Martínez, J.M.: Large-scale active-set box-constrained optimization method with spectral projected gradients. Comput. Optim. Appl. 23(1), 101–125 (2002)
Birgin, E.G., Martínez, J.M.: Practical Augmented Lagrangian Methods for Constrained Optimization. Society for Industrial and Applied Mathematics, Philadelphia (2014)
Birgin, E.G., Martínez, J.M., Raydan, M.: Nonmonotone spectral projected gradient methods on convex sets. SIAM J. Optim. 10(4), 1196–1211 (2000)
Birgin, E.G., Martínez, J.M., Raydan, M.: Algorithm 813: SPG-software for convex-constrained optimization. ACM Trans. Math. Softw. 27(3), 340–349 (2001)
Birgin, E.G., Martínez, J.M., Ronconi, D.P.: Optimizing the packing of cylinders into a rectangular container: a nonlinear approach. Eur. J. Oper. Res. 160(1), 19–33 (2005)
Birgin, E.G., Sobral, F.N.C.: Minimizing the object dimensions in circle and sphere packing problems. Comput. Oper. Res. 35(7), 2357–2375 (2008)
Castillo, I., Kampas, F.J., Pintér, J.D.: Solving circle packing problems by global optimization: numerical results and industrial applications. Eur. J. Oper. Res. 191(3), 786–802 (2008)
Chernov, N., Stoyan, Yu., Romanova, T.: Mathematical model and efficient algorithms for object packing problem. Comput. Geom. 43(5), 535–553 (2010)
Choi, Y.K., Chang, J.W., Wang, W., Kim, M.S., Elber, G.: Continuous collision detection for ellipsoids. IEEE Trans. Vis. Comput. Graph. 15(2), 311–324 (2009)
Conway, J.H., Sloane, N.J.A.: Sphere Packings, Lattices and Groups Volume 290 of Grundlehren der math. Wissenschaften. Springer, New York (1999)
Donev, A., Cisse, I., Sachs, D., Variano, E., Stillinger, F.H., Connelly, R., Torquato, S., Chaikin, P.M.: Improving the density of jammed disordered packings using ellipsoids. Science 303(5660), 990–993 (2004)
Donev, A., Connelly, R., Stillinger, F.H., Torquato, S.: Underconstrained jammed packings of nonspherical hard particles: ellipses and ellipsoids. Phys. Rev. E 75(5), 051304 (2007)
Donev, A., Stillinger, F.H., Chaikin, P.M., Torquato, S.: Unusually dense crystal packings of ellipsoids. Phys. Rev. Lett. 92, 255506 (2004)
Donev, A., Torquato, S., Stillinger, F.H.: Neighbor list collision-driven molecular dynamics simulation for nonspherical hard particles. I. Algorithmic details. J. Comput. Phys. 202(2), 737–764 (2005)
Donev, A., Torquato, S., Stillinger, F.H.: Neighbor list collision-driven molecular dynamics simulation for nonspherical hard particles. II. Applications to ellipses and ellipsoids. J. Comput. Phys. 202(2), 765–793 (2005)
Galiev, ShI, Lisafina, M.S.: Numerical optimization methods for packing equal orthogonally oriented ellipses in a rectangular domain. Comput. Math. Math. Phys. 53(11), 1748–1762 (2013)
Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)
Izmailov, A.F., Solodov, M.V., Uskov, E.I.: Global convergence of augmented Lagrangian methods applied to optimization problems with degenerate constraints, including problems with complementarity constraints. SIAM J. Optim. 22(4), 1579–1606 (2012)
Izmailov, A.F., Solodov, M.V., Uskov, E.I.: Combining stabilized SQP with the augmented Lagrangian algorithm. Comput. Optim. Appl. 62(2), 405–429 (2015)
Kallrath, J.: Cutting circles and polygons from area-minimizing rectangles. J. Glob. Optim. 43(2), 299–328 (2009)
Kallrath, J.: Packing ellipsoids into volume-minimizing rectangular boxes. J. Glob. Optim. (2015). doi:10.1007/s10898-015-0348-6
Kallrath, J., Rebennack, S.: Cutting ellipses from area-minimizing rectangles. J. Glob. Optim. 59(2), 405–437 (2014)
Kampas, F.J., Castillo, I., Pintér, J.D.: General ellipse packings in optimized regular polygons. http://www.optimization-online.org/DB_HTML/2016/03/5348.html (2016)
Kampas, F.J., Pintér, J.D., Castillo, I.: General ellipse packings in an optimized circle using embedded Lagrange multipliers. http://www.optimization-online.org/DB_HTML/2016/01/5293.html (2016)
Lobato, R.D.: Ellipsoid packing. PhD thesis, University of Sao Paulo (2015)
Man, W., Donev, A., Stillinger, F.H., Sullivan, M.T., Russel, W.B., Heeger, D., Inati, S., Torquato, S., Chaikin, P.M.: Experiments on random packings of ellipsoids. Phys. Rev. Lett. 104, 185501 (2010)
Martínez, J.M.: Local minimizers of quadratic functions on euclidean balls and spheres. SIAM J. Optim. 4(1), 159–176 (1994)
Martínez, L., Andrade, R., Birgin, E.G., Martínez, J.M.: Packmol: a package for building initial configurations for molecular dynamics simulations. J. Comput. Chem. 30(13), 2157–2164 (2009)
Pankratov, A., Romanova, T., Khlud, O.: Quasi-phi-functions in packing problem of ellipsoids. Radioelectron. Inform. 68(1), 37–41 (2015)
Stoyan, Y., Romanova, T., Pankratov, A., Chugay, A.: Optimized object packings using quasi-phi-functions. In: Fasano, G., Pintér, J.D. (eds.) Optimized Packings with Applications, Volume 105 of Springer Optimization and Its Applications, Chapter 13, pp. 265–293. Springer, Cham (2015)
Stoyan, Y.G., Pankratov, A., Romanova, T.: Quasi-phi-functions and optimal packing of ellipses. J. Glob. Optim. 65(2), 283–307 (2016)
Stoyan, Y.G., Yas’kov, G.: A mathematical model and a solution method for the problem of placing various-sized circles into a strip. Eur. J. Oper. Res. 156(3), 590–600 (2004)
Stoyan, Y.G., Yas’kov, G.: Packing congruent hyperspheres into a hypersphere. J. Glob. Optim. 52(4), 855–868 (2012)
Stoyan, Y.G., Yas’kov, G.: Packing equal circles into a circle with circular prohibited areas. Int. J. Comput. Math. 89(10), 1355–1369 (2012)
Stoyan, Y.G., Zlotnik, M.V., Chugay, A.: Solving an optimization packing problem of circles and non-convex polygons with rotations into a multiply connected region. J. Oper. Res. Soc. 63(3), 379–391 (2012)
Acknowledgements
The authors are indebted to the anonymous referees whose comments helped to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by PRONEX-CNPq/FAPERJ E-26/111.449/2010-APQ1, FAPESP (Grants 2010/10133-0, 2013/03447-6, 2013/05475-7, 2013/07375-0, and 2012/23916-8), and CNPq (Grants 309517/2014-1, 303750/2014-6).
Appendix: Derivatives
Appendix: Derivatives
The computation of the derivatives of the function defined in (33) is nontrivial. That is because this function depends on the functions \(\mathcal {X}\) and \(\mathcal {U}\) whose values are given by the solution of an optimization problem. Firstly, we will show the derivatives of the terms that compose the function defined in (33) in terms of the derivatives of the functions \(\mathcal {X}\) and \(\mathcal {U}\). Next, we will show how to compute the derivatives of the functions \(\mathcal {X}\) and \(\mathcal {U}\). To simplify the notation, we will denote by \(\mathcal {X}_{ij}\) the value \(\mathcal {X}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) and by \(\mathcal {U}_{ij}\) the value \(\mathcal {U}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\).
1.1 First order derivatives
Let \(i,j \in \{1,\dots ,m\}\) such that \(i < j\). We have that \(\mathcal {X}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) is a solution to the problem
where \(c_{i}^{ij} = P_i^{-\frac{1}{2}}Q_i^{\top }(c_i-c_j)\), and \(\mathcal {U}(c_i,c_j,\Omega _i,\Omega _j;P_i,P_j)\) is the corresponding Lagrange multiplier. According to the Karush–Kuhn–Tucker first-order necessary conditions for problem (38), we have
Thus, by defining the function \(F: \mathbb {R}^{n} \times \mathbb {R}^{n} \times \mathbb {R}^{q} \times \mathbb {R}^{q} \rightarrow \mathbb {R}^{n+1}\) as
we have that \(F(c_i,c_j,\Omega _i,\Omega _j) = 0\) for all \(c_i,c_j \in \mathbb {R}^n\) and for all \(\Omega _i,\Omega _j \in \mathbb {R}^q\). That is, F is an identically zero function. Therefore, we have that the derivative of function F is also an identically zero function. Hence, for each variable v of the function F and for each component \(\ell \in \{1,\dots ,n+1\}\) of F, we have
Once the values of \(\mathcal {X}_{ij}\) and \(\mathcal {U}_{ij}\) are known, we have, for each \(\ell \in \{1,\dots ,n+1\}\), analytical expressions for \(\frac{\partial F_{\ell }}{\partial v}\), \(\frac{\partial F_{\ell }}{\partial \mathcal {U}_{ij}}\), and \(\frac{\partial F_{\ell }}{\partial [\mathcal {X}_{ij}]_k}\) for each \(k \in \{1,\dots ,n\}\). On the other hand, the values of \(\frac{\text {d}\mathcal {U}_{ij}}{\text {d}v}\) and \(\frac{\text {d}[\mathcal {X}_{ij}]_k}{\text {d}v}\) for each \(k \in \{1,\dots ,n\}\) are unknown, but can be computed by solving the linear system provided by (40):
Then, for each \(i, j \in \{1,\dots ,m\}\) such that \(i < j\), we need to solve \(2(n+q)\) linear systems with \(n+1\) equations and \(n+1\) variables (one linear system for each variable among \(c_i\), \(c_j\), \(\Omega _i\) and \(\Omega _j\)).
Once i and j are fixed, observe that the \(2(n+q)\) linear systems have the same coefficient matrix. The only difference between these systems are their right-hand sides. Thus, in order to solve these linear systems, we can factorize the coefficient matrix only once and then, for each right-hand side, solve the linear system with the coefficient matrix already factorized.
1.2 Second order derivatives
For each variable v of the function F defined in (39) and for each component \(\ell \) of F, we define the function \(G^v_\ell \) as the total derivative of the function \(F_\ell \) with respect to v:
Since the function F is identically zero, its derivative is also identically zero. Then, for each variable u of the function \(G^v_\ell \), we have
Next, we present the partial derivatives of the function \(G^v_{\ell }\), that appear in the expression (41). The partial derivative of \(G^v_{\ell }\) with respect to the variable u is given by
The partial derivative of \(G^v_{\ell }\) with respect to \(\mathcal {U}_{ij}\) is given by
Finally, the partial derivative of \(G^v_{\ell }\) with respect to \(\mathcal {X}_{ij}\) is given by
The simplifications in the expressions of the derivatives of \(G^v_{\ell }\) with respect to \(\mathcal {U}_{ij}\) and \(\mathcal {X}_{ij}\) come from the removal of null elements.
Considering that the values of the first order derivatives of the function F are known, the equation (41) provides the following linear system
for each variable u and for each variable v of the function F. The components of the right-hand side of this system are given by
for each \(\ell \in \{1,\dots ,n+1\}\). Notice that the coefficient matrix of this linear system does not depend on the variables u and v. Therefore, for each \(i,j \in \{1,\dots ,m\}\) such that \(i < j\), we have \((n+q)(2(n+q) + 1)\) linear systems (one for each pair of variables u and v of the function F) with \(n+1\) variables each one, and all of them have the same coefficient matrix.
Rights and permissions
About this article
Cite this article
Birgin, E.G., Lobato, R.D. & Martínez, J.M. A nonlinear programming model with implicit variables for packing ellipsoids. J Glob Optim 68, 467–499 (2017). https://doi.org/10.1007/s10898-016-0483-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-016-0483-8