1 Introduction

Problems in the optimization of structures frequently require the use of binary decision variables. Examples include a nonlinear 0 − 1 formulation to minimize the mass of load-carrying structures (Stolpe and Sandal 2018) and the optimal design of frame structures (Van Mellaert et al. 2018). Exact solutions to these problems require high computational efforts, precluding the solution of large-scale problems. A strategy to tackle the computational burden is to relax the binary variables and devise constraints that should induce the value of the relaxed variables to a binary domain. An ideal relaxation technique would be able to obtain binary solutions with easy handling constraints that allow reducing the overall computation effort. The quest for such an ideal relaxation technique has been the research core in binary optimization.

For instance, binary variables xi ∈ {− 1, 1} can be relaxed to the interval [− 1, 1] by adding a set of constraints \({x_{i}^{2}}=1\) (Kochenberger et al. 2014). Another procedure relaxes the binary variables xi ∈ {0, 1} as xi ∈ [0, 1], with the addition of the constraints xi(xi − 1) = 0 (Kochenberger et al. 2014). A third technique is the solid isotropic material with penalization (SIMP) method (Bendsøe 1989), for problems with xi ∈ {0, 1} variables.

Because the SIMP may fail to obtain binary solutions in some simple counterexamples, Martínez (2005) proposed a set of conditions to overcome this issue, including the addition of the constraint \(\sum \limits _{i=1}^{n}x_{i}\leq V, V \in \{1, \dots , n\}\). However, the requirement of the upper bound V restrains the domain of applications of this SIMP approach. This note designs a new way of relaxing the binary variables that allow avoiding the requirement of such an upper bound.

The proposed approach maps the original problem with binary variables xi ∈ {0, 1} into an equivalent continuous problem with relaxed variables xi ∈ [0, 1], using supplementary continuous variables yi ∈ [0, 1] and only one additional constraint.

The worth of the proposed relaxation is assessed using a new formulation for the unassigned distance geometry problem (uDGP). The uDGP searches to unveil the structure of particles or proteins, i.e., the 3D position of each atom (vertex) of these structures. The pieces of information available are the number of vertices and a list of distances between them, which are provided by experimental techniques such as nuclear magnetic resonance (NMR) or X-ray (Liberti and Lavor 2018).

The main theoretical results to prove the equivalence of the proposed approach and the original binary optimization problem are discussed in the next section. Section 3 presents a new formulation for the unassigned distance geometry problem and the computational experiments. Conclusions follow.

2 Binary relaxation

Consider the optimization problem,

$$ \begin{array}{@{}rcl@{}} \max && f(x); \ \text{s.t.} \ x \in \varOmega \end{array} $$
(1)

where \(f:\mathbb {R}^{n} \longrightarrow \mathbb {R}\), \(\varOmega \subset \mathbb {R}^{n}\) represents the constraint set, and xi ∈ {0, 1}, \(i= 1, {\dots } , n\) the optimization variables.

The following three results show that Problem (1) can be converted into an equivalent continuous problem adding a single quadratic constraint.

Lemma 1

The binary variables xi in Problem (1) can be relaxed to xi ∈ [0, 1], by adding a set of continuous variables yi ∈ [0, 1] and a set of constraints (xiyi)2 = 1 (\(i= 1, {\dots } , n\)).

Proof

Indeed, the only solutions of (xiyi)2 = 1 for xi ∈ [0, 1] and yi ∈ [0, 1] are xi = 0 and yi = 1 or xi = 1 and yi = 0; whichever case, the solutions are binary. □

The following lemma extends this result by showing that the set of n constraints (xiyi)2 = 1 can be packed into a single quadratic constraint.

Lemma 2

Assume the binary variables xi ∈ {0, 1} relaxed as described in Lemma 1. The set of n quadratic constraints (xiyi)2 = 1 is equivalent to the single quadratic constraint \(\sum \limits _{i=1}^{n}(x_{i}-y_{i})^{2}=n\).

Proof

Note that the maximal value of (xiyi)2 for xi ∈ [0, 1] and yi ∈ [0, 1] is equal to 1. Therefore, the maximal value of \(\sum \limits _{i=1}^{n}(x_{i}-y_{i})^{2}\) for xi ∈ [0, 1] and yi ∈ [0, 1] is equal to n. In other words, the constraint \(\sum \limits _{i=1}^{n}(x_{i}-y_{i})^{2}=n\) is satisfied when each term (xiyi)2 reaches the maximum value. By Lemma 1, the solution is binary. □

Now consider Problem (2),

$$ \begin{array}{@{}rcl@{}} \max & \ f(x)\\ \text{s.t.} & \sum\limits_{i=1}^{n}(x_{i}-y_{i})^{2}=n\\ & x \in \varOmega, y \in \varOmega \end{array} $$
(2)

where \(f:\mathbb {R}^{n} \longrightarrow \mathbb {R}\), \(\varOmega \subset \mathbb {R}^{n}\), and xi ∈ [0, 1], yi ∈ [0, 1], \(i= 1, {\dots } ,n\).

Theorem 1

The maximum value of Problem (2) is equal to the maximum value of Problem (1).

Proof

Lemmas 1 and 2 show that any feasible solution for Problem (2) is binary. Theorem 1 proves the additional result that there is a unique transformation that maps a feasible solution for Problem (1) into a feasible solution for Problem (2) with the same value for the objective function, f(x), and conversely.

Assume that \(\hat {x}\) is a feasible solution for Problem (1). Using the rule \(\tilde {x}_{i}= \hat {x}_{i}\) and \(\tilde {y}_{i} = 1 - \hat {x}_{i}\) (\(i= 1, \dots , n\)), it is possible to build a feasible solution (\(\tilde {x}, \tilde {y}\)) for Problem (2), with the same value for the objective function, \(f(\hat {x})\).

Conversely, suppose that (\(\hat {x}, \hat {y}\)) is a feasible solution for Problem (2). From Lemmas 1 and 2, (\(\tilde {x}, \tilde {y}\)) is binary. Therefore, \(\tilde {x}\) is a feasible solution for Problem (1), with the same value for the objective function, \(f(\tilde {x})\). □

Lemmas 1, 2, and Theorem 1 show that the relaxed Problem (2) is equivalent to the original Problem (1). The next result proves that, under the assumption of continuity for the function f, the quadratic constraint set can be added to the objective function without loss of the integrality properties.

Consider the Problem (3),

$$ \begin{array}{@{}rcl@{}} \max && f(x) - c \cdot g(x, y)\\ \text{ s.t.} && x \in \varOmega , \ y\in\varOmega \end{array} $$
(3)

where f is continuous, \(g(x, y) = n - \sum \limits _{i=1}^{n}(x_{i}-y_{i})^{2} \), c ≥ 0, \(\varOmega \subset \mathbb {R}^{n}\), xi ∈ [0, 1], and yi ∈ [0, 1], \( i=1, \dots , n\).

Theorem 2

The Problem (3) is equivalent to the Problem (1) for a suitable value of c.

Proof

Note that g(x,y) is continuous; also, using Lemmas 1 and 2, it is immediate to see that, for xi ∈ [0, 1] and yi ∈ [0, 1], g(x,y) ≥ 0, and that g(x,y) = 0 if and only if all xi and yi are binary. Therefore, and considering the continuity of the function f, a penalty function approach shows that Problem (2) and Problem (3) are equivalent, for a suitable value of c (Luenberger and Ye 2003, Chapter 13); in addition, using Theorem 1, the Problem (3) is equivalent to the Problem (1). □

Another property concerning g(x,y) that can be useful to assure global convergence of the optimization algorithms in some of the fields of application is its concavity. Indeed, the function g(x,y) can be expressed as g(x,y) = g(z) = n − 〈z,A z〉, where z = [x y], \( \textbf {A}=\left [\begin {array}{lll} \text {\textbf {I}}&\text {- \textbf {I}} \\ \text {- \textbf {I}} & \text {\textbf {I}} \end {array}\right ]\) is a block matrix, and I is the identity matrix of dimension n. Applying a singular value decomposition (SVD) for matrix A,

$$ \textbf{A} = \left[\begin{array}{ll} \text{\textbf{I}}& \text{- \textbf{I}} \\ \text{- \textbf{I}} & \text{\textbf{I}} \end{array}\right] = \left[\begin{array}{lll} \frac{\sqrt[]{2}}{2}\text{\textbf{I}}&\frac{\sqrt[]{2}}{2}\text{\textbf{I}}\\ -\frac{\sqrt[]{2}}{2}\text{\textbf{I}}&\frac{\sqrt[]{2}}{2}\text{\textbf{I}} \end{array}\right]\left[\begin{array}{ll} 2\text{\textbf{I}}& \textbf{0} \\ \textbf{0}&0\text{\textbf{I}} \end{array}\right] \left[\begin{array}{ll} \frac{\sqrt[]{2}}{2}\text{\textbf{I}}&-\frac{\sqrt[]{2}}{2}\text{\textbf{I}}\\ \frac{\sqrt[]{2}}{2}\text{\textbf{I}} & \ \frac{\sqrt[]{2}}{2}\text{\textbf{I}} \end{array}\right]. $$
(4)

where 0 is a zero matrix of dimension n × n. Using this result, it is straightforward to see that A has n eigenvalues equal to 0 and n eigenvalues equal to 2. Therefore, the matrix A is positive semidefinite, and the function g(x,y) is concave.

The formulation for the unassigned distance geometry problem (uDGP) proposed in the next section will illustrate the computational benefits of these results. The advantage of proposing a model to apply these ideas to the uDGP is twofold: the intrinsic difficulty of the uDGP makes it a severe testbed (Liberti and Lavor 2018), and improvements in solution strategies for this problem have their own worthiness, provided by applications in robotics (Porta et al. 2005; Rojas and Thomas 2013), design of structures, nanotechnology, and bio-engineering (Liberti and Lavor 2018).

3 Unassigned distance geometry problem

The uDGP (Billinge et al. 2016) seeks for the best assignment of each vertex of a molecule to a 3D Euclidean space, considering the number of vertices and the distance between them (Duxbury et al. 2016). The literature about uDGP is incipient, making it an open research area in distance geometry (Liberti and Lavor 2018).

The proposed formulation for the uDGP merges the problem of assigning distances to all single pairs of vertices with the problem of positioning the vertices in the Euclidean space. Because a distance value may occur repeatedly, each entry of the distance list contains the value of the distance (\(d_{a}, a=1, \dots , m\)) and its multiplicity (\(m_{a}\text {,} \ a=1, \dots , m\)).

As the data usually comes from experimental methods, inaccuracies and missing data should be expected. The case addressed in the (5)–(9) considers inaccuracies in the distance values (da) and underestimations of the distance frequency (ma). The first aspect is handled by adding positive and negative deviations to the distance value. The second aspect is handled by considering the data about multiplicity as a lower bound for ma.

The model comprises three sets of variables: \(x_{i}\in \mathbb {R}^{k}\), representing the position of the vertices \(i=1, \dots , n\) in the Euclidean space of dimension k; yaij ∈ {0, 1}, assigning to the vertices i, j the distance da; \(p_{ij} \in \mathbb {R}_{+}\) and \(n_{ij} \in \mathbb {R}_{+}\), which are, respectively, the positive or negative deviations of da from the real distance between xi, xj.

Screening the model for some symmetries allows the reduction in the number of variables: since the distances between two vertices i and j are symmetrical, only one of these distances needs to be represented; also, for i = j, the distance between them is zero and the variable yaij = 0.

Equations (5)–(9) summarize the mathematical model.

$$ \begin{array}{@{}rcl@{}} \min &&\sum\limits_{i=1}^{n-1}\sum\limits_{j=i}^{n}(p_{ij} +n_{ij}) \end{array} $$
(5)
$$ \begin{array}{@{}rcl@{}} \text{s.t.} && \sum\limits_{i=1}^{n-1}\sum\limits_{j=i}^{n}y_{aij}(\|x_{i} - x_{j}\| - d_{a} + p_{ij} - n_{ij}) = 0 , \ \forall a \end{array} $$
(6)
$$ \begin{array}{@{}rcl@{}} &&\sum\limits_{a=1}^{m} y_{aij} \leq 1, \ \forall i, j \end{array} $$
(7)
$$ \begin{array}{@{}rcl@{}} && \sum\limits_{i=1}^{n-1}\sum\limits_{j=i}^{n} y_{aij} \geq m_{a}, \ \forall a \end{array} $$
(8)
$$ \begin{array}{@{}rcl@{}} && x_{i}\geq 0, n_{ij} \geq 0, p_{ij} \geq 0, y_{aij} \in \{0, 1\} \end{array} $$
(9)

The objective function minimizes the sum of the positive and negative deviations from the distance between two vertices. The lower bound for the optimal value of the objective function is zero; in the cases for which the lower bound is attained, the solution delivers an exact assignment for the data provided.

The constraint set (6) computes the positive and negative deviations for each pair i, j assigned to the distance da. Because the number of equations in the constraint set (6) increases with the number of different distances da, a high multiplicity decreases the computational cost of solving the problem. The constraint set (7) expresses that only one assignment of the distances to a single pair of vertices is allowed. The constraint set (8) sets ma as the lower bound for the multiplicity of each distance da. It should also be observed that the binary variables yaij in the (5)–(9) play the role of the variables xi in Problem (1).

Using the relaxation strategy proposed in the previous section, the model described by the (5)–(9) can be restated as (10)–(14),

$$ \begin{array}{@{}rcl@{}} \min && \sum\limits_{i=1}^{n-1}\sum\limits_{j=i}^{n}(p_{ij} + n_{ij})\\ &&+{c} \cdot \left( \frac{mn(n - 1)}{2} - \sum\limits_{a=1}^{m}\sum\limits_{i=1}^{n-1}\sum\limits_{j=i}^{n}(y_{aij} - w_{aij})^{2} \right) \end{array} $$
(10)
$$ \begin{array}{@{}rcl@{}} \text{s.t.} \sum\limits_{i=1}^{n-1}\sum\limits_{j=i}^{n}y_{aij}(\|x_{i} - x_{j}\| - d_{a} + p_{ij} - n_{ij}) =0 , \ \forall a \end{array} $$
(11)
$$ \begin{array}{@{}rcl@{}} && {\sum}_{a=1}^{m} y_{aij} \leq 1, \ \forall i, j \end{array} $$
(12)
$$ \begin{array}{@{}rcl@{}} && {\sum}_{i=1}^{n-1}{\sum}_{j=i}^{n} y_{aij} \geq m_{a}, \ \forall a \end{array} $$
(13)
$$ \begin{array}{@{}rcl@{}} && x_{i} \geq 0, n_{ij} \geq 0, \ p_{ij} \geq 0, \ y_{aij} \in [0, 1], \ w_{aij} \in [0, 1] \end{array} $$
(14)

From the last result of Section 2, the inclusion of the relaxation term in the objective function does not bring additional difficulties to the problem. Also note that the continuous variables waij in the (10)–(14) play the role of the variables yi in Problem (2).

The following computational tests evaluate both models in solving molecular conformation instances of the uDGP. Four classes of instances with 5, 7, 10, and 20 vertices were generated using the method proposed by Lavor (2006). Each class contains ten instances, for which 30% of the distances were randomly removed.

The problems were coded with the modeling language AMPLTM (Fourer et al. 1990) and solved with the KnitroTM package for nonlinear optimization (Byrd et al. 2006) on a PC desktop using Linux operational system, Intel Core i7 processor, and 16 GB of RAM. The maximum allowed execution time was 3600 s. Preliminary computation experiments returned 500 as a suitable value for the penalty constant c, providing feasible solutions without causing numerical instabilities.

Table 1 presents the computational results for the model described by the (5)–(9), named Integer, and for the model described by the (10)–(14), called Relaxed. The column “Vert” gives the number of vertices for each instance; the column “Bin Var” contains the number of binary variables in the “Integer” model; column “Solved” gives the number of instances solved with each model; the column “Deviat.” presents the average deviation for the instances, computed as \(\sum \limits _{i=1}^{n-1}\sum \limits _{j=i}^{n}(p_{ij} +n_{ij})\).

Table 1 Data about instances and solutions

The results in Table 1 show that both the Integer and the Relaxed models provide exactly solvable approaches to the uDGP, an open problem for which there are only a few heuristics available (Duxbury et al. 2016). However, there is a clear advantage of the Relaxed model, illustrating the benefits of relying on the binary relaxation strategy developed in Section 2; the computation complexity of the uDGP severely restricted the solvable instances with the Integer approach, which could address only one out the 31 instances solved with the proposed approach.

As a final remark, note that the deviations obtained with the Integer approach should not be compared with the deviations obtained with the Relaxed approach in Table 1. Indeed, not only the Integer approach could address just a single instance out the 31 instances solved with the Relaxed approach, but also it is not possible to assure that there exist optimal binary solutions for all these 31 instances.

4 Conclusions

The main strength of the relaxation ideas proposed here relies on how it achieves generality while remaining essentially uncomplicated. The model proposed for the unassigned distance geometry problem (uDGP) was a severe testbed to evaluate these ideas. Being a nonlinear and nonconvex problem with a large number of binary variables, the uDGP has all the ingredients of a very difficult combinatorial optimization problem. It goes without saying that in being able to address the uDGP, the proposed approach enlarges the perspective to solve other difficult engineering combinatorial optimization problems with binary variables.