Unassigned distance geometry and molecular conformation problems

Duxbury, Phil; Lavor, Carlile; Liberti, Leo; de Salles-Neto, Luiz Leduino

doi:10.1007/s10898-021-01023-0

Unassigned distance geometry and molecular conformation problems

Published: 15 April 2021

Volume 83, pages 73–82, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Global Optimization Aims and scope Submit manuscript

Unassigned distance geometry and molecular conformation problems

Download PDF

Phil Duxbury¹,
Carlile Lavor²,
Leo Liberti³ &
…
Luiz Leduino de Salles-Neto ORCID: orcid.org/0000-0001-8938-5370⁴

297 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

3D protein structures and nanostructures can be obtained by exploiting distance information provided by experimental techniques, such as nuclear magnetic resonance and the pair distribution function method. These are examples of instances of the unassigned distance geometry problem (uDGP), where the aim is to calculate the position of some points using a list of associated distance values not previoulsy assigned to the pair of points. We propose new mathematical programming formulations and a new heuristic to solve the uDGP related to molecular structure calculations. In addition to theoretical results, computational experiments are also provided.

Recent results on assigned and unassigned distance geometry with applications to protein molecules and nanostructures

Article 04 August 2018

Distance Geometry in Structural Biology: New Perspectives

Assigned and unassigned distance geometry: applications to biological molecules and nanostructures

Article 04 April 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Distance geometry (DG) started when Menger characterized geometric concepts using the idea of distance [35]. In the majority of applications of DG, the input data consists of an incomplete list of distance values and the output is a set of positions in some Euclidean space realizing those given distances [6]. Other applications are given in [36, 39].

When distances are pre-assigned to pairs of objects, we have the assigned Distance Geometry Problem (aDGP), also called just DGP [28, 37], defined as follows:

Definition 1

Given a simple undirected graph $G=(V,E,\delta )$, whose edges are weighted by $ \delta :E\rightarrow (0,\infty )$, and an integer $K>0$, find a function $x:V\rightarrow {\mathbb {R}}^{K}$ such that

$$\begin{aligned} \forall \{u,v\}\in E,\text { }||x_{i}-x_{j}||_{2}=\delta _{i,j}, \end{aligned}$$

(1)

where $x_{i}=x(i)$, $x_{j}=x(j)$, and $\delta _{i,j}=\delta (\{i,j\})$.

Depending on the application, the embedding space can be very general, but for problems related to molecular geometry, we will fix it to ${\mathbb {R}}^{3} $. For example, 3D protein structures and nanostructures can be obtained by exploiting distance information between atom pairs provided by experimental techniques, such as nuclear magnetic resonance (NMR) [28] and the pair distribution function (PDF) method [15], respectively.

In general, in the context of molecular conformations, it is considered that the graph G is known a priori, but the information that is actually given by NMR experiments and PDF methods consists of a list of distance values that are only subsequently assigned to atom pairs [7].

In other words, while the distance is given, we do not know the two vertices having such a distance. That is, the associated graph is actually unknown and the only input is a vertex set and a list of distance values. This is the unassigned Distance Geometry Problem (uDGP), which has received much less attention than the aDGP [4, 6].

The formal definition of the uDGP is the following (since $\delta _{i,j}=\delta _{j,i}$, we will write $\{i,j\}$ instead of (i, j)):

Definition 2

Given a set of vertices V and a list of associated distance values $ d_{1},\ldots ,d_{m}$, find an injective function $g:\{1,\ldots ,m\}\rightarrow V\times V$ and a function $x:V\rightarrow {\mathbb {R}}^{3}$ such that, $ \forall \{i,j\}\in g(\{1,\ldots ,m\})$,

$$\begin{aligned} \delta _{i,j}=d_{g^{-1}(\{i,j\})}\text { } \end{aligned}$$

(2)

and

$$\begin{aligned} ||x_{i}-x_{j}||_{2}=\delta _{i,j}. \end{aligned}$$

Note that g is an assignment function that defines a set $E\subset $ $ V\times V$, the edges of the graph associated to the uDGP.

For historical notes and surveys on methods to solve DGPs, see [6, 7, 28, 29], respectively. Also see the recent books [21, 22, 30].

In 1979, Saxe proved that the aDGP is NP-hard [40]. The uDGP is even more challenging in practice, because the graph itself and the graph realization must both be determined at the same time.

Although the uDGP is not new to mathematics [41], the literature focuses predominantly on one-dimensional problems motivated by DNA sequencing, often called partial digest problems [12].

For nanostructure calculations, there are two heuristics that have been proposed: TRIBOND [11, 14] and LIGA [15]. Both methods are based on build-up approaches and suppose that sufficient distance constraints are available to ensure a unique solution at each step of the procedure.

In the context of molecular geometry, we propose mathematical programming formulations for the uDGP, one of the open problems in DG mentioned in [31]. In addition to theoretical results related to these formulations, we also propose a new heuristic for the problem (Sect. 2). Section 3 presents computational experiments and Sect. 4 concludes the paper with new research directions.

2 New formulations for the uDGP

In this section, we present mathematical programming models for the uDGP, with associated theoretical results, and a heuristic to solve it.

2.1 Mathematical programming models

To take into account the assignment function g (Definition 2), we introduce binary variables $a_{i,j}^{k}$ such that

$$\begin{aligned} a_{i,j}^{k}=1\Leftrightarrow \text {distance }d_{k}\text { is assigned to the pair }(i,j)\in V\times V\text {.} \end{aligned}$$

Considering vertices $v_{1},\ldots ,v_{n}\in V$ and distance values $ d_{1},\ldots ,d_{m}$ related to a uDGP instance, we propose the following model to the uDGP:

$$\begin{aligned} \begin{array}{c} \begin{array}{c} \begin{array}{rc} \underset{x_{1},\ldots ,x_{n},a_{i,j}^{k}}{\min } &{} \overset{n-1}{\underset{i=1}{ \sum }}\overset{n}{\underset{j=i+1}{\sum }}\left( \overset{m}{\underset{k=1}{ \sum }}\left( a_{i,j}^{k}\left( ||x_{i}-x_{j}||_{2}^{2}-d_{k}^{2}\right) ^{2}\right) \right) \\ \text {s.t.} &{} \begin{array}{c} \overset{n-1}{\underset{i=1}{\sum }}\overset{n}{\underset{j=i+1}{\sum }} a_{i,j}^{k}=1,\text { }k=1,\ldots ,m, \\ \overset{m}{\underset{k=1}{\sum }}a_{i,j}^{k}\le 1,\text { }i=1,\ldots ,n-1, \text { }j=i+1,\ldots ,n,\ \\ x_{i}\in {\mathbb {R}}^{3},\text { }a_{i,j}^{k}\in \{0,1\},\text { } \\ k=1,\ldots ,m,\text { }i=1,\ldots ,n-1,\text { }j=i+1,\ldots ,n. \end{array} \end{array} \end{array} \end{array} \end{aligned}$$

(3)

Our first result states a relationship between a uDGP solution and a solution to model (3).

Theorem 1

A pair (g, x) is a solution for a uDGP instance associated to a graph $ G=(V,E)$, with $|V|=n$, $|E|=m$, $g:\{1,\ldots ,m\}\rightarrow V\times V$, and $ x:V\rightarrow {\mathbb {R}}^{3}$, if and only if (x, a) is a global optimal solution to (3).

Proof

If (g, x) is a solution for a given uDGP associated to a graph $G=(V,E)$, with $|V|=n$, $|E|=m$, $g:\{1,\ldots ,m\}\rightarrow V\times V$, $x:V\rightarrow {\mathbb {R}}^{3}$, and distance values $d_{1},\ldots ,d_{m}$ related to binary variables $a_{i,j}^{k}$, such that

$$\begin{aligned} ||x_{i}-x_{j}||_2=\delta _{i,j}, \end{aligned}$$

where $\delta _{i,j}=d_{g^{-1}(\{i,j\})}$, then it is easy to see that (x, a) is a global optimal solution to (3).

Considering now that $(x_{1},\ldots ,x_{n},a_{i,j}^{k})$ is a global optimal solution to (3), for $i=1,\ldots ,n-1$, $j=i+1,\ldots ,n$, and $k=1,\ldots ,m $, we have that

$$\begin{aligned} \overset{m}{\underset{k=1}{\sum }}\left( a_{i,j}^{k}\left( ||x_{i}-x_{j}||_{2}^{2}-d_{k}^{2}\right) ^{2}\right) =0 \end{aligned}$$

and, by constraints of problem (3), the values $a_{i,j}^{k}$ assign pairs $(i,j)\in V\times V$ such that

$$\begin{aligned} ||x_{i}-x_{j}||_2=d_{k}\text {, }k=1,\ldots ,m\text {.} \end{aligned}$$

(4)

This implicitly defines a weighted graph $G=(V,E,d)$, $d:E\rightarrow {\mathbb {R}}$, with vertices and edges related to $x_{1},\ldots ,x_{n}$ and pairs (i, j), respectively, an injective function $g:\{1,\ldots ,m\}\rightarrow V\times V$, such that

$$\begin{aligned} \delta _{i,j}=d_{g^{-1}(\{i,j\})}, \end{aligned}$$

and a realization of G, $x:V\rightarrow {\mathbb {R}}^{3}$, that satisfies ( 4). Thus, (g, x) is a solution of the uDGP associated to the distance values $d_{1},\ldots ,d_{m}$ and the graph $G=(V,E,d)$. $\square $

In order to avoid the huge number of binary variables of the model (3) and inspired by the Solid Isotropic Material with Penalization (SIMP) method [5] and the ideas proposed in [34], we introduce a new formulation with only continuous variables:

$$\begin{aligned} \begin{array}{c} \begin{array}{c} \begin{array}{rc} \underset{t,x_{1},\ldots ,x_{n},a_{i,j}^{k}}{\min } &{} t-\overset{n-1}{\underset{ i=1}{\sum }}\overset{n}{\underset{j=i+1}{\sum }}\left( \overset{m}{\underset{ k=1}{\sum }}\left( a_{i,j}^{k}\right) ^{2}\right) \\ \text {s.t.} &{} \begin{array}{c} \overset{n-1}{\underset{i=1}{\sum }}\overset{n}{\underset{j=i+1}{\sum }} \left( \overset{m}{\underset{k=1}{\sum }}\left( a_{i,j}^{k}\left( ||x_{i}-x_{j}||_2^{2}-d_{k}^{2}\right) ^{2}\right) \right) =t, \\ \overset{n-1}{\underset{i=1}{\sum }}\overset{n}{\underset{j=i+1}{\sum }} a_{i,j}^{k}=1,\text { }k=1,\ldots ,m, \\ \overset{m}{\underset{k=1}{\sum }}a_{i,j}^{k}\le 1,\text { }i=1,\ldots ,n-1, \text { }j=i+1,\ldots ,n,\text { } \\ t\ge 0,\text { }x_{i}\in {\mathbb {R}}^{3},\text { }0\le a_{i,j}^{k}\le 1, \text { } \\ k=1,\ldots ,m,\text { }i=1,\ldots ,n-1,\text { }j=i+1,\ldots ,n. \end{array} \end{array} \end{array} \end{array} \end{aligned}$$

(5)

The next result gives a relationship between a uDGP solution and a solution to model (5).

Theorem 2

A pair (g, x) is a solution for a feasible uDGP instance associated to a graph $G=(V,E)$, with $|V|=n$, $|E|=m$, $g:\{1,\ldots ,m\}\rightarrow V\times V$ , and $x:V\rightarrow {\mathbb {R}}^{3}$, if and only if (t, x, a) is a global optimal solution to (5) with globally optimal objective function value equal to $-m$.

Proof

If (g, x) is a solution for a given uDGP associated to a graph $G=(V,E)$, with $|V|=n$, $|E|=m$, $g:\{1,\ldots ,m\}\rightarrow V\times V$, $x:V\rightarrow {\mathbb {R}}^{3}$, and distance values $d_{1},\ldots ,d_{m}$ related to binary variables $a_{i,j}^{k}$, such that

$$\begin{aligned} ||x_{i}-x_{j}||=\delta _{i,j}, \end{aligned}$$

where $\delta _{i,j}=d_{g^{-1}(\{i,j\})}$, we obtain, from Theorem 3, that (x, a) is a global optimal solution to (3). Considering this solution in model (5), we have $a_{i,j}^{k}\in \{0,1\}$ and

$$\begin{aligned} \overset{n-1}{\underset{i=1}{\sum }}\overset{n}{\underset{j=i+1}{\sum }} \left( \overset{m}{\underset{k=1}{\sum }}\left( a_{i,j}^{k}\left( ||x_{i}-x_{j}||^{2}-d_{k}^{2}\right) ^{2}\right) \right) =0, \end{aligned}$$

which implies that $t=0$ and that (0, x, a) is also a global optimum solution to model (5), with globally optimal objective function value equal to $-m$.

Let us consider the other direction of the theorem. For a global optimal solution (t, x, a) of the model (5), if there exist positive integers $l_{1},l_{2},l_{3}$ with $l_{1}\le n-1$, $l_{2}\le n$, $l_{3}\le m$, such that

$$\begin{aligned} 0<a_{l_{1},l_{2}}^{l_{3}}<1, \end{aligned}$$

then

$$\begin{aligned} \overset{n-1}{\underset{i=1}{\sum }}\overset{n}{\underset{j=i+1}{\sum }} \left( \overset{m}{\underset{k=1}{\sum }}\left( a_{i,j}^{k}\right) ^{2}\right) <\overset{n-1}{\underset{i=1}{\sum }}\overset{n}{\underset{j=i+1}{\sum }}\left( \overset{m}{\underset{k=1}{\sum }}a_{i,j}^{k}\right) . \end{aligned}$$

(6)

Since we are considering a feasible uDGP, for $k=1,\ldots ,m$,

$$\begin{aligned} \overset{n-1}{\underset{i=1}{\sum }}\overset{n}{\underset{j=i+1}{\sum }} a_{i,j}^{k}=1\Rightarrow \overset{n-1}{\underset{i=1}{\sum }}\overset{n}{ \underset{j=i+1}{\sum }}\left( \overset{m}{\underset{k=1}{\sum }} a_{i,j}^{k}\right) =m, \end{aligned}$$

implying that, from (6) and $t\ge 0$,

$$\begin{aligned} \overset{n-1}{\underset{i=1}{\sum }}\overset{n}{\underset{j=i+1}{\sum }} \left( \overset{m}{\underset{k=1}{\sum }}\left( a_{i,j}^{k}\right) ^{2}\right) <m\Rightarrow t-\overset{n-1}{\underset{i=1}{\sum }}\overset{n}{ \underset{j=i+1}{\sum }}\left( \overset{m}{\underset{k=1}{\sum }}\left( a_{i,j}^{k}\right) ^{2}\right) >t-m, \end{aligned}$$

which is a contradiction, because we already know that $-m$ is the optimal value for model (5). Thus,

$$\begin{aligned} a_{i,j}^{k}\in \{0,1\}, \end{aligned}$$

for $i=1,\ldots ,n-1$, $j=i+1,\ldots ,n$, and $k=1,\ldots ,m$. From the constraints of problem (5), the values $a_{i,j}^{k}$ assign pairs $(i,j)\in V\times V$ such that

$$\begin{aligned} ||x_{i}-x_{j}||=d_{k}\text {, }k=1,\ldots ,m\text {.} \end{aligned}$$

(7)

This implicitly defines a weighted graph $G=(V,E,\delta )$, $\delta :E\rightarrow {\mathbb {R}}$, with vertices and edges related to $ x_{1},\ldots ,x_{n}$ and pairs (i, j), respectively, an injective function $ g:\{1,\ldots ,m\}\rightarrow V\times V$, such that

$$\begin{aligned} \delta _{i,j}=d_{g^{-1}(\{i,j\})}, \end{aligned}$$

and a realization of G, $x:V\rightarrow {\mathbb {R}}^{3}$, that satisfies ( 7). Thus, (g, x) is a solution of the uDGP associated to the distance values $d_{1},\ldots ,d_{m}$ and the graph $G=(V,E,d)$. $\square $

From the proof above, note that model (5) also provides a “certificate” of infeasibility of the uDGP instance if the globally optimal objective function value is strictly greater than $-m$.

2.2 A heuristic approach

Model (5) can solve larger instances, compared to model (3), but to solve instances with hundreds of atoms, we propose a new heuristic inspired by the TRIBOND method [14] and model (5).

First, we need to find a “core” (positions in ${\mathbb {R}}^{3}$ for five vertices with ten associated distances provided from the list of distance values), solving model (5) considering just five points, and then increase its size by adding one vertex position at a time solving a modification of model (5), where four random points (already fixed) are used in order to find the next position:

1.
Find a core $x_{1},\ldots ,x_{5}\in {\mathbb {R}}^{3}$ solving the problem
$$\begin{aligned} \begin{array}{c} \begin{array}{c} \begin{array}{rc} \underset{t,x_{1},\ldots ,x_{5},a_{i,j}^{k}}{\min } &{} t-\overset{4}{\underset{i=1 }{\sum }}\overset{5}{\underset{j=i+1}{\sum }}\left( \overset{m}{\underset{k=1 }{\sum }}\left( a_{i,j}^{k}\right) ^{2}\right) \\ \text {s.t.} &{} \begin{array}{c} \overset{4}{\underset{i=1}{\sum }}\overset{5}{\underset{j=i+1}{\sum }}\left( \overset{m}{\underset{k=1}{\sum }}\left( a_{i,j}^{k}\left( ||x_{i}-x_{j}||_2^{2}-d_{k}^{2}\right) ^{2}\right) \right) =t, \\ \overset{4}{\underset{i=1}{\sum }}\overset{5}{\underset{j=i+1}{\sum }} a_{i,j}^{k}=1,\text { }k=1,\ldots ,m, \\ \overset{m}{\underset{k=1}{\sum }}a_{i,j}^{k}\le 1,\text { }i=1,\ldots ,4,\text { }j=i+1,\ldots ,5,\ \ \\ t\ge 0,\text { }x_{1},\ldots ,x_{5}\in {\mathbb {R}}^{3},\text { }0\le a_{i,j}^{k}\le 1,\text { }k=1,\ldots ,m. \end{array} \end{array} \end{array} \end{array} \end{aligned}$$
(8)
2.
For $i=6,\ldots ,n$, solve the problem
$$\begin{aligned} \begin{array}{c} \begin{array}{c} \begin{array}{rc} \underset{t,x_{i},a_{i,j}^{k}}{\min } &{} t-\overset{m_{i}}{\underset{k=1}{ \sum }}\left( \underset{j\in J}{\sum }\left( a_{i,j}^{k}\right) ^{2}\right) \\ \text {s.t.} &{} \begin{array}{c} \underset{j\in J}{\sum }\overset{m_{i}}{\underset{k=1}{\sum }}\left( a_{i,j}^{k}\left( ||x_{i}-y_{j}||_2^{2}-d_{k}^{2}\right) ^{2}\right) =t \\ \underset{j\in J}{\sum }a_{i,j}^{k}=1, \ k=1,\ldots ,m_{i}, \\ \overset{m_{i}}{\underset{k=1}{\sum }}a_{i,j}^{k}\le 1,\text { }j\in J, \\ t\ge 0,\text { }x_{i}\in {\mathbb {R}}^{3},\text { }0\le a_{i,j}^{k}\le 1, \text { }k=1,\ldots ,m_{i},\text { }j\in J, \end{array} \end{array} \end{array} \end{array} \end{aligned}$$
(9)
where $x_{i}\in {\mathbb {R}}^{3}$ is the position to be determined, J is a random set with four indices related to already fixed points $y_{j}\in {\mathbb {R}}^{3},$ $j\in J\subset \{1,\ldots ,i-1\}$, and $m_{i}$ is the number of available distances.
3.
If a set of compatible distances cannot be found for some $i=6,\ldots ,n$, find a new core (go to Step 1) and restart.

The importance of a core in Step 1 is to allow, with high probability [14], to start correctly the reconstruction of the molecular structure. After finding a core, the geometric idea of Step 2 is to intersect fours spheres [32] (centered at points $y_{j}$), which gives one point if there are consistent distance values (radii of the spheres) from the list of distances.

3 Computational results

We generate uDGP instances in the following way. We consider a sequence of covalently connected atoms indexed by $1,\ldots ,n.$ The 3D structure of the instance can be defined in terms of the lengths of the covalent bonds $d_{1,2},\ldots ,d_{n-1,n}$, covalent angles $\theta _{1,3},\ldots ,\theta _{n-2,n}$ (formed by three consecutive atoms), and torsion angles $ \omega _{1,4},\ldots ,\omega _{n-3,n}$ (formed by four consecutive atoms).

By fixing the lengths of the covalent bonds ($d_{i-1,i}=1.0$) and the values of the covalent angles ($\theta _{i-2,i}=2.0$ radians), a 3D molecular structure is determined by the torsion angles $\omega _{1,4},\ldots ,\omega _{n-3,n}\in [0,2\pi ]$, randomly chosen from the set $\{\frac{\pi }{3},\pi ,\frac{5\pi }{3}\}$. More details about instance generation are given in [16, 20].

Differently from TRIBOND and LIGA methods, instances with incomplete list of distances, i.e. $m<\frac{n(n-1)}{2}$, can be easier to solve by the proposed approach, since there will be many global optimum solutions for the model (5). Thus, in order to guarantee a unique solution, we consider instances with all the distances $d_{1},\ldots ,d_{m}$, where $m=\frac{ n(n-1)}{2}$.

We used the software AMPL with the solver Baron 17.4.1 on a Lenovo notebook, with 6 MB RAM and intel celeron 1.6 GHz, to solve the three models proposed: M1 (3), M2 (5), and M3 (9).

For all values of n, we generated 5 random instances according to the procedure described above.

For models M1 and M2, we stopped with $n=5$ and $n=10$, respectively, because no solution was found considering 3000 minutes as the limit time (see Table 1).

Table 1 Number of solutions found with time limit = 3,000 min

Full size table

Table 2 Computational time in minutes

Full size table

For each n, Table 2 shows the average of the computational time, in minutes, necessary to solve all the 5 instances randomly generated.

From Tables 1 and 2, we notice that the proposed heuristic for solving problem (5) finds the global optimum solutions, in all the cases, in a reasonable time. This means that, during the execution of the heuristic, for a core found in Step 1, problem (9) was solved in Step 2, for all $i=6,\ldots ,n$.

4 Conclusions

TRIBOND and LIGA are the first generation methods to solve the uDGP applied to molecular conformation problems and computational results presented in this paper point to different approaches that could be starting points for new research directions.

We are particularly interested in such kind of problems related to protein structure calculations using distance information given by NMR experiments, called the Molecular DGP (MDGP) [28]. Many methods applied to the MDGP suppose that the available distances are pre-assigned to the pairs of atoms. However, as we mentioned in the Introduction, the data that is actually provided by NMR consists of just a list of distances.

The geometry of proteins allows us to define vertex orders $v_{1},\ldots ,v_{n}$ on the associated graph $G=(V,E)$ [8, 25] such that

1.
The first four vertices can be fixed in ${\mathbb {R}}^{3}$, since they define a clique;
2.
Each vertex with rank greater than four is adjacent to at least two contiguous predecessors, i.e.
$$\begin{aligned} \forall i>4,\{v_{i-2},v_{i}\},\{v_{i-1},v_{i}\}\in E. \end{aligned}$$

Property 1 can help to define the core (instead of solving problem (8)) and property 2 can be useful in the definition of the set J in the model (9).

In [23], the authors propose a new vertex order for protein molecules where pairs $\{v_{i-3},v_{i}\}$ can also be included in the set of edges of the associated graph. When distances $d_{i-3,i}$ are also known, the search space of the problem can be discretized and if the assignment function g is defined in advance, we have the so-called Discretizable MDGP (DMDGP) [17, 18], allowing the application of a combinatorial method, called Branch-and-Prune (BP) [27].

As mentioned in the recent survey on DGP’s [7], experiences with TRIBOND and LIGA, together with recent results on BP methods for DGP’s [13, 26, 33], emphasize the importance of vertex orders in molecular reconstruction from distance information.

Our main research direction now is to consider protein vertex orders in the models proposed in this work to deal with uncertainties in the NMR distance information, already discussed in many approaches to the aDGP [1,2,3, 9, 10, 19, 24, 38, 42].

The proposed models can also incorporate uncertainties representing distance values as interval distances $[{\underline{d}}_{k},{\overline{d}}_{k}]$, $0< {\underline{d}}_{k}\le {\overline{d}}_{k}$, where precise distance values $d_{k} $ are replaced by ${\underline{d}}_{k}+\lambda _{k}w_{k}$, with $0\le \lambda _{k}\le 1$ and $w_{k}={\overline{d}}_{k}-{\underline{d}}_{k}$ [38], implying that $\lambda _{k}$ would be new variables and $w_{k}$ new input data.

References

Alves, R., Lavor, C.: Geometric algebra to model uncertainties in the discretizable molecular distance geometry problem. Adv. Appl. Clifford Algebr. 27, 439–452 (2017)
Article MathSciNet Google Scholar
Alves, R., Lavor, C., Souza, C., Souza, M.: Clifford algebra and discretizable distance geometry. Math. Methods Appl. Sci. 41, 3999–4346 (2018)
Article MathSciNet Google Scholar
Baez-Sanchez, A., Lavor, C.: On the estimation of unknown distances for a class of Euclidean distance matrix completion problems with interval data. Linear Algebr. Appl. 592, 287–305 (2020)
Article MathSciNet Google Scholar
Bartmeyer, P., Lyra, C.: A new quadratic relaxation for binary variables applied to the distance geometry problem. Struct. Multidiscip. Optim. 62, 2197–2201 (2020)
Article MathSciNet Google Scholar
Bendsoe, M., Sigmund, O.: Topol. Optim. Theory. Methods and Applications, Springer, New York (2003)
Google Scholar
Billinge, S., Duxbury, P., Gonçalves, D., Lavor, C., Mucherino, A.: Assigned and unassigned distance geometry: applications to biological molecules and nanostructures, 4OR, 14:337-376 (2016)
Billinge, S., Duxbury, P., Gonçalves, D., Lavor, C., Mucherino, A.: Recent results on assigned and unassigned distance geometry with applications to proteinmolecules and nanostructures. Ann. Oper. Res. 271, 161–203 (2018)
Article MathSciNet Google Scholar
Cassioli, A., Gunluk, O., Lavor, C., Liberti, L.: Discretization vertex orders in distance geometry. Discret. Appl. Math. 197, 27–41 (2015)
Article MathSciNet Google Scholar
Costa, T., Bouwmeester, H., Lodwick, W., Lavor, C.: Calculating the possible conformations arising from uncertainty in the molecular distance geometry problem using constraint interval analysis. Inform. Sci. 415–416, 41–52 (2017)
Article MathSciNet Google Scholar
Dambrosio, C., Ky, V., Lavor, C., Liberti, L., Maculan, N.: New error measures and methods for realizing protein graphs from distance data. Discret. Comput. Geom. 57, 371–418 (2017)
Article MathSciNet Google Scholar
Duxbury, P., Granlund, L., Gujarathi, S., Juhas, P., Billinge, S.: The unassigned distance geometry problem. Discret. Appl. Math. 204, 117–132 (2016)
Article MathSciNet Google Scholar
Fontoura, L., Martinelli, R., Poggi, M., Vidal, T.: The minimum distance superset problem: formulations and algorithms. J. Glob. Optim. 72, 27–53 (2018)
Article MathSciNet Google Scholar
Gonçalves, D., Mucherino, A., Lavor, C., Liberti, L.: Recent advances on the interval distance geometry problem. J. Glob. Optim. 69, 525–545 (2017)
Article MathSciNet Google Scholar
Gujarathi, S., Farrow, C., Glosser, C., Granlund, L., Duxbury, P.: Ab-initio reconstruction of complex Euclidean networks in two dimensions. Phys. Rev. E 89, 053311 (2014)
Article Google Scholar
Juhás, P., Cherba, D., Duxbury, P., Punch, W., Billinge, S.: Ab initio determination of solid-state nanostructure. Nature 440, 655–658 (2006)
Article Google Scholar
Lavor, C.: On generating instances for the molecular distance geometry problem. In: Liberti, L., Maculan, N. (eds.) Global Optimization: From Theory to Implementation, pp. 405–414. Springer, Berlin (2006)
Chapter Google Scholar
Lavor, C., Liberti, L., Maculan, N., Mucherino, A.: Recent advances on the discretizable molecular distance geometry problem. Eur. J. Oper. Res. 219, 698–706 (2012)
Article MathSciNet Google Scholar
Lavor, C., Liberti, L., Maculan, N., Mucherino, A.: The discretizable molecular distance geometry problem. Comput. Optim. Appl. 52, 115–146 (2012)
Article MathSciNet Google Scholar
Lavor, C., Liberti, L., Mucherino, A.: The interval BP algorithm for the discretizable molecular distance geometry problem with interval data. J. Glob. Optim. 56, 855–871 (2013)
Article Google Scholar
Lavor, C., Alves, R., Figueiredo, W., Petraglia, A., Maculan, N.: Clifford algebra and the discretizable molecular distance geometry problem. Adv. Appl. Clifford Algebr. 25, 925–942 (2015)
Article MathSciNet Google Scholar
Lavor, C., Liberti, L., Lodwick, W., Mendonça da Costa, T.: An Introduction to Distance Geometry applied to Molecular Geometry. SpringerBriefs, New York (2017)
Book Google Scholar
Lavor, C., Xambó-Descamps, S., Zaplana, I.: A Geometric Algebra Invitation to Space-Time Physics. Robotics and Molecular Geometry. SpringerBriefs, New York (2018)
Book Google Scholar
Lavor, C., Liberti, L., Donald, B., Worley, B., Bardiaux, B., Malliavin, T., Nilges, M.: Minimal NMR distance information for rigidity of protein graphs. Discret. Appl. Math. 256, 91–104 (2019)
Article MathSciNet Google Scholar
Lavor, C., Alves, R.: Oriented conformal geometric algebra and the molecular distance geometry problem. Adv. Appl. Clifford Algebr. 29, 1–19 (2019)
Article MathSciNet Google Scholar
Lavor, C., Souza, M., Mariano, L., Liberti, L.: On the polinomiality of finding $^{K}$DMDGP re-orders. Discret. Appl. Math. 267, 190–194 (2019)
Article Google Scholar
Lavor, C., Souza, M., Mariano, L., Gonçalves, D., Mucherino, A.: Improving the sampling process in the interval Branch-and-Prune algorithm for the discretizable molecular distance geometry problem. Appl. Math. Comput. 389, 125586 (2021)
MathSciNet MATH Google Scholar
Liberti, L., Lavor, C., Maculan, N.: A branch-and-prune algorithm for the molecular distance geometry problem. Int. Trans. Oper. Res. 15, 1–17 (2008)
Article MathSciNet Google Scholar
Liberti, L., Lavor, C., Maculan, N., Mucherino, A.: Euclidean distance geometry and applications. SIAM Rev. 56, 3–69 (2014)
Article MathSciNet Google Scholar
Liberti, L., Lavor, C.: Six mathematical gems from the history of distance geometry. Int. Trans. Oper. Res. 23, 897–920 (2016)
Article MathSciNet Google Scholar
Liberti, L., Lavor, C.: Euclidean Distance Geometry: An Introduction. Springer, New York (2017)
Book Google Scholar
Liberti, L., Lavor, L.: Open research areas in distance geometry. In: Pardalos, P., Migdalas, A. (eds.) Open Problems in Optimization and Data Analysis, pp. 183–223. Springer, New York (2018)
Chapter Google Scholar
Maioli, D., Lavor, C., Gonçalves, D.: A note on computing the intersection of spheres in ${\mathbb{R}}^{n}$. ANZIAM J. 59, 271–279 (2017)
Article MathSciNet Google Scholar
Malliavin, T., Mucherino, A., Lavor, C., Liberti, L.: Systematic exploration of protein conformational space using a distance geometry approach. J. Chem. Inform. Model. 59, 4486–4503 (2019)
Article Google Scholar
Martínez, J.M.: A note on the theoretical convergence properties of the SIMP method. Struct. Multidiscipl. Optim. 29, 319–323 (2005)
Article MathSciNet Google Scholar
Menger, K.: Untersuchungen uber allgemeine Metrik. Mathematische Annalen 100, 75–163 (1928)
Article MathSciNet Google Scholar
Moreira, N., Duarte, L., Lavor, C., Torezzan, C.: A novel low-rank matrix completion approach to estimate missing entries in Euclidean distance matrix. Comput. Appl. Math. 37, 4989–4999 (2018)
Article MathSciNet Google Scholar
Mucherino, A., Lavor, C., Liberti, L., Maculan, N. (eds.): Distance Geometry: Theory, Methods, and Applications. Springer, New York (2013)
MATH Google Scholar
Neto, L.S., Lavor, C., Lodwick, W.: A constrained interval approach to the generalized distance geometry problem. Optim. Lett. 14, 483–492 (2020)
Article MathSciNet Google Scholar
Santiago, C., Lavor, C., Monteiro, S., Kroner-Martins, A.: A new algorithm for the small-field astrometric point-pattern matching problem. J. Glob. Optim. 72, 55–70 (2018)
Article MathSciNet Google Scholar
Saxe, J.: Embeddability of weighted graphs in k-space is strongly np-hard, Proceeding of the 17th Allerton Conference in Communications, Control and Computing, 480–489 (1979)
Skiena, S., Smith, W., Lemke, P.: Reconstructing sets from interpoint distances, Proceedings of the Sixth ACM Symposium on Computational Geometry, 332–339 (1990)
Worley, B., Delhommel, F., Cordier, F., Malliavin, T., Bardiaux, B., Wolff, N., Nilges, M., Lavor, C., Liberti, L.: Tuning interval branch-and-prune for protein structure determination. J. Glob. Optim. 72, 109–127 (2018)
Article MathSciNet Google Scholar

Download references

Acknowledgements

We would like to thank the Brazilian research agencies CNPq and FAPESP, for their financial support, and the reviewers for their valuable comments.

Author information

Authors and Affiliations

Michigan State University, Natural Science Building, 288 Farm Lane, East Lansing, MI, USA
Phil Duxbury
IMECC-Unicamp, Cidade Universitaria Zeferino Vaz, Campinas, Brazil
Carlile Lavor
CNRS LIX, Ecole Polytechnique, 91128, Palaiseau, France
Leo Liberti
Science and Technology Institute, Federal University of Sao Paulo, Sao Jose dos Campos, Brazil
Luiz Leduino de Salles-Neto

Authors

Phil Duxbury
View author publications
You can also search for this author in PubMed Google Scholar
Carlile Lavor
View author publications
You can also search for this author in PubMed Google Scholar
Leo Liberti
View author publications
You can also search for this author in PubMed Google Scholar
Luiz Leduino de Salles-Neto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luiz Leduino de Salles-Neto.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Fapesp and CNPq-Brazil.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duxbury, P., Lavor, C., Liberti, L. et al. Unassigned distance geometry and molecular conformation problems. J Glob Optim 83, 73–82 (2022). https://doi.org/10.1007/s10898-021-01023-0

Download citation

Received: 07 February 2020
Accepted: 03 April 2021
Published: 15 April 2021
Issue Date: May 2022
DOI: https://doi.org/10.1007/s10898-021-01023-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Unassigned distance geometry and molecular conformation problems

Abstract

Similar content being viewed by others

Recent results on assigned and unassigned distance geometry with applications to protein molecules and nanostructures

Distance Geometry in Structural Biology: New Perspectives

Assigned and unassigned distance geometry: applications to biological molecules and nanostructures

1 Introduction

Definition 1

Definition 2

2 New formulations for the uDGP

2.1 Mathematical programming models

Theorem 1

Proof

Theorem 2

Proof

2.2 A heuristic approach

3 Computational results

4 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unassigned distance geometry and molecular conformation problems

Abstract

Similar content being viewed by others

Recent results on assigned and unassigned distance geometry with applications to protein molecules and nanostructures

Distance Geometry in Structural Biology: New Perspectives

Assigned and unassigned distance geometry: applications to biological molecules and nanostructures

1 Introduction

Definition 1

Definition 2

2 New formulations for the uDGP

2.1 Mathematical programming models

Theorem 1

Proof

Theorem 2

Proof

2.2 A heuristic approach

3 Computational results

4 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation