1 Introduction

The fundamental matrix has received a great interest in the computer vision community (see for instance [17]). This \((3\times 3)\) rank-two matrix encapsulates the epipolar geometry, the projective motion between two uncalibrated perspective cameras, and serves as a basis for 3D reconstruction, motion segmentation and camera self-calibration, to name a few. Given \(n\) point matches \(({\mathsf {\mathbf{{{q}} }}}_i,{\mathsf {\mathbf{{{q}} }}}_i'),\, i=1,\ldots ,n\) between two images, the fundamental matrix may be estimated in two phases. The initialization phase finds some suboptimal estimate while the refinement phase iteratively minimizes an optimal but nonlinear and nonconvex criterion. The gold standard uses the eight-point algorithm and projective bundle adjustment for these two phases, respectively. A ‘good enough’ initialization is necessary to avoid local minima at the refinement phase as much as possible. The main goal of this article is to improve the current state of the art regarding the initialization phase. We here focus on input point matches that do not contain mismatches (a pair of points incorrectly associated). The problem of mismatches has been specifically addressed by the use of robust methods in the literature.

The eight-point algorithm follows two steps [1]. In its first step, it relaxes the rank-deficiency constraint and solves the following convex problem:

$$\begin{aligned} \tilde{{\varvec{\mathbf {F}}}} = \underset{{\varvec{\mathbf {F}}} \in \mathbb {R}^{3\times 3}}{\arg \min }\; C({\varvec{\mathbf {F}}}) \text{ s.t. } ||{\varvec{\mathbf {F}}} ||^2 =1, \end{aligned}$$
(1)

where \(C\) is a convex, linear least squares cost, hereinafter called the algebraic cost:

$$\begin{aligned} C({\varvec{\mathbf {F}}}) = \displaystyle \sum _{i=1}^n \left( {\mathsf {\mathbf{{{q}} }}}_i'^\top {\varvec{\mathbf {F}}}{\mathsf {\mathbf{{{q}} }}}_i \right) ^2. \end{aligned}$$
(2)

This minimization is subject to the normalization constraint \(||{\varvec{\mathbf {F}}} ||^2 =1\). This is to avoid the trivial solution \({\varvec{\mathbf {F}}}={\varvec{\mathbf {0}}}\). Normalization will be further discussed in Sect. 3. The estimated matrix \(\tilde{{\varvec{\mathbf {F}}}}\) is thus not a fundamental matrix yet. In its second step, the eight-point algorithm computes the closest rank-deficient matrix to \(\tilde{{\varvec{\mathbf {F}}}}\) as:

$$\begin{aligned} {\varvec{\mathbf {F}}}_{\text {8pt}} = \underset{{\varvec{\mathbf {F}}} \in \mathbb {R}^{3\times 3}}{\arg \min }\; \displaystyle ||{\varvec{\mathbf {F}}} - \tilde{{\varvec{\mathbf {F}}}} ||^2 \text{ s.t. } \det ({\varvec{\mathbf {F}}})=0. \end{aligned}$$
(3)

Both steps can be easily solved. The first step is a simple linear least squares problem and the second step is solved by nullifying the least singular value of \(\tilde{{\varvec{\mathbf {F}}}}\). It has been shown [4] that this simple algorithm performs extremely well in practice, provided that the image point coordinates are standardized by simply rescaling them so that they lie in \([-\sqrt{2};\sqrt{2}]^2\).

Our main contribution in this paper is an approach that solves for the fundamental matrix minimizing the algebraic cost. In other words, we find the global minimum of:

$$\begin{aligned} {\varvec{\mathbf {F}}}_{Gp} = \underset{{\varvec{\mathbf {F}}} \in \mathbb {R}^{3\times 3}}{\arg \min }\; C({\varvec{\mathbf {F}}}) \text{ s.t. } \det ({\varvec{\mathbf {F}}})=0 \text{ and } ||{\varvec{\mathbf {F}}} ||^2 =1. \end{aligned}$$
(4)

Perhaps more importantly, we also quantify the impact that each of \({\varvec{\mathbf {F}}}_{\text {8pt}}\) and \({\varvec{\mathbf {F}}}_{Gp}\) has when used as an initial estimate in bundle adjustment. Each initial estimate will lead bundle adjustment to its own refined estimate. The two final estimates may thus be different since, as the difference between the two initial estimates grows larger, the probability that they lie in different basins of attraction increases. Our measure quantifies:

  1. 1.

    how far are these two basins of attraction,

  2. 2.

    how many iterations will bundle adjustment take to converge.

The proposed algorithm uses polynomial global optimization [8, 9]. Previous attempts [1012] in the literature differ in terms of optimization strategy and parameterization of the fundamental matrix. None solves problem (4) optimally for a general parameterization: they either do not guarantee global optimality [11, 12] or prescribe some camera configurations [1012] (requiring typically that the epipole in the first camera does not lie at infinity). Furthermore, the main criticism made to the optimization method we use is the resolution of a hierarchy of convex linear problems of increasing size, which is computationally ineffective and numerically unstable. The proposed solution overcomes this drawback: experiments show that, in most of cases, the proposed algorithm only requires solving the second relaxation of the sequences.

Our experimental evaluation on simulated and real datasets compares the difference between the eight-point algorithm and ours used as initialization to bundle adjustment. We observe that (i) bundle adjustment consistently converges within less iterations with our initialization and (ii) bundle adjustment always achieves an equal or lower reprojection error with our initialization. We provide numerous examples of real image pairs from standard datasets. They all illustrate practical cases for which our initialization method allows bundle adjustment to reach a better local minimum than the eight-point algorithm.

2 State of the Art

Accurately and automatically estimating the fundamental matrix from a pair of images has received a lot of attention. We first review a four-class categorization of existing methods, and specifically investigate the details of existing global methods. We finally state the improvements brought by our global method.

2.1 Categorizing Methods

A classification of the different methods in three categories—linear, iterative and robust—was proposed in [2]. Linear methods directly optimize a linear least squares cost. They include the eight-point algorithm [1], SVD resolution [2] and variants [36]. Iterative methods iteratively optimize a nonlinear and nonconvex cost. They require, and are sensitive to the quality of, an initial estimate. The first group of iterative methods minimizes the distances between points and epipolar lines [13, 14]. The second group minimizes some approximation of the reprojection error [1518]. The third group of methods minimizes the reprojection error, and are equivalent to two-view projective bundle adjustment. Iterative methods typically use a nonlinear parameterization of the fundamental matrix which guarantees that the rank-deficiency constraint is met. For instance, a minimal 7-parameter update can be used over a consistent orthogonal representation [7]. Finally, robust methods estimate the fundamental matrix while classifying each point match as inlier or outlier. Robust methods use M-Estimators [19], median least squares (LMedS) [16] or random sampling consensus (RANSAC) [20]. Both LMedS and RANSAC are stochastic.

To these three categories, we propose to add a fourth one: global methods. Global methods attempt at finding the global minimum of a nonconvex problem. Convex relaxations have been used to combine a convex cost with the rank-deficiency constraint [11]. However, these relaxations do not converge to a global minimum and the solution’s optimality is not certified.

2.2 Global Methods

In theory, for a constrained optimization problem, global optimization methods do not require an initial guess and may be guaranteed to reach the global minimum, thereby certifying optimality. Such global methods can be separated in two classes. The methods of the first class describe the search space as exhaustively as possible in order to test as many candidate solutions as possible. Following this way, there are methods such as Monte-Carlo sampling, which test random elements satisfisying constraints, and reactive tabu search [21, 22], which continues searching even after a local minimum has been found. The major drawback of these methods is mainly in the prohibitive computation time required to have a sufficiently high probability of success. Moreover, even in case of convergence, there is no certificate of global optimality. Contrary to the methods of the first class, methods lying in the second class provide a certificate of global optimality using the mathematical theory from which they are built. Branch and Bound algorithms [23] or global optimization by interval analysis [24, 25] are some examples. However, although these methods can be faster than those of the first category, their major drawback is their lack of generality. Indeed, these methods are usually dedicated to one particular type of cost function because they use highly specific computing mechanisms to be as efficient as possible. A review of global methods may be found in [26].

A good deal of research has been conducted over the last few decades on applying global optimization methods in order to solve polynomial minimization problems under polynomial constraints. The major drawback of these applications has been the difficulty to take constraints into account. But, by solving simplified problems, these approaches have mainly been used to find a starting point for local iterative methods. However, recent results in the areas of convex and polynomial optimization have facilitated the emergence of new approaches. These have attracted great interest in the computer vision community. In particular, global polynomial optimization [8, 9] has been used in combination with a finite-epipole nonlinear parameterization of the fundamental matrix [10]. This method does not consequently cover camera setups where the epipole lies at infinity. A global convex relaxation scheme [8, 9] was used to minimize the Sampson distance [12]. Because this implies minimizing a sum of many rational functions, the generic optimization method had to be specifically adapted and lost the property of certified global optimality.

2.3 The Proposed Method

The proposed method lies in the fourth category: it is a global method. Similarly to the eight-point algorithm, it minimizes the algebraic cost, but explicitly enforces the nonlinear rank-deficiency constraint. Contrarily to previous global methods [1012], the proposed method handles all possible camera configurations (it does not make an assumption on the epipoles being finite or infinite) and certifies global optimality. Moreover, the presented algorithm is based on the resolution of a very short sequence of convex linear problems and is therefore computationally efficient.

A large number of attempts to introduce global optimization have been made in the literature. In [11], a dedicated hierarchy of convex relaxations is defined in order to globally solve the problem of fundamental matrix estimation.

In [10], Lasserre’s hierarchy is used jointly with the introduction of the singularity constraint in the problem description. In [12] the authors minimize the Sampson distance (which theoretically gives better results) by solving a specific hierarchy of convex relaxations built upon an epigraph formulation. Finally, in a very recent work [27], the algebraic error is globally minimized thanks to the resolution of seven subproblems. Each subproblem is reduced to a polynomial equation system solved via a Gröbner basis solver. The singularity constraint is satisfied thanks to the right epipole parametrization. Although this parametrization ensures that \({\varvec{\mathbf {F}}}\) is singular while using the minimum number of parameters, this method is not practical since it would be necessary to solve 126 subproblems in order to cover all the 18 possible parameter sets [16]. Therefore it is preferable to introduce the singularity constraint directly in the problem description rather than via some parametrization of \({\varvec{\mathbf {F}}}\).

3 Polynomial Global Optimization

3.1 Introduction

Given a real-valued polynomial \(f(x) : \mathbb {R}^n \rightarrow \mathbb {R}\), we are interested in solving the problem:

$$\begin{aligned} f^{\star }= \underset{x \in { K }}{\inf }\; f(x) \end{aligned}$$
(5)

where \({ K }\subseteq \mathbb {R}^n\) is a (not necessarily convex) compact set defined by polynomial inequalities: \(g_j(x)\ge 0, \; j = 1,\ldots ,m\). Our optimization method is based on an idea first described in [28]. It consists in reformulating the nonconvex global optimization problem (5) as the equivalent convex linear programming problem:

$$\begin{aligned} \widehat{f}= \underset{\mu \in \mathcal {P}({ K })}{\inf }\; \displaystyle \int _{ K }f(x) d\mu , \end{aligned}$$
(6)

where \(\mathcal {P}({ K })\) is the set of probability measures supported on \({ K }\). Note that this reformulation is true for any continuous function (not necessarily polynomial) and any compact set \(K\subseteq \mathbb {R}^n\). Indeed, as \(f^\star \le f(x)\), then \(f^\star \le \int _{ K }f d\mu \) and thus \(f^\star \le \widehat{f}\). Conversely, if \(x^\star \) is a global minimizer of (5), then the probability measure \(\mu ^\star \mathop {=}\limits ^{\vartriangle }\delta _{x^\star }\) (the Dirac at \(x^\star \)) is admissible for (6). Moreover, because \(\widehat{f}\) is a solution of (6), the following inequality holds: \(\int _{ K }f(x) d\mu \ge \widehat{f}, \; \forall \mu \in \mathcal {P}({ K })\) and thus \(f^\star =\int _{K} f(x) \, \delta _{x^\star } \ge \widehat{f}\). Instead of optimizing over the finite-dimensional euclidean space \({ K }\), we optimize over the infinite-dimensional set of probability measures \(\mathcal {P}({ K })\). Thus, Problem (6) is, in general, not easier to solve than Problem (5). However, in the special case of \(f\) being a polynomial and \({ K }\) being defined by polynomial inequalities, we will show how Problem (6) can be reduced to solving a (generically finite) sequence of convex linear matrix inequality (LMI) problems.

3.2 Notations and Definitions

First, given vectors \(\alpha = (\alpha _1,\ldots ,\alpha _n)^\top \in \mathbb {N}^n\) and \(x = (x_1,\ldots ,x_n)^\top \in \mathbb {R}^n\), we define the monomial \(x^\alpha \) by:

$$\begin{aligned} x^\alpha \mathop {=}\limits ^{\vartriangle }x_1^{\alpha _1}x_2^{\alpha _2}\ldots x_n^{\alpha _n} \end{aligned}$$
(7)

and its degree by \(\mathrm {deg}(x^\alpha )\mathop {=}\limits ^{\vartriangle }\displaystyle ||\alpha ||_1 = \sum _{i=1}^n \alpha _i\). For \(t\in {\mathbb {N}}\), we define \(\mathbb {N}^n_t\) the space of the \(n\)-dimensional integer vector with a norm lower than \(t\) as:

$$\begin{aligned} \mathbb {N}^n_t \mathop {=}\limits ^{\vartriangle }\left\{ \alpha \in \mathbb {N}^n \;\; | \;\; ||\alpha ||_1 \le t \right\} . \end{aligned}$$
(8)

Then, consider the family:

$$\begin{aligned} \left\{ x^\alpha \right\} _{\alpha \in \mathbb {N}_t^n}&=\left\{ 1,x_1,x_2,\ldots ,x_n,x_1^2,x_1x_2,\ldots ,\right. \\&\quad \left. x_1x_n,x_2x_3,\ldots ,x_n^2,\ldots ,x_1^t,\ldots ,x_n^t\right\} \nonumber \end{aligned}$$
(9)

of all the monomials \(x^\alpha \) of degree at most \(t\), which has dimension \(s(t)\mathop {=}\limits ^{\vartriangle }\dfrac{(n+t)!}{t!n!}\). Those monomials form the canonical basis of the vector space \(\mathbb {R}_t[x]\) of real-valued multivariate polynomials of degree at most \(t\). Then, a polynomial \(p \in \mathbb {R}_t[x]\) is understood as a linear combination of monomials of degree at most \(t\):

$$\begin{aligned} p(x) = \sum _{\alpha \in \mathbb {N}_t^n} p_\alpha x^\alpha , \end{aligned}$$
(10)

and \(\mathbf {p}\mathop {=}\limits ^{\vartriangle }(p_\alpha )_{||\alpha ||_1 \le t}\in {\mathbb {R}}^{\mathbb {N}_t^n}\simeq {\mathbb {R}}^{s(t)}\) is the vector of its coefficients in the monomial basis \(\left\{ x^\alpha \right\} _{\alpha \in \mathbb {N}_t^n}\). Its degree is equal to \(\mathrm {deg}(p)\mathop {=}\limits ^{\vartriangle }\max {\left\{ ||\alpha ||_1 \; | \; p_\alpha \ne 0\right\} } \) and \(\mathrm {d}_p\) denotes the smallest integer not lower than \(\dfrac{\mathrm {deg}(p)}{2}\).

Example

The polynomial

$$\begin{aligned} x\in {\mathbb {R}}^2&\mapsto p(x)=1+2x_2 +3x^2_1 +4x_1x_2 \end{aligned}$$
(11)

has a vector of coefficients \(p\in {\mathbb {R}}^6\) with entries \(p_{00}=1,\, p_{10}=0,\, p_{01}=2,\, p_{20}=3,\, p_{11}=4\) and \(p_{02}=0\).

Next, given \(\mathbf {y}=(y_\alpha )_{\alpha \in {\mathbb {N}}^n} \in {\mathbb {R}}^{{\mathbb {N}}^n}\), we define the Riesz functional \(L_\mathbf {y}\) by the linear form:

$$\begin{aligned} L_\mathbf {y}&: {\mathbb {R}}\left[ x \right] \rightarrow \mathbb {R} \nonumber \\&\quad p=\sum _{\alpha \in \mathbb {N}^n} p_\alpha x^\alpha \rightarrow \mathbf {y}^\top \mathbf {p}= \sum _{\alpha \in \mathbb {N}^n} p_\alpha y_\alpha . \end{aligned}$$
(12)

Thus, the Riesz functional can be seen as an operator that linearizes polynomials.

Example

For the polynomial (11), the Riesz functional reads

$$\begin{aligned} p(x)&= 1+2x_2 +3x^2_1 +4x_1x_2 \mapsto L_\mathbf {y}(p)\nonumber \\&= y_{00}+2y_{01}+3y_{20}+4y_{11}. \end{aligned}$$
(13)

For \(t\in {\mathbb {N}}\) and \(\mathbf {y}\in {\mathbb {R}}^{{\mathbb {N}}^n_{2t}}\), the matrix \(M_t(\mathbf {y})\) of size \(s(t)\) defined by:

$$\begin{aligned} (M_t(\mathbf {y}))_{\alpha ,\beta }= L_\mathbf {y}(x^\alpha x^\beta )= y_{\alpha +\beta } \quad \forall \alpha ,\beta \in \mathbb {N}_{t}^n \end{aligned}$$
(14)

is called the moment matrix of order \(t\) of \(\mathbf {y}\). By construction, this matrix is symmetric and linear in \(\mathbf {y}\). Then, given \(q\in {\mathbb {R}}_t\left[ x \right] \) and \(\mathbf {q}\in {\mathbb {R}}^{{\mathbb {N}}^n_{t}}\) the vector of its coefficients in the monomial basis, the vector:

$$\begin{aligned} q\mathbf {y} \mathop {=}\limits ^{\vartriangle }M_t(\mathbf {y})\mathbf {q} \;\in \; {\mathbb {R}}^{{\mathbb {N}}^n_{t}} \end{aligned}$$
(15)

is called the shifted vector with respect to \(\mathbf {q}\). \(M_t(q\mathbf {y})\), the moment matrix of order \(t\) of \(q\mathbf {y}\), is called the localizing matrix of degree \(t\) of \(q\). This matrix is also symmetric and linear in \(\mathbf {y}\).

Example

If \(n=2\) then:

(16)

and if \(q(x)=a+2x_1^2+3x_2^2\) then:

(17)

Finally, recall that a symmetric matrix \(F\in \mathbb {S}^n\) is positive semidefinite, denoted by \(F\succeq 0\), if and only if \(x^\top F x\ge 0, \; \forall x \in {\mathbb {R}}^n\) or equivalently, if and only if the minimum eigenvalue of \(F\) is non-negative. A linear matrix inequality (LMI) is a convex constraint:

$$\begin{aligned} F_0+\displaystyle \sum _{k=1}^n x_kF_k&\succeq 0, \end{aligned}$$
(18)

on a vector \(x\in {\mathbb {R}}^n\), where matrices \(F_k \in \mathbb {S}^m,\, k = 0,\ldots ,n\) are given.

3.3 Optimization Method

Let \(f\) be a real-valued multivariate polynomial, Problem (6) can be reduced to a convex linear programming problem. Indeed, if \(f(x)=\sum _{\alpha \in {\mathbb {N}}^n} f_\alpha x^\alpha \) then:

$$\begin{aligned} \int _K f\; d\mu \!=\! \int _K \sum _{\alpha \in \mathbb {N}^n} f_\alpha x^\alpha d\mu \!=\! \sum _{\alpha \in \mathbb {N}^n} f_\alpha \int _K x^\alpha d\mu = L_\mathbf {y}(f) \end{aligned}$$
(19)

where each coordinate \(y_\alpha \) of the infinite sequence \(\mathbf {y}\in {\mathbb {R}}^{\mathbb {N}^n}\) is equal to \(\displaystyle \int _K x^\alpha \, \mu (dx)\), also called the moment of order \(\alpha \). Consequently, if \(f\) is polynomial, then Problem (6) is equivalent to:

$$\begin{aligned} \widehat{f}=\inf \; L_\mathbf {y}(f) \nonumber \\ \text {s.t. }\;\;{\mathop {y_0=1}\limits _{\mathbf {y} \in \mathcal {M}_\textit{K}.}} \end{aligned}$$
(20)

with:

$$\begin{aligned} \mathcal {M}_\textit{K}&\mathop {=}\limits ^{\vartriangle }\left\{ \mathbf {y} \in \mathbb {R}^{\mathbb {N}^n}\;\mid \; \exists \mu \in \mathcal {M}_+(K)\text { such that }\right. \nonumber \\&\quad \left. y_\alpha = \int _K x^\alpha d\mu \; \; \forall \alpha \in \mathbb {N}^n\right\} , \end{aligned}$$
(21)

and \(\mathcal {M}_+(K)\) is the space of finite Borel measures supported on \({ K }\). Remark that the constraint \(y_0=1\) is added in order to impose that if \(\mathbf {y} \in \mathcal {M}_\textit{K}\) then \(\mathbf {y}\) represents a measure in \(\mathcal {P}({ K })\) (and no longer in \(\mathcal {M}_+(K)\)). Although Problem (20) is a convex linear programming problem, it is difficult to describe the convex cone \(\mathcal {M}_\textit{K}\) with simple constraints on \(\mathbf {y}\). But, the problem \(\mathbf {y}\in \mathcal {M}_\textit{K}\), also called \({ K }\)-moment problem, is solved when \({ K }\) is a basic semi-algebraic set, namely:

$$\begin{aligned} K \mathop {=}\limits ^{\vartriangle }\left\{ x \in \mathbb {R}^n \vert g_1(x)\geqslant 0,\ldots ,g_m(x)\geqslant 0 \right\} \end{aligned}$$
(22)

where \(g_j \in {\mathbb {R}}[x], \; \forall j=1,\ldots m\). Note that \({ K }\) is assumed to be compact. Then, without loss of generality, we assume that one of the polynomial inequalities \(g_j(x)\geqslant 0\) is of the form \(R^2-||x ||_2^2\geqslant 0\) where \(R\) is a sufficiently large positive constant. This allows to apply a theorem on positivity by Putinar [29, 30] and to model \(\mathcal {M}_\textit{K}\) with LMI conditions:

$$\begin{aligned} \mathcal {M}_\textit{K}&= \mathcal {M}_\succeq (g_1,\ldots ,g_m), \end{aligned}$$
(23)

where:

$$\begin{aligned} \mathcal {M}_\succeq (g_1,\ldots ,g_m)&\mathop {=}\limits ^{\vartriangle }\left\{ \mathbf {y} \in \mathbb {R}^{\mathbb {N}^n} \vert \, M_t(\mathbf {y}) \succeq 0,\, M_t(g_j\mathbf {y}) \succeq 0 \right. \nonumber \\&\quad \left. \forall j =1,. .,m\; \; \forall t \in {\mathbb {N}}\right\} . \end{aligned}$$
(24)

Then, Problem (6) is equivalent to:

$$\begin{aligned} \widehat{f}=\underset{\mathbf {y}\, \in \, \mathbb {R}^{\mathbb {N}^n}}{\inf }L_\mathbf {y}(f) \nonumber \\ \text {s.t. } y_0=1 \nonumber \\ M_t(\mathbf {y}) \succeq 0 \nonumber \\ M_t(g_j\mathbf {y}) \succeq 0 \quad j =1,\ldots ,m \quad \forall t \in {\mathbb {N}}. \end{aligned}$$
(25)

To summarize, if \(f\) is polynomial and \({ K }\) a semi-algebraic set, then Problem (5) is equivalent to a convex linear programming problem with an infinite number of linear constraints on an infinite number of decision variables. Now, for \(t\ge \mathrm {d}_{{ K }}\mathop {=}\limits ^{\vartriangle }\max (\mathrm {d}_f,\mathrm {d}_{g_1},\ldots , \mathrm {d}_{g_m})\) consider the finite-dimensional truncations of Problem (25):

$$\begin{aligned} \mathcal {Q}_t\mathop {=}\limits ^{\vartriangle }\left\{ \begin{array}{l}\widehat{f}_t\mathop {=}\limits ^{\vartriangle }\underset{\mathbf {y}\, \in \, \mathbb {R}^{\mathbb {N}_{2t}^n}}{\min }L_\mathbf {y}(f)\\ \text { s.t. } y_0 =1 \\ M_t(\mathbf {y}) \succeq 0,\\ M_{t-\mathrm {d}_{g_j}}(g_j\mathbf {y}) \succeq 0\; \; \forall j \in \left\{ 1,\ldots ,m\right\} . \end{array}\right. \end{aligned}$$
(26)

By construction, \(\mathcal {Q}_t,\, t \in \mathbb {N}\) generates a hierarchy of LMI relaxations of Problem (25) [8], where each \(\mathcal {Q}_t,\, t \in \mathbb {N}\), is concerned with moment and localizing matrices of fixed size \(t\). Each relaxation (26) can be solved by using public-domain implementations of primal-dual interior point algorithms for semidefinite programming (SDP) [3135]. When the relaxation order \(t\in {\mathbb {N}}\) tends to infinity, we obtain the following results [8, 36]:

$$\begin{aligned} \widehat{f}_{t} \le \widehat{f}_{t+1} \le \widehat{f} \text { and } \underset{t\rightarrow +\infty }{\lim } \widehat{f}_t = \widehat{f}. \end{aligned}$$
(27)

Practice reveals that this convergence is fast and very often finite, i.e. there exists a finite \(t_0\) such that \(\widehat{f}_t = \widehat{f},\; \forall t\ge t_0\). In fact, finite convergence is guaranteed in a number of cases (e.g. discrete optimization) and very recent results by Nie [36] show that the finite convergence of the sequence \((\widehat{f}_t)_{t\in {\mathbb {N}}}\) as well as the existence of an optimal solution \(\mathbf {y}^\star _t\) of (26) are generically guaranteed.

Example

Consider the polynomial optimization problem

$$\begin{aligned} \widehat{f}=\min _{x \in {\mathbb {R}}^2}&-x_2 \nonumber \\ \mathrm {s.t.}&3-2x_2-x_1^2-x_2^2 \ge 0 \nonumber \\&-x_1-x_2-x_1x_2 \ge 0 \nonumber \\&1+x_1x_2 \ge 0. \end{aligned}$$
(28)

The first LMI relaxation \(\mathcal {Q}_1\) is

$$\begin{aligned} \widehat{f}_1 =\min _{\mathbf {y}\in {\mathbb {R}}^6}&-y_{01} \nonumber \\ \mathrm {s.t.}&y_{00}=1\nonumber \\&\begin{pmatrix} y_{00} &{}\quad y_{10} &{}\quad y_{01} \\ y_{10} &{}\quad y_{20} &{}\quad y_{11} \\ y_{01} &{}\quad y_{11} &{}\quad y_{02} \end{pmatrix}\succeq 0\nonumber \\&3y_{00}-2y_{01}-y_{20}-y_{02} \ge 0 \nonumber \\&-y_{10}-y_{01}-y_{11} \ge 0 \nonumber \\&y_{00}+y_{11} \ge 0, \end{aligned}$$
(29)

and the second LMI relaxation \(\mathcal {Q}_2\) is

$$\begin{aligned} \widehat{f}_2 =\min _{\mathbf {y}\in {\mathbb {R}}^{15}}&-y_{01} \nonumber \\ \mathrm {s.t.}&y_{00}=1 \nonumber \\&\begin{pmatrix} y_{00} &{}\quad y_{10} &{}\quad y_{01} &{}\quad y_{20} &{}\quad y_{11} &{}\quad y_{02} \\ y_{10} &{}\quad y_{20} &{}\quad y_{11} &{}\quad y_{30} &{}\quad y_{21} &{}\quad y_{12} \\ y_{01} &{}\quad y_{11} &{}\quad y_{02} &{}\quad y_{21} &{}\quad y_{12} &{}\quad y_{03} \\ y_{20} &{}\quad y_{30} &{}\quad y_{21} &{}\quad y_{40} &{}\quad y_{31} &{}\quad y_{22} \\ y_{11} &{}\quad y_{21} &{}\quad y_{12} &{}\quad y_{31} &{}\quad y_{22} &{}\quad y_{13} \\ y_{02} &{}\quad y_{12} &{}\quad y_{03} &{}\quad y_{22} &{}\quad y_{13} &{}\quad y_{04} \end{pmatrix}\succeq 0,\nonumber \\&\begin{pmatrix} 3y_{00}-2y_{01}-y_{20}-y_{02} &{}\quad 3y_{10}-2y_{11}-y_{30}-y_{12} &{}\quad 3y_{01}-2y_{02}-y_{21}-y_{03} \\ 3y_{10}-2y_{11}-y_{30}-y_{12} &{}\quad 3y_{20}-2y_{21}-y_{40}-y_{22} &{}\quad 3y_{11}-2y_{12}-y_{31}-y_{13} \\ 3y_{01}-2y_{02}-y_{21}-y_{03} &{}\quad 3y_{11}-2y_{12}-y_{31}-y_{13} &{}\quad 3y_{02}-2y_{03}-y_{22}-y_{04} \end{pmatrix}\succeq 0\nonumber \\&\begin{pmatrix} -y_{10}-y_{01}-y_{11} &{}\quad -y_{20}-y_{11}-y_{21} &{}\quad -y_{11}-y_{02}-y_{12} \\ -y_{20}-y_{11}-y_{21} &{}\quad -y_{30}-y_{21}-y_{31} &{}\quad -y_{21}-y_{31}-y_{21} \\ -y_{11}-y_{02}-y_{12} &{}\quad -y_{21}-y_{12}-y_{22} &{}\quad -y_{12}-y_{03}-y_{13} \end{pmatrix}\succeq 0\nonumber \\&\begin{pmatrix} y_{00}+y_{11} &{}\quad y_{10}+y_{21} &{}\quad y_{01}+y_{12} \\ y_{10}+y_{21} &{}\quad y_{20}+y_{31} &{}\quad y_{11}+y_{22} \\ y_{01}+y_{12} &{}\quad y_{11}+y_{22} &{}\quad y_{02}+y_{13} \end{pmatrix}\succeq 0. \end{aligned}$$
(30)

It can be checked that \(\widehat{f}_1=-2\le \widehat{f}_2=\widehat{f}=-\frac{1+\sqrt{5}}{2}\). Note that the constraint \(3-2x_2-x_1^2-x_2^2 \ge 0\) certifies boundedness of the feasibility set.

However, we do not know a priori at which relaxation order \(t_0\) the convergence occurs. Practically, to detect whether the optimal value is attained, we can use conditions on the rank of the moment and localization matrices. Indeed, let \(\mathbf {y}_t^\star \in \mathbb {R}^{\mathbb {N}_{2t}^n}\) be a solution of Problem (26) at a given relaxation order \(t\ge \mathrm {d}_{{ K }}\), if:

$$\begin{aligned} \mathrm {rank}(M_{t}(\mathbf {y}_t^\star ))=\mathrm {rank}(M_{t-\mathrm {d}_K}(\mathbf {y}_t^\star )) \end{aligned}$$
(31)

then \(\widehat{f}_t = \widehat{f}\). In particular, if \(\mathrm {rank}(M_{t}(\mathbf {y}_t^\star ))=1\) then condition (31) is satisfied. Moreover, if these rank conditions are satisfied, then we can use numerical linear algebra to extract \(\mathrm {rank}(M_{t}(\mathbf {y}_t^\star ))\) global optima for Problem (5). We do not describe the algorithm in this article, but the reader can refer to [29, Sect. 4.3] for more advanced information. Figure 1 summarizes the optimization process.

Fig. 1
figure 1

Polynomial optimization process; see the main text for details

A Matlab interface called GloptiPoly [37] has been designed to construct Lasserre’s LMI relaxations in a format understandable by any SDP solver interfaced via YALMIP [38]. It can be used to construct an LMI relaxation (26) of a given order corresponding to a polynomial optimization problem (5) with given polynomial data entered symbolically. A numerical algorithm is implemented in GloptiPoly to detect global optimality of an LMI relaxation, using the rank tests (31). The algorithm also extracts numerically the global optima from the moment matrix. Then, a practical algorithm is given by Algorithm 1. This approach has been successfully applied to globally solve various polynomial optimization problems (see [29] for an overview of results and applications). In computer vision this approach was first introduced in [39] and used in [12].

figure g

3.4 Application to Fundamental Matrix Estimation

This first paragraph aims at relating the theory and the practical application in the context of fundamental matrix estimation. More generally, applying the presented algorithm requires to pay one specific attention to three key points.

Firstly, a necessary condition for the convergence of the presented polynomial optimization method is the compactness of the feasible set. In the context of fundamental matrix estimation, the problem is homogeneous. Hence, an additional normalization constraint is needed to avoid the trivial solution \({\varvec{\mathbf {F}}}=\mathbf 0 \). A classical confusion would be to assume that any normalization constraint satisfies the compactness condition. Indeed, a generally used normalization constraint consists in setting one of the coefficients of the \({\varvec{\mathbf {F}}}\) matrix to \(\mathbf 1 \). However, the other \({\varvec{\mathbf {F}}}\) coefficients are not bounded and thus the compactness of the feasible set is not guaranteed. Moreover, such normalisation a priori excludes some geometric configurations. A way to proceed is to add the normalization constraint \(\Vert {\varvec{\mathbf {F}}}\Vert ^2=1\).

Fig. 2
figure 2

For problem (28) with an additional constraint of the form \(x_1^n = x_1+x_2\), total number of moments after substitutions against the relaxation order for a fixed \(n\)

Secondly, the applicability of the presented algorithm is directly linked to the number of variables (i.e. the length of vector \(\mathbf {y}\)) in the LMI relaxation (26). Indeed, for a polynomial \(f\) of \(n\) variables, the size of the vector \(\mathbf {y}\) in the first relaxation equals \(s(2t)=\frac{(n+2t)!}{2t!\,n!}\) with \(t=\mathrm {d}_f\). The amount of variables \(n\) being fixed, \(s(2t)\) grows in \(O(t^n)\), that is polynomially in the relaxation order \(t\). Clearly, the smaller is the degree of \(f\), the smaller is the number of variables of the first relaxations in the hierarchy \((\mathcal {Q}_t)_{t \in {\mathbb {N}}}\). Thus, in the context of fundamental matrix estimation, the goal is to include the singularity constraint in the optimization problem in a manner which minimizes the degree of the polynomial criterion. Alternatively to a direct inclusion in the constaints, the singularity constraint can be inferred by parameterizing the \({\varvec{\mathbf {F}}}\) matrix using one or two epipoles. This latter method being not only arbitrary, also leads to increase the degree of the cost function. For instance, the parameterization with one epipole:

$$\begin{aligned} {\varvec{\mathbf {F}}}= \left[ \begin{array}{c@{\quad }c@{\quad }c} f_{11} &{} f_{12} &{} f_{13} \\ f_{21} &{} f_{22} &{} f_{23} \\ \alpha f_{11} + \beta f_{21} \; &{} \; \alpha f_{12} + \beta f_{22}\; &{} \; \alpha f_{13} + \beta f_{23} \end{array}\right] , \end{aligned}$$
(32)

leads to a cost function of degree 4, while the parameterization with two epipoles:

$$\begin{aligned} {\varvec{\mathbf {F}}}= \left[ \begin{array}{c@{\quad }c@{\quad }c} f_{11} &{} f_{12} &{} {\mathsf {\mathbf{{{e}} }}}_1 f_{11} + {\mathsf {\mathbf{{{e}} }}}_2 f_{12}\\ f_{21} &{} f_{22} &{} {\mathsf {\mathbf{{{e}} }}}_1 f_{21} + {\mathsf {\mathbf{{{e}} }}}_2 f_{22} \\ {\mathsf {\mathbf{{{e}} }}}_1' f_{11} +{\mathsf {\mathbf{{{e}} }}}_2' f_{21}\; &{} \; {\mathsf {\mathbf{{{e}} }}}_1' f_{12}+{\mathsf {\mathbf{{{e}} }}}_2' f_{22}\; &{} \; ({\mathsf {\mathbf{{{e}} }}}_1 f_{11}+{\mathsf {\mathbf{{{e}} }}}_2 f_{12}){\mathsf {\mathbf{{{e}} }}}_1' + ({\mathsf {\mathbf{{{e}} }}}_1 f_{21}+{\mathsf {\mathbf{{{e}} }}}_2 f_{22}){\mathsf {\mathbf{{{e}} }}}_2' \end{array} \right] \end{aligned}$$
(33)

leads to a cost function of degree 6.

Thirdly, in case of polynomial equalities, several explicit moment substitutions can be performed and thus significantly reduce the number of variables and constraints in LMI relaxations, as described in Sect 5.12 of [37]. More precisely, from an equality constraint, it is sometimes possible to express a variable \(x_k\) in function of \(x_1,\ldots ,x_{k-1},x_{k+1}\ldots ,x_n\):

$$\begin{aligned} x_k^{\alpha } = s(x1,\ldots ,x_{k1},x_{k+1},\ldots ,x_n), \end{aligned}$$

with \(x_k^{\alpha }\) a monomial and \(s\) a polynomial. The goal of the following example is to underline the result: if the degrees of the monomial and the polynomial are high, then only a few explicit moment substitutions can be carried out. If an equality constraint of the form \(x_1^n = x_1+x_2\) is added to Problem (28), it is then possible to represent the total number of moments after substitutions against the relaxation order for a fixed \(n\). Consequently, Fig. 2 demonstrates that the total number of moments in Problem (28) increases with \(n\). In the context of fundamental matrix estimation, possible substitutions are given by the rank constraint and the normalization constraint, say:

$$\begin{aligned}&f_{11}f_{22}f_{33}-f_{11}f_{32}f_{23}-f_{21}f_{12}f_{33} +f_{21}f_{32}f_{13}\\&\quad +f_{31}f_{12}f_{23}-f_{31}f_{22}f_{13} =0\\&f_{11}^2 +f_{12}^2 +f_{13}^2 +f_{21}^2 +f_{22}^2 +f_{23}^2\\&\quad +f_{31}^2 +f_{32}^2 + f_{33}^2 =1. \end{aligned}$$

Thus, due to complexity of this equation system, there are too few possible substitutions to significantly increase the performance of the proposed algorithm.

This second paragraph aims at focusing on previous attempts to solve the fundamental matrix estimation problem thanks to a hierarchy of convex relaxations. The method described in [10] applies directly the presented hierarchy without bounding the \({\varvec{\mathbf {F}}}\) coefficients. Indeed, the trivial solution is avoided by fixing, a priori, one of the \({\varvec{\mathbf {F}}}\) coefficients to 1. Consequently, as explained in the first key point, there is no guarantee that the sequence of solutions \((\widehat{f}_t)_{t\in \mathbb {N}}\) converges to the global minimum. In [11], a dedicated hierarchy of convex relaxations is defined. The rank constraint is not directly added to the problem description, but is accounted for thanks to the introduction of additional optimization variables. The resulting optimization algorithm is not generic and, contrary to the presented hierarchy, there is no proof that the sequence of solutions of this specific hierarchy converges to the global minimum (e.g. the obtained solution could be a lower bound). Finally, in [12] an extension of the presented hierarchy is defined in order to minimize the Sampson distance (which theoretically gives better results [19]). Indeed, the Sampson distance being a sum of many rational functions, the presented hierarchy cannot be applied. However, thanks to an epigraph formulation, authors are able to include the denominators in the constraints and thus solve a polynomial problem. In addition to adding as many variables as matched points, the price to pay for this introduction is to loose the linearity of the constraints. Indeed, the linear matrix inequalities become polynomial matrix inequalities (PMI). To handle the case of PMI constraints, an adapted moment-SOS approach with convergence guarantees is described in [40]. However, this method is not directly implemented in [12]. Consequently, no asymptotic convergence of this particular hierarchy to the global optimum can be guaranteed. Moreover, even if the rank correction is achieved using the singularity constraint, the normalization is replaced by setting, a priori, the coefficient \(f_{33}\) to 1 and subsequently discards the compactness condition.

To sum up, although methods presented in [11, 12] are based on a theory close to that presented in the above, they have no guarantee of convergence to the global minimum, only a lower bound can be ensured. The method presented in [10] is a direct application of the presented hierarchy but without ensuring the compactness of the feasible set. For all these reasons, it is chosen to not compare the presented algorithm to these methods. The presented method is summarized in Algorithm 2 below. Its main features are:

  • In contrast with [12, 39] the optimization problem is formulated with an explicit Frobenius norm constraint on the decision variables. This enforces compactness of the feasibility set which is included in the Euclidean ball of radius \(1\). We have observed that enforcing this Frobenius norm constraint has a dramatic influence on the overall numerical behavior of the SDP solver, especially with respect to convergence and extraction of global minimizers.

  • We have chosen the SDPT3 solver [34, 41] since our experiments revealed that for our problem it was the most efficient and reliable solver.

  • We force the interior-point algorithm to increase the accuracy as much as possible, overruling the default parameter set in SDPT3. Then the solver runs as long as it can make progress.

  • The presented numerical experiments show that the moment matrix has almost always rank-one (which certifies global optimality) at the second SDP relaxation of the hierarchy. This suggests that the problem of the fundamental matrix estimation has a unique global minimizer. Note that, in some (very few) cases, due to the numerical extraction, the global minimum is not fully accurate but yet largely satisfactory.

figure h

4 Experimental Results

This section presents results obtained by the test procedure presented below with the 8-point method and our global method. First, criteria to evaluate the performance of a fundamental matrix estimate are described. Next, the evaluation methodology is detailed. Experiments were then carried out on synthetic data to test the sensitivity to noise and to the number of point matches. Finally, experiments on real data were performed to confirm previous results and to study the influence of the type of motion between the two images.

4.1 Evaluation Criteria

Various evaluation criteria were proposed in the literature [16] to evaluate the quality of a fundamental matrix estimate. Driven by practice, a fundamental matrix estimate \({\varvec{\mathbf {F}}}\) is evaluated with respect to the behavior of its subsequent refinement by projective bundle adjustment. Bundle Adjustment from two uncalibrated views is described as a minimization problem. The cost function is the RMS reprojection errors. The unknowns are the 3D points \({\varvec{\mathbf {Q}}}_i, \; i=1,\ldots ,n\) and the projection matrices \({\varvec{\mathbf {P}}}\) and \({\varvec{\mathbf {P}}}'\). The criteria we use are:

  1. 1.

    The initial reprojection error written \(e_{\text {Init}}({\varvec{\mathbf {F}}})\).

  2. 2.

    The final reprojection error \(e_{\text {BA}}({\varvec{\mathbf {F}}})\).

  3. 3.

    The number of iterations taken by bundle adjustment to converge, \(\text {Iter}({\varvec{\mathbf {F}}})\).

These three criteria assess whether the estimates provided by the two methods, denoted by \({\varvec{\mathbf {F}}}_{8pt}\) and \({\varvec{\mathbf {F}}}_{Gp}\), are in a ‘good’ basin of attraction. Indeed, the number of iterations gives an indication of the distance between the estimate and the optimum while \(e_{\text {BA}}({\varvec{\mathbf {F}}})\) gives an indication on the quality of the optimum (Fig. 3).

4.2 Evaluation Method

Figure summarizes our evaluation method. The initial projective cameras are computed using [42], while the initial 3D points are calculated using [43]. The two view uncalibrated bundle adjustment used is described in [19]

Fig. 3
figure 3

Evaluation method

Fig. 4
figure 4

Projection of the cube in the camera on initial position \(\mathsf {(I)}\) and in the camera after applying the rigid transformation \(\left[ {\varvec{\mathbf {R}}}_1 \; {\mathsf {\mathbf{{{t}} }}}_1\right] \, \mathsf {(II)}\) and the rigid transformation \(\left[ {\varvec{\mathbf {R}}}_2\; {\mathsf {\mathbf{{{t}} }}}_2\right] \;\; \mathsf {(III)}\)

4.3 Experiments on Simulated Data

4.3.1 Simulation Procedure

For each simulation series and for each parameter of interest (noise, number of points and number of motions), the same methodology is applied with the following four steps:

  1. 1.

    For two given motions between two successives images (\(\left[ {\varvec{\mathbf {R}}}_k \; {\mathsf {\mathbf{{{t}} }}}_k \right] \)) and for a given matrix \({\varvec{\mathbf {K}}}\) of internal parameters, a set of 3D points \(({\varvec{\mathbf {Q}}}_i)_{i}, \;i=1,\ldots ,n\) is generated and two projection matrices \({\varvec{\mathbf {P}}}\) and \({\varvec{\mathbf {P}}}'\) are defined. In practice, the rotations matrices, \({\varvec{\mathbf {R}}}_1\) and \({\varvec{\mathbf {R}}}_2\), of two motions are defined by:

    $$\begin{aligned} {\varvec{\mathbf {R}}}_k\mathop {=}\limits ^{\vartriangle } \left[ \begin{array}{c@{\quad }c@{\quad }c} \cos (\theta _k) &{} 0 &{} \sin (\theta _k)\\ 0 &{} 1 &{} 0\\ -\sin (\theta _k) &{} 0 &{} \cos (\theta _k) \\ \end{array} \right] \text { with } \left\{ \begin{array}{l} \theta _1 = \dfrac{\pi }{3} \\ \text { and }\\ \theta _2 = \dfrac{\pi }{6} \\ \end{array} \right. \end{aligned}$$
    (34)

    and their translation vectors by \({\mathsf {\mathbf{{{t}} }}}_1=(20,0,5)^\top \) and \({\mathsf {\mathbf{{{t}} }}}_2=(6,0,0)^\top \). These matrices are chosen such that \(\left[ {\varvec{\mathbf {R}}}_1, {\mathsf {\mathbf{{{t}} }}}_1 \right] \) is a large movement and \(\left[ {\varvec{\mathbf {R}}}_2, {\mathsf {\mathbf{{{t}} }}}_2 \right] \) is a small movement (see Fig. 4). We simulated points lying in a cube with 10 meter side length. The first camera looks at the center of the cube and it is located 15 meters from the center of the cube. The focal length of the camera is 700 pixels and the resolution is \(640\times 480\) pixels.

  2. 2.

    Thanks to projection matrices \({\varvec{\mathbf {P}}}={\varvec{\mathbf {K}}}\left[ {\varvec{\mathbf {R}}}_1 ,{\varvec{\mathbf {t}}}_1\right] \) and \({\varvec{\mathbf {P}}}'={\varvec{\mathbf {K}}}\left[ {\varvec{\mathbf {R}}}_2 ,{\varvec{\mathbf {t}}}_2\right] \), the set of 3D points \(({\mathsf {\mathbf{{{Q}} }}}_i)_{i}\) is projected into the two images as \(({\mathsf {\mathbf{{{q}} }}}_i,{\mathsf {\mathbf{{{q}} }}}'_i)_{i}\). At each of their pixel coordinates, a centered Gaussian noise with a variance \(\sigma ^2\) is added. In order to have statistical evidence, the results are averaged over 100 trials.

  3. 3.

    The resulting noisy points \((\widetilde{{\mathsf {\mathbf{{{q}} }}}_i},\widetilde{{\mathsf {\mathbf{{{q}} }}}'_i})_{i}\) are used to estimate \({\varvec{\mathbf {F}}}\) by our method \({\varvec{\mathbf {F}}}_{Gp}\) and the reference 8-point method \({\varvec{\mathbf {F}}}_{8pt}\).

  4. 4.

    Finally, via our evaluation procedure we evaluate the estimation error with respect to the noise standard deviation \(\sigma \) and the number of points \(n\).

Fig. 5
figure 5

For two movements, \(\left[ {\varvec{\mathbf {R}}}_1 \; {\mathsf {\mathbf{{{t}} }}}_1 \right] \) (left column) and \(\left[ {\varvec{\mathbf {R}}}_2 \; {\mathsf {\mathbf{{{t}} }}}_2 \right] \) (right column), reprojection errors and number or iterations measured against image noise

4.3.2 Sensitivity to Noise

We tested in two simulation series the influence of \(\sigma \) ranging from 0 to 2 pixels. The number of simulated points is \(50\). The first (resp. second) simulation series is based on the first motion \(\left[ {\varvec{\mathbf {R}}}_1 \; {\mathsf {\mathbf{{{t}} }}}_1 \right] \) (resp. the second motion \(\left[ {\varvec{\mathbf {R}}}_2 \; {\mathsf {\mathbf{{{t}} }}}_2 \right] \)). Figure 5 gathers the influence of noise on the evaluation criteria. The first line shows the reproduction errors before, \(e_{\text {Init}}({\varvec{\mathbf {F}}})\), and after \(e_{\text {BA}}({\varvec{\mathbf {F}}})\) refinement through bundle adjustment with respect to the noise standard deviation. The second line shows the number of iterations \(\text {Iter}({\varvec{\mathbf {F}}})\) of the bundle adjustment versus the noise standard deviation. The first (resp. second) row concerns the first (resp. second) motion.

Fig. 6
figure 6

For two movements, \(\left[ {\varvec{\mathbf {R}}}_1 \; {\mathsf {\mathbf{{{t}} }}}_1 \right] \) (left column) and \(\left[ {\varvec{\mathbf {R}}}_2 \; {\mathsf {\mathbf{{{t}} }}}_2 \right] \) (right column), reprojection errors and number or iterations measured against number of points for a gaussian noise with a variance fixed to 0.5

Fig. 7
figure 7

For the movement \(\left[ {\varvec{\mathbf {R}}}_1 \; {\mathsf {\mathbf{{{t}} }}}_1 \right] \), reprojection errors and number or iterations measured against number of points for a gaussian noise with a variance fixed to 1 (left and right)

Fig. 8
figure 8

Reprojection Error before (\(e_{\text {Init}}({\varvec{\mathbf {F}}})\)) and after bundle adjustment (\(e_{\text {BA}}({\varvec{\mathbf {F}}})\)), Number of Iterations (\(\text {Iter}({\varvec{\mathbf {F}}})\)), and CPU time to compute \({\varvec{\mathbf {F}}}\) (Time), obtained when combining pairs of images to obtain epipoles close to the images or toward infinity

Fig. 9
figure 9

Reprojection Error before (\(e_{\text {Init}}({\varvec{\mathbf {F}}})\)) and after bundle adjustment (\(e_{\text {BA}}({\varvec{\mathbf {F}}})\)), Number of Iterations (\(\text {Iter}({\varvec{\mathbf {F}}})\)), and CPU time to compute \({\varvec{\mathbf {F}}}\) (Time), obtained when combining pairs of images of the Library series

For the two motions, re-projection errors, \(e_{\text {Init}}({\varvec{\mathbf {F}}})\) or \(e_{\text {BA}}({\varvec{\mathbf {F}}})\), increase with the same slope when the noise level increases. Notice that for both movements, the Bundle-Adjustment step does not improve the results. Indeed, the noise gaussian noise is added to the projections \(({\mathsf {\mathbf{{{q}} }}}_i,{\mathsf {\mathbf{{{q}} }}}'_i)_{i}\). So this is noise which in practice would be produced by the extraction points process. Thus the solution produced by the resolution of the linear system is very close to the optimum and does not need to be refined. The initial solution provided by the triangulation step is then very close to a local minimum of the bundle adjustment problem. Moreover, the variation of the errors of initial re-projection before (\(8pt-Init\) and \(Gp-Init\)) and after (\(8pt-BA\) and \(Gp-BA\)) Bundle Adjustment versus the noise standard deviation is linear. However, the number of iterations needed for convergence is different in the two methods. The initial estimate of the triangulation computed from \({\varvec{\mathbf {F}}}_{Gp}\) is closer to the local minimum than that obtained from \({\varvec{\mathbf {F}}}_{8pt}\). For the first motion (large displacement between camera 1 and 2), the number of iterations of the global method (in green) remains smaller than for the 8-point method (in blue) even though their difference seems to decrease when the noise level is high \((\sigma >1)\). For a significant displacement the quality of the estimate \({\varvec{\mathbf {F}}}\) by the global method remains better even though the difference in quality diminishes with the noise level. Conversely, for the second motion (small displacement between the camera 1 and 2) both methods are equivalent since the difference in quality is only significant for a high level of noise (\(\sigma >1\)). This is logical as the movement is less important. As a conclusion, the 8-point method provides a solution equivalent to that obtained with the global method when the displacement is not too important. For more significant movements the provided solution is not so close even though still in the same basin of attraction of a local minimum.

4.3.3 Influence of the Number of Points

In this experiment, we kept the noise level constant with a standard deviation \(\sigma ^2=0.5\) pixels. We tested the influence of the number of matches \(({\mathsf {\mathbf{{{q}} }}}_i,{\mathsf {\mathbf{{{q}} }}}'_i)_{i}\) on the quality of the resulting estimate of \({\varvec{\mathbf {F}}}\). The number of points \(N\) varied from 10 to 100. Two simulation series are also carried out with the two motions.

Figure 6 brings with the same organization the evaluation criteria. It displays the influence of the number of matches for estimating \({\varvec{\mathbf {F}}}\) on the re-projection errors and on the number of iterations. For both motions and for a sufficiently high number of matches (\(N>50\)), re-projection errors, before and after refinement with bundle adjustment, or the number of iterations versus the number of matches converge to the same asymptote. From a high number of matches, the initial estimate from triangulation computed with \({\varvec{\mathbf {F}}}_{8pt}\) and with \({\varvec{\mathbf {F}}}_{Gp}\) are both in the same basin of attraction for the bundle adjustment problem. However, for a number of matches smaller than \(50\), the number of iterations to converge is smaller for given re-projection errors. The quality of the estimation by the global method seems better. The initial estimate from triangulation computed with \({\varvec{\mathbf {F}}}_{8pt}\) goes away from the basin of convergence whereas the one computed with \({\varvec{\mathbf {F}}} _{Gp}\) remains in the basin.

4.3.4 Influence of the Number of Points with Wide Baseline

In order to sustain the previous behavior, for a noise standard deviation of \(\sigma ^2=1\) pixel and for the significant displacement \(\left[ {\varvec{\mathbf {R}}}_1 \; {\mathsf {\mathbf{{{t}} }}}_1 \right] \), the influence of the number of matching points on the re-projection errors and on the number of iterations was tested. In this difficult context, Fig. 7 demonstrates that the initial estimate computed from \({\varvec{\mathbf {F}}}_{Gp}\) is always closer to the local minimum than that computed from \({\varvec{\mathbf {F}}}_{8pt}\). No matter what is the number of matching points, the number of iterations needed to converge is always smaller.

Fig. 10
figure 10

Reprojection Error before (\(e_{\text {Init}}({\varvec{\mathbf {F}}})\)) and after bundle adjustment (\(e_{\text {BA}}({\varvec{\mathbf {F}}})\)), Number of Iterations (\(\text {Iter}({\varvec{\mathbf {F}}})\)), and CPU time to compute \({\varvec{\mathbf {F}}}\) (Time), obtained when combining pairs of images of the Merton1 series

Fig. 11
figure 11

Reprojection Error before (\(e_{\text {Init}}({\varvec{\mathbf {F}}})\)) and after bundle adjustment (\(e_{\text {BA}}({\varvec{\mathbf {F}}})\)), Number of Iterations (\(\text {Iter}({\varvec{\mathbf {F}}})\)), and CPU time to compute \({\varvec{\mathbf {F}}}\) (Time), obtained when combining pairs of images of the Merton2 series

Fig. 12
figure 12

Reprojection Error before (\(e_{\text {Init}}({\varvec{\mathbf {F}}})\)) and after bundle adjustment (\(e_{\text {BA}}({\varvec{\mathbf {F}}})\)), Number of Iterations (\(\text {Iter}({\varvec{\mathbf {F}}})\)), and CPU time to compute \({\varvec{\mathbf {F}}}\) (Time), obtained when combining pairs of images of the Merton3 series

As a conclusion, the quality of solutions obtained by both methods is almost identical when the movement is not too important, the number of matching points is sufficiently large, and the noise level is not too high. However, when one of these three parameters varies then the 8-point method lacks precision whereas the global method still allows bundle adjustment to convergence to the global minimum. The 8-point method computes the projection of an unconstrained local minimizer on the feasible set whereas the global method provides a global minimizer of the constrained optimization problem. It is already surprising that even for good values of the three parameters the resulting solutions are not too far apart. But for worst values of the parameters it would be even more surprising.

Fig. 13
figure 13

Reprojection Error before (\(e_{\text {Init}}({\varvec{\mathbf {F}}})\)) and after bundle adjustment (\(e_{\text {BA}}({\varvec{\mathbf {F}}})\)), Number of Iterations (\(\text {Iter}({\varvec{\mathbf {F}}})\)), and CPU time to compute \({\varvec{\mathbf {F}}}\) (Time), obtained when combining pairs of images of the Cylinder series. The matched points are located in blue bounding boxes

Fig. 14
figure 14

Reprojection Error before (\(e_{\text {Init}}({\varvec{\mathbf {F}}})\)) and after bundle adjustment (\(e_{\text {BA}}({\varvec{\mathbf {F}}})\)), Number of Iterations (\(\text {Iter}({\varvec{\mathbf {F}}})\)), and CPU time to compute \({\varvec{\mathbf {F}}}\) (Time), obtained when combining pairs of images of the Endoscope series

Fig. 15
figure 15

Initial re-projections errors measured against movement amplitude for the dinosaur series

Fig. 16
figure 16

Final re-projections errors measured against movement amplitude for the dinosaur series

Fig. 17
figure 17

Number of iterations performed by Bundle-Adjustment to converge measured against movement amplitude for the dinosaur series

Fig. 18
figure 18

Initial re-projections errors measured against movement amplitude for the House series

Fig. 19
figure 19

Final re-projections errors measured against movement amplitude for the House series

Fig. 20
figure 20

Number of iterations performed by Bundle-Adjustment to converge measured against movement amplitude for the House series

4.4 Experiments on Real Data

The evaluation criteria remain the same, \(e_{\text {Init}}({\varvec{\mathbf {F}}}),\, e_{\text {BA}}({\varvec{\mathbf {F}}})\) and \(\text {Iter}({\varvec{\mathbf {F}}})\) and the computation time is added. Two experiments were carried out with two sets of images that illustrate different motions between two successive images.

4.4.1 Experiment 1

The first set of four images (see Fig. 8) shows all possible epipolar configurations (right or left epipole at infinity \(\dots \)). With four images, six motions between a pair of images are possible: \(\mathsf {A}\)\(\mathsf {B}\), \(\mathsf {A}\)\(\mathsf {C}\), \(\mathsf {A}\)\(\mathsf {D}\), \(\mathsf {B}\)\(\mathsf {C}\) (infinite epipoles correspond, e.g., to pure translation motions), \(\mathsf {B}\)\(\mathsf {D}\) and \(\mathsf {C}\)\(\mathsf {D}\). For every pair of images, 60 matches are available to compute an estimate of \({\varvec{\mathbf {F}}}\). The values of the evaluation criteria are summarized in Fig. 8. No matter what pair of images is used, the re-projection errors and the number of iterations are almost always better when \({\varvec{\mathbf {F}}}_{Gp}\) is used as initial guess. In addition, for three motions (\(\mathsf {A}\)\(\mathsf {C}\), \(\mathsf {A}\)\(\mathsf {D}\) and \(\mathsf {C}\)\(\mathsf {D}\)), in contrast with the initial guess \({\varvec{\mathbf {F}}}_{Gp}\), the initial guess from the 8-point method is not in a better basin of attraction. This may explain why the initial re-projection errors \(e_{\text {Init}}({\varvec{\mathbf {F}}})\) are sometimes larger for \({\varvec{\mathbf {F}}}_{Gp}\) as the initial guess may be in a good basin of attraction but with a larger re-projection error. For the four motions \(\mathsf {A}\)\(\mathsf {B}\), \(\mathsf {B}\)\(\mathsf {C}\), \(\mathsf {B}\)\(\mathsf {D}\) and \(\mathsf {B}\)\(\mathsf {D}\), both initializations are in the same basin of attraction but the number of iterations demonstrates that the initial guess from the global method is always closer to the local minimizer. Finally, even though the computation time of the latter is significantly larger than for the 8-point method, it still remains compatible with a practical use.

4.4.2 Experiment 2

The second experiment compares the two methods on large motions. It is based on many series of images. First we test our algorithm with the classic series Library, Merton, dinosaur and house that are available at www.robots.ox.ac.uk/~vgg/data/data-mview.html. For the set of three images of Library and Merton serie, Figs. 91011 and 12 demonstrate that the quality of the solution achieved by the global method is always better than with the 8-point method (in some cases both solutions are very close).

We also conducted the same tests on other pairs of images. For the first pair, we used images from a standard cylinder graciously provided by the company NOOMEO. This cylinder is use to evaluate the accuracy of 3D reconstructions. Matched points are calculated with digital image correlation method. They are located in a window inside the cylinder. Thus, we have 6609 pairs \(({\mathsf {\mathbf{{{q}} }}}_i,{\mathsf {\mathbf{{{q}} }}}_i)_i\) matched to sub-pixel precision. Results are presented in Fig. 13. We observe that the computation time of the 8-point method exceeds one second. This is due to the large number of matched points which leads to the resolution of a large linear system. However, as the points are precisely matched, this system is well conditioned. But the quality of the fundamental matrix estimated with the 8-point method is not sufficient to properly initialize the Bundle-Adjustment because the final re-projection error is 1.47 pixels. At the same time, even if the number of iterations is larger, our global method supplies a good estimation because the final re-projection error is 0.25 pixels. Furthermore, the calculation time remains constant in approximately 2 seconds. For the second pair, we use images taken by an endoscope. Figure 14 shows the results obtained on this difficult case. As for the previous example, we observe that the fundamental matrix estimated by our global method is good quality because the final error is 0.93 pixels. At the same time, Bundle Adjustement puts more iterations to converge on a less precise solution when we use \({\varvec{\mathbf {F}}}_{8pt}\) to initialize it.

For the set of 36 images of the Dinosaur series and 9 images of the house series, we tested the influence of motion amplitude between a pair of image on the quality of the resulting estimates obtained by both methods. For this purpose, we had both estimates with all possible motions \(((0,1), (1,2), (2,3),\ldots )\) with 1-image distance, then all possible motions \(((0,2), (1,3),(2,4),\ldots )\) with 2-image distance, and so on. With this process, we can measure the influence of the average angle on the quality of the fundamental matrix estimated by both methods. Figures 15, 16, 17, 18, 19 and 20 shows the average of re-projection errors and the average of number of iterations with respect to average angle for the two series. The re-projection error after bundle adjustment is always smaller with the global method and with always a smaller number of iterations. Next, the larger the movement the more the solution by both methods deteriorates. But the deterioration is larger for the 8-point method than for the global method. One may also observe that in some cases the re-projection error before bundle adjustment is in favor of the 8-point method. In analogy with the real cases studied before, this may be due to the fact that for these cases the initial guess \({\varvec{\mathbf {F}}}_{Gp}\) is in a basin of attraction with a better local minimum than in the basin of attraction associated with \({\varvec{\mathbf {F}}}_{8Pt}\), but the ‘distance’ between the initial guess and the corresponding local minimizer is larger for \({\varvec{\mathbf {F}}}_{Gp}\) than for \({\varvec{\mathbf {F}}}_{8Pt}\). Indeed in such cases the number of iterations is larger for \({\varvec{\mathbf {F}}}_{Gp}\) than for \({\varvec{\mathbf {F}}}_{8Pt}\).

5 Conclusion

We have studied the problem of estimating globally the fundamental matrix over nine parameters and under rank and normalisation constraints. We have proposed a polynomial-based approach which enables one to estimate the fundamental matrix with good precision. More generally, we have shown how to modify the constraints on the numerical certificate of optimality to obtain fast and robust convergence. The method converges in a reasonable amount of time compared to other global optimization methods.

From computational experiments conducted on both simulated and real data we conclude that the global method always provides an accurate initial estimation for the subsequent bundle adjustment step. Moreover, we have shown that if the eight-point method has a lower computational cost, its resulting estimate often lies further away from the global optimum obtained by the global method.