Verifying Global Minima for L 2 Minimization Problems in Multiple View Geometry

Hartley, Richard; Kahl, Fredrik; Olsson, Carl; Seo, Yongduek

doi:10.1007/s11263-012-0569-9

Verifying Global Minima for L ₂ Minimization Problems in Multiple View Geometry

Published: 04 October 2012

Volume 101, pages 288–304, (2013)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Computer Vision Aims and scope Submit manuscript

Verifying Global Minima for L ₂ Minimization Problems in Multiple View Geometry

Download PDF

Richard Hartley¹,
Fredrik Kahl²,
Carl Olsson² &
…
Yongduek Seo³

1240 Accesses
12 Citations
Explore all metrics

Abstract

We consider the least-squares (L2) minimization problems in multiple view geometry for triangulation, homography, camera resectioning and structure-and-motion with known rotation, or known plane. Although optimal algorithms have been given for these problems under an L-infinity cost function, finding optimal least-squares solutions to these problems is difficult, since the cost functions are not convex, and in the worst case may have multiple minima. Iterative methods can be used to find a good solution, but this may be a local minimum. This paper provides a method for verifying whether a local-minimum solution is globally optimal, by providing a simple and rapid test involving the Hessian of the cost function. The basic idea is that by showing that the cost function is convex in a restricted but large enough neighbourhood, a sufficient condition for global optimality is obtained.

The method is tested on numerous problem instances of real data sets. In the vast majority of cases we are able to verify that the solutions are optimal, in particular, for small to medium-scale problems.

A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem

Match Selection and Refinement for Highly Accurate Two-View Structure from Motion

Minimal Solutions for Pose Estimation of a Multi-Camera System

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Modern methods for the solution of geometric reconstruction problems in Computer Vision may be viewed as starting with the paper of Longuet-Higgins (1981). However, after nearly three decades of research, the problem of finding guaranteed optimal least-squares (L ₂) solutions to such problems is not solved. In fact, papers such as Nistér et al. (2007a, 2007b) suggest that such algorithms do not exist.

Even for the simplest of geometric vision problems, the triangulation problem, no ideal method has been given that guarantees an optimal least-squares solution. This paper, however, tries a totally different approach, by giving a procedure for verifying whether a solution obtained through standard methods such as bundle-adjustment actually is the global optimum. Usually, in fact, it is. Though not useful for all geometric Vision problems, the method given in this paper is applicable to a substantial number of the common reconstruction problems, as will be shown theoretically and by experiment in this paper.

Perhaps the simplest of all 3D reconstruction problems is the triangulation problem, which nevertheless shares many of the difficulties of more complex tasks. We begin our description of our approach by considering the triangulation problem, before generalizing later to a wider class of problems. The condition we provide for the verification is a sufficient but not necessary condition for the solution to be a global optimum. However, for triangulation it works in almost all cases. In the rare cases where the condition fails it is usually because the point has large noise, in which case in a large-scale reconstruction problem, the best option is just to remove the point from consideration. Alternatively it may be possible to apply one of the recent (considerably more time-consuming) optimal algorithms (Lu and Hartley 2007; Kahl et al. 2008) for solving the problem; see also the survey (Hartley and Kahl 2007).

How Hard is the Triangulation Problem, Really?

It was shown in Hartley and Sturm (1997) that the least-squares two-view triangulation problem is solvable in closed form. However, up to three local minima may exist. Much more recently, it has been shown (Stewenius et al. 2005) that the solution for three views involves the solution of a polynomial of degree 47, and higher degree polynomials are required with more views. The degree of the polynomial involved in the solution translates into numbers of possible local minima. It is certainly possible to find explicit examples with multiple local minima (Hartley and Sturm 1997). This suggests that the problem is difficult.

On the other hand, it was shown in Hartley and Schaffalitzky (2004) that a single local (and hence global) minimum occurs if an L _∞ (minimax) solution is sought, instead of the least-squares solution. However, the least-squares problem remains the problem of primary interest. The L _∞ solution is useful as an initialization procedure for the least squares, but it does not guarantee an optimal L ₂ solution. In a real data test it seems that common algorithms get the right answer most of the time. So, perhaps the problem is not so hard after all—if only one knows whether one has the right solution. That problem is effectively solved in this paper.

Working algorithms with practical accuracy have been reported for geometric vision problems since the eight-point algorithm for two view reconstruction (Hartley and Zisserman 2004). Although these known methods do not guarantee a globally optimal solution, nevertheless, simple methods based on initialization, followed by iterative refinement usually work very well. They depend on the initial solution being within the basin of attraction of the optimal solution. However, until now there has been no way of verifying this requirement. In this paper, we address this problem and give a fast and very satisfactory method of verifying that a given local minimum is the global minimum. By experiments on a very large set of triangulation problems, the test is seen to have a 99.9 % success rate and averages 0.5 ms per triangulated point on a modern 2.66 GHz laptop computer, substantially faster for small problems.

The new technique for proving convexity and thus the global optimality is applicable to not only triangulation but also other geometric vision problems. We show that the same analysis for the triangulation problem applies to any optimization problem with quasi-convex residual functions (Ke and Kanade 2007; Kahl and Hartley 2008), since the location of the global minimum can be constrained to a convex set.

The class of problems that our framework covers is the same as the quasiconvex problems in the L _∞-norm (Ke and Kanade 2007; Kahl and Hartley 2008), which includes n-view triangulation, homography estimation, camera resectioning to name a few.

Our main contributions may be listed as follows:

An easy and fast method to verify the global optimality of the solution under the L ₂ norm for a whole class of multiview geometry problems.
We perform an extensive set of experiments on several data sets to explore the limitations of the method. Results are given for triangulation, homography estimation, camera resectioning and structure and motion with known rotations. Our main conclusion is that in most cases the basic verification test succeeds, especially for small scale problems.

2 Triangulation—Simple Convexity Test

We will derive conditions for optimality of the solutions for a class of cost functions encountered in the solution of geometric vision problems. These are the cost functions composed of sum of squares of measurement errors, where each error term is a quasi-convex cost function of Second Order Cone Programming (SOCP) type (Kahl and Hartley 2008). This type of cost function is encountered in many different problems, many of which are listed in the survey (Hartley and Kahl 2007). However, to allow a simple introduction to the techniques introduced in this paper, we will first apply them to the triangulation problem, which has been shown to be representative of the more general problems considered later.

2.1 Problem Description

We consider a calibrated camera model and think of image points as being represented by points on a sphere, that is unit vectors, rather than points in an image plane. The triangulation problem is characterized by a set of unit measurement vectors w _i based at projection centres C _i, all pointing to a common point X in space, which is to be determined. Thus, nominally X=C _i+d _i w _i, where d _i is the depth of the point X with respect to the i-th projection.

With error in the measurements, the rays from centre C _i in direction d _i do not exactly intersect, so the required point X is the one that minimizes the sum of squares error. The image error is normally the angular difference between a measured direction w and the direction vector from the camera centre to the point X. For simplification, our cost function will be the sum of squares of the tangent of the angular error, rather than the squared angle itself. Since errors are usually quite small, the difference is insignificant, and the analysis is simpler. Eventually, we will extend the result to minimization of squared angle error (see Appendix A).

Considered more generally, the desired solution is a point X _opt that represents a global minimum of the cost function $C(\mathbf{X}) = \sum_{i=1}^{N} f^{2}_{i}(\mathbf{X})$, where $f^{2}_{i}$ represents a squared residual error measurement in the i-th of N images. If we use the tangent of the angle to measure the residual error, then the total cost function is given by

(1)

Bounding the Solution

Now, suppose that X ₀ is a proposed solution to this triangulation problem with residual given by $C(\mathbf{X}_{0}) = \sum_{i=1}^{N} f^{2}_{i}(\mathbf{X}_{0}) = \epsilon^{2}$. This observation gives a constraint on the position of the optimal solution, which must satisfy

$$ C( \mathbf{X}_{\mathrm {opt}}) = \sum_{i=1}^N f^2_i( \mathbf{X}_{\mathrm {opt}}) \le\epsilon^2 . $$

(2)

Since the sum of terms $f^{2}_{i}( \mathbf{X}_{\mathrm {opt}})$ must be less than ϵ ², so must each individual term. Thus, for all i, $f^{2}_{i}( \mathbf{X}_{\mathrm {opt}}) \le\epsilon^{2}$. The set of points X in ℝ³ satisfying this condition for some i consists of those points that map into a circular neighbourhood of the image point. This constitutes a cone in ℝ³, as observed in Hartley and Schaffalitzky (2004). Since this condition must be satisfied by each of the projections, it follows that X _opt must lie in the intersection of all the cones. This is a convex region of ℝ³, since each cone is a convex set.

Let us define the convex set $\mathcal {D}$ of points for the optimal solution as

$$ \mathcal {D}= \bigl\{\mathbf{X} \, | \, f^2_i( \mathbf{X}) \le\epsilon^2; \, {\mathbf{w}_i}^{\top}(\mathbf{X} - \mathbf{C}_i) \ge0,\ i=1,\ldots,N\bigr\}, $$

(3)

where the second condition w _i ^⊤(X−C _i)≥0 restricts the domain to points in front of the camera.

Our overall strategy of finding X _opt (or verifying X ₀=X _opt) is to provide tests that allow us to prove that the cost function must be convex on this convex region. If this is so, then finding the optimal solution X _opt may be carried out using standard convex optimization methods. More significantly, if X ₀ already is a local minimum of the cost function, found by any geometric optimization technique, then it must be a global minimum.

The Hessian of the Cost Function

A sufficiently differentiable function is convex on a convex region if and only if its Hessian is positive semi-definite. Hence, we are led in this section to consider the Hessian of the cost function. The total cost function is a sum of several different terms, one for each measurement. We will first consider a single term in the cost function, corresponding to a single image measurement.

To begin with, we consider the Hessian expressed in a suitable camera-centred coordinate frame. Consider a vector w pointing from the origin in the direction of the z-axis, and let X=(x,y,z) be a point lying close to the positive z axis, such that the vector from the origin to X makes an angle ϕ from the vector w. Consider the error function given by

$$ f^2(x, y, z) = \tan^2 \phi= \bigl(x^2 + y^2\bigr) / z^2 . $$

(4)

This represents the squared projection error of the point X with respect to the “measured direction”, w.

The Hessian matrix of f ² with respect to x, y and z is easily computed to be

(5)

(6)

where we introduce the notation τ _x, τ _y, τ and d defined by the terms in corresponding position. Note that $\tau^{2} = \tau_{x}^{2} + \tau_{y}^{2}$, and d represents the depth of the point in the viewing direction. Neglecting the constant factor 2/d ², the eigenvalues of this matrix are easily computed to be 1 and $a \pm\sqrt{a^{2} + \tau^{2}}$, where a=(3τ ²+1)/2. There are two positive and one negative eigenvalue.

A Bound on the Hessian

As the point X=(x,y,z) moves over the region of interest $\mathcal {D}$, the Hessian changes. It is our purpose to find a lower bound for the Hessian matrix for all points X in $\mathcal {D}$.

Consider the matrix given by

$$ \mathtt{H}' = \frac{2}{d^2} \left [ \arraycolsep=5pt \begin{array}{@{}ccc@{}} 1/3 & 0 & 0 \\ 0 & 1/3 & 0 \\ 0 & 0 & -3 \tau^2 \end{array} \right ]. $$

(7)

The reasoning that led to the choice of this matrix H′ will be explained later, but its properties are simple enough. The key property of this matrix is the observation that

$$\mathtt{H} - \mathtt{H}' = \frac{4}{3d^2} \left [ \arraycolsep=5pt \begin{array}{@{}ccc@{}} 1 & 0 & -3\tau_{x} \\ 0 & 1 & -3\tau_{y} \\ -3\tau_{x} & -3\tau_{y} & 9 \tau^2 \end{array} \right ] $$

is positive semi-definite. Indeed, it is easily seen that H−H′ has eigenvalues (4/3d ²) times the following:

All eigenvalues are non-negative. We write H−H′⪰0, or H⪰H′. Matrix H′ is a lower bound for the Hessian.

The eigenvalues of H′ are obviously the diagonal entries. We see that H′ has two positive and one negative eigenvalue. The eigenvector corresponding to the negative eigenvalue is directed along the z axis, and the other two eigenvectors are in the plane perpendicular to the z axis. However, since the two corresponding eigenvalues are equal, the eigenvectors may be taken as any two orthogonal vectors in the xy plane.

The matrix H′ is more easily handled than H, since it is dependent on the point X only through the values of d and τ, both of which can be bounded, as we shall see. We may write H′ as

$$\mathtt{H}' = \bigl(2/3d^2\bigr) \operatorname {diag}\bigl(1,1,- 9 \tau^2\bigr) . $$

We assume that the point X lies in a cone with angle arctanτ _max. Thus, τ≤τ _max, and we see that

Note that this matrix is a lower bound (in the semi-definite partial ordering) for the Hessian of f at any point X lying in the cone. It depends only on the depth d of the point from the vertex of the cone.

Generally Placed Cone

The above computation was carried out for the case where the vector w points directly along the z axis. We now consider a cone with axis represented by an arbitrary unit vector w and with vertex at a point C. This corresponds to rotating the camera by R to align the Z direction to w, followed by a translation of the origin. The depth d(X) of a point X relative to the vertex is given by w ^⊤(X−C). The Hessian for the function (4) can be transformed easily to this rotated coordinate frame, according to

$$ \mathtt{H} \mapsto \mathtt{R}\mathtt{H} \mathtt{R}^{\top }\succeq \frac{2}{3 d(\mathbf{X})^2} \bigl( \mathbf{u} \mathbf{u}^{\top}+ \mathbf{v}\mathbf{v}^{\top}- 9 \tau _{\mathrm {max}}^2\mathbf{w}\mathbf{w}^{\top} \bigr) , $$

(8)

where u, v and w are the three columns of the rotation matrix. This gives a lower bound for the Hessian, depending only on the point X through its depth d(X) from the vertex.

Bounded Depths

We may remove this final dependency of the Hessian on the point X if we assume that the point X lies in a bounded domain. Thus, suppose the depth d(X) is bounded on the feasible domain $\mathcal {D}$, and that we can find bounds L and U such that for all $\mathbf{X} \in \mathcal {D}$,

$$ L \leq1/d(\mathbf{X}) \leq U . $$

(9)

In this case, one can find a bound on the Hessian that holds over the complete domain $\mathcal {D}$, and yet is independent of X, namely (8) and (9) together give

$$ \mathtt{H} \succeq (2/3) \bigl(L^2 \bigl(\mathbf{u} \mathbf{u}^{\top}+ \mathbf{v}\mathbf{v}^{\top}\bigr) -9 \,U^2\, \tau _{\mathrm {max}}^2 \mathbf{w}\mathbf{w}^{\top} \bigr), $$

(10)

where the inequality holds for the Hessian of f ² computed at any point in $\mathcal {D}$. Thus, the matrix on the right may be thought of as a lower bound for the Hessian evaluated in this region.

Summing the Hessians

Now, we consider a point X, subject to several measurements, represented by vectors w _i. We do not care where the vertex of the cone (corresponding to the camera centre) is located, but only that the depth of the point X in the i-th cone is d _i. We suppose that the point X is situated in the intersection of cones with angle arctanτ _max. Let $f^{2}_{i}(\mathbf{X})$ be $\tau_{i}^{2}$ where arctanτ _i is the angle of X from the axis of the i-th cone. The L ₂ error associated with the point X is given by $f^{2}(\mathbf{X}) = \sum_{i} f^{2}_{i} (\mathbf{X})$ and the Hessian of f ² is H=∑_i H _i. Now applying the inequality (8) to each H _i, we get

$$ \mathtt{H} \succeq \sum_i \bigl(2/3d_i(\mathbf{X})^2\bigr) \bigl( \mathbf{u}_i {\mathbf{u}_i}^{\top}+ {\mathbf{v}_i \mathbf{v}_i}^{\top} - 9\tau _{\mathrm {max}}^2\mathbf{w}_i {\mathbf{w}_i}^{\top} \bigr). $$

(11)

Although each of the individual summands H _i in this expression cannot be positive semi-definite, we hope that by adding up the contributions for different i, the positive eigenvalues will dominate the negative eigenvalues, resulting in a positive definite or semi-definite matrix.

Summarizing the previous discussion, we may state our basic convexity theorem for the 3D triangulation problem.

Theorem 1

Let the convex domain $\mathcal {D}$ be contained in the intersection of a set of N cones with vertices C _i and with axes represented by unit vectors w _i, and angle bounded by arctanτ _max. Let u _i and v _i be unit vectors orthogonal to each other and to w _i. For a point $\mathbf{X} \in \mathcal {D}$, let d _i(X)=w _i ^⊤(X−C _i) represent its depth from the vertex of the i-th cone in the direction w _i. Let U _i and L _i be upper and lower bounds for the value of 1/d _i(X) on $\mathcal {D}$. Then

$$ \sum_{i=1}^N \bigl( L_i^2 \bigl(\mathbf{u}_i { \mathbf{u}_i}^{\top}+ \mathbf{v}_i{ \mathbf{v}_i}^{\top}\bigr)-9U_i^2\, \tau _{\mathrm {max}}^2 \mathbf{w}_i {\mathbf{w}_i}^{\top} \bigr) \succeq \mathtt{0} $$

(12)

is a sufficient condition for the least-squares error function (1) to be convex on $\mathcal {D}$.

3 Weighted Depths

It is possible to obtain a better condition than that given in Theorem 1, as will be shown next. Observe that the summand in (12) consists of two parts, which are multiplied by $L_{i}^{2}$ and $U_{i}^{2}$ respectively. Whereas u _i u _i ^⊤+v _i v _i ^⊤ is positive semi-definite, $-9 \tau_{\mathrm{max}}^{2} \mathbf{w}_{i} {\mathbf{w}_{i}}^{\top}$ is negative semi-definite. In order for the total sum to be positive semi-definite, it is advantageous if the factor $U_{i}^{2}$ is kept as small as possible. Though U _i can be no smaller than L _i, it is best if the bounds L _i and U _i are as close to each other (tight) as possible.

Looked at another way, the condition (12) resulted from obtaining a lower bound for the Hessian over the whole region $\mathcal {D}$. Since the Hessian of each individual term $f_{i}^{2}$ scales as 1/d ² along any ray through C _i, any constant lower bound over the region $\mathcal {D}$ will not be a good bound if the depth of the region varies greatly, so that L _i and U _i are widely different. An alternative is to introduce some positive function α(X) and find a lower bound for the scaled Hessian α(X)H over the region $\mathcal {D}$. If this lower bound is positive definite, then so is H everywhere in $\mathcal {D}$.

Consider a positive function α(X) and suppose that there exist constants $L^{\alpha}_{i}$ and $U^{\alpha}_{i}$, such that the following conditions holds for all i, and all $\mathbf{X} \in \mathcal {D}$:

$$ L^{\alpha}_i \le \frac{\alpha(\mathbf{X})}{d_i(\mathbf{X})} \le U^{\alpha}_i . $$

(13)

Then, referring to (11), we find that

(14)

So to establish convexity it is sufficient to verify that

$$ \sum_i \bigl( \bigl({L^{\alpha}_i} \bigr)^2 \bigl( \mathbf{u}_i {\mathbf{u}_i}^{\top}+ \mathbf{v}_i{\mathbf{v}_i}^{\top}\bigr) - 9 \bigl({U^{\alpha}_i} \bigr)^2 \tau _{\mathrm {max}}^2 \mathbf{w}_i{\mathbf{w}_i}^{\top} \bigr) \succeq \mathtt{0}. $$

(15)

Observe that this condition is identical with (12) except that bounds L _i and U _i are replaced by $L^{\alpha}_{i}$ and $U^{\alpha}_{i}$. If these new bounds are tighter (that is more nearly equal) than (L _i,U _i), then we can expect a better result; we are more likely to be able to establish convexity using condition (15) than by using (12).

If $L^{\alpha}_{i}$ and $U^{\alpha}_{i}$ are not greatly different, this condition specifies that all the depths d _i(X) are approximately proportional to some “average” depth value α(X).

A useful choice for the function α(X) in the above discussion is the average depth of point X with respect to all views, that is

$$\alpha(\mathbf{X}) = (1/N) \sum_{i=1}^N d_i(\mathbf{X}) . $$

If the depth d _i(X) of a point is reasonably approximated by the average depth over all views, then α(X)/d _i(X) does not vary greatly over the domain $\mathcal {D}$, so the bounds $L^{\alpha}_{i}$ and $U^{\alpha}_{i}$ are close together. Even for domains $\mathcal {D}$ reaching to infinity, the ratio α(X)/d _i(X) will remain within reasonable maximum and minimum bounds, since the average depth α(X) increases to infinity at the same rate as any specific depth d _i. This is illustrated in Fig. 1.

Average depth is a good choice for the function α(X) in the case where all the points have approximately the same depth in all views. In the case when the depths are substantially different, then it may be better to choose a weighted depth average. For instance, let μ _i=d _i(X ₀), where $\mathbf{X}_{0}\in \mathcal {D}$ is a proposed solution to the problem. Then, a good choice for the function α(X) is

$$\alpha(\mathbf{X}) = \frac{1}{N} \sum_{i=1}^N \frac{d_i(\mathbf{X})}{\mu_i} . $$

We now may summarize the previous discussion by stating the general convexity theorem.

Theorem 2

With the notation as in Theorem 1, let α(X) be any positive real valued function defined on $\mathcal {D}$, and let $U^{\alpha}_{i}$ and $L^{\alpha}_{i}$ be upper and lower bounds for the value of α(X)/d _i(X) on $\mathcal {D}$. Then

$$ \sum_{i=1}^N \bigl( \bigl({L^{\alpha}_i} \bigr)^2 \bigl( \mathbf{u}_i {\mathbf{u}_i}^{\top}+ \mathbf{v}_i {\mathbf{v}_i}^{\top} \bigr) - 9 \bigl({U^{\alpha}_i} \bigr)^2\,\tau _{\mathrm {max}}^2 \mathbf{w}_i {\mathbf{w}_i}^{\top} \bigr) \succeq \mathtt{0} $$

(16)

is a sufficient condition for the least-squares error function (1) to be convex on $\mathcal {D}$.

3.1 Computing the Bounds

For conditions of this type to be useful, it is necessary that the bounds $L^{\alpha}_{i}$ and $U^{\alpha}_{i}$ can be calculated. We can find $U^{\alpha}_{i}$ by maximizing α(X)/d _i(X) for $\mathbf{X} \in \mathcal {D}$. Any value $U^{\alpha}_{i}$ no smaller than this maximum will be a suitable bound.

The region $\mathcal {D}$ of ℝ³ over which we need to minimize and maximize α(X)/d _i(X) is the intersection of second order cones. Using second order cones leads us into Second Order Cone Programming (SOCP), which for efficiency (and the non-availability of public domain code in C++) we wish to avoid. It is easier to allow $\mathcal {D}$ to be a polygonal region (if necessary by enlarging it slightly). This polygonal region can be expressed in terms of linear constraints, resulting in the use of Linear Programming to optimize α(X)/d _i(X). More details of this are given in the algorithm description in Sect. 3.2.

In the case α(X)=1, the problem reduces to finding the maximum and minimum values of depth on $\mathcal {D}$. Thus, let

$$B = \min_{\mathbf{X} \in \mathcal {D}} d_i(\mathbf{X}) = \min_{\mathbf{X} \in \mathcal {D}} {\mathbf{w}_i}^{\top}(\mathbf{X} - \mathbf{C}_i) . $$

Then $U^{\alpha}_{i}$ can be chosen as any value greater than 1/B. If the region $\mathcal {D}$ is polygonal, the value of B may be found by linear programming. If $\mathcal {D}$ is an intersection of cones, then we may use SOCP to find the minimum.

If α(X) is a linear function of the different depths, say $\alpha(\mathbf{X}) = \alpha_{0} + \sum_{j=1}^{n} \alpha_{j} d_{j}(\mathbf{X})$, then it is also possible to find the bounds $U^{\alpha}_{i}$ and $L^{\alpha}_{i}$. The required optimization problem to find $U^{\alpha}_{i}$ is

where the maximization takes place over all values of B and $\mathbf{X} \in \mathcal {D}$, and $U^{\alpha}_{i}$ is subsequently chosen as any value no smaller than B. In this case, however, writing d _i(X)=w _i ^⊤(X−C _i) results in a constraint involving both B and the point X; this is not a linear constraint. Nevertheless, for a fixed value of B, the constraint is linear, and instead of maximizing B, we may solve a feasibility problem to determine whether there exists X to satisfy the constraints. The problem can be rewritten, essentially equivalently, as

By a process of binary search over the range of B one may solve the optimization problem to determine the optimal B. This requires repeated solution of the feasibility problem. We write the problem in this way to emphasize that the required value of B is one in which the constraint problem is infeasible. It is of some importance not to stop the bisection process at a value B where the constraints are satisfied.

A good starting point is the point $\mathbf{X}_{0}\in \mathcal {D}$, which can be used to specify a value B for which the problem is feasible. Since we may not need a tight upper bound B, the bisection process may not need to continue to high accuracy. One may set $U^{\alpha}_{i}$ to some value of B for which the constraint problem is infeasible.

3.2 Algorithm

We give a summary of the algorithm for proving convexity of the cost function.

Consider a set of cameras with centres C _i, and let w _i be a direction vector representing the measured direction of an observed point X from C _i. Let u _i and v _i be two unit vectors orthogonal to w _i constituting (along with w _i) an orthogonal coordinate frame.

1.
Bounding the region. Let X be a 3D point constituting a potential solution to the triangulation problem. The cost of this point is the value of the cost function
(17)
Let the value of this cost for the given point X be ϵ ².

We then define a region of space in which the optimal point X _opt lies according to the inequalities.
$$ -\epsilon\le\frac{{\mathbf{u}_i}^{\top}( \mathbf{X}_{\mathrm {opt}}- \mathbf{C}_i)}{{\mathbf{w}_i}^{\top}( \mathbf{X}_{\mathrm {opt}}- \mathbf{C}_i)} \le \epsilon $$
(18)
and similar inequalities involving v _i instead of u _i. Since w _i ^⊤(X _opt−C _i)>0 (the cheirality constraint that the point must lie in the direction it is observed), we can multiply out by w _i ^⊤(X _opt−C _i) to obtain a total of four linear inequalities in the positions of the point X _opt, constraining it to a polyhedral region of space, $\mathcal {D}$.
2.
Finding depth bounds. The next step is to find minimum and maximum values of d _i on the region $\mathcal {D}$. Since d _i is defined to be w _i ^⊤(X−C _i), determining its minimum and maximum over the polyhedral region $\mathcal {D}$ is simply a pair of linear programming problems.
3.
Performing the test. We can now form the matrix in (12) and test its smallest eigenvalue to determine if it is positive-definite. If it is, then the cost function is convex on the region $\mathcal {D}$. If the initial estimate X is a local minimum, then it is also a global minimum.
4.
α -test. If the test in the previous step failed, then repeat steps 2 and 3 to find bounds (13), and then condition (16) to test for convexity.

4 General SOCP Cost Functions

We now consider the same problem in the more general context of the type of cost function that arises in many geometric Vision problems. The type of cost function that we consider is of the type that arises in SOCP. Thus, let f:ℝ^m→ℝ be a function of the form

$$ f^2(\mathbf{X}) = \frac{\| \mathtt{A}^{\top }\mathbf{X} + \mathbf{b}\|^2}{(\mathbf{c}^{\top} \mathbf{X} + k)^2} , $$

(19)

where A is an m×n matrix. Suppose that for $\mathbf{X} \in \mathcal {D}$ the value of the function is bounded, so that $f^{2}(\mathbf{X}) < \tau_{\mathrm{max}}^{2}$ for $\mathbf{X} \in \mathcal {D}$.

Define the vector v=(A ^⊤ X+b)/(c ^⊤ X+k), and observe that $\|\mathbf{v}\|^{2} = f^{2}(\mathbf{X}) < \tau_{\mathrm{max}}^{2}$. The Hessian of f ²(X) can now be computed to equal

$$ \mathtt{H} = \frac{2}{(\mathbf{c}^{\top} \mathbf{X} + k)^2} [ \arraycolsep=5pt \begin{array}{@{}cc@{}} \mathtt{A} & \mathbf{c} \end{array} ] \left [ \arraycolsep=5pt \begin{array}{@{}cc@{}} \mathtt{I}_{n\times n} & -2 \mathbf{v} \\ -2 \mathbf{v}^{\top} & 3 \| \mathbf{v}\|^2 \end{array} \right ] \left [ \begin{array}{c} \mathtt{A}^{\top}\\ \mathbf{c}^{\top} \end{array} \right ]. $$

(20)

Apart from the multiplication by [A c] and its transpose, this is the same as what was obtained previously in (5). As before, we see that

Indeed, it is straight-forward to verify that the difference between these two matrices,

$$\frac{2}{3} \left [ \arraycolsep=5pt \begin{array}{@{}cc@{}} \mathtt{I}_{n\times n} & -3 \mathbf{v} \\ -3 \mathbf{v}^{\top} & 9 \|\mathbf{v}\|^2 \end{array} \right ] $$

has eigenvectors 0, 2/3 (repeated n−1 times) and (2/3)(1+9∥v∥²). It results from this that

$$ \mathtt{H} \succeq \frac{2}{3 d(\mathbf{X})^2} \bigl(\mathtt{A} \mathtt{A}^{\top}- 9\tau_{\mathrm{max}}^2 \mathbf{c} \mathbf{c}^{\top}\bigr), $$

(21)

where d(X)=c ^⊤ X+k. If we can bound 1/d(X)² so that L ²≤1/d(X)²≤U ², then we obtain a bound that does not depend at all on the point X, namely

$$ \mathtt{H} \succeq (2/3) \bigl( L^2 \mathtt{A} \mathtt{A}^{\top}- 9 U^2\,\tau_{\mathrm{max}}^2 \mathbf{c} \mathbf{c}^{\top} \bigr) . $$

(22)

In addition, as with the triangulation problem, we may define a function α(X) on $\mathcal {D}$ to obtain a general condition.

Theorem 3

Let $\mathcal {D}$ be a domain in ℝ^m and for i=1,…,N let $f^{2}_{i} : \mathcal {D}\rightarrow \mathbb{R}$ be a function of the form

$$f^2_i(\mathbf{X}) = \frac{\| {\mathtt{A}_i}^{\top} \mathbf{X} + \mathbf{b}_i \| ^2}{({\mathbf{c}_i}^{\top} \mathbf{X} + k_i)^2} $$

such that $0 \le f^{2}_{i}(\mathbf{X}) \le\tau_{i}^{2}$ on $\mathcal {D}$. Let α be a positive real valued function defined on $\mathcal {D}$ and $(L^{\alpha}_{i}, U^{\alpha}_{i})$ be lower and upper bounds for α(X)/(c _i ^⊤ X+k _i) on $\mathcal {D}$. Then

(23)

is a sufficient condition for the function $\sum_{i=1}^{N} f^{2}_{i}(\mathbf{X})$ to be convex on $\mathcal {D}$.

An important special case is when α is identically equal to unity, and (L _i,U _i) are bounds for the inverse “depth”, 1/(c _i ^⊤ X+k _i). Also note that if τ _max=max_i τ _i, then we can replace τ _i by τ _max in the (23) to get a slightly weaker (but more simple) sufficient condition.

4.1 Application to Other Problems

We now consider the way many other problems in Multiple-View Geometry may be expressed in terms of cost functions of the form given in (19), and hence are susceptible to the methods developed in this paper. Problems that can be addressed in this way are essentially the same as those that can be solved in L _∞ norm using SOCP or related methods. A survey of such problems was given in Hartley and Kahl (2007). We consider the most important such problems here. In all cases we consider, it is a simple exercise to write the cost functions as a sum of terms of the form (19).

4.2 Multiple View Reconstruction

The cost function (17) involves only one point X. Suppose we have several points X _j; j=1,…,M and also several points C _i; i=1,…,N, and that a direction vector w _ij nominally pointing from C _i to X _j is known for some subset $\mathcal{S}$ of pairs (i,j). One may write a cost-function

(24)

which is to be minimized over all choices of the X _j and C _i. To remove the gauge freedoms from this problem, it is best to set one of the C _i to be at the origin, and for one point X _j to be chosen to satisfy w _ij ^⊤(X _j−C _i)=1.

Now, if we define a vector X to be a concatenation of all the X _j and C _i, then each of the terms in (24) is easily expressible in the form (19). The matrix A _ij, and the vector c _ij ^⊤ that will appear in the place of A _i and c _i in (19) will have 3(M+N) columns, but will be quite sparse, having non-zero entries only in columns corresponding to the points X _i and C _j involved in the term.

Defined in this way, the problem satisfies the conditions of Theorem 3, and the bound (23) may be used to check convexity of the cost function.

For large problems, the number of measurements is quite large and the total cost $\sum f_{ij}^{2} (\mathbf{X}_{0}) = \epsilon^{2}$ will be high. This value determines the value of τ that must be used to verify optimality. For this reason, the region $\mathcal {D}$ can be large, and the test can fail. An alternative is to set a smaller value of τ. If the convexity test succeeds for this smaller value, then we cannot rule out the existence of an optimum X _opt not in the region $\mathcal {D}$; however, it such an optimum exists, then the residual $f_{ij}^{2}( \mathbf{X}_{\mathrm {opt}})$ must be greater than τ for at least one measurement. If τ is reasonably large, then this may be considered an unlikely occurrence, particularly if the measurement set is outlier-free.

4.3 Reconstruction Using Image Measurements

Consider a projective camera with camera matrix P _i acting on a point X _j, and let the corresponding measured image point be x _ij, defined for $(i, j) \in\mathcal{S}$, that is for some subset of all point—image pairs. If we decompose the camera matrix as

$$\mathtt{P}_i = \left [ \arraycolsep=5pt \begin{array}{@{}cc@{}} \mathtt{A}_i & \mathbf{b}_i \\ {\mathbf{a}_i}^{\top} & b_i \end{array} \right ] $$

where A _i is a 2×3 matrix and a _i and b _i are vectors, then the projection of a point X _j is given by (A _i X _j+b _i)/(a _i ^⊤ X _j+b _i). The squared image error is then given by

$$f_{ij}^2(\mathbf{X}_j, \mathbf{b}_i, b_i) = \biggl \Vert \frac{\mathtt{A}_i \mathbf{X}_j + \mathbf {b}_i}{{\mathbf{a}_i}^{\top} \mathbf{X}_j + b_i} - \mathbf{x}_{ij} \biggr \Vert ^2 $$

which is easily put into the form (19).

If the camera matrices are completely known, and there is only one point X _j that needs to be found, then this is simply the triangulation problem, expressed in terms of image coordinates.

With several points and cameras, we suppose that the first three columns of P _i, consisting of matrix A _i and vector a ^⊤ are known, whereas X _j, b _i and b _i are unknown. Then, A _i X _j+b _i and a _i ^⊤ X _j+b _i are linear in the unknowns X _j, b _i and b _i. As in Sect. 4.2 we define a vector X by concatenating all the X _j, b _i and b _i. Each term $f^{2}_{ij} (\mathbf{X})$ of the cost function is then of the required form (19) and (23) may be used to prove convexity of the cost function.

Together, A _i and a _i ^⊤ make up the left-hand 3×3 block of the camera matrix. There are two important situations in which this left-hand block of P _i may be known. The first is when the cameras are calibrated, and the block constitutes the rotation matrix associated with the camera. This rotation matrix may have been obtained from a prior reconstruction step in which rotations are computed. This is the same problem as was considered in Sect. 4.2, expressed now in image coordinates.

The second situation is that considered by Rother and Carlsson (2002), in which a plane in the scene allows us to compute inter-image homographies induced by that plane. These homographies are represented by the left-hand block of the camera matrices. Note that the solution given in Rother and Carlsson (2002) was a linear solution to the reconstruction problem, not the sort of least-squares solution that we wish to verify here.

4.4 Homographies and Camera Resectioning

In the camera resectioning problem, we are given 3-dimensional points X _i and measured corresponding points x _i related by an unknown 3×4 projection matrix P, which is to be estimated. This is also known as the projective camera pose estimation problem. For no apparent reason, this has also recently become known as the PnP problem, a neologism which we abjure. The homography estimation is essentially the same, except that the points X _i are 2-dimensional points, and the matrix P is a 3×3 matrix. Since these two problems are essentially the same, we will concentrate on the pose-estimation problem.

If we denote the k-th row of P by p _k, then the squared error associated with the measurement of point x _i is given by

$$f^2(\mathtt{P}) = \biggl( \frac{{\mathbf{p}_1}^{\top}\tilde{\mathbf{X}}_i}{{\mathbf{p}_3}^{\top}\tilde{\mathbf{X}}_i} - x_i \biggr)^2 + \biggl( \frac{{\mathbf{p}_2}^{\top}\tilde{\mathbf{X}}_i}{{\mathbf{p}_3}^{\top}\tilde{\mathbf{X}}_i} - y_i \biggr)^2 $$

where $\tilde{\mathbf{X}}_{i}$ represents the homogeneous vector (x,y,z,1). Here, the situation is slightly different from the previous cases of triangulation and multiple view reconstruction, in that the entries of the matrix P are the unknowns, and not the coordinates of the points X _i. Nevertheless, it is easily observed that the squared error term may be written in the usual form (19).

5 Projective Coordinate Change

If the convexity test described previously should fail to give a positive answer, then there are further things that can be tried to attempt to prove convexity. Our first method of attack is to try a projective coordinate change.

By changing projective coordinate systems, we will reparametrize the domain of the error function f ²(X). This may turn a non-convex function into a convex function. It is easy to see that affine coordinate changes do not affect convexity so it is enough to restrict our attention to transformations that change the plane at infinity. Cheirality conditions also apply, that is, all 3D-points must remain in front of the cameras. See Hartley and Zisserman (2004) for more details on projective geometry.

We consider the use of the inequalities given previously to demonstrate that the function is positive semi-definite on a region. The matrix H′ defined in (7) has one small negative eigenvalue in the direction along the principal ray of the camera, and two larger positive eigenvalues oriented in directions orthogonal to the principal ray. It gives a uniform lower bound on the Hessian in a region of space. The general idea is that if a point is seen from two different directions, then the Hessian of the combined error function is simply the sum of the individual Hessians. If the two principal rays are not aligned, then the idea is that the negative eigenvalue in the principal direction for one camera will be outweighed by the contribution of the positive eigenvalues for the other camera.

This will be true if the view directions for the different cameras are sufficiently different. If on the other hand, the view directions for the different views are the same, then no cancelling will take place, and we will not be able to conclude that the Hessian is positive definite in any region around the point X of interest.

In this case, the situation can be saved by the application of a projective transformation. If the view directions for several cameras are similar, this implies that the point X is near to infinity. We apply a projective transformation that maps this point to a point closer to the cameras, so that the viewing rays are no longer near parallel. Then, consider the Hessian of the error function in this new coordinate system. The general idea is best illustrated by an example.

Consider cameras with matrices P _a=[I|(a,0,0)^⊤], where a is a variable. This camera is placed at the position x=−a on the x-axis, and the principal rays for all these cameras point in the direction of the z-axis. Suppose we want to find the Hessian of the error function on the intersection of error-cones, defined by τ, around the principal rays of the cameras. This situation will occur when the correct triangulation point lies near infinity. Now, according to (7), the Hessian at a point with depth d may be bounded as follows:

$$H_a \succeq 2/3d^2 \operatorname {diag}\bigl(1, 1, -9\tau^2 \bigr) $$

and this bound is the same for all a. Adding any number of such Hessians H _a for different cameras P _a will not result in a positive-definite matrix. Hence, we cannot easily conclude that the Hessian is positive-definite.

It is instructive to note here in passing that since the region of interest $\mathcal {D}$ stretches to infinity, the bound L _a in condition (12) will be zero. Hence the matrix in (12) will be primitive. This problem will be partly alleviated by using the α-test (16) with α(X) equalling the average depth of a point over different images. Since the depth is the same in all images, the ratio α(X)/d(X)=1, so $L^{\alpha}_{i} = U^{\alpha}_{i} = 1$, and the matrix to test is $\sum_{i=1}^{N} 2/3d^{2} \operatorname {diag}(1, 1, -9\tau^{2})$, which as remarked above is still not positive-definite. The problem is that the axes of the cones are parallel, which gives no scope for the positive eigenvalues of the individual Hessians to cancel the negative ones.

However, if we apply a projective transformation represented by matrix

$$\mathtt{T} = \left [ \arraycolsep=5pt \begin{array}{@{}cccc@{}} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 1 \end{array} \right ] $$

which takes the point (0,0,1,0)^⊤, the point at infinity in the z direction, to (0,0,1,1)^⊤, a finite point. The axes of different cones pointing to this point will no longer be parallel.

This action transforms the camera matrix P _a to

$$\mathtt{P}_a \mathtt{T}^{-1} = \left [ \arraycolsep=5pt \begin{array}{@{}cccc@{}} 1 & 0 & -a & a \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \end{array} \right ]. $$

Note that the point (0,0,1,1) still maps to the origin in all images. Now, according to (21) the Hessian for a point at depth d will be bounded by

Now, summing two such Hessians for a=1 and a=−1, we find that

$$\mathtt{H}_1 + \mathtt{H}_{-1} = \frac{4}{3d^2} \left [ \arraycolsep=5pt \begin{array}{@{}ccc@{}} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & a^2 - 9 \tau^2 \end{array} \right ], $$

where we use the fact that (as before) the depth d is the same for all images. This matrix is obviously positive definite even for moderately large values of τ. This proves that the Hessian is positive definite on the intersection of τ-cones, and shows that the region over which it is certain that there is a single minimum of the cost function is relatively large.

Algorithm

Even if there exists a transformation T:ℙ^m↦ℙ^m that makes the error function f ²(X) convex over the transformed domain $T(\mathcal {D})$ (where m is the dimension of the unknown), it may be hard to find it automatically. Note that the maximum and minimum depths change when the plane at infinity is transformed, so it is required that the bounding depths are recomputed as well.

In addition, there is an essential subtlety to observe in selecting candidates for the plane at infinity π _∞. We limit the discussion for the triangulation problems, but the similar considerations apply for other problems. First, it is not advisable to select π _∞ to be a plane cutting the region of interest $\mathcal {D}$, for in this case, points in $\mathcal {D}$ will be moved to infinity, which defeats our purpose of selecting non-parallel viewing directions. More important is a second consideration that if the selected plane at infinity separates a camera centre from the region of interest $\mathcal {D}$, then cheirality of points in $\mathcal {D}$ with respect to that camera is reversed (Hartley 1998). This causes complications in defining depth with the right sign.

In the experiments, we have found that by choosing the candidate plane at infinity as a plane parallel to one of the cameras’ image planes, and with an offset such that all 3D points and camera centres are just in front of this plane, then good verification rates are obtained. This choice corresponds to picking a principal plane to one of the cameras and moving it slightly backwards. Recall that the coordinates of a principal plane is extracted from the last row of the camera matrix.

Suppose the candidate plane at infinity is parametrized by v ^⊤ X+1=0 where v is a m-vector (for triangulation m=3). Then, the matrices A _i and vectors c _i of Theorem 3 should be updated according to A _i↦A _i−vb _i ^⊤ and c _i↦c _i−k _i v. The next step is to recompute the bounding depths, and finally test whether condition (16) holds.

6 Experiments

Test Procedure

In all experiments described below, the following test procedure is run until a definitive answer is returned (unless otherwise stated) in order to determine convexity and hence to verify global optimality.

1.
Compute a local minimizer X _local with bundle adjustment.
2.
Compute lower and upper bounds, L _i and U _i, respectively, according to Theorem 1.
3.
Test if the convexity condition in (12) holds (referred to as primary test).
4.
Compute lower and upper bounds, $L^{\alpha}_{i}$ and $U^{\alpha}_{i}$, respectively, according to Theorem 2.
5.
Test if the stronger convexity condition in (16) holds (referred to as α-test).
6.
Apply a projective transformation and test convexity (see Sect. 5). This is only done for triangulation.

Notre Dame

This data set was originally created in Snavely et al. (2006). It consists of 595 images with 277,877 reconstructed 3D-points. Each point is visible in at least 2 images up to as many as 216 images according to the following distribution: 2 images 58 %, 3 images 19 %, 4–10 images 18 % and more than 10 images 5 %.

Out of 277,887 triangulated 3D-points, 265 instances were unsuccessful with our primary test in (12), and after applying the stronger α-test derived in Sect. 3, 158 cases remained. By transforming the plane at infinity, another 156 cases could be proven to be optimal. Hence, only 2 cases could not be verified to be globally optimal.

See Table 1 for a summary of the results. Each problem instance takes on the average less than 0.5 milliseconds to verify on a 2.66 GHz Pentium written in C++.

Table 1 Results for Notre Dame

Full size table

Dinosaur

This turn-table sequence consists of 36 images with 328 given point features. The complete 3D reconstruction was computed with standard structure from motion routines (including refinements by bundle adjustment).

The multiview geometry problems tested were triangulation, camera resectioning and structure and motion with known rotations. See Table 2 for a summary of results. Both camera resectioning and triangulation work very well for this type of scene and camera motion. It was not possible to prove optimality for the whole sequence (assuming known rotations), only for the first 22 images. Note that this configuration is still a large structure and motion problem: 22 cameras and several hundreds of 3D points. When going beyond 22 images, the primary test was not sufficient. We did not apply the α-test due to the sheer size of the configuration.

Table 2 Results for Dinosaur

Full size table

The large number of image measurements in this data set makes the total cost C(X ₀) relatively large. And since this value determines the value of τ for the test, it is not surprising that the convexity test fails. One may ask to what pixel threshold is it possible to verify optimality? We ran a series of tests with different values τ’s using bisection and found that it is possible to verify the whole configuration for up to 2.59 pixels. It cannot be ruled out the existence of an optimum X _opt with lower total cost C(X _opt), but at the same time such a solution must have at least one residual with more than 2.59 pixels of error. Given that the data set is outlier free and typical residual errors are much lower, this seems like an unlikely occurrence. Hence, we may conclude that with high probability we have verified the globally optimal solution. Similar ideas have recently been developed for one-dimensional vision problems in Olsson et al. (2009).

We also tried verifying 100 random pairs of images with common feature points in this pair, and similarly 100 random triples with all common feature points, with 100 % success rate. Only pairs and triples with at least 10 feature points were selected.

The Christ Statue

The images were collected from various tourist photographs of this well-known statue and reconstructed with standard structure and motion routines. Similar experiments as for the Dinosaur sequences were performed with similar success rates, see Table 3.

Table 3 Results for the Christ Statue

Full size table

Corridor

The forward camera motion for this 11-image sequence is quite different from the other sequences. Verifying global optimality for triangulation and camera resectioning turned out to be no problem, but for the other multiview problems it was more difficult, see Table 4. We did not apply the α-test to the known rotation cases due to the large size of the configurations.

Table 4 Results for Corridor

Full size table

There are several 3D planes in the scene and we took all the point features on the left frontal wall and computed all pairwise 3×3 homographies for this plane using bundle adjustment. Out of the $\binom{11}{2}=55$ image pairs, 27 were successfully verified with the primary test, and another 3 with the α-test.

The known rotation case turned out to be more difficult, which is consistent with the findings in Vedaldi et al. (2007). Not a single test was successful for the primary test. We did not apply the other methods due to the size of the setup. The whole sequence has 11 cameras and more than 4000 image points, so just computing all the depth tests takes time. (The size for the LP constraint set is approximately 4×4000=16000 since each image coordinate gives rise to two inequality constraints).

Arnold

A final experiment with 6 images of a poster was tested, see Fig. 2 and Table 5. In this scene, all 3D feature points are on a plane, and hence uncalibrated camera resectioning is not possible. Triangulation of all 3D points (in total, 1639 points) as well as all pairwise homographies (15 in total) induced by the plane were successfully verified by the primary test. The results with known rotation were as follows: 9 out of 15 pairwise images were successfully verified and 17 out 20 image triples were verified. Again, no other steps in the test procedure were applied due to the size of these configurations.

Table 5 Results for Arnold

Full size table

Discussion

The method works very successfully (with almost 100 % success) on the triangulation and resection problems. However for large problems, such as full 3D reconstruction the method is less reliable. The main reason for this is the approximation made after (2), where it is deduced that each f _i(X _opt)≤ϵ ². This is the worst-case hypothesis, that all the error is incurred by one measurement. The effect is to set an unrealistically high bound τ _max in the convexity conditions, which causes the tests to fail. For this reason, the tests derived here do not work well for very large problems. It is only possible to verify subproblems for optimality.

The other main failure case for the primary and α-tests is when points are a long way away compared to the base line, that is, close to infinity. This is because the two bounds L and U in (10) or $L_{i}^{\alpha}$ and $U_{i}^{\alpha }$ in (15) are far apart—the intersection volume is too elongated. Indeed, it is easily seen that if U or $U^{\alpha}_{i}$ is too large then the matrix in (10) or (15) cannot be positive definite. However, in this case, as shown, the projective change method succeeds in all but two cases.

In practice, the three tests, the primary test, the α-test and projective change are run sequentially until one of them returns a positive answer. Since the tests are successively more expensive in terms of time, this is the most efficient approach. The total time for triangulation verification of all the points in the Notre Dame sequence was 2 m 12 s (less than 0.5 ms per point) on a 2.66 GHz laptop, single threaded, programmed in C++.

Comparing the primary test and the α-test, it was observed (on the Notre Dame set) that the α-test is more accurate. There were only two cases where the α-test failed but the primary test succeeded. However the α-test takes about 5 times as long to run as the primary test (9 m 40 s). Since in practice it is only run when the primary test fails (about one test in 1000, this longer run time is not important).

As a tool to verify the optimality of the triangulation and resection results for a 3D reconstruction, this method requires insignificant time in the context of the complete reconstruction task. Even for a large reconstruction of 10⁶ points, the time to verify that all points and cameras are triangulated or resectioned correctly will take no more than 500 seconds.

To summarize, reconstruction problems with small dimensions can be handled well with a small computational effort (the primary test generally works) while for larger problems the picture is mixed. The success rate depends on the geometry of the scene as well as the motion of the camera. Images taken with a wide baseline seem to be more easily verifiable compared to other camera motions. As the number of residuals increases, it becomes harder to verify optimality—just as expected.

7 Conclusion

The tests described here are extremely effective at verifying convexity, and hence global optimality of a local minimum for various multiple view geometry problems. The convexity test for the primary bound seems to do almost as well as α-bound. As the second bound is harder to compute, it is often advantageous to try the simple bound first. Experimental evaluations on several data sets showed the practical efficacy of the tests.

Failure of verifying optimality is quite rare. In the case of triangulation, the success rate is almost 100 % and verification time is about 0.5 ms per point. The main reason for failure is that the viewing rays are being parallel and the domain of interest becomes elongated, as illustrated in Fig. 1. For larger problems with many residuals, the success rate decreases, and it is possible only to verify subproblems. We have been able to verify optimality for reconstructions with more than 1500 image points and 20 cameras.

This work has shown how to deal with a large class of multiple view geometry problems under the L ₂ cost function. On the other hand, many geometric problems involve optimizations including rotations, for example, the estimation of two-view relative motion. Such problems cannot be handled with the presented techniques and are left as a challenge for future work.

Notes

By analogy with Sect. 2.1 one may be tempted to define $\mathtt{H}' = (2/r^{2}) \operatorname {diag}(1/3, 1/3, -3 \phi^{2})$ which gives a slightly better bound, but this does not work in this case.

References

Hartley, R. I. (1998). Chirality. International Journal of Computer Vision, 26(1), 41–61.
Article Google Scholar
Hartley, R., & Kahl, F. (2007). Optimal algorithms in multiview geometry. In Asian conf. computer vision, Tokyo, Japan.
Google Scholar
Hartley, R., & Schaffalitzky, F. (2004). L_∞ minimization in geometric reconstruction problems. In Conf. computer vision and pattern recognition, Washington DC, USA (Vol. I, pp. 504–509).
Google Scholar
Hartley, R., & Sturm, P. (1997). Triangulation. Computer Vision and Image Understanding, 68(2), 146–157.
Article Google Scholar
Hartley, R. I., & Zisserman, A. (2004). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.
Book MATH Google Scholar
Kahl, F., & Hartley, R. (2008). Multiple view geometry under the L _∞-norm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(9), 1603–1617.
Article Google Scholar
Kahl, F., Agarwal, S., Chandraker, M. K., Kriegman, D. J., & Belongie, S. (2008). Practical global optimization for multiview geometry. International Journal of Computer Vision, 79(3), 271–284.
Article Google Scholar
Ke, Q., & Kanade, T. (2007). Quasiconvex optimization for robust geometric reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(10), 1834–1847.
Article Google Scholar
Longuet-Higgins, H. C. (1981). A computer algorithm for reconstructing a scene from two projections. Nature, 293, 133–135.
Article Google Scholar
Lu, F., & Hartley, R. (2007). A fast optimal algorithm for L ₂ triangulation. In Asian conf. computer vision, November (Vol. 2, pp. 279–288).
Google Scholar
Nistér, D., Hartley, R., & Stewénius, H. (2007a). Using Galois theory to prove structure from motion algorithms are optimal. In Conf. computer vision and pattern recognition, Minneapolis, USA.
Google Scholar
Nistér, D., Kahl, F., & Stewénius, H. (2007b). Structure from motion with missing data is NP-hard. In Int. conf. computer vision, Rio de Janeiro, Brazil.
Google Scholar
Olsson, C., Byröd, M., & Kahl, F. (2009). Globally optimal least squares solutions for quasiconvex 1d vision problems. In Scandinavian conf. on image analysis, Oslo, Norway.
Google Scholar
Rother, C., & Carlsson, S. (2002). Linear multi view reconstruction and camera recovery using a reference plane. International Journal of Computer Vision, 49(2/3), 117–141.
Article MATH Google Scholar
Snavely, N., Seitz, S. M., & Szeliski, R. (2006). Photo tourism: exploring photo collections in 3d. ACM Transactions on Graphics, 25(3), 835–846.
Article Google Scholar
Stewenius, H., Schaffalitzky, F., & Nister, D. (2005). How hard is 3-view triangulation really? In Proc. international conference on computer vision (pp. 686–693).
Google Scholar
Vedaldi, A., Guidi, G., & Soatto, S. (2007). Moving forward in structure from motion. In Conf. computer vision and pattern recognition, Minneapolis, USA.
Google Scholar

Download references

Acknowledgement

This research was supported by (i) NICTA, a research centre funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy, (ii) the Australian Research Council through the ICT Centre of Excellence program, (iii) the Swedish Research Council (Grant No. 2007-6476), (iv) the Swedish Foundation for Strategic Research (SSF) through the programmes Future Research Leaders and Wearable Visual Information Systems, (v) the European Research Council (GlobalVision Grant No. 209480), and (vi) Sogang University Research Grant of 2012 (201210029.1).

Author information

Authors and Affiliations

Australian National University and NICTA, Canberra, Australia
Richard Hartley
Centre for Mathematical Sciences, Lund University, Lund, Sweden
Fredrik Kahl & Carl Olsson
Department of Media Technology, Sogang University, Seoul, Korea
Yongduek Seo

Authors

Richard Hartley
View author publications
You can also search for this author in PubMed Google Scholar
Fredrik Kahl
View author publications
You can also search for this author in PubMed Google Scholar
Carl Olsson
View author publications
You can also search for this author in PubMed Google Scholar
Yongduek Seo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Richard Hartley.

Appendices

Appendix A: Minimizing Squared Angle

In the derivations in Sect. 2, the error function used was the sum of squares of the tangents of the angles, as given by (4). This leads to a relatively simple result in terms of computing and bounding the Hessian. On the other hand, it would be more natural to wish to minimize the sum of squares of the error angles, and not their tangents. The difference is very small, but for exactness, we now derive a similar result for this error function. It is worth noting here that although in many computer-vision applications, pixel error is the natural metric for evaluating a solution, this is not clearly true of omnidirectional or wide-angle images.

In a different application, target tracking and localization, the position of a target may be measured from several observation points resulting in angular (alt-azimuth) readings. The task of determining the position of the target is precisely the orientation problem. In this context, it is natural to wish to minimize squared angular error, which is what we do in this appendix.

In the following discussion, we give the outline of the argument. To verify the details, the reader may need to use a computer algebra system, such as Mathematica. Now, under the same conditions as before, we define the error function

$$ f^2(x, y, z) = \phi^2 = \arctan\biggl( \frac{\sqrt{x^2 + y^2}}{z} \biggr)^2 . $$

(25)

We compute the Hessian at a point with (x,y,z) with respect to the coordinates x, y and z. Subsequently, evaluating with x≥0 and y=0, and making the substitutions ϕ for arctan(x/z), radial distance r for $\sqrt{x^{2} + z^{2}}$ and tanϕ for x/z, we arrive after some computation at the expression for the Hessian given in Fig. 3. This matrix has eigenvalues 2ϕ/(r ²tan(ϕ)) and $(1 \pm\sqrt{1 + 4 \phi^{2}} ) / r^{2}$, namely two positive and one negative eigenvalue.

Similarly as before,^{Footnote 1} let H′ be the matrix $(2 / r^{2}) \operatorname {diag}(1/4,\allowbreak 1/4, -4 \phi^{2})$. We claim that H−H′ is positive semi-definite. Unfortunately, the proof is a little more intricate than before.

First, we observe that (0,1,0)^⊤ is an eigenvector of H−H′, with eigenvalue (2/r ²)(ϕ/tan(ϕ)−1/4), which is positive at least for ϕ<1. The other two eigenvalues are the eigenvalues of the reduced-size matrix obtained by eliminating the second row and column from H−H′. This matrix will have two positive eigenvalues, as long as its trace and determinant are both positive. Apart from the factor 1/r ² the trace is equal to 8ϕ ²+3/2 which is positive. The determinant is equal to

$$D(\phi) = -1 + \bigl(1 + 16 \phi^2\bigr) \bigl(\cos(2 \phi) - 2 \phi \sin(2 \phi)\bigr) . $$

This function is positive for ϕ<0.3 as may be shown by plotting the function D(ϕ)/ϕ ² (see Fig. 4). A more formal proof can be given by computing the series expansion of this function.

The result of this computation is the following result.

Lemma 1

If H is the Hessian of the error function (25) evaluated at a point with error less than angle ϕ<0.3, then

$$\mathtt{H} \succeq \bigl(2/r^2\bigr) \operatorname {diag}\bigl(1/4, 1/4, -4 \phi^2\bigr) . $$

From here on, conditions for convexity of the error function in the intersection of a set of cones with angle bounded by ϕ _max proceeds just the same as previously.

Appendix B: Minimum Bounds for Hessians

The matrix H′ given in (7) was presented without motivation. In this section, we will give some theory that justifies the choice. This will be developed in a more general context, to allow for possible applications to other problems. In what follows, all matrices are assumed to be symmetric, even if not stated.

Notation

We need to develop some properties of the relationship H⪰0 which has been used throughout this paper. It means that H is positive semi-definite, possibly zero. The notation H⪰H′ means H−H′⪰0. This defines a partial ordering relationship between matrices. We use the notation H≻0 to mean that H is positive semi-definite, but not zero. It could be interpreted as meaning that H is positive definite, but not semi-definite, but that is not what we mean. If we need to specify that H is strictly positive-definite, this will be specifically stated.

Consider a family of symmetric matrices H _v defined in terms of a parameter vector v. We say that a matrix G is a lower bound for this set if H _v⪰G for all v, and it is a maximal lower bound if there is no other lower bound G′ such that G′≻G. There may be more than one maximal lower bound for a family of matrices. Clearly if we want to define a lower bound for a set of Hessian matrices, it is advantageous to identify the maximal lower bounds.

We are particularly interested in the lower bounds for matrices of the form

$$ \mathtt{H}_\mathbf {v}= \left [ \arraycolsep=5pt \begin{array}{@{}cc@{}} \mathtt{I}_{n\times n} & -2\mathbf{v} \\ -2\mathbf{v}^{\top}& 3 \tau _{\mathrm{max}}^2 \end{array} \right ] $$

(26)

as in (5) or (20). We may determine the complete set of maximal lower bounds for the set of all matrices H _v where ∥v∥ is bounded.

Theorem 4

Consider the set of matrices

$$ \mathcal{H} = \bigl\{ \mathtt{H}_\mathbf {v}~|~\| \mathbf{v} \|^2 \le\tau_{\mathrm{max}}^2 \bigr\} . $$

(27)

The maximal lower bounds for this set are the matrices of the form $\mathtt{G}_{\lambda}= \operatorname {diag}(1-1/\lambda, 1-1/\lambda, (3-4\lambda) \tau_{\mathrm{max}}^{2})$ with λ>0.

This theorem will follow from a more general theorem, which we will state shortly, in terms of maximal ellipsoids in ℝⁿ.

The matrices of interest are of the form

(28)

where v ranges over a set V of vectors in ℝⁿ, and A ₀ is a symmetric matrix. Compare this with the form of (26).

The following simple observation reduces finding lower bounds of such matrices A ₀−S(v) to finding bounds for the set S(v).

Lemma 2

The (maximal) lower bounds of A ₀−S(v) for all v∈V are exactly those matrices A ₀−G, where G is a (minimal) upper bound for the matrices S(v).

We now consider a connection between matrices and ellipsoids in ℝⁿ that will help us identify a connection between maximal lower bounds for sets of matrices and ellipsoids. Let G be a matrix of the form

$$ \mathtt{G} = \left [ \arraycolsep=5pt \begin{array}{@{}cc@{}} \mathtt{A} & \mathbf{r} \\ \mathbf{r}^{\top} & q \end{array} \right ]\quad\mbox{with } \mathtt{A} \succeq 0 \mbox{ and } q \ge0 $$

(29)

Define also

$$ \mathtt{G}_{\lambda}= \left [ \arraycolsep=5pt \begin{array}{@{}cc@{}} \mathtt{A}/\lambda& \mathbf{r} \\ \mathbf{r}^{\top} & q \lambda \end{array} \right ]\quad\mbox{for } \lambda> 0. $$

(30)

Definition 1

Let G be a matrix of the form (29). We define the ellipsoid C(G) associated with G to be the set

$$C(\mathtt{G}) = \bigl\{ \mathtt{K} \mathbf{w} + \mathbf{r}~|~\| \mathbf{w} \|^2 \le q \bigr\} $$

where K is a real symmetric n×n matrix such that A=K ².

Thus, C(G) is the image of a ball in ℝⁿ under an affine mapping, and hence an ellipsoid, possibly degenerate (lying in a subspace of dimension less than n) if A does not have full rank. It is clear that C(G) and C(G _λ) define the same ellipsoid for all λ.

Even when A is not of full rank such a matrix K exists, since writing A=UD ² U ^⊤, where D is diagonal and U orthogonal, we can choose K=UDU ^⊤, which is symmetric. It is easy to verify that the set C(G) is independent of the choice of K, depending only on A=K ².

We give another useful characterization of ellipsoids. For a matrix A, define the span of A to be $\operatorname {Sp}(\mathtt{A}) = \{ \mathtt{A} \mathbf{w} \}$ for all vectors w. Further, given a symmetric matrix A with Singular Value Decomposition A=UDU ^⊤, define the pseudo-inverse A ⁺=UD ⁺ U ^⊤, where $\mathtt{D}^{+}_{ii} = \mathtt{D}_{ii}^{-1}$ when D _ii=0 and $\mathtt{D}^{+}_{ij} = 0$ otherwise. Then, for a matrix of the form (29),

$$ C(\mathtt{G}) = \bigl\{ \mathbf{x}~|~( \mathbf{x} - \mathbf{r}) \in \operatorname {Sp}(\mathtt{A}) \mbox{ and } (\mathbf{x} - \mathbf{r})^{\top} \mathtt{A}^+ (\mathbf{x} - \mathbf{r}) \le q \bigr\}. $$

(31)

Our main theorem on lower bounds of matrices can now be stated.

Theorem 5

Let V be a bounded set of points in ℝⁿ. The maximal lower bounds for the set of matrices A ₀−S(v) with v∈V, are the matrices of the form A ₀−G, where G is of the form (29) and C(G) is a minimal ellipsoid containing the set V.

Application

Before giving the proof of this theorem, we show how it may be used to prove Theorem 4 as a special case. Thus, let $\mathtt{A}_{0} = \operatorname {diag}(\mathtt{I}_{n\times n}, 3 \tau_{\mathrm{max}}^{2})$. According to Theorem 5, the maximal lower bounds for the set of matrices $\mathcal{H}$ are those matrices of the form A ₀−G _λ where C(G) is a minimal ellipsoid containing the set of vectors V={2v | ∥v∥≤τ ²}. The minimal ellipsoid containing V is clearly a sphere of radius 2τ, having equation x ^⊤ Ix ^⊤≤4τ ². Therefore, $\mathtt{G}_{\lambda}= \operatorname {diag}(\mathtt{I} / \lambda, 4 \lambda\tau^{2})$. Consequently, the least lower bounds for the set $\mathcal{H}$ are the matrices $\operatorname {diag}(1-1/\lambda, \ldots, 1 - 1/\lambda, (3 - 4\lambda) \tau ^{2})$ with λ>0, which is the conclusion of Theorem 4.

If λ<1, these lower bounds are not very useful, since the eigenvalues 1−1/λ are negative. Our strategy in this paper is to select the value of λ that leads to the largest absolute value of the ratio of the largest to smallest eigenvalues of this matrix. Hence, we seek to maximize (1−1/λ)/(4λ−3)τ ² with λ>1. The maximum occurs when λ=3/2, and the corresponding optimal bound is $\operatorname {diag}(1/3, 1/3, - 3 \tau^{2})$. This is the value that is used in (7).

Proof of Theorem

We now start the proof of Theorem 5, which will be completed through a sequence of lemmas. The connection of ellipsoids with lower bounds of sets of matrices is given by the following lemma.

Lemma 3

G⪰S(v) if and only if G is of the form (29) and v∈C(G). Consequently, G is an upper bound for a set of matrices S(v), v∈V if and only if C(G) contains the set V.

Proof

First, we prove the if part of this lemma. Writing G as in (29), choose K such that K ²=A with K symmetric. If v∈C(G), then we can write v−r=K w with w ^⊤ w≤q. Substituting for v−r in G−S(v) we see

Therefore x ^⊤(G−S(v))x≥0 for any x, and it follows that G−S(v)⪰0.

For the converse, assume that G−S(v)⪰0. Then G can be written as in (29) where A⪰0 and q≥0, since any principal minor of a positive semi-definite matrix must be semi-definite.

If v−r does not lie in the span of the columns of K, then there exists a vector x such that K x=0 and of suitable length such that (v−r)^⊤ x>q. Writing $\hat{\mathbf{x}} = (\mathbf{x}^{\top}, 1)^{\top}$, we see that

$$\hat{\mathbf{x}}^{\top}\bigl(\mathtt{G} - \mathtt{S}(\mathbf{v})\bigr) \hat{\mathbf{x}} = q - ( \mathbf{v} - \mathbf{r})^{\top} \mathbf{x} < 0 , $$

so G−S(v) can not be positive semi-definite.

Consequently, v−r must lie in $\operatorname {Sp}(\mathtt{K})$ and we may write v−r=K w. Select the w of minimum norm that satisfies this equation. This is given by w=K ⁺(v−r). Then

$$\mathbf{w} = \mathtt{K}^+ (\mathbf{v} - \mathbf{r}) = \mathtt{K} \bigl(\mathtt{K}^+ \mathtt{K}^+ ( \mathbf{v} - \mathbf{r})\bigr) = \mathtt{K} \mathbf{x} $$

where x is defined by this formula. Let $\hat{\mathbf{x}} = (\mathbf{x}^{\top}, 1)^{\top}$. Then, and it follows that $\hat{\mathbf{x}}^{\top}(\mathtt{G} - \mathtt{S}(\mathbf{v})) \hat{\mathbf{x}} = q - \mathbf{w}^{\top}\mathbf{w}$. Since by assumption G−S(v)⪰0, it follows that ∥w∥²≤q, and so v=K w+r is in the ellipsoid C(G), and the proof is complete. □

This identifies the upper bounds for a set of matrices S(v). We need to identify the least upper bounds. This will be done by the following lemma.

Lemma 4

Let G and G′ be matrices as in (29). If G′≺G then C(G′)⊂C(G). Conversely, if C(G′) is a proper subset of C(G) then there exists a number λ such that $\mathtt{G}'_{\lambda} \prec \mathtt{G}$.

Note that the ellipsoids represented by $\mathtt{G}'_{\lambda}$ are the same for all λ≠0. It is clearly not true that $\mathtt{G}'_{\lambda} \preceq \mathtt{G}$ for all λ, since by making λ large or small enough, this condition will surely be broken.

A corollary of this lemma is

Lemma 5

Let V be a bounded set of points in ℝⁿ. Matrix G is a least upper bound for matrices S(v), v∈V if and only if G is of the form (29) and C(G) is a minimal ellipsoid containing the set V.

Proof of the first part of Lemma 4 is easy. Let v be a point in C(G′). Then, by Lemma 3, G′⪰S(v). If G≻G′ then G⪰S(v). Again by Lemma 3, this implies that v∈C(G). Hence C(G′)⊂C(G).

Now, we turn to the converse statement, which is considerably more difficult. First, note that it is sufficient to prove this result in the case where the matrices A and A′ belonging to G and G′ are invertible. If this is not the case, then the ellipsoids they define are degenerate, lying in a lower-dimensional subspace. By an appropriate affine transformation of the matrices, we can reduce them to matrices of lower dimension after eliminating rows and columns of zeros, and subsequently assume that the matrices are invertible.

We simplify this problem a little by simplifying the form of the matrices. The first observation is that we can assume that one of the two vectors r and r′ is zero. Thus, we can assume that r′ is replaced by r′−r, and r is replaced by the zero vector.

The second simplification is to apply an affine transformation to reduce to the case where A′=I. There exists a matrix P such that P ^⊤ A′P=I. Essentially, this affine transformation rescales non-isometrically so that the second ellipsoid becomes a spherical ball. We may also assume that q=q′=1.

It is now easily verified that if we prove Lemma 4 in the special case where A′ is the identity and r=0, then it will hold generally. Thus, given

$$\mathtt{G} = \left [ \arraycolsep=5pt \begin{array}{@{}cc@{}} \mathtt{A} & \mathbf{0} \\ \mathbf{0}^{\top} & 1 \end{array} \right ];\qquad \mathtt{G}' = \left [ \arraycolsep=5pt \begin{array}{@{}cc@{}} \mathtt{I} & \mathbf{r}' \\ \mathbf{r}^{\prime \top} & 1 \end{array} \right ] $$

our task is to show that

$$ \mathtt{G} - \mathtt{G}'_{\lambda}= \left [ \arraycolsep=5pt \begin{array}{@{}cc@{}} \mathtt{A} - \mathtt{I}/\lambda& - \mathbf{r}' \\ - \mathbf{r}^{\prime\top} & 1 - \lambda \end{array} \right ]\succ \mathtt{0} $$

(32)

for some λ, under the assumption that the ellipsoid C(G) contains the sphere C(G′).

However, this is equivalent, according to Lemma 3 to showing that A−I/λ⪰0, and 1−λ≥0, and that the ellipsoid x ^⊤(A−I/λ)⁺ x≤(1−λ) contains the point r′. This is shown in Fig. 5.

In the worst case, we may assume that the sphere C(G′) is touching the ellipsoid C(G) tangentially at some point, which we denote by y. The inward-pointing normal to the ellipsoid at y is obtained by differentiating y ^⊤ A ⁻¹ y. It is −2A ⁻¹ y and the unit vector is −A ⁻¹ y/λ where we define λ=∥A ⁻¹ y∥. Therefore, the centre of the unit sphere touching the ellipsoid at point y is given by

$$ \mathbf{r}' = \mathbf{y} - \mathtt{A}^{-1} \mathbf{y} / \lambda= (\mathtt{A} - \mathtt{I}/\lambda) \mathtt{A}^{-1} \mathbf{y} $$

(33)

where we define λ=∥A ⁻¹ y∥. This shows that $\mathbf{r}' \in \operatorname {Sp}(\mathtt{A} - \mathtt{I} / \lambda)$. We need to show that r′ lies in the ellipsoid x ^⊤(A−I/λ)⁺ x≤(1−λ). We compute

(34)

as required.

We also need to show that the matrix A−I/λ is positive semi-definite. This will be true if and only if 1/λ≤λ _min(A), the smallest eigenvalue of A. Equivalently, our task is to show that

$$ \lambda\ge\lambda_{\mathrm{max}} \bigl( \mathtt{A}^{-1}\bigr) $$

(35)

under the assumption that the unit sphere touching the ellipsoid x ^⊤ A ⁻¹ x=1 at the point y lies entirely inside the ellipsoid. Now, let u be the eigenvector of A ⁻¹ such that A ⁻¹ u=λ _max u, and suppose that the condition (35) is violated. With the centre r′ of the sphere given by (33) we verify that

with c<0. This means that r′ and y are on opposite sides of the principal plane of the ellipsoid defined by points x such that u ^⊤ x=0. We argue that this is impossible.

Let y′ be the point symmetrically opposite to y, reflected across this plane. This point also lies on the boundary of the ellipsoid C(G). Since y′ is on the same side of the plane as r′, we claim that r′ is closer to y′ than it is to y. This means that y′ lies closer than unit distance from r′ and so the unit ball centred at r′ is not contained completely inside the ellipsoid. This contradiction indicates that (35) must be true, and hence A−I/λ is positive semi-definite, as required. This being so, it follows directly from (34) that also (1−λ)≥0, as required.

This completes the proof of Lemma 4. The truth of Theorem 5 follows from Lemmas 2 and 5.

Remarks on the Lower Bound Matrix

It was assumed throughout most of this paper that the errors in any particular are bounded within a circle of radius τ _max. This allows an easy derivation of a good maximal lower bound for the Hessians. The justification for this is that the region $\mathcal {D}$, formed as an intersection of cones, must project to points within the stated error bound. However, in reality the projection of the region $\mathcal {D}$ in each image may be significantly smaller, and not circular in shape. In this case, we could think of using the more general form of the maximal lower bound Hessians for each image given by Theorem 5. This would be expected to give better results, at the cost of greater computation. The improvement may, however, not be worth the extra computational effort.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hartley, R., Kahl, F., Olsson, C. et al. Verifying Global Minima for L ₂ Minimization Problems in Multiple View Geometry. Int J Comput Vis 101, 288–304 (2013). https://doi.org/10.1007/s11263-012-0569-9

Download citation

Received: 26 August 2010
Accepted: 03 September 2012
Published: 04 October 2012
Issue Date: January 2013
DOI: https://doi.org/10.1007/s11263-012-0569-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Verifying Global Minima for L 2 Minimization Problems in Multiple View Geometry

Abstract

Similar content being viewed by others

A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem

Match Selection and Refinement for Highly Accurate Two-View Structure from Motion

Minimal Solutions for Pose Estimation of a Multi-Camera System

Explore related subjects

1 Introduction

How Hard is the Triangulation Problem, Really?

2 Triangulation—Simple Convexity Test

2.1 Problem Description

Bounding the Solution

The Hessian of the Cost Function

A Bound on the Hessian

Generally Placed Cone

Bounded Depths

Summing the Hessians

Theorem 1

3 Weighted Depths

Theorem 2

3.1 Computing the Bounds

3.2 Algorithm

4 General SOCP Cost Functions

Theorem 3

4.1 Application to Other Problems

4.2 Multiple View Reconstruction

4.3 Reconstruction Using Image Measurements

4.4 Homographies and Camera Resectioning

5 Projective Coordinate Change

Algorithm

6 Experiments

Test Procedure

Notre Dame

Dinosaur

The Christ Statue

Corridor

Arnold

Discussion

7 Conclusion

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Minimizing Squared Angle

Lemma 1

Appendix B: Minimum Bounds for Hessians

Notation

Theorem 4

Lemma 2

Definition 1

Theorem 5

Application

Proof of Theorem

Lemma 3

Proof

Lemma 4

Lemma 5

Remarks on the Lower Bound Matrix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Verifying Global Minima for L ₂ Minimization Problems in Multiple View Geometry