1 Introduction

The emergence of affordable 3D sensors and high quality 2D cameras has triggered a growing interest in combining both imaging modalities. 3D sensors provide faithful 3D scene models in the form of dense 3D point clouds while images are used to extract texture information. High quality 3D models with mapped texture can be obtained so long as the 2D and 3D sensors are registered in a common reference frame. The two modalities are generally registered off-line and the 2D and 3D sensors kept rigidly attached during acquisition. Doing so, however, may prove impractical considering that suitable acquisition conditions for one sensor may not be adequate for the other (e.g. lighting conditions for cameras, surface orientation for 3D sensors, etc.). Some application-specific requirements (e.g. camera on a drone and a 3D scanner on a terrestrial vehicle) may altogether prohibit the sensors to be rigidly attached. When the 2D and 3D sensors are free to move with respect to each other, reliable methods for registering the two modalities, i.e. establishing inter-modality correspondences and estimating the rigid transformation aligning their reference frames, are highly desirable.

Structure-from-Motion (SfM) techniques compute 3D point coordinates from pixel correspondences across images. It is thus tempting to regard the problem of registering 3D and 2D sensors as that of aligning two 3D point sets: one set induced by the images and the other obtained from scanner measurements. Registering 3D point clouds is a well-studied problem. Most methods use the Iterative-Closest-Point (ICP) algorithm (or its variants) (Bartoli and Castellani 2012; Fitzgibbon 2001; Rusinkiewicz and Levoy 2001) when the correspondences between two 3D point sets are not known. While ICP is a local method, Breuel (2003) investigated its global version using the Branch-and-Bound (BnB) algorithmic paradigm in several settings, mostly focusing upon the registration in 2D case. For 3D point sets registration, Enqvist et al. (2009) employed BnB to maximize the inlier set from the given point-to-point correspondences. Similar works also exist in the context of absolute pose estimation problem (Jurie 1999; Enqvist and Kahl 2008). A thorough insight about globally optimal solutions for geometric reconstruction problems can be found in Kahl and Henrion (2007). Recent work by Yang et al. (2013) (Go-ICP) provides a practical method for retrieving the globally optimal solution to same-scale point set registration. However, because SfM reconstructions suffer from a scale ambiguity, methods devised for registering same-scale data cannot be employed.

Most methods handling the scale ambiguity rely on establishing correspondences either between the 3D measurements obtained by both modalities or directly between scanned data and images (Liu and Stamos 2005; Christy et al. 1999; Ferraz et al. 2014). The sought transformation parameters are then obtained by either minimizing the registration loss function or maximizing the consensus set of inliers. Note that Random Sample Consensus (RANSAC) (Fischler and Bolles 1981) is the most widely used method for finding the set of maximum inliers. RANSAC, although not usually optimal, is a well established standard method for model estimation in the case of large amounts of outliers. Key to its success relies on its explicit inlier set maximization approach. Alternative approaches include robust estimation methods that minimize a robust loss function, such as M-estimator. Methods based on loss function minimization are more prone to outliers than their inlier-set-maximization counterparts (Bazin et al. 2013). Such methods may provide satisfactory results only when outliers are relatively few (Fraundorfer and Scaramuzza 2012).

Some approaches exploit scene knowledge or the Manhattan World assumption. For instance, methods have been devised based on line segment matching (Liu and Stamos 2005), target segmentation (Taneja et al. 2013), repeated patterns detection (Schindler et al. 2008), mutual information maximization (Mastin et al. 2009), and extended Chamfer matching (Zhang and Chen 2014). Registration methods relying on establishing correspondences may be undermined by unreliable visual feature descriptors. Alternative methods, not establishing initial correspondences, have also been proposed (Paudel et al. 2014; Du et al. 2007; Corsini et al. 2013; Plotz and Roth 2015). The methods in Paudel et al. (2014) and Du et al. (2007) use variants of the ICP algorithm and hence remain susceptible to partial scene overlap, scene occlusion, and high levels of outliers. The one in Corsini et al. (2013) employs a RANSAC-based inlier set maximization in which the scale problem is handled by an extension of the 4-point congruent sets algorithm. A recent approach presented in Plotz and Roth (2015) computes the average gradient magnitude over all lighting directions under Lambertian shading. These gradients are then matched with the image gradient to obtain a coarse 2D–3D registration.

As far as maximizing the set of inliers is concerned, RANSAC is non-deterministic and provides no guarantee with respect to the optimality of its solution. Globally optimal inlier set maximization methods (Li 2009; Bazin et al. 2013) have recently been proposed for problems formulated through linear equations. Extensions to problems with nonlinear equations (Yang et al. 2014) is problem-specific, difficult and may result in much more complicated (possibly numerically intractable) mathematical formulations. Note that a variety of methods for solving systems of nonlinear polynomial equations exist. Some are based on Gröbner bases or homotopy continuation (Verschelde 1999; Habed et al. 2012). Others use Polynomial Sum-of-Squares (SoS) optimization (Schweighofer and Pinz 2008; Lasserre 2000; Parrilo 2000; Olsson et al. 2006; Chandraker et al. 2007; Chesi et al. 2002). However, such methods are dedicated to solving outlier-free systems and dealing with outliers is carried out through RANSAC.

In this paper, we address the problem of registering a 3D scan and a set of images of a structured scene captured by calibrated cameras. Our assumption is that the scene is structured in the sense that it can be segmented into, and represented by, planes (or planar patches). Such representation is compact (Borrmann et al. 2011) and can also be useful for scene knowledge-based refinement methods (Tamaazousti et al. 2011). The plane-based assumption is particularly valid when dealing with man-made environments, including (but not limited to) Manhattan World, urban and indoor scenes that are abundant with planes. In our approach, we seek the metric transformation relating the scene planes and the SfM-induced 3D structure. The SfM structure may either be represented by a sparse set of 3D points, obtained from sparse matching across images, or by planes should a sufficiently dense matching between images be obtained and the resulting point cloud segmented. When the SfM 3D structure is a sparse set of points, registration is carried out by establishing point-to-plane correspondences. Point-to-plane registration methods are known to perform better than their point-to-point counterparts (Segal et al. 2009). When the SfM-induced 3D structure consists of planes, registration is carried out by seeking plane-to-plane correspondences.

We rely on the fact that, under metric ambiguity, both point-to-plane and plane-to-plane assignments can be expressed as second degree polynomials in scaled-quaternion and translation parameters. Our approach aims at maximizing the set of inlier assignments with guaranteed optimality of the consensus set. The consensus set maximization methods (Li 2009; Bazin et al. 2013) discussed above are not applicable because of the nonlinearity of the problem at hand. In our approach, we use the BnB algorithmic paradigm to explore the scaled-quaternion and translation parameter space. As in Li (2009) and Bazin et al. (2013), we rely on establishing optimistic and pessimistic sets of inlier assignments for pruning branches whose most optimistic sets are worse than the best pessimistic one. Our contribution is threefold:

  1. (i)

    We propose a novel modeling of the point-to-plane (possibly point-to-patch) and plane-to-plane correspondence problems. Our modeling is based on a rigorous Sum-of-Squares polynomial optimization theory and is used to derive new conditions to identify, with certainty, mismatched correspondences within parameter bounds. Our registration approach relies upon such conditions to build optimistic inlier assignment sets for given parameter bounds.

  2. (ii)

    We introduce SfM-specific constraints in our modeling, namely, a plane visibility criterion and optional vague constraints on the positions of the camera.

  3. (iii)

    Based on our modeling and constraints, we propose a globally optimal registration algorithm that maximizes the inlier set of either point-to-plane or plane-to-plane assignments in presence of putative correspondences along with its non-combinatorial counterpart in the absence of such correspondences.

Our paper is organized as follows: Sect. 2 provides the main results from polynomial SoS optimization theory that we exploit in our registration approach. The BnB paradigm is presented in Sect. 3. In Sect. 4, we present our working hypotheses along with the assignment polynomials arising in the point-to-plane and plane-to-plane alignment problems. In Sect. 5, we derive polynomial SoS conditions for identifying mismatches within given registration parameter bounds. In sect. 6, we present our modeling of additional geometric constraints to handle the scale ambiguity, to exploit vague knowledge on camera locations, and to perform point-to-patch rather than point-to-plane registration. Plane visibility, vague camera locations, and (to a lesser extent) scaling are rather SfM-specific constraints. These, and the point-to-patch constraints, may be used to boost the search for mismatches. This leads us to the statement of our main result that we give in Sect. 7 along with the description of our BnB registration algorithm for point-to-plane (or point-to-patch) and plane-to-plane registration. The discussion about our method is presented in Sect. 8. The results of our experiments are summarized and discussed in Sect. 9. Section 10 concludes our work.

2 Polynomial Sum-of-Squares Theory

In this section, we present an overview of some important results in Polynomial Sum-of-Squares optimization theory. These results account for the main ingredients of our registration approach.

Definition 1

(SoS and PSD) Let \(\mathbb {R}[\mathsf {x}]\) be the ring of polynomials in n variables, \(\mathsf {x}=(x_1,x_2,\ldots ,x_n)\), with real-valued coefficients. A polynomial \(f(\mathsf {x})\in \mathbb {R}[\mathsf {x}]\) is

  • Positive Semi-Definite (PSD) (or nonnegative) if \(f(\mathsf {x})\ge 0\) for all \(\mathsf {x}\in \mathbb {R}^n\);

  • Sum-of-Squares (SoS) if there exist polynomials \(f_i(\mathsf {x})\in \mathbb {R}[\mathsf {x}]\) such that \(f(\mathsf {x})=\sum _{i}f_i(\mathsf {x})^2\).

A SoS is obviously always PSD and the converse is generally untrue. However, Hilbert (1888) proved that, for some classes of polynomials including quadratic ones, a polynomial is PSD if and only if it is SoS. Checking whether a polynomial is PSD is NP-hard (though decidable) while checking whether a polynomial is SoS is computationally tractable using Semi-definite Programming (SDP) and employing the so-called Gram matrix of the polynomial.

Definition 2

(Gram matrix Powers and Wörmann 1998) Consider a polynomial \(f(\mathsf {x})\in \mathbb {R}[\mathsf {x}]\) of degree 2d. Let \(\mathbf{Z}_d(\mathsf {x})\) be the vector of monomials of \(f(\mathsf {x})\) up to monomials of degree d. The matrix \(\mathsf {G}\) such that \(f(\mathsf {x})=\mathbf{Z}_d(\mathsf {x})^\intercal \mathsf {G}{} \mathbf{Z}_d(\mathsf {x})\) is a Gram matrix of \(f(\mathsf {x})\).

Theorem 1

(Choi et al. 1995; Powers and Wörmann 1998) A polynomial \(f(\mathsf {x})\in \mathbb {R}[\mathsf {x}]\) of degree 2d is SoS if and only if there exists a real symmetric positive semi-definite Gram matrix of \(f(\mathsf {x})\).

Note that since odd-degree polynomials cannot be SoS, only even-degree polynomials are concerned by such test. Checking for the existence of a positive semi-definite Gram matrix \(\mathsf {G}\) boils down to solving a Linear Matrix Inequality (LMI) feasibility problem. LMI feasibility can be efficiently solved using the interior-point algorithm (Boyd and Vandenberghe 2004). Theorem 1 allows us to check whether a polynomial \(f(\mathsf {x})\) is nonnegative for every \(\mathsf {x}\in \mathbb {R}^n\). One is often interested in checking whether \(f(\mathsf {x})\) is nonnegative in a semi-algebraic set \(\mathcal {K}\) defined by polynomials \(g_i(\mathsf {x})\in \mathbb {R}[x]\) such that

$$\begin{aligned} { \mathcal {K}=\{\mathsf {x}\in \mathbb {R}^n: g_{i}(\mathsf {x})\ge 0, i=1\ldots m\}. } \end{aligned}$$
(1)

This can be answered via the so-called Positivstellensatz (Psatz) (Parrilo 2000). The Psatz states that \(f(\mathsf {x})\) is nonnegative on \(\mathcal {K}\) if and only if there exist SoS polynomials \(\sigma _v(\mathsf {x})\) such that

$$\begin{aligned} {f(\mathsf {x})=\sum _{v\in \{0,1\}^m}\sigma _v(\mathsf {x})g_1(\mathsf {x})^{v_1}g_2(\mathsf {x})^{v_2}\ldots g_m(\mathsf {x})^{v_m}.} \end{aligned}$$
(2)

Exploiting Psatz is difficult and may turn numerically intractable in practice because (2) requires \(2^m\) SoS \(\sigma _v\) polynomials. Putinar (1993) provides a much simpler Psatz under Archimedean conditions on the so-called quadratic module.

A module is a well-studied algebraic structure, like a vector space, defined over a ring rather than a field. Let the set of SoS polynomials be \(\mathcal {S}[\mathsf {x}]=\{g_i(\mathsf {x})\}_{i=1}^m\), defining the semi-algebraic set \(\mathcal {K}\). Then, we are interested in the modules defined over the semi-ring of \(\mathcal {S}[\mathsf {x}]\). A semi-ring is an algebraic structure similar to a ring, but there is no requirement of additive inverse of each element. Then, a subset of ring \(\mathbb {R}[\mathsf {x}]\) is a module over \(\mathcal {S}[\mathsf {x}]\), if it is closed under addition and scalar multiplication. In fact, the quadratic module is a cone of sum-of-squares –which is also a module over \(\mathcal {S}[\mathsf {x}]\)– defined as follows:

Definition 3

(Quadratic module Wagner 2009) The quadratic module \(M(g)=M(g_1,\ldots , g_m)\subset \mathbb {R}[x]\) of polynomials \(g_1(\mathsf {x})\), \(g_2(\mathsf {x})\), ..., \(g_m(\mathsf {x})\) is the set

$$\begin{aligned} {M(g)=\left\{ \sigma _{0}(\mathsf {x})+\sum _{i=1}^{m}\sigma _{i}(\mathsf {x})g_{i}(\mathsf {x}):\;each\; \sigma _{i}\, is\, SoS\right\} .} \end{aligned}$$
(3)

Definition 4

(Archimedean Wagner 2009) The quadratic module M(g) of polynomials \(g_1(\mathsf {x})\), \(g_2(\mathsf {x})\), ..., \(g_m(\mathsf {x})\) is Archimedean if \(N-\vert \vert \mathsf {x}\vert \vert ^2\in M(g)\) for some \(N\in {\mathbb {N}}\).

The term Archimedean can also be related to the Archimedean property of ordered or normed groups, and fields. Intuitively, it is the property of having no infinitely large elements therefore ensuring an upper bound. This property appears as Axiom V of Archimedes’ “On the Sphere and Cylinder”. In our case, the norm of the variables defining module M(g) is bounded by the upper bound \(\sqrt{N}\) for some \(N\in \mathbb {N}\).

Theorem 2

(Putinar’s Positivstellensatz Putinar 1993) Assume the quadratic module M(g) is Archimedean. If \(f(\mathsf {x})>0\) on \(\mathcal {K}\) (defined by (1)), then \(f(\mathsf {x})\in M(g)\).

In the special case of degree 2 polynomials, we further relax the Putinar’s Positivstellensatz during positivity test. This is carried our by using only positive scalar coefficients instead of SoS polynomials, with the help of the following lemma.

Lemma 1

(Finsler’s 1936/37) Let \(\mathsf {y}\) be a vector, \(\mathsf {Q}\) a symmetric matrix, \(\mathsf {B}\) a rectangular matrix—all real-valued and of appropriate dimensions—and \(\gamma >0\) a scalar. The following statements are equivalent:

  1. (i)

    \({\textsf {y} ^\intercal \textsf {Q y} }>0, \forall \mathsf {y} \not = 0: \mathsf {By} =0\).

  2. (ii)

    There exists \(\gamma : \mathsf {Q} + \gamma {\mathsf {B} ^\intercal \mathsf {B} >0}\).

Finsler’s lemma allows to convert the problem of checking the sign of a quadratic form over a subspace into solving a LMI problem.

Fig. 1
figure 1

Left: hierarchical barnching in two dimensional space. The subspaces represented by nodes \(a_i\) and \( a_{ij}\) are the results of, respectively, the first and second levels of branching. Right: results of two levels of branching represented as by a tree structure (Color figure online)

3 Branch-and-Bound Search

We perform the BnB search to obtain the parameter \(\mathsf {x}\in \mathbb {R}^n\) that maximizes the consensus among a given set of polynomials parametrized by \(\mathsf {x}\). While doing so, our BnB algorithm performs a hierarchical discretization of the optimization variables via a dynamically-built search-tree. The process of hierarchical discretization is also know as branching. The branching process explores the parameter space with the help of the optimistic and pessimistic number of inliers for each branch, seeking an \(\epsilon \)-gap parameter interval containing the globally optimal solution.

3.1 Dynamic Tree Construction

Every node of the dynamic tree represents a closed convex subspace of the optimization variables. It is defined by the lower and upper bounds of the variables in the form of two vectors \(\underline{\mathsf {x}}\) and \(\overline{\mathsf {x}}\) in \(\mathbb {R}^n\), respectively. The lower and upper bound entries \(\underline{{x}}_i\) and \(\overline{{x}}_i\) are such that \(\underline{{x}}_i\le \overline{{x}}_i\) for all \(i=1,\ldots ,n\). Thus a node is defined by the variable intervals \([\underline{\mathsf {x}}, \overline{\mathsf {x}}]\). \(\mathsf {x}\in [\underline{\mathsf {x}}, \overline{\mathsf {x}}]\) means that each entry \({x}_i\) of \(\mathsf {x}\) is such that \(\underline{{x}}_i\le {{x}}_i\le \overline{{x}}_i\).

Starting from a known interval, say \(\mathcal {B}_0 = [\underline{\mathsf {x}}_{init}, \overline{\mathsf {x}}_{init}]\), the tree is constructed by recursively dividing the interval into two or more tighter intervals. If \(\mathcal {B}_0\) is divided into \(\mathcal {B}_k,\,k=1,\ldots ,p\) intervals, it must satisfy \(\mathcal {B}_k\subset \mathcal {B}_0\) for all \(k=1,\ldots ,p\) and \(\mathcal {B}_0 = \mathcal {B}_1\cup \mathcal {B}_2\ldots \cup \mathcal {B}_p\).

We illustrate the tree construction process with an example of two variables, say \(\mathsf {x} = (x_1,x_2)^T\), scenario. Let \(a_0\) be the first node representing the interval \(\mathcal {B}_0\). The interval \(\mathcal {B}_0\) is subdivided into smaller intervals by breaking each variable bounds into two. Two variables, each with two bounds produce four combinations of smaller intervals, which are represented by first level nodes \(a_1\), \(a_2\), \(a_3\), and \(a_4\). The complete tree construction process is then carried out through similar branching in a recursive manner. Fig. 1(left) provides the graphical illustration of two levels of branching. The nodes created in this fashion are dynamically stored in a tree structure as shown in Fig. 1(right). In case of higher dimensional search space, trees are constructed in a very similar fashion. However, the number of branching per node can vary as per convenience.

3.2 Search Method

The Branch-and-Bound algorithmic paradigm relies on finding the optimistic and pessimistic number of inliers within each interval of variables. Any local solution, obtained using as a starting point a randomly picked sample from the given interval, can serve to determine a pessimistic number of inliers. Therefore, the most challenging part is to efficiently determine a tight optimistic number of inliers. An efficient method to estimate such optimistic number of inliers for our registration problem is given in Sect. 7. Here, we discuss the BnB process with an example while assuming that the optimistic and pessimistic inliers can be estimated for any given interval.

Fig. 2
figure 2

Branch-and-Bound pruning based on bounds on the number of inliers. Node \(a_1\) is pruned because its optimistic number of inliers is smaller than the pessimistic number of inliers of node \(a_4\) (Color figure online)

Consider the inliers measure \(\eta (\mathsf {x})\in \mathbb {N}\) and \(\underline{\eta }\), \(\overline{\eta }\), respectively, the pessimistic and optimistic numbers of inliers estimated in \([\underline{\mathsf {x}},\overline{\mathsf {x}}]\). We represent the node-specific bounds using the subscript corresponding to that node. For the first level nodes, the mapping of both bounds to the axis with the number of inliers is shown in Fig. 2. One can observe from this diagram that the pessimistic number of inliers of node \(a_4\) is greater than the optimistic number of inliers of \(a_1\) (i.e. \(\overline{\eta _1}<\underline{\eta _4}\)). Therefore, the solution corresponding to the maximum number of inliers cannot lie in the subspace represented by node \(a_1\). As a result, \(a_1\) can be safely pruned. In fact, if the pessimistic number of inliers of any node is greater than the optimistic number of inliers of any other node, then nodes with smaller optimistic number of inliers can always be pruned. This process can be applied repeatedly in a recursive manner until the globally optimal number of inliers is reached. The algorithm terminates when the maximum number pessimistic inliers (obtained so far) becomes equal to the maximum optimistic inliers among all remaining nodes. The complete BnB process seeking the optimal number of inliers \(\eta ^*\) is summarized in Algorithm 1. In our case, we also return the parameter \(\mathsf {x}^*\) corresponding to \(\eta ^*\).

figure a

4 Assignment Polynomials

In this paper, we consider a set of two or more calibrated cameras observing a scene consisting of a set \(\mathcal {P}\) of at least four distinct planes in general positions. The scene has been scanned by a 3D sensor and segmented into these planes. A plane \(\varPi \in \mathcal {P}\) is given by its normal 3-vector \({\pi }\) and signed distance to the origin d. We also consider the set \(\mathcal {Y}\) of seven or more points (lying on at least four distinct scene planes) whose projections are matched across two or more cameras. We distinguish two working hypotheses depending on whether image correspondences are sparse or dense:

  • when the set \(\mathcal {Y}\) is sparse, the SfM-induced (Hartley and Zisserman 2004) 3D points are likewise sparse. Each reconstructed point is then represented by a \(\mathsf {y}\in \mathbb {R}^3\) of Cartesian coordinates;

  • should \(\mathcal {Y}\) be sufficiently dense, the SfM-induced point cloud may further be segmented into a set of planes \(\mathcal {P}_r\). Each such plane is represented by its normal 3-vector \({\pi }_r\) and distance to the origin \(d_r\).

The coordinates of the SfM-reconstructed points and/or planes and those of the scanned scene planes are represented in two distinct reference frames. The two representations of the scene also differ by a generally unknown scale factor. Consequently, the transformation aligning the SfM-reconstructed scene (whether represented by points or by planes) and the scanned scene is represented by a \(3\times 3\) scaled-rotation matrix \(\mathsf {Q}\) and a translation 3-vector \(\mathsf {t}\). A quaternion representation with no enforcement of unit quaternion \(\mathsf {q}=(\begin{array}{cccc} z&u&v&w\end{array})^{\intercal }\) is used to represent the scaled-rotation matrix \(\mathsf {Q}\):

$$\begin{aligned} \mathsf {Q} =\left[ \begin{array}{ccc} z^{2}+u^{2}-v^{2}-w^{2} &{} 2uv-2wz &{} 2uw+2vz\\ 2uv+2wz&{} z^{2}-u^{2}+v^{2}-w^{2}&{}2vw-2uz\\ 2uw-2vz &{} 2vw+2uz &{} z^{2}-u^{2}-v^{2}+w^{2} \end{array}\right] . \end{aligned}$$

Aligning the SfM and scanned representations consists in finding \(\mathsf {Q}\) and \(\mathsf {t}\) that together map the 3D SfM-reconstructed points, or alternatively the SfM-induced planes, to their corresponding planes in the scanned scene. The problem of finding the correct matches between the 3D SfM-reconstructed points and scanned planes is referred to as the “point-to-plane” assignment problem. That of finding the correct match between the SfM and scanned planes is referred to as the plane-to-plane assignment problem. In both cases, the assignments are described by polynomials in the unknown entries of \(\mathsf {Q}\) and \(\mathsf {t}\) and which we denote by a vector \(\mathsf {x}\in \mathbb {R}^7\) such that \(\mathsf {x}=(\mathsf {q}^\intercal , \mathsf {t}^\intercal )^\intercal \).

Point-to-Plane Assignment Polynomials When relying on point-to-plane assignments for registering the image and scanner modalities, we consider \(\mathcal {A}\subset \mathcal {Y}\times \mathcal {P}\) as a set of putative point-to-plane assignments (\(\times \) refers to the cartesian product) and \(a=(Y,\varPi )\in \mathcal {A}\) is one such assignment. The polynomial \(f^{a}(\mathsf {x})\) in \(\mathbb {R}[\mathsf {x}]\) induced by a is given by:

$$\begin{aligned} { f^{a}{(\mathsf {x})}:= {\pi }^\intercal (\mathsf {Q}\mathsf {y}+\mathsf {t})-d. } \end{aligned}$$
(4)

If \(\mathsf {x}\) is the true registration parameter vector, then for every correct assignment \(a\in \mathcal {A}\), \(f^a(\mathsf {x})=0\). Note that the residual of polynomial \(f^{a}(\mathsf {x})\) is the orthogonal distance from point Y to plane \(\varPi \) for the parameter \(\mathsf {x}\). Accounting for scanning, reconstuction and round-off errors, this residual must lie within a reasonable margin for an assignment to be considered an inlier. Given \(\xi \ge 0\), a sufficiently small predefined distance threshold, inlier assignments ought to satisfy

$$\begin{aligned} \vert f^{a}(\mathsf {x})\vert \le \xi \text{ where } \xi \ge 0. \end{aligned}$$
(5)

Note that \(\xi \) representing a geometric error, it is fairly easy to set. We are interested in finding \(\mathsf {x}\) that maximizes the number of assignments by assessing the exact point-to-plane distance measure.

Plane-to-Plane Assignment Polynomials When carrying out plane-to-plane registration, the set of putative assignments \(\mathcal {A}\) represents a subset of \(\mathcal {P}_r\times \mathcal {P}\). If the euclidean point Y relates to the SfM reconstructed point \(Y_r\) by \(\mathsf {y}_r = \mathsf {Qy+t}\), then the Euclidean plane \(\varPi \) and reconstructed plane \(\varPi _r\) are related by,

$$\begin{aligned} \left[ \begin{array}{c} \pi _r\\ -d_r \end{array} \right] =\delta \left[ \begin{array}{cc} \mathsf {Q}^\intercal &{}\mathsf {0}\\ \mathsf {t}^\intercal &{}1 \end{array} \right] \left[ \begin{array}{c} \pi \\ -d \end{array} \right] , \end{aligned}$$
(6)

such that \( {\pi }_r^\intercal \mathsf {y}_r -d_r = {\pi }_r^\intercal (\mathsf {Qy+t}) -d_r = 0\) is satisfied. Now, after introducing the quaternion terms, the assignment \(a=(\varPi _r,\varPi )\in \mathcal {A}\) can be described by a 4-vector of degree 2 polynomials

$$\begin{aligned} {f}^{a}(\mathsf {x}):=\mathsf {q}^\intercal \mathsf {q} \left[ \begin{array}{c} \pi _r\\ -d_r \end{array} \right] +\delta \left[ \begin{array}{cc} \mathsf {Q}^\intercal &{}\mathsf {0}\\ \mathsf {t}^\intercal &{}1 \end{array} \right] \left[ \begin{array}{c} \pi \\ -d \end{array} \right] \end{aligned}$$
(7)

Should \(\mathsf {x}\) be the true registration parameter vector, all four polynomials simultaneously vanish for some value of \(\delta =\pm 1\). Similar to (5), accounting for errors, an inlier assignment must satisfy

$$\begin{aligned} \vert {f}_i^{a}(\mathsf {x})\vert \le \xi \text{ for } i=1,2,3,4 \end{aligned}$$
(8)

where \({f}^{a}(\mathsf {x})=({f}_1^{a}(\mathsf {x}),{f}_2^{a}(\mathsf {x}),\ldots ,{f}_4^{a}(\mathsf {x}))\). However, this time, the residuals do not correspond to a geometric distance but are rather of an algebraic nature. The threshold \(\xi \) in this case is hard to set. Nevertheless, a useful threshold can generally be obtained empirically such as trial-and-error. The work herein assumes such threshold is given.

5 Polynomial SoS Assignment Conditions

Our goal is to simultaneously estimate the registration parameters \(\mathsf {x}\) and associated set of correct assignments. The problem can be solved by considering either point-to-plane or plane-to-plane assignments. Note, however, that we voluntarily do not distinguish between the cases in which the initial set \(\mathcal {A}\) is a point-to-plane putative assignments set from that in which it is a plane-to-plane one. Unless stated otherwise, \(f^{a}(\mathsf {x})\) is always considered as a vector of polynomials of appropriate dimension p: a 1-vector with \(p=1\) to represent the single polynomial (4) induced by a point-to-plane putative assignment or, with \(p=4\), a 4-vector of polynomials (7) induced by a plane-to-plane assignment. We use \(f_i^{a}(\mathsf {x})\) to refer to the \(i^{th}\) entry of \(f^{a}(\mathsf {x})\), \(1\le i\le p\).

We may assume, for the sake of clarity of the exposition, that, in the case of plane-to-plane assignments, the value of \(\delta =\pm 1\) in (4) is known. It will be made clear, further in this section (see Result 3), how the two cases are handled in our method.

Our registration approach is based on the BnB algorithmic paradigm and branching is carried out on the space of registration parameters \(\mathsf {x}\), as discussed in Sect. 3. At each iteration, we are given parameter interval \([\underline{\mathsf {x}}, \overline{\mathsf {x}}]\). We probe this interval for potentially correct assignments by attempting to solve, for each assignment, the following problem:

Problem 1

For given \(a\in \mathcal {A}\) and threshod \(\xi \ge 0\), is there a vector \(\mathsf {x}\in [\underline{\mathsf {x}}, \overline{\mathsf {x}}]\) such that \(\displaystyle \vert f_i^{a}(\mathsf {x})\vert \le \xi \) for all \(i=1\ldots p\)?

In other words, through Problem 1, one would like to know whether all polynomials in \(f^{a}(\mathsf {x})\) are within the desired margin for some \(\mathsf {x}\) within the considered bounds. The assignment would then qualify as a potential inlier, i.e. a possibly correct assignment, within these bounds. This is however difficult to answer. One alternative is to consider the following problem:

Problem 2

For given \(a\in \mathcal {A}\) and threshod \(\xi \ge 0\), is there a polynomial \(f_i^{a}(\mathsf {x})\) in vector \(f^{a}(\mathsf {x})\) satisfying \(\displaystyle \vert f_i^{a}(\mathsf {x})\vert > \xi \) for every \(\mathsf {x}\in [\underline{\mathsf {x}}, \overline{\mathsf {x}}]\)?

If the question of Problem 2 is answered in the affirmative, the one of Problem 1 is answered in the negative: i.e. there exists no \(\mathsf {x}\in [\underline{\mathsf {x}}, \overline{\mathsf {x}}]\) with which \(\vert f_i^{a}(\mathsf {x})\vert \le \xi \) for all \(i=1,\ldots ,p\). In such case, the assignment a is definitely an outlier, i.e. incorrect assignment, within the bounds. Otherwise, it is a potential inlier.

One can rely on Putinar’s Theorem 2 to solve Problem 2. To do so, assume we are given a set of polynomials \(g_k({\mathsf {x}})\) whose quadratic module M(g) is Archimedean: if, for a scalar \(\lambda _i^a\) such that \(-1\le \lambda _i^a\le +1\), the polynomial \(\lambda _i^af_i^{a}(\mathsf {x})>\xi \) for all \(\mathsf {x}\in \mathcal {K}=\{\mathsf {x}\in \mathbb {R}^7: g_{k}(\mathsf {x})\ge 0, k=1\ldots m\}\), then \(\lambda _i^a f_i^{a}(\mathsf {x})\in M(g)\). Hence, SoS polynomials \(\sigma _{k}(\mathsf {x})\) and scalar \(-1\le \lambda _i^a\le +1\) exist such that:

$$\begin{aligned} { \lambda _i^a f_i^{a}(\mathsf {x})-\xi -\sum ^m_{k=1}\sigma _k(\mathsf {x})g_{k}(\mathsf {x}) \text{ is } \text{ SoS }.} \end{aligned}$$
(9)

Note that, in general, if (9) is satisfied, then \(\lambda _i^a f_i^{a}(\mathsf {x})\) may not necessarily be positive in \(\mathcal {K}\) since \(\mathcal {K}\) could possibly be empty. However, so long as \(\mathcal {K}\) is not empty and \(\sigma _{k}(\mathsf {x})\) SoS polynomials and scalar \(\lambda _i^a\) can be found, one is guaranteed that \(\lambda _i^a f^{a}_i(\mathsf {x})>\xi \) everywhere in \(\mathcal {K}\) since \(\sum ^m_{k=1}\sigma _k(\mathsf {x})g_{k}(\mathsf {x})>0\) in \(\mathcal {K}\). Moreover, if \(-1\le \lambda _i^a\le +1\), then this guarantees that either \(f_i^{a}(\mathsf {x})>\xi \) or \(-f_i^{a}(\mathsf {x})>\xi \) everywhere in \(\mathcal {K}\): assignment a would be an outlier.

There are two main pending issues before one is able to use (9). First, one needs to find a set of polynomials \(g_k({\mathsf {x}})\), representative of the parameter intervals, whose quadratic module M(g) is Archimedean. Second, it is so far unclear how the \(\sigma _{k}(\mathsf {x})\) SoS polynomials can be found. Let us explore now the first of these issues. Note that the Archimedean property is a matter of representation and the quadratic module of the set constructed from the linear interval constraints \({x}_k-\underline{{x}}_k\ge 0\) and \(\overline{{x}}_k-{x}_k\ge 0\) is not Archimedean. In the following, we show that quadratic polynomial inequalities derived from such bound constraints yield an Archimedean quadratic module.

Proposition 1

Consider the polynomials \(g_k(\mathsf {x})=({x}_k-\underline{{x}}_k)(\overline{{x}}_k-{x}_k)\), \(k=1\ldots 7\). The quadratic module M(g) of these polynomials is Archimedean.

Proof

As per Definition 4, for M(g) to qualify as Archimedean, one ought to demonstrate that \(N-\sum ^n_{k=1}{x}_k^2\in M(g)\) for some \(N\in {\mathbb {N}}\). Definition 3 indicates that, for \(N-\sum ^n_{k=1}{x}_k^2\) to be in M(g) with the \(g_k(\mathsf {x})\) polynomials of this proposition, there must exist SoS polynomials \(\sigma _0(\mathsf {x})\) and \(\sigma _k(\mathsf {x})\), \(k=1\ldots 7\), such that

$$\begin{aligned} N-\sum ^n_{k=1}{x}_k^2=\sigma _{0}(\mathsf {x})+\sum _{k=1}^{7}\sigma _{k}(\mathsf {x})({x}_k-\underline{{x}}_k)(\overline{{x}}_k-{x}_k). \end{aligned}$$
(10)

In particular, because \(\sigma _0(\mathsf {x})\) must be SoS, the proof boils down to establishing the existence of SoS \(\sigma _k(\mathsf {x})\), \(k=1\ldots 7\) that can turn \(\sigma _0(\mathsf {x})\), of the form

$$\begin{aligned} \sigma _0(\mathsf {x})=N-\sum ^7_{k=1}\mathsf {x}_k^2-\sum _{k=1}^{7}\sigma _{k}(\mathsf {x})({x}_k-\underline{{x}}_k)(\overline{{x}}_k-{x}_k), \end{aligned}$$
(11)

into a SoS polynomial. The polynomial \(\sigma _0(\mathsf {x})\) can be rewritten as

$$\begin{aligned} \sigma _0(\mathsf {x})=N-\sum ^7_{k=1}\left( {x}_k^2+\sigma _{k}(\mathsf {x})({x}_k-\underline{{x}}_k)(\overline{{x}}_k-{x}_k)\right) . \end{aligned}$$
(12)

Let us first show that a PSD, not necessarily SoS, \(\sigma _0(\mathsf {x})\) exists. To do so, we thus seek \(\sigma _k(\mathsf {x})\), \(k=1\ldots 7\), for which there exists N satisfying

$$\begin{aligned} N{\scriptstyle \ge }{\max }_{\mathsf {x}}\left( \sum ^7_{k=1}\left( {x}_k^2+\sigma _{k}(\mathsf {x})({x}_k-\underline{{x}}_k)(\overline{{x}}_k-{x}_k)\right) \right) . \end{aligned}$$
(13)

Such N exists if the polynomial argument of \(\max (.)\) is concave. Observe that using zero-degree SoS polynomials \(\sigma _k\) that are independent from \(\mathsf {x}\), i.e. nonnegative real scalars, this polynomial is quadratic. Its expansion into

$$\begin{aligned} \sum ^7_{k=1}\left( (1-\sigma _k){x}^2_k+\sigma _k(\underline{x}_k+\overline{x}_k){x}_k-\sigma _k\underline{x}_k\overline{x}_k\right) \end{aligned}$$
(14)

shows that its Hessian matrix, \(\mathsf {H}=diag((1-\sigma _1),(1-\sigma _2),\ldots ,(1-\sigma _7))\), is diagonal. This polynomial is then concave if \(\mathsf {H}\) is negative-definite which happens for \(\sigma _k>1\) for \(k=1\ldots 7\). This shows that N and nonnegative scalars (SoS) \(\sigma _k\), \(k=1\ldots 7\), do exist for \(\sigma _0(\mathsf {x})\) to be PSD. Furthermore, \(\sigma _k\) being scalars, \(\sigma _0(\mathsf {x})\) is a quadratic polynomial. As discussed in Sect. 2, Hilbert (1888) showed that, for quadratic polynomials, every PSD polynomial is SoS. \(\square \)

Let us now consider the problem of checking whether or not (9) is SoS when considering the polynomials \(g_k(\mathsf {x})\), \(k=1\ldots 7\) of Proposition 1. If so the assignment a is definitely an outlier within the bounds. If one knows beforehand that \(\lambda _i^a f_i^{a}(\mathsf {x})-\xi \) must be positive, a sequence of \(\sigma _k(\mathsf {x})\) of increasing degree can be used until a positivity certificate is obtained. However, for the problem at hand, when a set of \(\sigma _k(\mathsf {x})\) of some degree fails to deliver such certificate, it is either because \(\lambda _i^a f_i^{a}(\mathsf {x})-\xi \) is indeed not positive or the required degree for a positivity certificate has not been reached. The good news here is that, within a BnB search, the considered bound intervals \([\underline{\mathsf {x}},\overline{\mathsf {x}}]\) get smaller and we show in the following that using nonnegative scalars \(\sigma _k\) rather than SoS polynomials of higher degree suffices. To see this, consider the following proposition:

Proposition 2

Let \(\hat{\mathsf {x}}\in \mathbb {R}^7\) with known entries. The following statements are equivalent

  1. (i)

     \(\lambda _i^af_i^{a}(\hat{\mathsf {x}})-\xi >0\).

  2. (ii)

    There exist SoS polynomials \(\sigma _k(\mathsf {x})\), \(k=1\ldots 7\):

    $$\begin{aligned} \lambda _i^af_i^{a}(\mathsf {x})-\xi +\sum ^7_{k=1}({x}_k-\hat{{x}}_k)^2\sigma _k(\mathsf {x})>0. \end{aligned}$$
    (15)

    If \(f_a(\mathsf {x})\) is a polynomial of degree 2, nonnegative scalars \(\sigma _k\), instead of \(\sigma _k(\mathsf {x})\) for all k, is sufficient.

Proof

(ii)\(\implies \) (i) is straightforward.

For (i)\(\implies \)(ii), since \(\lambda _i^a f_i^{a}(\mathsf {x})\in M(g)\), the SoS polynomial of (9) can be expressed as,

$$\begin{aligned} \sigma _0&= \lambda _i^af_i^{a}(\mathsf {{x}})-\xi -\sum ^7_{k=1}\sigma _k(\mathsf {x})g_{k}(\mathsf {{x}}), \nonumber \\&= \lambda _i^af_i^{a}(\mathsf {x})-\xi +\sum ^7_{k=1}({x}_k-\hat{{x}}_k)^2\sigma _k(\mathsf {x}). \end{aligned}$$
(16)

When \(f_i^a(\mathsf {x})\) is a polynomial of degree 2, let \(\mathsf {G}_f\) be the Gram matrix of \(\lambda _i^af_i^{a}(\mathsf {x})-\xi \) and \(\mathsf {G}_\mathsf {x}\) be that of \(\sum ^7_{k=1}({x}_k-\hat{{x}}_k)^2\). These matrices are defined by: \(\lambda _i^af_i^{a}(\mathsf {x})-\xi =\mathbf{Z}_1(\mathsf {x})^\intercal \mathsf {G}_f\mathbf{Z}_1(\mathsf {x})\) and \(\sum ^7_{k=1}({x}_k-\hat{{x}}_k)^2=\mathbf{Z}_1(\mathsf {x})^\intercal \mathsf {G}_\mathsf {x}\mathbf{Z}_1(\mathsf {x})\). Note that \(\mathsf {G}_\mathsf {x}\) is PSD and can be written as \(\mathsf {G}_\mathsf {x}=\mathsf {U}^\intercal \mathsf {U}\) with \({\mathsf {U}{\hat{\mathsf {x}}}=\mathsf {0}}\). The Gram matrix of the polynomial in (15) is then written as \(\mathsf {G}_f+\mathsf {U}^\intercal \text{ diag }(\sigma _1,\sigma _2,\ldots ,\sigma _7)\mathsf {U}\). A direct application of Lemma 1 is that the latter matrix is positive-definite if and only if \(\mathbf{Z}_1({\hat{\mathsf {x}}})^\intercal \mathsf {G}_\mathsf {f}\mathbf{Z}_1({\hat{\mathsf {x}}})>0\). This, not only shows (i)\(\implies \)(ii), but also provides their equivalence. \(\square \)

Although, the equivalence can also be proven using one \(\sigma \) instead of multiple \(\sigma _k\) for the case of degree 2 polynomials, the Proposition 2 is only a proof that the outlier certificate can definitely be obtained when the bounding intervals get smaller during the BnB search. Since we are interested in detecting outliers as early as possible, we suggest using multiple \(\sigma _k\).

We now state the following preliminary result:

Result 3

(Preliminary) Let \(\xi \ge 0\) be a predefined threshod distinguishing inliers from outliers. Consider two vectors \(\underline{\mathsf {x}}\) and \(\overline{\mathsf {x}}\) in \(\mathbb {R}^7\) whose respective entries \(\underline{{x}}_k\) and \(\overline{{x}}_k\) satisfy \(\underline{{x}}_k\le \overline{{x}}_k\) for \(k=1\ldots 7\). Let \(\mathcal {K}_b\) be the set

$$\begin{aligned} { \mathcal {K}_b=\{\mathsf {x}\in \mathbb {R}^7: g_k(\mathsf {x}):=({x}_k-\underline{{x}}_k)(\overline{{x}}_k-{x}_k)\ge 0\}, } \end{aligned}$$
(17)

\(a\in {{\mathcal {A}}}\) be either a point-to-plane or plane-to-plane putative assignment, and \(f^a(\mathsf {x})\) the vector of polynomials \(f_i^a(\mathsf {x})\), \(i=1\ldots p\), induced by this assignment [(4) for point-to-plane and (7) for plane-to-plane].

If, for at least one of the polynomials \(f_i^a(\mathsf {x})\), a scalar \(-1\le \lambda _i^a\le +1\) and nonnegative scalars \(\sigma _k\) exist such that

$$\begin{aligned} {\lambda _i^af_i^{a}(\mathsf {x})-\xi -\sum ^7_{k=1}g_{k}(\mathsf {x})\sigma _k} \end{aligned}$$
(18)

is SoS, then \(\lambda _i^af_i^{a}(\mathsf {x})>\xi \), hence \(\vert f_i^{a}(\mathsf {x})\vert >\xi \), for every \(\underline{{x}}_k\le {x}_k\le \overline{{x}}_k\). In this case, assignment a is then guaranteed to be an outlier (a point-to-plane or plane-to-plane mismatch) within these bounds. Assignment a is a potential inlier only if (18) is not SoS for any \(i=1\ldots 4\). Note that, in the case of a plane-to-plane assignment a, the latter is deemed an outlier only if (18) is found to be SoS for both values \(\pm 1\) of \(\delta \) in (7).

Furthermore, a consequence of Proposition 2 is that when \(\overline{\mathsf {x}}_k-\underline{\mathsf {x}}_k\) tends towards zero, we are guaranteed that any outlier within the bound is detected. Indeed, this can be seen by noticing that when \(\overline{\mathsf {x}}_k=\underline{\mathsf {x}}_k=\hat{\mathsf {x}}_k\), polynomial (18) turns into (15).

Whether (18) is SoS can be tested by converting it into its corresponding Gram matrix LMI feasibility problem for the \(\lambda _i^a\) and \(\sigma _k\) indeterminates. Both \(\lambda _i^a\) and \(\sigma _k\) appear in their respective Gram matrices while searching for the SoS certificate of (18). Although the guarantee of identifying outliers (using scalar multipliers \(\sigma _k\))is demonstrated with a zero-gap bound, in practice, outliers are detected very early in the process. As demonstrated in our experiments, the ability to detect outliers is improved with every size reduction of the investigated bounds. It may be tempting to use higher degree \(\sigma _k(\mathsf {x})\) SoS polynomials to boost the process. However, this is unnecessary and yields slower performances compared to branching.

6 Semi-Algebraic Sets for Geometric Constraints

Recall that our goal is to register a SfM-induced reconstruction of points or planes and a plane-segmented scanned scene. Unlike when dealing with 3D-3D registration, additional geometric constraints emanating from the cameras can be exploited. Some may be implicit, such as plane visibility, others, such as vague camera locations, may be obtained from extra knowledge. In addition, when dealing with segmented scanned scenes, one is given planar patches rather than infinite planes. These patches can be used for point-to-patch in lieu of point-to-plane registration to restrict the location of the SfM-induced points patches. In all cases, these constraints are described by quadratic polynomial inequalities that may augment the semi-algebraic set \(\mathcal {K}_{b}\) derived from the bound constraints. Indeed, adding new polynomial inequalities to those in \(\mathcal {K}_b\) has no effect on the Archimedean property of its quadratic module and Proposition 2 still holds. Note though that some of the constraints presented herein, which we refer to as “generic constraints”, may be exploited in both point-to-plane and plane-to-plane registration. Others are applicable only to point-to-plane registration.

Fig. 3
figure 3

Illustrations of camera bounds delimited by six planes (left) defining a bounding box, plane as a patch delimited by four planes (middle), and plane visibility for two sets of cameras lying on two different sides (right) (Color figure online)

6.1 Generic Constraints

Whether engaging in point-to-plane or plane-to-plane, extra-knowledge about vague camera locations or scale information can be used as additional registration constraints.

Camera Bounds A camera center C may lie within a box delimited by six planes in the set \(\varPsi =\{\varPsi _k\}^6_{k=1}\) defined by their normal vectors \(\psi _k\) and signed distances \(d_k\), as illustrated in Fig. 3(left). Such information can be obtained from application-specific knowledge (GPS, moving vehicle, etc.). This knowledge can be used for further enforcing the search for point-to-plane or plane-to-plane outliers and turns very useful when no putative correspondences are initially known. Consider the cartesian coordinate vector \(\mathsf {c}\) of the camera center and let

$$\begin{aligned} \mathcal {K}_{c}= & {} \{\mathsf {x}\in \mathbb {R}^7: h_k(\mathsf {x}):=({\psi }_k^\intercal (\mathsf {Q}\mathsf {c}+\mathsf {t})-d_k)\delta _k\ge 0,\nonumber \\&\quad k=1\ldots 6\} \end{aligned}$$
(19)

where \(\delta _k\) is the known sign, with respect to \(\varPhi _k\), of any point within the considered box. If \(h_k(\mathsf {x})\) are positive, the camera center is within the box. One can now test if \(\lambda _i^af_i^a(\mathsf {x})-\xi >0\) whenever the camera center is in the box defined by \(\mathcal {K}_{c}\).

Quaternions and Scale In the absence of scale, quaternion parameters demand that \(\mathsf {q}^\intercal \mathsf {q}=1\). When dealing with a scaled scene, the rotation is represented by a scaled quaternion matrix and one can only enforce that \(\mathsf {q}^\intercal \mathsf {q}>0\). It is understood that, in order to keep the problem numerically tractable via the Archimedean property, all registration parameters need to be bounded. The scale of the scene is no exception. When a better lower bound \(\underline{s}>0\) on the scale s is available, it is advised to enforce that \(\mathsf {q}^\intercal \mathsf {q}\ge \underline{s}\). This condition does not appear in the set \(\mathcal {K}_b\) and hence must be accounted for. Assuming the entries \(\mathsf {x}_k\), \(k=1\ldots 4\) of \(\mathsf {x}\) correspond the quaternion parameters, we consider the set

$$\begin{aligned} { \mathcal {K}_q=\left\{ \mathsf {x}\in \mathbb {R}^7: q(\mathsf {x}):=-\underline{s}+\sum ^4_{k=1}\mathsf {x}^2_k\ge 0\right\} } \end{aligned}$$
(20)

Furthermore, since both \(\mathsf {q}\) and \(-\mathsf {q}\) yield the same rotation matrix, the initial lower bound of one of the quaternion parameters may arbitrarily be chosen nonnegative. The rest of the quaternion parameters may be initially bounded between \(-\sqrt{\overline{s}}\) and \(\sqrt{\overline{s}}\) where \(\overline{s}\) is the upper bound of the scale.

6.2 Additional Point-to-Plane Geometric Constraints

In addition to camera bounds and scale constraints, other constraints can be used when dealing specifically with point-to-plane registration.

Patches Consider a plane \(\varPi \), from the scanned scene, and three or more planes \(\varPhi _k\), not necessarily from the scene, orthogonal to it. The \(\varPhi _k\) planes must be chosen such that their intersection with \(\varPi \) defines a convex region on \(\varPi \). The set of points on \(\varPi \) within this convex region is a patch. In practice, four such planes are adequate to represent meaningful patches in man-made environments, as shown in Fig. 3(middle). Each \(\varPhi _k\) is described by its normal vector \(\phi _k\) and signed distance \(d_k\). Let us denote by \(\varPhi \) the set \(\{\varPhi _k\}^4_{k=1}\) and let \(\delta _k=\pm 1\) be the known sign, with respect to \(\varPhi _k\), of a scanned point lying within the considered region. In the point-to-patch case, we can then identify outliers by checking whether point-to-plane \(f_i^a(\mathsf {x})\) of (4) is positive everywhere within the bounds of \(\mathsf {x}\) and in the set

$$\begin{aligned} \mathcal {K}^\varPhi _{a}= & {} \{\mathsf {x}\in \mathbb {R}^7: p_k(\mathsf {x}):=({\phi }_k^\intercal (\mathsf {Q}\mathsf {y}+\mathsf {t})-d_k)\delta _k\ge 0,\nonumber \\&\quad k=1\ldots 4\} \end{aligned}$$
(21)

The polynomials in this set indicate the sign of the point Y, with coordinates \(\mathsf {y}\), with respect to each \(\varPhi _k\).

Plane Visibility Consider a point Y on a scene plane \(\varPi \). If this point is imaged by two cameras, then these can only observe the same side of the plane: the one on which the point lies, as shown in Fig. 3(right). In order for the cameras to observe the same side of the plane, their camera centers must lie on one side with respect to \(\varPi \). Camera centers can easily be obtained from the SfM-calculated camera matrices: they are their right null space. Let \(C_k\) be the camera centers of \(n\ge 2\) cameras with cartesian coordinates \(\mathsf {c}_k\). We define the set \(\mathcal {K}^\delta _{\varPi }\) such that

$$\begin{aligned} \mathcal {K}^\delta _{\varPi }= & {} \{\mathsf {x}\in \mathbb {R}^7: v_k(\mathsf {x}):=({\pi }^\intercal (\mathsf {Q}\mathsf {c}_k+\mathsf {t})-d)\delta \ge 0,\nonumber \\&\quad k=1\ldots n\} \end{aligned}$$
(22)

where \(\delta =\pm 1\). We denote \(\mathcal {K}^+_{\varPi }\) the set \(\mathcal {K}^\delta _{\varPi }\) obtained using \(\delta =+1\) and \(\mathcal {K}^-_{\varPi }\) otherwise. A given assignment a is a definite outlier if \(f_a(\mathsf {x})>0\) in \(\mathcal {K}^+_{\varPi }\) and in \(\mathcal {K}^-_{\varPi }\) (in addition to patch and bounds conditions). Furthermore, planes for which \(v_1(\mathsf {x})\) and \(v_2(\mathsf {x})\) (for two cameras 1 and 2) always have opposite signs within the bounds of \(\mathsf {x}\) cannot be assigned any points visible in those cameras. This would indicate that the plane always cuts the base-line of the two camera and cannot contain points visible in both cameras. Testing this can be carried out by checking, for \(\delta =\pm 1\), whether

$$\begin{aligned} {\left\{ \begin{array}{l} \exists \sigma _k: \,v_1(\mathsf {x})-\sum ^7_{k=1}g_{k}(\mathsf {x})\sigma _k\;\hbox {is SoS},\\ \exists \sigma _k: -v_2(\mathsf {x})-\sum ^7_{k=1}g_{k}(\mathsf {x})\sigma _k\;\hbox {is SoS}. \end{array}\right. } \end{aligned}$$
(23)

If for both values of \(\delta \), each polynomial in (23) is SoS, plane \(\varPi \) shall not be considered for assigning SfM points emanating from those cameras.

7 Registration

The preliminary results in Result 3 apply to both point-to-patch and plane-to-plane registration problems. Based on these results and the semi-algebraic sets presented in Sect. 6, we are now ready to state additional results for point-to-plane, or rather plane-to-patch, and plane-to-plane registration. These results along with those in Result 3 are used in our BnB registration algorithm which is also presented in this Section.

Result 4

(Point-to-patch) Let \(\xi \ge 0\) be a predefined threshod distinguishing inliers from outliers. Assume we are given a putative point-to-plane assignment \(a=(Y,\varPi )\in \mathcal {A}\), a patch on \(\varPi \) delimited by the planes in the set \(\varPhi =\{\varPhi _k\}^4_{k=1}\), lower \(\underline{\mathsf {x}}\) and upper \(\overline{\mathsf {x}}\) bounds on the registration parameter vector \(\mathsf {x}\), bounds \(\underline{s}\) and \(\overline{s}\) on the scale of the scene, and (optionally) bounds defined by planes \(\varPsi =\{\varPsi _k\}^6_{k=1}\) on the location of the camera centers of one (possibly more) camera. One would like to know whether or not the SfM-reconstructed point Y may lie on \(\varPi \), while \(\varPi \) is visible by the cameras observing Y, within the patch \(\varPhi \) with registration parameters in the bounds \(\underline{\mathsf {x}}\) and \(\overline{\mathsf {x}}\). In order to establish whether such assignment is possible, we consider the set

$$\begin{aligned} { \mathcal {K}=\{\mathsf {x}\in \mathbb {R}^7: \mathsf {x}\in \mathcal {K}_b\cap \mathcal {K}^\varPhi _{a}\cap \mathcal {K}^\delta _{\varPi }\cap \mathcal {K}_{c}\cap \mathcal {K}_q) } \end{aligned}$$
(24)

resulting from the intersection of all the sets defined by (17), (21), (22), (19) and (20). If there exist a scalar \(-1\le \lambda ^a\le +1\) and nonnegative scalars \(\sigma _k\), \(\sigma ^\prime _k\), \(\sigma _k^{\prime \prime }\), \(\sigma _k^{\prime \prime \prime }\) and \(\sigma \) such that [considering \(f^a(\mathsf {x})\) given by (4)]

$$\begin{aligned} {\begin{array}{l} \lambda ^af^{a}(\mathsf {x})-\xi -\sum ^7_{k=1}g_{k}(\mathsf {x})\sigma _k-\sum ^4_{k=1}p_k(\mathsf {x})\sigma ^\prime _k\\ sum^n_{k=1}v_k(\mathsf {x})\sigma ^{\prime \prime }_k-\sum ^6_{k=1}h_k(\mathsf {x})\sigma ^{\prime \prime \prime }_k-q(\mathsf {x})\sigma \end{array}} \end{aligned}$$
(25)

is SOS, then \(\vert f^a(\mathsf {x})\vert >\xi \) in \(\mathcal {K}\) and the assignment a is a definite outlier. It is a potential inlier otherwise.

Note that a point-to-plane version of this results can simply be obtained by not using \(\mathcal {K}^\varPhi _{a}\) to construct \(\mathcal {K}\).

Result 5

(Plane-to-plane) Let \(\xi \ge 0\) be a predefined threshod distinguishing inliers from outliers. Assume we are given a putative plane-to-plane assignment \(a=(\varPi _r,\varPi )\in \mathcal {A}\), lower \(\underline{\mathsf {x}}\) and upper \(\overline{\mathsf {x}}\) bounds on the registration parameter vector \(\mathsf {x}\), bounds \(\underline{s}\) and \(\overline{s}\) on the scale of the scene, and (optionally) bounds defined by planes \(\varPsi =\{\varPsi _k\}^6_{k=1}\) on the location of the camera centers of one (possibly more) camera. One would like to know whether or not the SfM-reconstructed plane \(\varPi _r\) may be aligned with \(\varPi \) for registration parameters in the bounds \(\underline{\mathsf {x}}\) and \(\overline{\mathsf {x}}\). In order to establish whether such assignment is possible, we consider the set

$$\begin{aligned} { \mathcal {K}=\{\mathsf {x}\in \mathbb {R}^7: \mathsf {x}\in \mathcal {K}_b\cap \mathcal {K}_{c}\cap \mathcal {K}_q) } \end{aligned}$$
(26)

resulting from the intersection of all the sets defined by (17), (19) and (20). If, for at least one polynomial \(f_i^a(\mathsf {x})\), \(i=1\ldots 4\), there exists a scalar \(-1\le \lambda _i^a\le +1\) and nonnegative scalars \(\sigma _k\), \(\sigma ^\prime _k\) and \(\sigma \) such that

$$\begin{aligned} {\lambda _i^af_i^{a}(\mathsf {x})-\xi -\sum ^7_{k=1}g_{k}(\mathsf {x})\sigma _k-\sum ^6_{k=1}h_k(\mathsf {x})\sigma ^{\prime }_k-q(\mathsf {x})\sigma } \end{aligned}$$
(27)

is SOS, then \(\vert f_i^a(\mathsf {x})\vert >\xi \) in \(\mathcal {K}\) and the assignment a is a definite outlier. It is a potential inlier only if (27) is not SoS for any \(i=1\ldots 4\).

Recall that SoS problems (25) and (27) in these results can be solved as a LMI feasibility problem.

Our registration approach is based on Results 4 and 5 . In the following, we use the term point-to-plane to refer to both point-to-plane and point-to-patch assignments. Since the goal of the BnB algorithm is to estimate the registration parameters yielding the largest number of inliers, our algorithm is provided either a set of putative point-to-plane or plane-to-plane correspondences. In the absence of such correspondences, we consider every SfM-induced point or plane to be putatively assigned to all the planes (or patches) in the scanned scene. A dynamically-built search tree, whose nodes are registration parameter bounds, allows to explore the space of parameters, as discussed in Sect. 3.

Optimistic Inliers Given a set of putative assignments and the semi-algebraic set for parameter bounds, our algorithm estimates the optimistic number of potential inliers \(\overline{\eta }\) (and the corresponding inlier set \(\overline{\mathcal {I}}\)) using Results 4 and 5 , with appropriate additional semi-algebraic sets for generic constraints. To qualify an assignment to be an inlier, we distinguish two cases:

  1. 1.

    Putative point-to-plane (resp. plane-to-plane) correspondences are provided: a point-to-plane (resp. plane-to-plane) assignment qualifies as a potential inlier if (25) (resp. 27) is not proven SoS for any polynomial in vector \(f^a(\mathsf {x})\) for the given assignment.

  2. 2.

    No putative correspondences are provided: all possible point-to-plane (resp. plane-to-plane) assignments are tested. An assignment is considered a potential inlier if (25) (resp. 27) is not proven SoS for any polynomial in vector \(f^a(\mathsf {x})\) of that assignment.

Now we summarize the process of estimating optimistic number inliers in Algorithm 2.

figure b

Pessimistic Inliers A local refinement method is used to obtain a pessimistic number of inliers \(\underline{\eta }\) (and the corresponding inlier set \(\underline{\mathcal {I}}\)) for each node. To maximize the pessimistic number of inliers, a local method iteratively refines the registration parameters. The refinement process starts from the mid-values of the registration parameter bounds. In order to be representative of the node, it searches the optimal solution within the investigated bounds. Given the set of optimistic inlier assignments \(\overline{\mathcal {I}}\) (obtained from Algorithm 2) and the semi-algebraic set \(\mathcal {K}_b\) representing the bounds under consideration, the algorithm iteratively updates the registration parameters to:

$$\begin{aligned} {\hat{\mathsf {x}}} = \mathop {\hbox {argmin}}\limits _{\mathsf {x}\in \mathcal {K}_{b}}\sum _{a\in \overline{\mathcal {I}}}{||f^a(\mathsf {x})||^2.} \end{aligned}$$
(28)

The process of estimating the pessimistic number inliers is given by Algorithm 3.

figure c

To summarize, the optimistic and pessimistic inliers are obtained usign Algorithm 2 and Algorithm 3, respectively. Then, the registration parameters yielding the largest number of inliers are obtained using Algorithm 1. In our implementation, the qualified node is branched along its longest edge resulting in two new nodes to be processed. The node corresponding to the maximum number of pessimistic inliers is processed first.

8 Discussion

In general, our method converges while the explored bounds are still quite large. This suggests that incorporating \(\xi \) in the SoS test plays a limited role and most of the time can by ignored. What matters most is that potential inliers, as per the local method’s measure, not to be considered outliers by the SoS test. Without the threshold \(\xi \), the SoS test is of the form (dropping \(\lambda \) for the sake of the argument) \(f(\mathsf {x})-\sum ^7_{k=1}\sigma _k({x}_k-\underline{{x}}_k)(\overline{{x}}_k-{x}_k)>0\) which, when true, means \(f(\mathsf {x})>\sum ^7_{k=1}\sigma _k({x}_k-\underline{{x}}_k)(\overline{{x}}_k-{x}_k)\) for all \(\mathsf {x}\) within bounds.

No assignment would be mistakenly considered as an outlier by the SoS test so long as \(\sum ^7_{k=1}\sigma _k({x}_k-\underline{{x}}_k)(\overline{{x}}_k-{x}_k)>\xi \) for all \(\mathsf {x}\) within the bounds. Note that \(\sum ^7_{k=1}\sigma _k({x}_k-\underline{{x}}_k)(\overline{{x}}_k-{x}_k)\) is concave and its maximum is \(\frac{1}{4}\sum ^7_{k=1}\sigma _k(\overline{{x}}_k-\underline{{x}}_k)^2\). Therefore, the SoS test not considering \(\xi \) would be safe as long as it produces no \(\sigma _k\) such that \(\frac{1}{4}\sum ^7_{k=1}\sigma _k(\overline{{x}}_k-\underline{{x}}_k)^2<\xi \).

Note that \(\sigma _k\) compensate the nonconvexities in \(f(\mathsf {x})\) and are generally relatively large. Hence for the SoS test to be unsafe when \(\xi \) is not included, the size of the bound-gap \((\overline{{x}}_k-\underline{{x}}_k)^2\) for \(k=1\ldots 7\) must be very small. The algorithm converges before reaching such small bound.

9 Experiments

We conducted experiments with seven different benchmark real datasets shown in Fig. 4 whose details can be found in Jensen et al. (2014) and Strecha et al. (2008). Both point-to-plane (more precisely, point-to-patch) and plane-to-plane registration methods have been tested. Our algorithm was implemented in MATLAB2014b and the SoS problems were solved using the LMI Control Toolbox.Footnote 1 All experiments were carried out on a 8GB RAM Pentium i7/3.40GHz. The SfM reconstructions and segmented scene planes were obtained using the openMVG Toolbox (Moulon et al. 2013) and Hough Transform based plane detector (Borrmann et al. 2011). All the planes were extracted directly from the point clouds using the source code provided in Hough Transform Plane Detector (2015) with an accumulator designed in Borrmann et al. (2011) to speed up the Hough Transform. For the experiments with unknown scale of the reconstruction, the initial bound on reconstruction scale was set to 0.2–5.0 (five times in scale in both directions). Four different error measurement metrics were defined to evaluate the registration quality: the RMS 3D error on normalized point sets, errors in rotation \(\mathsf {R}\), translation \(\mathsf {t}\), and scale s. For N experiments, these are defined as follows :

$$\begin{aligned} {\scriptstyle \varDelta R}= & {} \sqrt{\frac{1}{3N}{\sum \limits _{i=1}^N{\Vert \mathsf {r}_{i}^*-\mathsf {r}\Vert ^{2}}}}, {\scriptstyle \varDelta T}=\sqrt{\frac{1}{N(\Vert \mathsf {t}\Vert ^{2})}{\sum \limits _{i=1}^N{\Vert \mathsf {t}_{i}^*-\mathsf {t}\Vert ^{2}}}},\\ {\scriptstyle \varDelta S}= & {} \sqrt{\frac{1}{N(s^{2})}{\sum \limits _{i=1}^N{(s_{i}^*-s)^{2}}}}, \end{aligned}$$

where \(\mathsf {r}\) is a vector obtained by stacking three rotation angles in degrees. The estimated variables are represented with * and variables without it are their ground truth.

Note that all SoS inlier/outlier tests have been carried out with \(\xi =0\).

Fig. 4
figure 4

Sample image and the segmented scene (shown in different colors for each plane) of seven different datasets. a Scene23, b Scene24, c Scene27, d Scene29, e Scene73, f Fountain and g Herz-Jesu (Color figure online)

9.1 Point-to-Plane Registration

To perform point-to-plane registration, we selected only the prominent planes from the Euclidean scene. This allowed us to select more points while keeping the exhaustive combinations between points and planes within practical limits. It is important because, when a set of sparse points are selected randomly, we wish to ensure that at least 7 points lying on 4 different planes are included. In this case, we conducted two sets of experiments : one with and one without putative point-to-plane correspondences.

9.1.1 Inlier Set Maximization with Correspondences

Fig. 5
figure 5

Point-to-plane registration for Scene73 with correspondences and no camera bounds. Left: number of processed points and time taken for various levels of outliers. Right: number of detected inliers using RANSAC and our method (Color figure online)

Fig. 6
figure 6

Point-to-plane registration for Scene73 with increasing number of outliers, with correspondences and no camera bounds. Left: error in rotation. Middle: error in translation. Right: error in scale (Color figure online)

The method was first tested for known putative correspondences where the synthetic inliers/outliers were generated under real data setups. No bounds on cameras were used in these experiments. To test robustness to outliers, we varied the number of outliers up to 90% for Scene73 and compared the results against the linear 12-point RANSAC. In theory, only 7 point-to-plane assignments (at least of 4 different planes) are sufficient to estimate the similarity transformation. Therefore, 7-point RANSAC is preferred over 12-point RANSAC, if an efficient minimal solver is avilable for 7 point-to-plane assignments. Up to our knowledge, there exists no such minimal solver. Ideally, generic polynomial solvers may also serve the purpose. But unfortunately, our attempts with such solvers for minimal correspondences case (7 equations, 7 variables, and 2 degree, form 7 point-to-plane correspondences), computation time was significant enough to render this solution impractical within the framework of RANSAC.

Figure 5 shows that our method consistently detects 21 inliers for every experiment while RANSAC fails to detect the least number of required inliers starting from 45% of outliers. Note that the numbers of inliers reported here are true positive inliers. Furthermore, our method does not detect any false positive inliers. Figure 6 shows the errors in rotation, translation, and scale for the same scene with various levels of outliers. The convergence graph of our method with 50% outliers is shown in Fig. 7 for Scene23, Scene73, and Fountain whose quantitative results are shown in Table 1. Figure 8 shows the evolution of the volume and the number of nodes remaining to be processed for the first 50 iterations on Scene23 with 50% outliers. The qualitative results for scene73 are shown in Fig. 9.

Fig. 7
figure 7

Point-to-plane registration convergence graph for 50% outliers, with correspondences and no camera bounds. Left: Scene23. Middle: Scene73. Right: Fountain (Color figure online)

Table 1 Point-to-plane registration with correspondences and no camera bounds: quantitative results obtained with 50% outliers
Fig. 8
figure 8

Point-to-plane registration for Scene23 with correspondences, no camera bounds, and 50% outliers. Left: number of remaining nodes versus search iterations. Right: number of remaining volume versus search iterations (Color figure online)

Fig. 9
figure 9

Point-to-plane registration for Scene73 with correspondences. Left: recovered point-to-plane correspondences with all the processed points in red. Right: registration between reconstruction (in green) and the segmented scene (Color figure online)

Fig. 10
figure 10

Left to right: errors in rotation, errors in translation, and number of detected inliers obtained by our method compared against Minimal-Point Ransac with known scale (Ramalingam and Taguchi 2013). Experiments conducted with Scene73 (Color figure online)

Fig. 11
figure 11

Point-to-plane registration for Scene23. Left: number of iterations versus number of cameras. Right: Convergence graph for the case w/o correspondences with three 1m-box bounded cameras (Color figure online)

Furthermore, we conducted experimetns with the minimal solver of Ramalingam and Taguchi (2013) within RANSAC while allowing it to run for as long as the proposed method did. Note that the minimal solver Ramalingam and Taguchi (2013) is designed for rigid body transformations. Therefore, the reconstruction scale needs to be known beforehand. We favored the solver of Ramalingam and Taguchi (2013) by providing the exact ground truth scale. The experiments with Ramalingam and Taguchi (2013) for Scene73 are reported in Fig. 10. Observe that RANSAC with Ramalingam and Taguchi (2013) achieves a success rate of 100% for outliers upto 50%. The success rate then drops to 69% for 60% of outliers. All the experiments conducted with more than 70% of outliers failed.

Fig. 12
figure 12

Point-to-plane registration for Scene23 without correspondences and three 1m-box bounded cameras (100 independent trials). Our method versus randomly started scaled ICP (RS-ICP). Left: no. of inliers detected. Right: 3D RMS error (Color figure online)

Fig. 13
figure 13

Point-to-plane registration results for Scene24. b, c Inputs, d point-to-plane registration and e, f texture projection. a Sample image, b Segmented scene, c SfM reconstruction, d Registered pointsets, e Textured scene, top view and f Textured scene, side view (Color figure online)

Table 2 Point-to-plane registration without correspondences: quantitative results for seven different datasets
Table 3 Registration results using RISAG, Go-ICP and our method

9.1.2 Inlier Set Maximization w/o Correspondences

In the absence of initial correspondences, each point was assigned to all available planes. We conducted several experiments with bounded cameras by changing the number of bounded cameras and camera bounding box size. The number of iterations taken for these configurations are shown in Fig. 11(left) for Scene23. The average time per iteration is 1.15sec. In the same figure, we also provide the number of iterations taken for the “with correspondences” case with 50% outliers and 2m camera bounding boxes. The case of a single bounded camera is equivalent to unbounded cameras but bounded translation: plane visibility criterion cannot be used in this case. We recall that initial bounds on all the registration parameters are indispensable to ensure an Archimedean quadratic module of the constraints set and hence for employing Putinar’s Psatz. Figure 11(right) shows the convergence graph, using Scene23, obtained with 3 1m-box bounded cameras. It also shows how the residual error on the registration parameters varies with the increase in the number of pessimistic inliers. The reported box size is for a normalized scene size of about 10 meters. In Fig. 12, we report the results obtained on Scene23 (with 3 1m-box bounded cameras) using our method and a randomly started scaled ICP (RS-ICP) for 100 independent trials. In each trial, the scaled ICP was started at randomly picked registration parameter values satisfying bound and visibility constraints. The results show that, unlike RS-ICP which provides very large 3D errors, our method consistently detects the same number of inliers with the same 3D error. Observe that the success rate of RS-ICP is only about 48%.

The results of our method for all scenes (with their corresponding configurations) are summarized in Table 2. In the reported parameters, Points, Planes, Iter, and Inlier represent their numbers. “Recon.” is the quality of the SfM reconstruction measured as the median reprojection error in pixels while “Rep.” is the fraction of the scene points represented by fitted planes. Observe that the registration quality depends upon the reconstruction quality, representation factor, and the number and size of the camera boxes. For a qualitative evaluation, the results obtained for Scene24 are shown in Fig. 13 along with the registered point sets and textured scene [after further refinement using (Viola and Wells 1997)].

Table 4 Plane-to-plane registration results, unknown reconstruction scale, and no camera bounds
Fig. 14
figure 14

Plane-to-plane registration convergence graphs for unknown reconstruction scale and no camera bounds. a Scene23, b Scene24, c Scene27, d Scene29, e Scene73, f Fountain and g Herz-Jesu (Color figure online)

Fig. 15
figure 15

Plane-to-plane registration with search iterations. Left: remaining nodes. Right: remaining volume (Color figure online)

We also provide the results for two datasets obtained using RISAG (Corsini et al. 2013), Go-ICP (Yang et al. 2013), and our method in Table 3. Our method was used without correspondences in the setting given in Table 2. Note that Go-ICP requires an Euclidean reconstruction, which was obtained by upgrading the metric reconstruction using ground truth measurements. Comparison of these methods may be unfair because each requires different initial conditions. Note that the poor performance of RISAG could be due to its RANSAC-driven nature (we used \(10^4\) RANSAC iterations). Nevertheless, experiments with both RISAG and Go-ICP were conducted in their favorable conditions.

9.2 Plane-to-Plane Registration

In this section, we report the results that we have obtained using our plane-to-plane registration method. The method was tested without any putative correspondences provided, i.e. plane-to-plane registration was carried out on the correspondences obtained by assigning each reconstructed plane to all available scene planes. Experiments with unknown scale as well as without scale ambiguity were carried out. Furthermore, no camera was bounded for all the experiments. The number of planes used for all the datasets are shown in Table 4, where “num.\(\mathsf {\Pi }_r\)” and “num.\(\mathsf {\Pi }_e\)”, respectively, represent the number of reconstructed planes and scene planes.

Fig. 16
figure 16

Left to right: sample image, segmented scene, segmented reconstruction, and plane-to-plane-based registered point sets. (From top to bottom: Scene23, Scene24, Scene27, Scene29, Scene73, Fountain, and Herz-Jesu) (Color figure online)

Fig. 17
figure 17

Texture mapping using plane-to-plane registration results. Left: Scene73. Right: Herz-Jesu (Color figure online)

Table 5 Plane-to-plane registration results, known reconstruction scale, and no camera bounds

9.2.1 Inlier Set Maximization with Unknown Scale

The quantitative results obtained for all seven datasets using plane-to-plane registration with scale ambiguity are shown in Table 4. Note that, despite a small number of reconstructed and scene planes, our algorithm was still able to successfully register the data. As expected, the number of search iterations in this case are relatively high. This is due to the absence of constraints from camera bounds. Despite a high number of iterations, the processing time remains reasonable. This is mainly due to the fact that the scenes are represented by a small number of planes leading to a limited number of exhaustive correspondences and hence a reduced number of SoS tests in each iteration.

Convergence graphs for all seven datasets are shown in Fig. 14. For these experiments, the remaining nodes and volume over the search iterations are shown in Fig. 15. It can be observed that, when the nodes are not efficiently pruned (as in for Scene29 after 300 iterations), the remaining search volume varies only slightly. However, as soon as the maximum number of pessimistic inliers meets that of the optimistic one, the optimal solution is obtained. Thereafter, no remaining nodes/volume needs to be processed.

The qualitative results of plane-to-plane registration with unknown scale and no camera bounds are shown in Fig. 16. Among them, the texture mapping of two datasets, namely Scene73 and Herz-Jesu, obtained after refinement using mutual information maximization, are shown in Fig. 17.

9.2.2 Inlier Set Maximization with Known Scale

In many applications, two point sets are related by a rigid-body transformation. These cases involve no discrepancy of scale between the two point sets to be registered i.e. \(\Vert \mathsf {q}\Vert =1\). In fact, this registration problem is equivalent to that of 3D-3D registration of structured same-scale point clouds that can both be segmented into planes. The scale information provides two advantages: (1) it reduces the initial bounds on the entries of \(\mathsf {q}\); (2) it introduces an additional polynomial \(\Vert \mathsf {q}\Vert ^2-1=0\) that must always be satisfied for the solution bounds. In our experiments, the reconstruction scales were set to their ground truth values.

The results obtained for known scale plane-to-plane registration are reported in Table 5. As expected, the number of search iterations are significantly smaller than in the case of unknown scale (refer to Table 4). Therefore, the registration process is also faster. Although the same datasets were used for both the cases of unknown and known scale, the sets of inlier planes were found to be different in some cases. This is because of the scale factor relaxation, which was restricted in the latter.

10 Conclusion

We proposed a method for registering a 3D scan and a set of images of a structured scene. The proposed approach is based on the theory of polynomial SoS optimization. Our method uses SoS registration conditions for point-to-plane as well as plane-to-plane registration. The method presented in this paper can incorporate various constraints emanating from scene and camera knowledge (patch segmentation, camera locations, plane visibility, scaling, etc.). Using Branch-and-Bound and SoS theory, we devised a robust and optimal method for inlier set maximization of either point-to-plane or plane-to-plane correspondences. Although the problem is nonlinear and combinatorial, our method has provided outstanding results in terms of robustness and optimality. In particular, the employed optimization framework has the potential to be efficiently applied to other nonlinear Computer Vision problems.