Keywords

1 Introduction

Finding a planar region in 3D space is important for safely driving and operating a mobile robot or a moving system that can freely drive. A robot operating in an unknown 3D space must identify its surroundings before the system can conduct its assigned tasks. Biped or wheel-based robots should be able to recognize obstacles within their area of motion and avoid or drive over detected obstacles where possible. In the presence of a staircase, a robot should be able to recognize the step planes and traverse them. In order to operate without falling in an unknown 3D space, it must be able to use the available information about 3D depth to recognize planar regions without obstacles in the direction of movement.

Planar space detection has many fields of application. A robot moving around indoors should be able to recognize the planar floor region on which it will drive. Also, the inclination of this plane with respect to the direction of gravity should not be too great, to prevent the robot falling while it is moving. Planar space recognition can be used in recognizing obstacles, such as walls, a table on which objects are located, or the plane of a table. It can also be used to find the centerline of road on which a robot can drive. Recognizing the step planes is an important application for the movement of humanoid biped robots [1].

Recently, a method was proposed which detects regions in real-time for driving mobile robots. Based on the Hough Transform (HT), the method was applied to biped robots such as ASIMO of Honda, which walked on steps and avoided obstacles successfully. Using the HT, the X-Y-Z data in 3D space is transformed into another parameter space, with \(\rho\)- \(\theta\)- \(\varphi\), representing the vertical distance from the origin and the rotation angle on the plane for voting. The corresponding planes are detected by the peaks in the voting space.

Although the method produces segmentation results for range data with some noises and complex planes for an indoor environment, it requires many impractical procedures such as excessive memory for voting, decisions about the optimal voting size, difficulties in fixing the peak locations in the voting space, and many additional processes which need a high computer processing speed [2]. In order to solve these problems, many improved methods have been proposed such as Combinatorial HT (CHT) [3], Randomized HT (RHT) [4, 5], Probabilistic HT (PHT) [6], and Dynamic Generalized HT (DGHT) [7].

RHT is proposed to solve the computing time problem of HT. It doesn’t vote all points in an image into Hough space but votes on the geometrical parameters which are calculated by selecting several points randomly. However, sampling range data that exists on different planes or includes noisy samples on a nonplane surface result in many local maxima in the parameter voting distribution of the Hough space. Also, several distributions are mixed and thus the peaks can be very similar. In a mixed and overlapped distribution of peaks in Hough parameter space, it is difficult to segment a plane reliably.

Kang et al. proposed the plane detection cell (PDC) for reducing the effect of noises and outliers [8]. PDC is a cell of circle type used to test the planarity. The normal vector of a triangle is inscribed in a small circular region, the triangle is rotationally sampled and a series of inscribed triangles having different normal vectors is generated. The direction vectors of these generated triangles are used to test the planarity of the small circular region.

This method is effective in a local region detection but it is not adequate for global plane detection and segmentation and calculating the planarity at all the grid point is too time-consuming.

We test only the normal directions of two planes, which are determined by three vertices of a triangle on range data and arbitrary rotation of it. If the angle of two normal detections is lower than a given threshold it is voted into Hough parameter space. It takes less time than PDC and also detects the global planes, because it can provide the global voting through the local evaluation of data. First, we start with a scan window to vote locally, then, the scan window explores all regions as its size grows. This method improves the detection performance because the planes are locally consistent. We obtained 3D range data from a stereo imaging system and experimented with it in various environments.

The remainder of this chapter is organized as follows: Sect. 11.2 describes the RHT method for plane detection. Section 11.3 presents details of the proposed method. Section 11.3.1 describes a method using random sampling from accuracy evaluation of local planar region, Sect. 11.3.2 shows a search method using a scan window, and Sect. 11.3.3 provides a Look-Up-Table for fast processing. Section 11.3.4 explains the application of Iterative RHT to reduce the effect of noises and multiple planes. The experimental results are shown in Sects. 11.4 and 11.5 gives the conclusions.

2 Randomized Hough Transform

The Hough transform method has the advantage that it is robust to noise when applied to detect a geometric shape in inputted 2D/3D data. However, the computational cost and required memory size are very large. To cope with this problem, a method called randomized Hough transform has been introduced [6, 9].

In the original Hough transform method, each point on the input image is transformed into a Hough parameter space, which is why there are so many points for voting. However, Randomized HT votes for geometrical parameters which are calculated by selecting several points randomly. Figure 11.1 shows the difference between RHT and HT in case of detection of 3D spatial planes.

Fig. 11.1
figure 1

Comparison of HT and RHT. a For Hough transform, a point in xyz space is transformed to a curved surface in Hough parameter space; b For randomized HT, plane parameters in xyz space are transformed to a point in Hough parameter space. Three points in 3D space can represent a plane via Eq. (11.1). The 3D point (x, y, z) is on the plane, the parameters A, B, C represent the normal vector of the plane and D is the scale of the normal vector

$$Ax + By + Cz + D = 0$$
(11.1)

RHT selects three points randomly for detecting a plane, and the plane parameters (A, B, C, D), which represent a plane calculated from the three points, is voted into the Hough parameter space. If the three points denoted by (x1, y1, z1) (x2, y2, z2), and (x3, y3, z3) are set on a plane, parameters A, B, C, D can be represented by Eq. (11.2).

$$\begin{gathered} A = \left| {\begin{array}{*{20}l} 1 & {y1} & {z1} \\ 1 & {y2} & {z2} \\ 1 & {y3} & {z3} \\ \end{array} } \right|\quad B = \left| {\begin{array}{*{20}l} {x1} & 1 & {z1} \\ {x2} & 1 & {z2} \\ {x3} & 1 & {z3} \\ \end{array} } \right| \hfill \\ C = \left| {\begin{array}{*{20}l} {x1} & {y1} & 1 \\ {x2} & {y2} & 1 \\ {x3} & {y3} & 1 \\ \end{array} } \right|\quad D = - \left| {\begin{array}{*{20}l} {x1} & {y1} & {z1} \\ {x2} & {y2} & {z2} \\ {x3} & {y3} & {z3} \\ \end{array} } \right| \hfill \\ \end{gathered}$$
(11.2)

The parameters A, B, C, D are difficult to define in a limited range because they can have fractional values. We need to change this to a spherical coordinate system. Figure 11.2 shows the spherical coordinate system, where \(\rho\) is the distance between a plane and the origin, \(\varphi\) is the angle with respect to the x-axis, and \(\theta\) is the angle with respect to the z-axis. We can define the range of parameters in the changed coordinate system:

$$\begin{aligned} & \rho \ge 0 \\ & 0 \le \theta \le \pi \\ & 0 \le \varphi \le 2\pi \\ \end{aligned}$$
(11.3)
Fig. 11.2
figure 2

Spherical coordinate system

where \(\rho , \varphi , \theta\)can be obtained by Eq. (11.4), where x, y, z is a point in the Cartesian coordinate system and they are denoted by x = A/D, y = B/D, z = C/D.

$$\rho = \sqrt {x^{2} + y^{2} + z^{2}} ,\quad \theta = { \cos }^{ - 1} \left( {\frac{z}{\rho }} \right),\quad \varphi = { \tan }^{ - 1} \left( \frac{y}{x} \right)$$
(11.4)

Conversely, the Cartesian coordinates may be retrieved from the spherical coordinates by Eq. (11.5).

$$x = \rho \sin \theta \cos \varphi , \quad y = \rho \sin \theta \sin \varphi ,\quad z = \rho \cos \theta$$
(11.5)

3 Multiple Plane Detection

3.1 Planarity Evaluation from Sampling of Triangular Point Sets

The 3D range data is the coordinate information, measured from the camera origin, of various planar and nonplane surfaces in 3D, including many noises and outliers. The random sampling method of the conventional RHT votes for Hough parameter space based on selection of excessive samples in order to reduce the influence of outliers for accurate plane detection. Compared to the portion of the surface with outliers, the parameters of the plane are much more consistent. Many samplings improve the accuracy but the processing time is also much increased. And when the size of plane is small, it is difficult to find the plane peak in Hough parameter space.

The normal vector of a triangle in PDC is inscribed in a small circular region so that the normal vector passes through the circumcenter area of the triangle (see Fig. 11.3).

Fig. 11.3
figure 3

The plane normal vector by three vertices of a triangle

The triangle is rotationally sampled with respect to the center position of the circular region, and a series of inscribed triangles having different normal vectors are generated. The direction vectors of these generated triangles are normalized and the median direction of the normal vector is then used to test the planarity. If the planarity of the circular region is lower than a given threshold, the point won’t be voted into the Hough parameter space. Therefore, this method can filter noises and outliers instead of voting them into the Hough parameter space and it is effective for detection of local regions. But it cannot detect global plane because the method calculates the planarity at a single point. So we use the PDC as a filtering method to reduce the excessive sampling of RHT.

We compare the normal direction of a few planes which are determined by three vertices of a triangle and the others rotated with respect to this triangle (see Fig. 11.4). If the angle between two normal directions is lower than a given threshold it is voted into Hough parameter space. The local evaluation for the planarity acts to filter noises and outliers that are not voted into Hough parameter space. Therefore, there are only a few data samplings, resulting in less computing time than PDC.

Fig. 11.4
figure 4

Evaluating the planarity by two triangles

The vector v normal to the triangular plane obtained by the three spatial points is calculated by Eq. (11.6).

$$v = \left( {P_{2} - P_{1} } \right) \times (P_{3} - P_{1} )$$
(11.6)

The vector v is normalized by the expression in Eq. (11.7), and the angle between the two vectors can be obtained by using an inner product Eq. (11.8), if we use two triangular point sets.

$$n = \frac{v}{\|v\|}$$
(11.7)
$$\alpha = n \cdot n^{\prime }$$
(11.8)

Figure 11.5 shows the difference between the conventional RHT method and the proposed RHT method using local evaluation filter of triangular sampling method in the voting space of the Hough parameter. To show the filtering effect for nonplanar and noisy surfaces, construct a nonplane environment that is filled with crumpled chapter as shown in Fig. 11.5a. Figure 11.5b–e shows the Hough parameter space, where (b, c) is the result of the conventional RHT method and (d, e) is the result of the proposed RHT method, respectively. Although the range data have no plane, RHT has voted on many parameters and count of maximum peak is 158. In contrast, the proposed RHT’s voting count is much smaller and the maximum peak is 61. Obviously, the proposed method is less affected by the noise.

Fig. 11.5
figure 5

The Hough parameter space of the conventional RHT and the proposed RHT method with local plane filter for a nonplane surface

3.2 Speed Up via the Scan Window

The conventional RHT involves voting on many data points which are selected randomly from the whole image region. It requires a lot of sampling to prove that the peak from the interesting shape is more salient than those of the noises in Hough parameter space.

In case of plane detection from 3D range data, we use the scan window for accurate and rapid detection of the surface by using the local consistency for data of planar surface. Plane data points have the locality property, because the data points in local region have higher probability to be a plane than far distance data. The scan window is a similar method to that of searching face in AdaBoost [10].

Window that shifts fixed number of pixels scans the whole area. The size of the scan window is initially small and is increased to the image size (see Fig. 11.6). A certain number of data points in each window region are sampled and voted on.

Fig. 11.6
figure 6

The window scans all regions while changing the window size

The scan window of various sizes can reflect the locality and the overlapping region of each window provides a more significant peak of the plane parameter than the conventional searching method.

3.3 Look Up Table

We compare the normal vector from the initial three vertices with the vector from the rotated three vertices, for planarity evaluation. Equation (11.9) is needed for rotation of the three vertices.

Because many samplings are required for accurate plane detection, the rotation of three vertices must be repeatedly calculated. Also, we may fail to get the 3D coordinates from the measurement error of the sensor system after a rotation for the original three points. In this case, we try to get the 3D coordinates from another angle. Table 11.1 shows sequential attempts of a few rotation angles.

Table 11.1 Some angles for data sampling

Therefore, we can previously save the calculated result of trigonometric function into the LUT (Look up table) to avoid repeated calculation of the functions. Although additional memory is necessary for building the LUT, it is indispensable for real-time algorithmic processing.

3.4 Plane Segmentation

IRHT (Iterative Randomized Hough Transform) [11] is introduced for plane segmentation. Figure 11.7 presents the plane segmentation steps. IRHT is a segmentation method using sequential elimination of maximum peak data from whole image to reduce the influence of noises and other candidate shapes.

Fig. 11.7
figure 7

The method of plane segment using IRHT

In step ①, the 3D range data is obtained from the stereo camera. The ρ, θ, and φ values of the sampled data are voted into Hough parameter space, as in step ②. In step ③, we find the maximum peak in Hough parameter space. The peak represents the normal direction of the biggest plane in the 3D range data. In step ④, if the voting count of the peak is lower than TH1, we determine that there is no plane and the plane segmentation algorithm is ended. In ⑤, we find the plane data in the 3D range data by using Eq. (11.9), which is derived from the plane equation. If the value is lower than a given threshold value TH2, the data is saved into the ith plane data and it is removed for detection of the (i + 1)th plane. Again, in ②, the algorithm involves finding the (i + 1)th plane from the remaining 3D range data.

$$\left| {Ax + By + Cz + D| < {\text{TH}}2} \right.$$
(11.9)

A threshold value TH1 is affected by the number of planes detected. The lower the value of TH1, the more iterations of plane detection there are. Therefore, TH1 is determined by the number of sampling data points and the voting value that differentiates the plane and nonplane cases via the experiments. And, the threshold value TH2 is the residual of all the points and the plane we have detected. It is also experimentally determined by the rates of correct data and noises via the experiments.

4 Experiments

4.1 Range Sensor

We have implemented the proposed method on a Core™2 CPU 2.0 GHz computer and used a Bumblebee stereo imaging system [12] for obtaining the 3D range data.

The quality of the range data from the bumblebee stereo camera is lower than that from the active sensor such as laser scanner and it cannot obtain the range data in a nontexture region, because it fails to find the correlations between the left and right images. However, the range data from the stereo camera can be obtained from a low-cost computer practically in real-time.

Figure 11.8a, b shows the left and right images of the bumblebee stereo camera and Fig. 11.8c shows the depth information from the horizontal disparity between the two images. The region close to the camera is dark, and the bright region is distant from the camera.

Fig. 11.8
figure 8

The left and right image of the stereo camera and 3D range image a Left image; b Right image; c 3D range image

In the right-top region of Fig. 11.8c, the gray color region doesn’t have any disparity information because there is no texture.

4.2 Comparison Results

In the conventional RHT method the sampling times are 2,000,000 per iteration and planes are detected with the TH1 value of 30 for the IRHT algorithm in Fig. 11.7.

In the proposed method, the initial size of the window is 16 × 12, the shift size is 4 pixels, and the window size is increased by 10 pixels. The sampling times are 1,637,305 and the TH1 value is 100. The parameter is voted on only if the angle of two normal vectors is less than 10°. Also, the weight of the candidate normal vector is set to a higher value when it has better planarity. The following rule applies: voting count = current voting count + (angle threshold – angle of two normal vectors). For example, if the angle is 0°, which is the best planarity, it is voted as having the highest value (10) and if the angle is 9°, which is the worst planarity, it is voted as having the lowest value (1). The parameter value is voted as the mean of the two normal vectors. The TH2 value for removing the plane data is 0.4 in both the conventional RHT method and the proposed method. We can get these threshold values according to our experiments.

Figures 11.911.10 and Tables 11.211.3 show the comparison results of the conventional RHT method and the proposed method. The top images are obtained from the bumblebee stereo camera. These are the left, right, and 3D range images, respectively. The middle ones are the results of plane detection, which are represented by the segmentation image and Hough parameter space. The colors of the middle left images denote the result of the detected planes and the right graphs show the result of the voting count of the Hough parameter space during each iteration of IRHT. The graph of the Hough parameter space only represents the \(\theta\) and \(\varphi\) values and not the \(\rho\) value. The bottom table represents the \(\rho\), \(\theta\), \(\varphi\) values for plane detection.

Fig. 11.9
figure 9

Comparison experiments for an outdoor planar surface. a Original left, right image, and range image obtained from the stereo sensor; b Plane segment result of conventional RHT and peaks in Hough parameter space, in which each peak presents a plane; c Plane segment result of proposed RHT and peaks in parameter space

Fig. 11.10
figure 10

Comparison experiments for an obstacle on the indoor surface

Table 11.2 Planes parameter of experiments for an outdoor planar surface
Table 11.3 Planes parameter of experiments for an obstacle on the indoor surface

Figure 11.9 and Table 11.2 show an experiment for one plane detection on an uneven flat surface. The proposed method detected the two planes instead of one plane due to the effect of noises. While the conventional RHT method detects nine planes on an uneven flat and sidewalk flat surface. In the proposed method, the sampled data on an uneven flat surface is filtered by planarity evaluation while the conventional RHT method votes the parameters of sampled data on an uneven surface into the Hough parameter space, even though it is not a plane.

Figure 11.10 and Table 11.3 shows the experiment where there is an obstacle in the heading direction of the robot. The proposed method detected two planes of a flat surface and obstacle, while the conventional RHT method detected nine planes because of the influence of noises. Figure 11.10 and Table 11.3 don’t include the parameter values of the 7–9th planes, because these are not the meaningful planes. The range data of stereo camera is easily affected by many external factors such as lighting, shape, texture, and so on. In the proposed method, sampled data that includes errors is filtered by planarity evaluation, but in the conventional RHT method, the parameters of the error data are not filtered and they make the wrong peak in Hough parameter space.

Table 11.4 shows the detection speed for two comparison experiments. The experiments prove that the proposed method is less affected by the noises and uneven flat surfaces, and it can detect planes faster than the conventional RHT method. Because unnecessary data is not voted on by planarity evaluation and it only detects correct planes, the iteration count for plane detection is lower than that of the conventional RHT method.

Table 11.4 Computing time of the comparison experiments

5 Conclusions

We proposed a sampling method for accurate plane detection which is less affected by noises. The proposed method contains two techniques, planarity evaluation which is used to filter the outliers and the IRHT application for range data segmentation.

The planarity is tested by determining the angle of normal plane directions. These are determined by three randomly sampled vertices and the rotation points of the vertices. We can vote on accurate plane parameters, because most noises and nonplanar data are filtered from planarity evaluation. The scan window contributes to accurate plane detection and fast processing, because the planes have the local property, which means that the data of the same plane are located in a local region. The proposed method is verified by examples of a plane detection using real range data.