Vision Based Pose Estimation of Multiple Peg-in-Hole for Robotic Assembly

Nagarajan, Pitchandi; Saravana Perumaal, S.; Yogameena, B.

doi:10.1007/978-3-319-68124-5_5

Pitchandi Nagarajan²¹,
S. Saravana Perumaal²¹ &
B. Yogameena²²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10481))

Included in the following conference series:

International Conference on Computer Vision, Graphics, and Image processing

2035 Accesses
2 Citations

Abstract

Vision sensors are used to estimate the pose (position and orientation) of mating components in a vision assisted robotic peg-in-hole assembly which is a crucial step in aligning the mating hole-component with the corresponding moving peg-component. The accuracy of this estimation decides the performance of peg-in-hole robotic assembly with an appropriate mapping between the image and task environment using a fixed overhead camera or camera on robot arm. The wheel and hub assembly in automobile has multiple holes and pegs in their mating parts which lead to more complex pose estimation procedure. The success rate of the assembly process (without jamming) is affected by an inaccurate pose estimation which leads to lateral and/or axial misalignment between the mating components during its insertion phase. On this consideration, this work proposes a pose estimation algorithm for a multiple peg-in-hole assembly with the use of genetic algorithm based two-stage camera calibration procedure. The proposed algorithm has also been tested for its performance in estimating the pose of the multiple pegs in wheel-hub of a car. The result reveals that the proposed method estimates the pose of the pegs accurately with minimum re-projection error.

Access provided by CONRICYT-eBooks. Download conference paper PDF

Implementation of Computer Vision Guided Peg-Hole Insertion Task Performed by Robot Through LabVIEW

Generic Algorithm for Peg-In-Hole Assembly Tasks for Pin Alignments with Impedance Controlled Robots

Camera Orientation Determination Based on Copper Wire Spool Shape

Keywords

1 Introduction

Vision sensors are the predominantly used sensors in a robotic assembly system to align, manipulate and assemble the mating component in the robot end-effector with the other mating part of the assembly. The feedback of the vision sensor is used to sense the dynamic mating component in working environment and to improve the alignment of mating components. An assembly has two components: (a) protruded part on the component often termed as peg and (b) hollow part on the mating component named as hole. Often the assembly task is horizontal or vertical insertion of peg component into the hole component to establish a permanent contact. The wheel and wheel-hub assembly of an automobile has multiple pegs and multiple-holes. In these assemblies, the wheel-hub is fixed with the automobile and the hole component (wheel) is inserted over the multiple pegs. The success rate of the assembly without jamming or wedging is affected by the presence of lateral or angular misalignment of the components. The precision and speed of alignment is influenced by the estimation of position and orientation (pose) of the multiple-peg component [4].

The process of controlling the robotic manipulators using the feedback of the vision sensors is named as Visual servoing. Main shortcomings of a visual servoing system are poor accuracy as well as stability, due to even small pose estimation error [2], and retaining the considered object always in the field of view [3]. Pose of the object cannot be accurately estimated due to non-trackability of the object, roughly calibrated camera and error in 3D model of the target [9].

An object tracking or recognition system depends on the efficiency of the feature extraction system for faster and accurate position estimation in pixel coordinates. Scale Invariant Feature Transform (SIFT) [6, 13] and Speed Up Robust Feature (SURF) [1] are commonly adopted to extract the target features from the video frames. Based on the geometrical features of the object, few researchers [10, 18] have used morphological image processing operations like boundary extraction and gradient operations like edge detection, circle detection, ellipse detection algorithms for feature tracking. The object of interest in this work is a wheel-hub which has four pegs (cylindrical rods) and screws on its surface. These features make the feature tracking based on gradient operations adaptable in this work.

Camera calibration is the process of determining the relationship between real world (metric) information and the 2D image information [19]. The properties of the camera used (intrinsic parameters) for acquisition and parameter set describing the geometric relation between the Cartesian world space coordinate system and the image coordinate system (extrinsic parameters) are determined using this calibration techniques. The presence of the lens distortion in the camera displaces the coordinates non-linearly. The distortion in the lenses is due to the errors in assembly of lens and the geometric features of the lens [15]. Various distortions like radial, tangential and thin prism distortion are possible in images [16]. Radial distortion is common in machine vision cameras which causes pincushion and barrel effect on the images [15]. Direct linear Transformation (DLT) of camera parameters lack the capability to incorporate non-linear distortion in the camera model. Therefore, direct non-linear transformation or two-stage camera calibration [5, 14, 17] is advantageous to include distortion and estimate the camera parameters. The performance of the direct non-linear transformation requires precise initial guess which is crucial. This makes the two-stage camera calibration technique suitable for the proposed work. Hence, a genetic algorithm based two stage camera calibration is adopted in this work.

On the consideration of above said issues, this work is aimed at developing an accurate pose estimation algorithm for determining the pose of the multiple pegs with minimum re-projection errors. This paper is organized as follows: Sect. 2 presents the vision assisted multiple peg-in-hole robot assembly environment, Sect. 3 explains the proposed pose estimation algorithm, Sect. 4 depicts the experimentation of the proposed calibration and pose estimation algorithm, Sect. 5 presents the results and discussion of the performance of the proposed work and Sect. 6 provides the conclusion of this work.

2 Vision Assisted Multiple Peg-in-Hole Robot Assembly Environment

Vision sensors are used to perceive the changes in the assembly environment and to control the manipulator in performing the assembly operation in accordance to the changes. Figure 1, shows a vision assisted robotic assembly environment considered in this work to perform a multiple peg-in-hole assembly. In the considered environment, the multiple-peg component is mounted on a shaft which rotates about its center axis. The camera (vision sensor) is mounted at a fixed position in the robotic assembly environment such that the multiple-peg components lie in the field of view of the camera at all instances. The camera and the peg-component are mounted such that the optical axis of the camera and the z-axis of the component lies in the same plane. Hence, the distance between the peg component and the camera remains same because only rotation about the z-axis is allowed. Since the distance remains same, monocular camera is sufficient and adopted in this work to estimate the coordinates of the peg centers. The robotic manipulator has the mating multiple-hole component in its end-effector. In order to assemble the mating components, the co-ordinate frames of the hole and peg components have to be aligned and then the insertion task has to be executed. Pose of the multiple-peg component $ {}^{C}T_{P} $ is required to align the manipulator with axes of the mating components. $ {}^{C}T_{E} $ represents the transformation matrix of end effector with respect to camera. The current pose of the end-effector with respect to the robot base is given as:

$$ {}^{R}T_{E} = {}^{R}T_{C} {}^{C}T_{P} {}^{P}T_{E} $$

(1)

Where,

$ {}^{R}T_{C} $ is the pose of the camera with respect to robot base (known).
$ {}^{C}T_{P} $ denotes the pose of the multiple-peg part in the table with respect to the camera in the current position.
$ {}^{P}T_{E} $ represents the pose of the end-effector with respect to the hole part in the current frame and is given by (2)

$$ {}^{P}T_{E} = \left[ {{}^{C}T_{P} } \right]^{ - 1} {}^{C}T_{E} $$

(2)

In visual servoing, $ {}^{C}T_{P} $ is computed through pose estimation algorithm and compared with the target $ {}^{C}T_{P}^{*} $ at every instant to minimize the error and to generate the $ {}^{R}T_{E} $ for robot manipulation. Precise camera calibration procedure is required to compute the $ {}^{C}T_{P} $ accurately in the Cartesian space which enables the accurate alignment of mating components.

3 Proposed Pose Estimation Algorithm

The proposed pose estimation algorithm is divided into two modules: position estimation module and orientation estimation module. In the position estimation module, the multiple-pegs are tracked and the pixel coordinates of the wheel-hub and the peg centers are estimated using the feedback from the monocular camera. The pixel coordinates are fed to genetic algorithm based camera calibration to estimate the camera parameters and the position of centers in metric coordinates. Using the position of the centers in metric, the orientation estimation module estimates the pose of the wheel-hub with respect to the camera. Figure 2, shows the overview of the proposed pose estimation algorithm.

3.1 Position Estimation Module

The wheel-hub used in this work has four pegs. Determining the centers of each peg gives the position of the wheel-hub in pixels at tracking stage followed by metric position in calibration stage.

Multiple-pegs Tracking.

In this module, each peg on the wheel-hub is tracked using the monocular camera frames. Since the top surface of the pegs and the component base area of the wheel-hub are at different planes, there is a significant change in their intensity values. This intensity difference aids segmenting the pegs from the background by estimating a global threshold value using the Otsu’s thresholding method. The segmentation process replaces all pixels in the input image with intensity greater than threshold value by 1 (white) and replaces all other pixels with 0 (black). Pegs are having higher intensity than the component base of the wheel-hub and hence they are shown as ‘white’ in Fig. 3. Further, the noise present in the binary image after the segmentation process is removed by area based filtering. An 8-connectivity neighborhood is used to create the connected components in the binary image. The area properties of the connected components are calculated. The areas of interest are the pegs and the wheel-hub center hole and the screw portion area. The area of the noise components present in the image are comparatively lesser than the area of screw. Hence the connected components with the area property less than that of the screw area are removed on this filtering. In case of noise components with higher area, they are removed in the circle detection stage as they are non-circular objects.

The circles in the segmented image are identified by a two stage Circular Hough Transform (CHT) method [11]. In this CHT, a circle is drawn for each edge point with a radius ‘r’. This method adopts an (3D) accumulator array computation technique in which the first two dimensions represent the coordinates of the circle and the last specify the radii. The values in the accumulator (array) are increased every time whenever a circle is drawn with the desired radii over every edge point. The accumulator, which keeps count of the circles passing through coordinates of each edge point, and makes a vote to find the highest count. The coordinates of the center of the circles in the images are the coordinates with the highest count. The details pertaining on this method related to matlab function is cited in reference [11]. There are four pegs in the wheel-hub and two screws present between the pegs labelled 1&4 and pegs 2&3. Taking the first screw as reference, the first peg could be labeled as 1 and other pegs are labelled sequentially in anticlockwise direction. Figure 3, shows the pegs detected in a frame using the multiple-peg tracking module.

Genetic Algorithm Based Camera Calibration.

The estimated coordinates in pixel values are to be converted to metric coordinates to calculate the pose of the multiple-peg object in Cartesian space. The precise calculation of the 3D coordinates and pose of the object depends on the accuracy of the calibration procedure. In this regard, a genetic algorithm based two-stage camera calibration procedure is adapted in this work. In the first stage, a linear solution is computed by considering distortion-free camera model and in the second stage, genetic algorithm is used to compute the optimal camera parameters including distortion by using the closed form solution as an initial guess [12].

A pin-hole model with lens distortion as shown in Fig. 4, is adopted in this work. Considering $ \left( {X_{w} ,Y_{w} ,Z_{w} } \right) $ as the 3D world coordinate system, and $ \left( {X_{c} ,Y_{c} ,Z_{c} } \right) $ as camera coordinate system, $ \left( {X_{i} ,Y_{i} } \right) $ represents the image coordinate system. Further, $ O_{i} $ and $ O_{c} $ represent the center of the image coordinate system and the optical center of the camera coordinate system, respectively. $ O_{i} $ and $ O_{c} $ are collinear and aligned with the $ Z_{c} $ axis of the camera coordinate system. $ \left( {x_{i} ,y{}_{i}} \right) $ be the image coordinates measured through any point extraction algorithm. Let $ P $ be the test point $ \left( {x_{w} ,y_{w} ,z_{w} } \right) $ as the world coordinate and $ \left( {x_{u} ,y_{u} } \right) $ be the estimated undistorted image coordinate $ \left( {x_{c} ,y_{c} ,z_{c} } \right) $ point. Focal length i.e. the distance between the image plane and the camera plane is denoted as $ f $.

On considering the radial distortion, the image coordinate $ \left( {x_{u} ,y_{u} } \right) $ of the distortion-free model is subtracted with the distortion factor $ D_{x} $ to estimate the distorted image coordinate $ \left( {x_{d} ,y_{d} } \right) $. The distortion is modelled as a second order polynomial [15].

$$ \begin{aligned} x_{d} + D_{x} & = x_{u} \\ y_{d} + D_{y} & = y_{u} \\ \end{aligned} $$

(3)

$$ \begin{aligned} D_{x} & = x_{d} (k_{1} r^{2} ) \\ Dy & = y_{d} (k_{1} r^{2} ) \\ r & = \sqrt {x_{d}^{2} + y_{d}^{2} } \\ \end{aligned} $$

$ k_{1} $ is the distortion coefficient.

The distorted image coordinates $ \left( {x_{d} ,y_{d} } \right) $ are transformed to computer image coordinates $ \left( {x_{f} ,y_{f} } \right) $ by multiplying with the uncertainty scaling factor and by adding the center of the frame. In the first stage, a linear approximation method as described by Tsai [7, 15] is used to determine the camera parameters by considering a distortion-free camera model which forms the bounds for the second nonlinear genetic algorithm stage.

Genetic Algorithm is adopted to identify the optimal intrinsic and extrinsic parameters of the camera incorporating the radial distortion in the camera model. Linear approximation stage estimates the rotation matrix, translation vector in $ x $ and $ y $ direction. Therefore, the proposed GA has focal length and translation about z-axis as the genes in the chromosome which enables the proposed method to have faster convergence to optimal results. $ P_{1} $ represents the initial population of the 1st generation, $ f $ and $ T_{z} $ represent the genes (parameters) present in a chromosome, $ j $ represents the chromosomes in the population of a generation, $ k $ represents the generations and $ n $ & $ m $ are the number of chromosomes in the population and number of generations, respectively. The initial population for the genetic algorithm is

$$ \left( {f^{j} } \right)_{k} = f_{linear} ;\forall \,\,1 \le j \le n\,{\text{and}}\,k = 1 $$

$$ \left( {T_{z}^{j} } \right)_{k} = T_{zlinear} ;\forall \,\,1 \le j \le n\,{\text{and}}\,k = 1 $$

$$ \left( {q^{j} } \right)_{k} = \left\{ {\left( {f^{j} } \right)_{k} ,\left( {T_{z}^{j} } \right)_{k} } \right\}\forall \,\,1 \le j \le n\,{\text{and}}\,k = 1 $$

$$ P_{1} = \left( {q_{i}^{j} } \right)_{1} $$

(4)

where $ f_{linearr} $, $ T_{linearr} $ are the focal length and translation about z-axis estimated in first stage. The bounds on the parameters $ f $ and $ T_{z} $ are taken as $ \pm 25\% $ of the linearly estimated values (stage 1 results).

$$ \left( {q_{1\,\,bound}^{j} } \right)_{k} = f_{linear} \pm \left( {0.25^{*} f_{linear} } \right) $$

$$ \left( {q_{2\,\,bound}^{j} } \right)_{k} = T_{zlinear} \pm \left( {0.25^{*} T_{zlinear} } \right) $$

$$ \begin{aligned} \left( {q^{j}_{\hbox{max} } } \right)_{k} = \left\{ {f_{\hbox{max} } ,\left. {T_{z\hbox{max} } } \right\}} \right.\,\,\forall \,\,1 \le j \le n\,and\,1 \le k \le m \hfill \\ \left( {q^{j}_{\hbox{min} } } \right)_{k} = \left\{ {f_{\hbox{min} } ,\left. {T_{z\hbox{min} } } \right\}} \right.\,\,\forall \,\,1 \le j \le n\,and\,1 \le k \le m \hfill \\ \end{aligned} $$

(5)

As lens distortion is considered in this camera model, the distortion coefficient $ k_{1} $ for each chromosome is calculated as:

$$ \left( {k_{1}^{j} } \right)_{k} = \frac{{C_{3} \left( {f^{j} } \right)_{k} }}{{C_{1} \left( {C_{2} + \left( {T_{z}^{j} } \right)_{k} } \right)}} - \frac{{C_{4} }}{{C_{1} }}\,\forall \,\,1 \le j \le n\,{\text{and}}\,1 \le k \le m $$

(6)

$$ c_{1} = d_{y} y_{i} r^{2} $$

$$ c_{2} = r_{7} x_{w} + r_{8} y_{w} $$

$$ c_{3} = r_{4} x_{w} + r_{5} y_{w} + T_{y} $$

$$ c_{4} = d_{y}^{{\prime }} y_{i} \,{\text{and}}\,r = \sqrt {x_{d}^{2} + y_{d}^{2} } $$

The variation between the actual image coordinates and the calculated coordinates using the estimated camera parameters is termed as re-projection error [8] which is the common performance measure of any camera calibration technique.

$$ E_{rms} = \frac{1}{n}\sum\limits_{l = 1}^{n} {\sqrt {(x_{f} - x_{i} )^{2} + (y_{f} - y_{i} )^{2} } } $$

(7)

Hence the objective of this proposed genetic algorithm based calibration technique is to determine the optimal values of $ f $ and $ T_{z} $ for minimum re-projection error ($ E_{rms} $).

$$ \left( {q^{j} } \right)_{k}^{*} = \hbox{min} \left( {E_{rms} } \right) $$

(8)

$$ {\text{Subject}}\,{\text{to}}\,\,f_{\hbox{min} } \le f^{*} \le f_{\hbox{max} } \,{\text{and}}\,T_{z\hbox{min} } \le T_{z}^{*} \le T_{z\hbox{max} } $$

GA Operators.

The convergence of a genetic algorithm depends on the selection of the mutation and cross over operators employed in the algorithm. The genes are encoded as real numbers to have understanding in the computation. The best chromosomes among the population are selected for reproduction using the proportionate reproduction method. Crossover operation creates new children in the population. The off-springs of the crossover operation have better fitness than the parent chromosomes. Even though if bad off-springs are created in the crossover phase, they will be eliminated by the subsequent reproduction phase in the next generation and thus off springs with better fitness than their parents are retained in the subsequent generation. A blend over crossover operation is adopted in this work, since they prevent the algorithm from getting trapped in the local optimal solution.

Two parents from the initial population $ \left( {q_{i}^{j} } \right)_{k} $ and $ \left( {q_{i}^{j + 1} } \right)_{k} $ are selected and the off-springs $ \left( {o1_{i}^{j} } \right)_{k} $ and $ \left( {o2_{i}^{j} } \right)_{k} $ are generated by

$$ \left( {o1_{i}^{j} } \right)_{k} = \left( {\hbox{min} \left( {\left( {q_{i}^{j} } \right)_{k} ,\left( {q_{i}^{j + 1} } \right)_{k} } \right)} \right) - \alpha \left( {\left( {q_{i}^{j} } \right)_{k} - \left( {q_{i}^{j + 1} } \right)_{k} } \right) $$

$$ \left( {o2_{i}^{j} } \right)_{k} = \left( {\hbox{max} \left( {\left( {q_{i}^{j} } \right)_{k} ,\left( {q_{i}^{j + 1} } \right)_{k} } \right)} \right) + \alpha \left( {\left( {q_{i}^{j} } \right)_{k} - \left( {q_{i}^{j + 1} } \right)_{k} } \right) $$

$$ \left( {q^{\prime j} } \right)_{k + 1} = \left( {\left( {o2_{i}^{j} } \right)_{k} - \left( {o1_{i}^{j} } \right)_{k} } \right) * rand + \left( {o1_{i}^{j} } \right)_{k} $$

(9)

where $ i = 1,2 $, $ j = 1,2, \ldots .n $ and $ k = 1,2, \ldots .m $ and the value of $ \alpha $ is taken as 0.75.

Mutation operation increases the diversity in the population which improves the convergence to global optimal solution. A power mutation operator is adopted in this work using the following expressions.

$$ \left( {q_{i}^{j} } \right)_{k + 1} = \left\{ {\begin{array}{*{20}l} {\left( {q_{i}^{j} } \right)_{k} - s\left( {\left( {q_{i}^{j} } \right)_{k} - \left( {q_{i\,\,\hbox{min} }^{j} } \right)_{k} } \right)} \hfill & {if\,u < r} \hfill \\ {\left( {q_{i}^{j} } \right)_{k} - s\left( {\left( {q_{i\,\,\hbox{max} }^{j} } \right)_{k} - \left( {q_{i}^{j} } \right)_{k} } \right)} \hfill & {if\,u \ge r} \hfill \\ \end{array} } \right\} $$

(10)

Where $ u = {\raise0.7ex\hbox{${\left( {\left( {q_{i}^{j} } \right)_{k} - \left( {q_{i\,\,\hbox{min} }^{j} } \right)_{k} } \right)}$} \!\mathord{\left/ {\vphantom {{\left( {\left( {q_{i}^{j} } \right)_{k} - \left( {q_{i\,\,\hbox{min} }^{j} } \right)_{k} } \right)} {\left( {q_{i\,\,\hbox{max} }^{j} } \right)_{k} - \left( {\left( {q_{i\,\,\hbox{min} }^{j} } \right)_{k} } \right)}}}\right.\kern-0pt} \!\lower0.7ex\hbox{${\left( {q_{i\,\,\hbox{max} }^{j} } \right)_{k} - \left( {\left( {q_{i\,\,\hbox{min} }^{j} } \right)_{k} } \right)}$}} $ and $ r $ is a uniformly distributed random value between 0 and 1.

$$ s = p^{ * } s_{r}^{p - 1} ,\quad 0 \le s_{r} \le 1 $$

Where $ p $ is the index of distribution and $ s_{r} $ is a uniform distributed random number between 0 and 1.

3.2 Orientation Estimation

After obtaining the image coordinates of the centers of pegs from tracking stage and the camera parameters from calibration stage, the metric coordinates of each pegs are determined. The wheel-hub is allowed to rotate about z-axis and/or to translate along x-axis only by constraining the translation along z-axis and rotation about X&Y axes. The linear displacement along x-axis (position of the peg) is estimated using the metric information of the peg centers. The orientation of the wheel-hub about z-axis is estimated by calculating the angle between the line1 connecting the peg1 center and wheel-hub center, and the horizontal line (line2) passing through wheel hub center as shown in Fig. 5 and (11).

$$ \tan \theta = \left[ {\frac{{m_{1} - m_{2} }}{{1 + \left( {m_{1} \circ m_{2} } \right)}}} \right] $$

(11)

$ m_{{_{1} }} ,m_{{_{2} }} $ are the slopes of the line1 and line2 respectively.

4 Experimental Arrangement

This section explains the experiments performed with the wheel-hub to evaluate the performance of the proposed pose estimation algorithm in terms of its accuracy. The experimental environment has the target object wheel-hub placed at a distance of 0.27 m from a fixed camera. DNV 3001 CCD camera with f50 lens is used to capture the wheel-hub images in this experiment at 10 fps. As discussed, the accuracy of the pose estimation algorithm is influenced by the accuracy of the camera calibration procedure and the feature tracking procedure. To ensure the performance of the proposed camera calibration procedure and to estimate the camera parameters, the algorithm is tested with the checker board images placed at the same location of the wheel-hub. The chess board pattern of 6 × 8 corner points with an interval of 29 mm is captured from five various orientations at an image resolution of 1600 × 1200 with the DNV camera. The camera parameters estimated by the proposed work is listed in Table 1. The image coordinates of the peg centers and the metric coordinates after the camera calibration are listed in Table 2. The accuracy of the pose estimated could be ensured only when the pose is fed to the manipulator to align the end-effector with respect to the object. In order to ensure the performance of the proposed pose estimation algorithm, the accuracy of tracking module is tested by comparing the metric values of radii of pegs at each instant with their corresponding true values.

Table 1. Camera parameters estimated by proposed work

Full size table

Table 2. Image and metric coordinates of pegs at frame 1

Full size table

5 Results and Discussion

Each peg’s center in pixels is estimated using the multiple-peg tracking and the camera parameters from the genetic algorithm based camera calibration technique. The metric coordinates of the centers are obtained from the estimated camera parameters. The results of our proposed calibration algorithm are compared with Zhang’s (2001) camera calibration algorithm based matlab camera calibration toolbox. The obtained root mean square re-projection error of toolbox is 0.37 pixels whereas that of the proposed algorithm is 0.0310 pixels. The results also show that our proposed algorithm has better accuracy in estimating the camera parameters than the matlab calibration toolbox which in turn ensures minimum pose estimation error. It is evident from Fig. 6, that 57% of re-projected points has less than 0.03 pixel distance error which proves the measurement accuracy of the proposed calibration technique. It is observed from Fig. 7, that the proposed method is able to estimate a point and re-project a point within an error range of 0.5 pixels. Table 3 represents the mean and standard deviation of the error between the true value and the estimated metric value of the radius of peg1 for specific interval of frames. The tracking and calibration stages of the proposed pose estimation technique are capable to estimate the pose within an error range of less than 1 mm.

Table 3. Mean and standard deviation of the errors between true and estimated radius of peg1

Full size table

6 Conclusion

Vision sensors offer flexibility in perceiving the multiple hole component in a dynamic environment. Pose of the mating components in an assembly are determined using the feedback of vision sensor. The successful attainment of assembly action is influenced by the accuracy of estimated pose of the hole component and alignment between each peg and the corresponding hole in the wheel-hub. In this regard, a pose estimation algorithm has been proposed to determine the position and orientation of the multiple-pegs in the wheel-hub. Each peg in the wheel-hub is tracked for its center using the circle detection algorithm. With the use of pixel coordinates of tracked centers the position of the pegs with respect to the camera in metric values are determined by a two-stage genetic algorithm based camera calibration technique. The change in orientation is also obtained by calculating the angle of line connecting the centers of peg and wheel-hub, with the horizontal axis. The proposed pose estimation algorithm is experimented to validate its performance in terms of accuracy. Experimental results show that the calibration technique used in the proposed pose estimation algorithm has capability to re-project the peg with an accuracy of 0.0310 pixels. Besides, the proposed algorithm is suitable for a vision assisted robot assembly system with the positioning accuracy of 1 mm.

References

Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006). doi:10.1007/11744023_32
Chapter Google Scholar
Chaumette, F., Hutchinson, S.: Visual servo control part I: basic approaches. IEEE Robot. Autom. Mag. 13, 82–90 (2006)
Article Google Scholar
Chaumette, F.: Potential problems of stability and convergence in image based and position based visual servoing. In: Kriegman, D.J., Hager, G.D., Morse, A.S. (eds.) The Confluence of Vision and Control. LNCIS, vol. 237, pp. 66–78. Springer, London (1998). doi:10.1007/BFb0109663
Chapter Google Scholar
Dong, G., Zhu, Z.: Position based visual servo control of autonomous robotic manipulators. Acta Astronaut. 115, 291–302 (2015)
Article Google Scholar
Ji, Q., Zhang, Y.: Camera calibration with genetic algorithms. IEEE Trans. Syst. Man. Cybern. Part A: Syst. Hum. 31, 120–130 (2001)
Google Scholar
Lee, S., Kim, E., Park, Y.: 3D object recognition using multiple features for robotic manipulation. In: Proceedings of IEEE International Conference on Robotics and Automation, USA, pp. 3768–3774 (2006)
Google Scholar
Lenz, R., Tsai, R.: Techniques for calibration of the scale factor and image center for high accuracy 3D machine vision metrology. In: Proceedings of IEEE International Conference on Robotics and Automation, pp. 68–75 (1987)
Google Scholar
Li, W., Gee, T., Friedrich, H., Delmas, P.: A practical comparison between Zhang’s and Tsai’s calibration approaches. In: Proceedings of International Conference on Image and Vision Computing (IVCNZ), New Zealand, pp. 166–171, November 2014
Google Scholar
Mallis, E.: Survey of vision based robot control. In: Proceedings of European Naval Ship Design, Captain Computer IV Forum, Brest, France, pp. 1–16 (2002)
Google Scholar
Marchand, E., Boutherny, P., Chaumette, F., Moreau, V.: Robust real time visual tracking using a 2D-3D model based approach. In: Proceedings of the 1999 IEEE International Conference on Computer Vision, pp. 262–268 (1999)
Google Scholar
Mathworks Circle Detection Algorithm, June 2016. http://in.mathworks.com/help/images/examples/detect-and-measure-circular-objects-in-an-image.htmlaccessed
Nagarajan, P., Saravana Perumaal, S.: GA based camera calibration for vision assisted robotic assembly system. IET Comput. Vision (2016). doi:10.1049/iet-cvi.2016.0004
Google Scholar
Song, K.T., Chang, C.H.: Object pose estimation for grasping based on robust center point detection. In: proceedings of 8th Asian Control Conference, Taiwan , pp. 305–310 (2011)
Google Scholar
Song, X., Yang, B., Feng, Z., et al.: Camera calibration based on particle swarm optimization. In: Proceedings of IEEE International Conference on Image and Signal Processing, Tianjin, pp. 1–5 (2009)
Google Scholar
Tsai, R.: A versatile camera calibration technique for high accuracy 3D machine vision metrology using off the shelf TV cameras and Lenses. IEEE J. Robot. Autom. 3(4), 323–344 (1987)
Article Google Scholar
Weng, J., Cohen, P., Herniou, M.: Camera calibration with distortion models and accuracy evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 14, 965–980 (1992)
Article Google Scholar
Yang, Z., Chen, F., Zhao, J., et al.: A novel camera calibration method based on genetic algorithm. IEEE Conference on Industrial Electronics and Applications, Singapore, pp. 2222–2227, June 2008
Google Scholar
Youngrock, Y., DeSouza, G.N., Avinash Kak, C.: Real-time tracking and pose estimation for Industrial objects using geometric features. In: Proceedings of IEEE International Conference on Robotics and Automation, Taiwan, pp. 3473–3478 (2003)
Google Scholar
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22, 1330–1334 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mechanical Engineering, Thiagarajar College of Engineering, Madurai, 625015, India
Pitchandi Nagarajan & S. Saravana Perumaal
Department of Electronics and Communication Engineering, Thiagarajar College of Engineering, Madurai, 625015, India
B. Yogameena

Authors

Pitchandi Nagarajan
View author publications
You can also search for this author in PubMed Google Scholar
S. Saravana Perumaal
View author publications
You can also search for this author in PubMed Google Scholar
B. Yogameena
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pitchandi Nagarajan .

Editor information

Editors and Affiliations

Indian Institute of Information Technology, Chittoor, India
Snehasis Mukherjee
GE Global Research, Bangalore, India
Suvadip Mukherjee
Indian Statistical Institute, Kolkata, India
Dipti Prasad Mukherjee
International Institute of Information Technology, Hyderabad, India
Jayanthi Sivaswamy
Indian Institute of Technology Bombay, Mumbai, India
Suyash Awate
CEDAR, Buffalo, New York, USA
Srirangaraj Setlur
International Institute of Information Technology, Hyderabad, India
Anoop M. Namboodiri
CSIR-CEERI, Pilani, Rajasthan, India
Santanu Chaudhury

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nagarajan, P., Saravana Perumaal, S., Yogameena, B. (2017). Vision Based Pose Estimation of Multiple Peg-in-Hole for Robotic Assembly. In: Mukherjee, S., et al. Computer Vision, Graphics, and Image Processing. ICVGIP 2016. Lecture Notes in Computer Science(), vol 10481. Springer, Cham. https://doi.org/10.1007/978-3-319-68124-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-68124-5_5
Published: 21 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68123-8
Online ISBN: 978-3-319-68124-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Vision Based Pose Estimation of Multiple Peg-in-Hole for Robotic Assembly

Abstract

Similar content being viewed by others

Implementation of Computer Vision Guided Peg-Hole Insertion Task Performed by Robot Through LabVIEW

Generic Algorithm for Peg-In-Hole Assembly Tasks for Pin Alignments with Impedance Controlled Robots

Camera Orientation Determination Based on Copper Wire Spool Shape

Keywords

1 Introduction

2 Vision Assisted Multiple Peg-in-Hole Robot Assembly Environment