A High Precision and Fast Alignment Method Based on Binocular Vision

Gao, Han; Shen, Fei; Zhang, Feng; Zhang, Zhengtao

doi:10.1007/s12541-022-00674-7

A High Precision and Fast Alignment Method Based on Binocular Vision

Regular Paper
Published: 13 July 2022

Volume 23, pages 969–984, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Precision Engineering and Manufacturing Aims and scope Submit manuscript

A High Precision and Fast Alignment Method Based on Binocular Vision

Download PDF

Han Gao^1,3,
Fei Shen ORCID: orcid.org/0000-0001-9263-4489^1,2,3,
Feng Zhang^1,2,3 &
…
Zhengtao Zhang^1,2,3

506 Accesses
5 Citations
Explore all metrics

Abstract

Alignment tasks for precision electronics manufacturing require high accuracy and low time consumption. However, in the current industrial environment, multiple servo alignment operations are often required to achieve the desired accuracy targets, which is time-consuming. In this paper, a high precision, fast alignment method based on binocular vision is proposed, which allows the accurate movement of the workpiece to the target position in only one alignment operation, without the need for a standard calibration board. Firstly, a calibration method of the telecentric lens camera is proposed based on an improved nonlinear damped least-squares method to establish the relationship between the image coordinate system and the local world coordinate system in the binocular vision system. Secondly, in order to transform the coordinates from the local world coordinate system to a unified coordinate system with the platform’s rotation center as the origin, an angle constraint-based rotation center calibration method is proposed. Thirdly, a two-stage feature point detection method based on shape matching is proposed to detect the feature points of the workpiece. Based on these, the position and pose of the workpiece are obtained. Then the alignment commands are calculated based on the current and the target position and pose of the workpiece, enabling the accurate alignment to be accomplished in one operation. Finally, taking the mobile phone’s cover glass alignment task as an example, a series of calibration and alignment experiments were carried out. The experiments and results show that the alignment errors are within ± 0.020 mm and the time taken to calculate alignment commands is less than 20 ms, which demonstrates the effectiveness of the proposed method.

Target Positioning Based on Binocular Vision

Vision measurement system for position-dependent geometric error calibration of five-axis machine tools

Article 17 November 2022

Hand–eye calibration method based on three-dimensional visual measurement in robotic high-precision machining

Article 10 January 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Vision-based alignment is one of the critical techniques in the domain of vision inspection and assembly, which can be widely applied in fields of electronic equipment, semiconductor and robotics. For example, the precise alignment task of the optical fibers is performed by a closed-loop control based on the telecentric stereo microvision [1]. In [2], the assembly task of the slice micropart in 3-D space is completed by the serial assembly with three microscopic cameras and a laser triangulation measurement instrument (LTMI). Although vision-based alignment has the advantage of high-speed, non-contact, high accuracy, and flexibility, the alignment methods are open to further development to improve precision and robustness.

In general, a vision-based alignment system consists of a vision system and a motion system. Firstly, the vision system measures the position and pose of the workpiece by recognizing features in the captured images. Secondly, the alignment commands are obtained by the controller based on the deviation of alignment. Finally, the motion system completes the alignment process based on the alignment commands from the controller. Therefore, the overall performance of the vision-based alignment system depends on the accuracy of the alignment method’s crucial parts, which are the measurement of the workpiece pose and position, the calibration accuracy of alignment system parameters, and the control strategy of the alignment process.

To measure the workpiece’s pose and position accurately, serial steps need to be carried out. Firstly, the calibration of the vision system is required to map the coordinate of a point from the image coordinate system to the world coordinate system. Secondly, the image processing method should be used to recognize features in the captured images. Finally, the position and pose of the workpiece are obtained based on the calibration result of the vision system and the workpiece’s features. Depending on the number of cameras used in the vision system, the measurement methods of the workpiece’s position and pose can be mainly divided into two categories which are the monocular vision method [3,4,5,6] and the multi-vision method [7,8,9,10]. Furthermore, due to advantages such as high resolution and less distortion, telecentric lenses are widely used for non-contact measurement of the workpiece’s position and pose.

To achieve high precision of the alignment task, the control strategy has been widely adopted in the alignment process. The feedback control-based alignment method can achieve the desired accuracy after several control periods with the closed-loop control strategy and the deviation of alignment as the feedback amount. As a hot research topic in the feedback control-based alignment method, the visual servo technology is utilized to complete the alignment task in a coarse-to-fine manner [11,12,13]. Y. Ma et al. [11] proposed a coordinated pose alignment strategy with two microscopic cameras to realize pose alignment. In [12], a vision-based system is proposed to automatically complete the watch hand’s precise alignment. S. Kwon et al. [13] proposed an alignment system with a visual servo to accomplish the coarse-to-fine alignment task. The vision system is designed to recognize the alignment marks, and an observer-based is designed for the display visual alignment tasks. With the robustness to the environmental variation and the achievable high precision, the visual servo-based alignment method has been adopted in the field of microassembly and micro-manipulation, where the precision requirement is high, with speed being a secondary requirement.

However, visual servo-based alignment methods must be completed after several control cycles, which means that a long time is required before the desired alignment accuracy is achieved. For vision-based alignment and inspection tasks in manufacturing lines, the alignment task is only one part of the production [14]. In order not to have an impact on subsequent production, the alignment task must be accomplished in a few tens of milliseconds in one shot alignment operation. In this case, the visual servo-based alignment method with several control periods is no more suitable. Therefore, in this paper, a high precision and fast alignment method is proposed to accomplish the alignment task after one-shot alignment operation, with a binocular vision system as the vision subsystem of the alignment system. We discuss how to improve the performance of the key parts of the alignment method.

Firstly, to establish the relationship between the image coordinate system and the local world coordinate system, an improved nonlinear damped least-squares calibration method for the telecentric lens camera is proposed to speed up the convergence of the camera calibration process. Secondly, to complete the alignment task in one shot operation, a world coordinate system with the rotation center of the rotation platform as the origin needs to be obtained in advance to unify the local world coordinate systems of the binocular vision system. Thus, an angle constraint-based rotation center calibration method is proposed through the active rotation of the motor three times. Thirdly, a two-stage feature point detection method based on shape matching is proposed to obtain the feature point of the workpiece robustly. Finally, a series of experiments are conducted on an alignment system to verify the effectiveness of the proposed alignment methods.

The rest of this paper is organized as follows. In Sect. 2, the binocular vision-based alignment system structure is introduced, and the coordinate systems are established. In Sect. 3, the proposed alignment methods are presented, including an improved nonlinear damped least-squares calibration method for the telecentric lens camera, an angle constraint-based rotation center calibration method, and the calculation method for the alignment commands. In Sect. 4, the experiment results and error analysis are shown. Finally, the conclusion and the suggestions for further work are given in Sect. 5.

2 Alignment System and Coordinate Systems

2.1 Binocular Vision-based Alignment System

As shown in Fig. 1, a binocular vision-based alignment system is designed to complete the alignment task, which consists of a motion system, a binocular vision system and a control system.

(1)
The motion system consists of a motion platform with a two dimensional translation platform and a rotation platform. The alignment task is completed by the movement of the motion platform, on which the workpiece is fixed.
(2)
The binocular vision system, designed to measure the position and pose of a workpiece by capturing images of the workpiece, consists of two telecentric lens cameras. The cameras are mounted with the optical axis direction orthogonal to the motion platform.
(3)
The control system is designed to calculate the alignment commands according to the images received from the vision system and to drive the motion platform to complete the alignment task through a programmable logic controller(PLC).

2.2 Establishment for the Coordinate Systems

As shown in Fig. 2, some relevant points need to be labeled before establishing the coordinate systems. The rotation center of the rotation platform is denoted as p_WO. The left and the right corner of the workpiece are labeled as p_WL and p_WR, respectively. The coordinate systems mentioned in this paper are established when the motion platform is in the reset state. o_Wx_Wy_Wz_W is a unified world coordinate system with coordinate axes parallel to the directions of the motion platform’s movement and p_WO as its origin. o_WLx_WLy_WLz_WL and o_WRx_WRy_WRz_WR are local-world coordinate systems with their axes parallel to the unified coordinate system o_Wx_Wy_Wz_W’s axes. p_WL and p_WR are set as the coordinate origin of the o_WLx_WLy_WLz_WL and o_WRx_WRy_WRz_WR, respectively. o_CLx_CLy_CLz_CL and o_CRx_CRy_CRz_CR are labeled as camera coordinate systems of the left and the right cameras, respectively, whose optic axes coincide with o_CLz_CL and o_CRz_CR correspondingly. o_Lu_Lv_L and o_Ru_Rv_R are image coordinate systems of the left and the right vision systems, respectively. o_Lu_L, o_Lv_L, o_Ru_R and o_Rv_R are parallel to o_CLx_CL, o_CLy_CL, o_CRx_CR and o_CRy_CR, respectively.

3 Fast Alignment Method Based on Binocular Vision

As shown in Fig. 3, the alignment method based on the binocular vision proposed in this paper is divided into two stages. The vision system is calibrated in Stage I, whose main tasks include the camera calibration based on an improved Levenberg–Marquardt algorithm and the calibration of the rotation center with an angle-based constraint. In Stage II, the alignment control command is calculated through the following three steps.

1)
With the images of the workpiece as input, the fast feature point detection method is utilized to get the image coordinates of the feature points on the workpiece, such as the corner points.
2)
Based on the calibration model established in Stage I, the image coordinates of the feature points are transformed to the unified world coordinates.
3)
Combined with the target position and pose, the alignment command is calculated to rectify the position and pose of the workpiece to the target in one operation.

We will detail the calibration method of the vision system in Sect. 3.1, the establishment of the alignment model and the calculation of the alignment command in Sect. 3.2, and a two-stage feature point detection method based on shape matching in Sect. 3.3, which is a critical factor for high accuracy vision-based alignment.

3.1 Calibration for the Vision System

The calibration of the vision system consists of two components: the calibration of the camera and the calibration of the rotation platform’s rotation center. The former is used to obtain the relationship between the image coordinate system o_Lu_Lv_L, o_Ru_Rv_R and the local world coordinate system o_WLx_WLy_WLz_WL, o_WRx_WRy_WRz_WR, while the latter is used to calculate the rotation center for achieving the alignment task in one operation. Firstly, the telecentric lens camera model is established, and an improved Levenberg–Marquardt algorithm is used to obtain the camera model's calibration parameters. Then an angle-based constrained calibration method is proposed to overcome the sensitive problem of measuring the rotation center in industrial scenarios.

3.1.1 Telecentric Lens Camera Model

Take the left camera as an example. Notice that the coordinate system has been established in Sect. 2.2, the relationship between (u_L, v_L) and (x_CL, y_CL, z_CL) can be given as follows.

$$ \left[ {\begin{array}{*{20}c} {u_{{\text{L}}} } \\ {v_{{\text{L}}} } \\ 1 \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {k_{{\text{L}}} } & 0 & 0 \\ 0 & {k_{{\text{L}}} } & 0 \\ 0 & 0 & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {x_{{{\text{CL}}}} } \\ {y_{{{\text{CL}}}} } \\ 1 \\ \end{array} } \right]\left( {z_{CL} \in \left[ {z_{0} ,z_{0} + \Delta z} \right]} \right) $$

(1)

where k_L is the magnification factor of the telecentric lens, z₀ is the location where the telecentric image is sharpest, and △z is the telecentric depth. Set z_WL = 0, the relationship between (x_CL, y_CL, z_CL) and (x_WL, y_WL, z_WL) is a combination of the rotation transformation and the translation transformation as follows.

$$ \left[ {\begin{array}{*{20}c} {x_{{{\text{CL}}}} } \\ {y_{{{\text{CL}}}} } \\ 1 \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {r_{{11{\text{L}}}} } & {r_{{12{\text{L}}}} } & {r_{{13{\text{L}}}} } & {p_{{{\text{xL}}}} } \\ {r_{{21{\text{L}}}} } & {r_{{22{\text{L}}}} } & {r_{{23{\text{L}}}} } & {p_{{{\text{yL}}}} } \\ 0 & 0 & 0 & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {x_{{{\text{WL}}}} } \\ {y_{{{\text{WL}}}} } \\ 0 \\ 1 \\ \end{array} } \right] $$

(2)

Due to the rotation angles between the left camera coordinate system and the world coordinate system can be expressed as θ_zL, θ_yL, θ_xL, the parameters of the rotation transformation can be described as follows.

$$ \begin{gathered} r_{{11{\text{L}}}} = \cos \theta_{{{\text{zL}}}} \cos \theta_{{{\text{yL}}}} ,r_{{21{\text{L}}}} = \sin \theta_{{{\text{zL}}}} \cos \theta_{{{\text{yL}}}} , \hfill \\ r_{{12{\text{L}}}} = - \sin \theta_{{{\text{yL}}}} \cos \theta_{{{\text{xL}}}} + \cos \theta_{{{\text{zL}}}} \sin \theta_{{{\text{yL}}}} \sin \theta_{{{\text{xL}}}} , \hfill \\ r_{{22{\text{L}}}} = \cos \theta_{{{\text{zL}}}} \cos \theta_{{{\text{xL}}}} + \sin \theta_{{{\text{zL}}}} \sin \theta_{{{\text{yL}}}} \sin \theta_{{{\text{xL}}}} , \hfill \\ r_{{13{\text{L}}}} = \sin \theta_{{{\text{zL}}}} \sin \theta_{{{\text{xL}}}} + \cos \theta_{{{\text{zL}}}} \sin \theta_{{{\text{yL}}}} \cos \theta_{{{\text{xL}}}} , \hfill \\ r_{{23{\text{L}}}} = - \cos \theta_{{{\text{zL}}}} \sin \theta_{{{\text{xL}}}} + \sin \theta_{{{\text{zL}}}} \sin \theta_{{{\text{yL}}}} \sin \theta_{{{\text{xL}}}} , \hfill \\ \end{gathered} $$

(3)

Then the left camera model can be given by

$$ \left[ {u_{{\text{L}}} \,v_{{\text{L}}} \,\,1} \right]^{T} = M_{{\text{L}}} \left[ {x_{{{\text{WL}}}} \,y_{{{\text{WL}}}} \,0\,\,1} \right]^{T} $$

(4)

where M_L is the homography matrix of the left camera model.

$$ M_{{\text{L}}} = \left[ {\begin{array}{*{20}c} {k_{{\text{L}}} } & {0} & {0} \\ {0} & {k_{{\text{L}}} } & {0} \\ {0} & {0} & {1} \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {r_{{11{\text{L}}}} } & {r_{{12{\text{L}}}} } & {r_{{13{\text{L}}}} } & {p_{{x{\text{L}}}} } \\ {r_{{21{\text{L}}}} } & {r_{{22{\text{L}}}} } & {r_{{23{\text{L}}}} } & {p_{{y{\text{L}}}} } \\ 0 & 0 & 0 & 1 \\ \end{array} } \right] $$

Similarly, the right camera model can be given by

$$ [u_{{\text{R}}} \;v_{{\text{R}}} \;1]^{T} = M_{{\text{R}}} [x_{{{\text{WR}}}} \;y_{{{\text{WR}}}} \;\,0\,\;1]^{T} $$

(5)

where M_R is the homography matrix of the right camera model.

$$ M_{{\text{R}}} = \left[ {\begin{array}{*{20}c} {k_{{\text{R}}} } & 0 & 0 \\ 0 & {k_{{\text{R}}} } & 0 \\ 0 & 0 & 1 \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {r_{{11{\text{R}}}} } & {r_{{12{\text{R}}}} } & {r_{{13{\text{R}}}} } & {p_{{x{\text{R}}}} } \\ {r_{{21{\text{R}}}} } & {r_{{22{\text{R}}}} } & {r_{{23{\text{R}}}} } & {p_{{y{\text{R}}}} } \\ 0 & 0 & 0 & 1 \\ \end{array} } \right] $$

3.1.2 Camera Calibration Based on an Improved Levenberg–Marquardt Algorithm

For the calibration of the left camera, the parameters in Eq. (4) need to be calculated. Considering these parameters as the unknown variables, Eq. (4) can be rewritten as

$$ \left\{ \begin{aligned} & u_{{{\text{Li}}}} - k_{{\text{L}}} (r_{11E} x_{{{\text{WLi}}}} + r_{{12{\text{L}}}} y_{{{\text{WLi}}}} + p_{{{\text{xL}}}} ) = 0 \\ & v_{{{\text{Li}}}} - k_{{\text{L}}} (r_{{21{\text{L}}}} x_{{{\text{WLi}}}} + r_{{22{\text{L}}}} y_{{{\text{WLi}}}} + p_{{{\text{yL}}}} ) = 0,i = 1,2,...,m \\ \end{aligned} \right. $$

(6)

Then the 2 m equations with unknown variable x_L = [θ_zL, θ_yL, θ_xL, p_xL, p_yL, k_L] can be formed as

$$ {\varvec{f}}_{{\text{L}}} ({\varvec{x}}_{{\text{L}}} ) = \left[ {\begin{array}{*{20}c} {f_{{{\text{L}}1}} ({\varvec{x}}_{{\text{L}}} )} \\ {f_{{{\text{L}}2}} ({\varvec{x}}_{{\text{L}}} )} \\ \cdots \\ {f_{{{\text{L}}2{\text{m}}}} ({\varvec{x}}_{{\text{L}}} )} \\ \end{array} } \right] = 0\quad (2m \ge 6) $$

(7)

where f_L2i-1 (x_L) = u_Li −k_L(r_11Lx_WLi + r_12Ly_WLi + p_xL), f_L2i (x_L) = v_Li −k_L(r_21Lx_WLi + r_22L y_WLi + p_yL), i = 1,2,…,m. The nonlinear least-squares method is used to solve the aforementioned equations with the cost function P_L(x_L) as

$$ P_{{\text{L}}} ({\varvec{x}}_{{\text{L}}} ) = \frac{1}{2}\sum\limits_{i = 1}^{2m} {\left\| {f_{{{\text{L}}i}} \left( {{\varvec{x}}_{{\text{L}}} } \right)} \right\|_{2}^{2} } $$

(8)

Thus, the minimum value of P_L(x_L) is the solution of f_L(x_L) = 0, named as x_L^*.

$$ {\varvec{x}}_{{\text{L}}}^{*} = \mathop {\min }\limits_{{{\varvec{x}}_{{\text{L}}} \in R^{6} }} P_{{\text{L}}} ({\varvec{x}}_{{\text{L}}} ) = \mathop {\min }\limits_{{{\varvec{x}}_{{\text{L}}} \in R^{6} }} \frac{1}{2}{\varvec{f}}_{{\text{L}}} ({\varvec{x}}_{{\text{L}}} )^{T} {\varvec{f}}_{{\text{L}}} ({\varvec{x}}_{{\text{L}}} ) $$

(9)

With a descent step h_L^k, the first-order Taylor expansion of f_L (x_L) around x_L^k is brought into P_L (x_L).

$$ {\text{P}}_{{\text{L}}} ({\varvec{x}}_{{\text{L}}}^{{\text{k}}} + {\varvec{h}}_{{\text{L}}}^{{\text{k}}} ) \approx \frac{1}{2}{\varvec{f}}_{{\text{L}}}^{{\text{T}}} ({\varvec{x}}_{{\text{L}}}^{{\text{k}}} ){\varvec{f}}_{{\text{L}}} ({\varvec{x}}_{{\text{L}}}^{{\text{k}}} ) + {\varvec{h}}_{{\text{L}}}^{{{\text{kT}}}} {\varvec{J}}_{{\text{L}}}^{{\text{T}}} {\varvec{f}}_{{\text{L}}}^{{\text{T}}} ({\varvec{x}}_{{\text{L}}}^{{\text{k}}} ) + \frac{1}{2}{\varvec{h}}_{{\text{L}}}^{{{\text{kT}}}} {\varvec{J}}_{{\text{L}}}^{{\text{T}}} {\varvec{J}}_{{\text{L}}} $$

(10)

where J_L is the Jacobi matrix J_ijL = ∂f_iL/∂x_jL. Based on the Levenberg–Marquardt algorithm(L-M), the iteration formula is given as follows.

$$ {\varvec{h}}_{{\text{L}}}^{{\text{k}}} = - ({\varvec{J}}_{{\text{L}}}^{{\text{T}}} {\varvec{J}}_{{\text{L}}} + \mu_{{\text{L}}} I)^{ - 1} {\varvec{J}}_{{\text{L}}}^{{\text{T}}} {\varvec{f}}_{{\text{L}}} ({\varvec{x}}_{{\text{L}}}^{{\text{k}}} ) $$

(11)

$$ {\varvec{x}}_{{\text{L}}}^{{{\text{k}} + 1}} = {\varvec{x}}_{{\text{L}}}^{{\text{k}}} + {\varvec{h}}_{{\text{L}}}^{{\text{k}}} $$

(12)

where μ_L is a damping parameter. Notice that the L–M algorithm is a trust region algorithm, the standard L-M updates μ_L by a ratio factor according to the performance of the cost function’s decrease.

Its relative success notwithstanding, the standard L–M algorithm maybe sluggish, especially when the algorithm moves to a canyon with a large aspect ratio in the parameters space. Therefore, instead of requiring cost reduction in each descent step, [15] uses the cosine similarity between adjacent steps as an acceptance criterion and finds a faster converging path with increasing cost tolerance. However, this algorithm does not solve the parameter evaporation problem. When the algorithm gets lost in the plateau of the parameter space, some parameters are pushed to infinity, which also means that the ratio of adjacent steps will increase significantly.

Therefore, an improved updating strategy for the damping factor μ_L is proposed to alleviate the parameter evaporation problem and speed up the convergence of the L-M algorithm. p_a = cos(h_L^k−1, h_L^k) and p_b = min{h_L^k−1, h_L^k} / max{h_L^k−1, h_L^k} are calculated to obtain the damping acceptance criterion named as ζ.

$$ \zeta = p_{a} p_{b} $$

(13)

The damping factor μ_L is updated with different strategies according to the distribution of ζ. Specifically, when ζ > ζ_th, it means that the L–M algorithm has accepted the current descent step. In this case, the damping factor μ_L needs to be smaller to obtain a more accurate result by decreasing with the factor m. Conversely, when ζ ≤ ζ_th, it indicates that the L–M algorithm has refused the current descent step. Thus, the damping factor μ_L needs to be larger to expand the trust region by increasing with the factor m. In this paper, m is set as 2 and the ζ_th is set as 0.9. The above update strategy of the damping factor is given as

$$ \mu_{{\text{L}}}^{{\text{k}}} = \left\{ \begin{gathered} \mu_{{\text{L}}}^{{\text{k - 1}}} \;/\;m\quad \zeta > \zeta_{{{\text{th}}}} \hfill \\ \mu_{{\text{L}}}^{{\text{k - 1}}} \;*\;m\quad \zeta \le \zeta_{{{\text{th}}}} \hfill \\ \end{gathered} \right. $$

(14)

Notice that the damping factor for obtaining the descent step h_L^k should be μ_L^k−1 since μ_L^k is updated based on the h_L^k−1 and h_L^k. Thus, the iteration formula is given as follows.

$$ {\varvec{h}}_{{\text{L}}}^{{\text{k}}} = - ({\varvec{J}}_{{\text{L}}}^{{\text{T}}} {\varvec{J}}_{{\text{L}}} + \mu_{{\text{L}}}^{{{\text{k}} - 1}} I)^{ - 1} {\varvec{J}}_{{\text{L}}}^{{\text{T}}} {\varvec{f}}_{{\text{L}}} ({\varvec{x}}_{{\text{L}}}^{{\text{k}}} ) $$

(15)

$$ {\varvec{x}}_{{\text{L}}}^{{{\text{k}} + 1}} = {\varvec{x}}_{{\text{L}}}^{{\text{k}}} + {\varvec{h}}_{{\text{L}}}^{{\text{k}}} $$

(16)

To sum up, the calibration method of a camera can be described as follows.

1)
Obtain m points with (x_Wi, y_Wi, z_Wi) and (u_i, v_i) by the active translational movement of the motion platform. Then 2 m equations are formed according to Eq. (7).
2)
At the beginning of the iteration, the initial value of x_L^k(k = 0) can be preset as a non-zero vector. Then, the maximum value of the diagonal matrix J_L^TJ_L is chosen to be the initial value of the damping factor denoted as μ_L⁰. Thus, h_L^k(k = 0) can be calculated according to Eq. (12).
3)
Then, the unknown vector x_L^k is solved by applying the iteration Eqs. (14)–(16). Specifically, in each iteration (k > = 1), the descent step h_L^k is obtained according to Eq. (15). x_L^k is updated by applying the iteration formula (16). Finally, the damping factor μ_L^k is updated according to the Eq. (14). The iterative process stops until the 2-norm of the two adjacent vectors' error becomes smaller than the threshold value. Thus, the homography matrix M_L can be calculated according to the calibration parameters of the left camera.

3.1.3 Calibration for the Rotation Center

To complete the alignment task for the workpiece which is fixed on the motion platform randomly in one operation, the rotation center of the rotation platform should be calibrated firstly. As shown in Fig. 4, the motion platform is rotated with an angle α three times. The left corner point and the right corner point are named as p_WLi and p_WRi, respectively. The image coordinates of p_WLi and p_WRi in the image coordinate system o_Lu_Lv_L and o_Ru_Rv_R are given as (u_Li, v_Li) and (u_Ri, v_Ri)(i = 1,2,3), respectively. The local world coordinates of p_WLi and p_WRi in the local world coordinate o_WLx_WLy_WLz_WL, o_WRx_WRy_WRz_WR can be calculated by the calibration model illustrated in Sect. 3.1 which are expressed as (x_WLi, y_WLi) and (x_WRi, y_WRi) (i = 1,2,3). These local world coordinates of p_WLi and p_WRi are utilized to calibrate the rotation center of the rotation platform.

The fitting of a circle is usually calculated by the least-squares method. Then the accuracy depends on the distribution of the fitting points on the circle. I. Kåsa [16] proved that the regular placement of the data points along a circle could improve the performance of the circle fitting. Zhu Jia et al. [17] demonstrated that when the center angle α is small, the transmission factor of the measurement error of the circle center and radius will increase rapidly with the decrease of the center angle α. For the case of the rotation platform of the alignment system cannot be rotated by a large angle, since it would cause the workpiece to deviate out of the field of view, the least-squares method will lead to non-negligible errors for the calculation of the rotation center. To solve this problem, we proposed a angle constraint-based calibration method for the rotation center.

Firstly, the central angle α is used as prior knowledge to calculate the candidate coordinates of the rotation center, and then the best solution with the minimum standard deviation is picked out as the coordinate of the rotation center. As shown in Fig. 4, taking the calibration of the rotation center in the left camera’s view as an example, when the left corner p_WL moves from (x_WL1, y_WL1) to (x_WL2, y_WL2) with a step angle α, the rotation center p_WOL(x_WOL, y_WOL) can be calculated according to the following equation.

$$ \left\{ \begin{gathered} x_{{{\text{WL}}2}} = x_{{{\text{WOL}}}} + \cos \alpha (x_{{{\text{WL}}1}} - x_{{{\text{WOL}}}} ) - \sin \alpha (y_{{{\text{WL}}1}} - y_{{{\text{WOL}}}} ) \hfill \\ y_{{{\text{WL}}2}} = y_{{{\text{WOL}}}} + \sin \alpha (x_{{{\text{WL}}1}} - x_{{{\text{WOL}}}} ) + \cos \alpha (y_{{{\text{WL}}1}} - y_{{{\text{WOL}}}} ) \hfill \\ \end{gathered} \right. $$

(17)

Thus, the solution of the rotation center in the world coordinate system is given by

$$ \left\{ \begin{gathered} x_{{{\text{WOL}}}} = \frac{{x_{{{\text{WL}}1}} + x_{{{\text{WL}}2}} }}{2} + \frac{{(y_{{{\text{WL}}1}} - y_{{{\text{WL}}2}} )\sin \alpha }}{2 - 2\cos \alpha } \hfill \\ y_{{{\text{WOL}}}} = \frac{{y_{{{\text{WL}}1}} + y_{{{\text{WL}}2}} }}{2} + \frac{{(x_{{{\text{WL}}2}} - x_{{{\text{WL}}1}} )\sin \alpha }}{2 - 2\cos \alpha } \hfill \\ \end{gathered} \right. $$

(18)

Since that the rotation center can be obtained by any two points with their corresponding angle is known, C_n² candidate solutions can be obtained by using n points p_WLj = (x_WLj, y_WLj), p_WRj = (x_WRj, y_WRj) ( j = 1,2,…,n). After calculating the standard deviation of the Euclidean distance between each candidate rotation center and all points, candidate solutions with the minimum standard deviation are taken as the best solutions, which are given as p_WOL(x_WOL, y_WOL) and p_WOR(x_WOR, y_WOR).

$$ \begin{gathered} \left\{ \begin{gathered} p_{{{\text{WOL}}}} = \mathop {{\text{argmin}}}\limits_{{p_{{{\text{WOL}}i}} }} \frac{{1}}{n}\sqrt {\sum\limits_{{j = {1}}}^{n} {(p_{{{\text{WOL}}i}} - p_{{{\text{WL}}j}} )^{{2}} } } \hfill \\ p_{{{\text{WOR}}}} = \mathop {{\text{argmin}}}\limits_{{p_{{{\text{WOR}}i}} }} \frac{{1}}{n}\sqrt {\sum\limits_{{j = {1}}}^{n} {(p_{{{\text{WOR}}i}} - p_{{{\text{WR}}j}} )^{{2}} } } \hfill \\ \end{gathered} \right. \hfill \\ i = {1},{2},...,C_{n}^{{2}} ,\,\,j = {1},{2},...,n \hfill \\ \end{gathered} $$

(19)

As shown in Fig. 2, o_Wx_Wy_Wz_W is the unified world coordinate system with its axes parallel to the motion platform’s moving directions and rotation center of the rotation platform p_WO as the origin. Thus, based on Eqs. (4), (5) and (19), the coordinate of p_WL and p_WR in o_Wx_Wy_Wz_W can be given as follows.

$$ \left\{ \begin{gathered} x_{{{\text{WL}}}} = \frac{{r_{{{\text{22L}}}} (u_{{\text{L}}} - k_{{\text{L}}} p_{{x{\text{L}}}} ) - r_{{12{\text{L}}}} (v_{{\text{L}}} - k_{{\text{L}}} p_{{y{\text{L}}}} )}}{{k_{{\text{L}}} (r_{{11{\text{L}}}} r_{{22{\text{L}}}} - r_{{12{\text{L}}}} r_{{21{\text{L}}}} )}} - x_{{{\text{WOL}}}} \hfill \\ y_{{{\text{WL}}}} = \frac{{r_{{11{\text{L}}}} (v_{{\text{L}}} - k_{{\text{L}}} p_{{y{\text{L}}}} ) - r_{{21{\text{L}}}} (u_{{\text{L}}} - k_{{\text{L}}} p_{{x{\text{L}}}} )}}{{k_{{\text{L}}} (r_{{11{\text{L}}}} r_{{22{\text{L}}}} - r_{{12{\text{L}}}} r_{{21{\text{L}}}} )}} - y{}_{{{\text{WOL}}}} \hfill \\ x_{{{\text{WR}}}} = \frac{{r_{{22{\text{R}}}} (u_{{\text{R}}} - k_{{\text{R}}} p_{{x{\text{R}}}} ) - r_{{12{\text{R}}}} (v_{{\text{R}}} - k_{{\text{R}}} p_{{y{\text{R}}}} )}}{{k_{{\text{R}}} (r_{{11{\text{R}}}} r_{{22{\text{R}}}} - r_{{12{\text{R}}}} r_{{21{\text{R}}}} )}} - x_{{{\text{WOR}}}} \hfill \\ y_{{{\text{WR}}}} = \frac{{r_{{11{\text{R}}}} (v_{{\text{R}}} - k_{{\text{R}}} p_{{y{\text{R}}}} ) - r_{{21{\text{R}}}} (u_{{\text{R}}} - k_{{\text{R}}} p_{{x{\text{R}}}} )}}{{k_{{\text{R}}} (r_{{11{\text{R}}}} r_{{22{\text{R}}}} - r_{{12{\text{R}}}} r_{{21{\text{R}}}} )}} - y{}_{{{\text{WOR}}}} \hfill \\ \end{gathered} \right. $$

(20)

So far, we show how to calculate the unified world coordinates of a point according to its image coordinates. Next, we will introduce how to calculate the alignment command according to these coordinates.

3.2 Calculation for the Alignment Command

The position and pose of a workpiece can be represented by a reference point p_W(x_W, y_W) and a title angle θ. In our alignment system, the middle point of the corner points is taken as the reference point of the workpiece. The angle of the line formed by the corner points is used as the title angle θ. Thus, the position and pose{x_w, y_w, θ} of the workpiece to be measured is given by

$$ \left\{ \begin{gathered} x_{{\text{W}}} = \frac{1}{2}\left( {x_{{{\text{WL}}}} + x_{{{\text{WR}}}} } \right) \hfill \\ y_{{\text{W}}} = \frac{1}{2}\left( {y_{{{\text{WL}}}} + y_{{{\text{WR}}}} } \right) \hfill \\ \theta = \arctan \left( {\frac{{y_{{{\text{WR}}}} - y_{{{\text{WL}}}} }}{{x_{{{\text{WR}}}} - x_{{{\text{WL}}}} }}} \right) \hfill \\ \end{gathered} \right. $$

(21)

Similarly, the target position can be calculated as {x_w*, y_w*, θ*}. For the calculation of the alignment command, as shown in Fig. 5, the workpiece is rotated with an angle △θ to the transitional position which is parallel to the target and then the translation amount is obtained. Then, the alignment command {Δx_w, Δy_w, Δθ} is given by

$$ \Delta \theta = \theta^{*} - \theta $$

(22)

$$ \left[ {\begin{array}{*{20}c} {\Delta x_{{\text{W}}} } \\ {\Delta y_{{\text{W}}} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {x_{{\text{W}}}^{*} } \\ {y_{{\text{W}}}^{*} } \\ \end{array} } \right] - \left[ {\begin{array}{*{20}c} {\cos \Delta \theta } & { - \sin \Delta \theta } \\ {\sin \Delta \theta } & {\cos \Delta \theta } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {x_{{\text{W}}} } \\ {y_{{\text{W}}} } \\ \end{array} } \right] $$

(23)

3.3 A Fast Feature Point Detection Method

The detection accuracy of feature points is a crucial factor affecting the calibration and alignment accuracy. In addition, the acquisition of feature points is often the most time-consuming step in the alignment operation. However, the accurate feature point detection is difficult to be completed since the gray transition bands would make the accurate edge detection be difficult, as shown in Fig. 6. These undesirable factors are usually found in alignment tasks and are mainly caused by the position of the light source and the limitation of the platform movement.

Thus, a fast feature point detection method is proposed in this paper. As shown in Fig. 7, the method consists of three steps, including object detection, edge extraction, and feature point extraction. Well-lit regions without gray transition bands are located in object detection step in a fast and robust manner to reduce the effects of inhomogeneous illumination and image noise. Then, the accurate edge detection is performed in these regions to ensure the detection accuracy of the corner point.

The alignment task of the mobile phone’s cover glass is taken as an example to illustrate the method clearly. Since the corner points of mobile phone’s cover glass are taken as the feature points and are utilized to recognize the reference point p_W(x_W, y_W), the region of interest denoted as ROI in the input image containing straight edges is located through a coarse-to-fine shape matching method in Step 1. Then, the edge extraction is accomplished by the line fitting in Step 2. In Step 3, the left and the right corner point are obtained by calculating the intersection of the lines in the left and the right vision system, respectively. The middle point of these corner points is used as the reference point.

The premise of accurately obtaining the image coordinates of feature points is to complete the object detection quickly and robustly to adapt to the random placement of the workpiece. In this paper, a coarse-to-fine shape matching method is proposed to accomplish the object detection. The main contribution of this method lies in the adoption of a coarse-to-fine strategy to achieve the compatibility of acceleration and accuracy. Specifically, after the template image and the input image are both downsampled, the branch-and-bound scheme and gradient spread method are used for the coarse matching to speed up the shape matching process. Furthermore, to obtain a more accurate result, the images with a higher resolution compare to the images in Stage I are adopted as inputs to carry out the fine matching based on the coarse matching results.

Stage I Coarse Shape Matching Based on the Branch-and-bound Scheme and the Gradient Spread Method.

The image matching task can be interpreted as the process of finding the best affine transformation among a large number of possible affine transformations. This process is time-consuming, so that an appropriate strategy should be found to speed up the search. Fast-Match method [18] utilized the branch-and-bound scheme for accelerating with the sum of absolute differences (SAD) as the similarity measurement. However, the SAD is calculated based on image blocks’ gray values, which may be influenced by the illumination variation and cause the matching task’ failure in some situations. In contrast, edge features exist widely in images and are insensitive to the illumination variation. Therefore, in this paper, edge features are considered to accomplish the image matching task for the adaptability of illumination variation. To realize this idea, the gradient direction is extracted as the descriptor for the edge features with the absolute cosine value of the gradient directions between the template and input image as the similarity measurement. Furthermore, the gradient spread process is adopted, and the branch-and-bound scheme is utilized to speed up the search of affine transformations. In addition, since edge features are utilized to accomplish the image matching, this matching method can be categorized as a shape matching method.

The gradient spread process is proposed by the LINE-MOD method [19] to keep the matching task invariant to the small translations and deformations. This process enhances the smoothness of images by diffusing the gradient direction of an edge point to points within its local neighborhood. Specifically, as shown in Fig. 8, a binarized image J storing the gradient direction codes is utilized to represent the input image I and is obtained by extracting, quantizing, encoding the gradient directions of edge points. The gradient spread process is then utilized in this process, as shown in Fig. 8c.

After the gradient spread process is utilized and the binarized image J of the input image I is established, the coarse shape matching is performed according to Algorithm 1. Firstly, the template image Θ and the input image I are downsampled to reduce the size of the search parameters in the pursuit of reducing time consumption. Secondly, in the preparation stage, a series of variables are established to use the branch-and-bound scheme and the gradient spread process to accelerate the shape matching. In detail, to calculate the similarity measurement quickly, the binarized image J of the input image is established within a neighborhood [− T/2, T/2] × [− T/2, T/2]. A response table τ is also precomputed to save the maximum cosine similarity between any possible gradient direction in the template image Θ and any possible direction code in the binarized image J. Then an affine transformation net N₀ containing all possible affine transformations is constructed for searching the best affine transformation. Thirdly, the candidate affine transformation net N_C containing n candidate affine transformations is searched by the branch-and-bound scheme. By using a parameter δ to control the search precision, the size of the candidate affine transformation net N_C is gradually reduced to n, which means that n expected candidate affine transformations are obtained finally. In addition, with the size of the candidate transformation network N_C decreasing, the length T of the neighborhood decreases by the factor of δ accordingly. Algorithm 1 is depicted as follows.

Stage II Fine Shape Matching

Since the gradient spread directions of one point is defined as the recording of all the gradient direction of its neighborhoods in a radius T/2, the error of shape matching is proportional to the neighborhoods radius. Therefore, based on the coarse matching results, a step-by-step search is performed to search for the best affine transformation.

As shown in Algorithm 2, the template image Θ and the input image I are both downsampled with a higher resolution than the images used in Stage I. Then the candidate affine transformation net N_C is expanded to a net Q_C to find the best transformation named T_Best.

4 Experiments and Analysis

A series of experiments are conducted to verify the alignment method proposed in this paper. The alignment system is established, and the calibration results containing the calibration for the camera and the rotation center of the rotation platform are illustrated in Sect. 4.2. Alignment experiments are carried out to verify the calibration and alignment accuracy in Sect. 4.3. Finally, to compare the binocular vision system with the monocular vision system, the contrast experiments are carried out in Sect. 4.4.

4.1 Alignment System

The alignment system is composed of a motion platform and a vision system. As shown in Fig. 9, the motion platform consists of a two-dimensional translation platform and a rotation platform. Then a binocular-based vision system with two telecentric lens cameras is established to measure the position and pose of the workpiece by capturing the images containing the feature points. The binocular vision system is mounted with optic axes orthography to the motion platform. Taking the alignment task of the mobile phone’s cover glass as an example, the left and the right corners of the workpiece are taken as the feature points. Then, the images containing the left and the right corner of the workpiece are captured by camera 1 and camera 2, respectively.

In the binocular based vision system, two Basler alA3800-8gm GigE cameras (image size: 3840 × 2748 pixel; pixel size: 1.67 µm × 1.67 µm) are mounted orthogonal to the motion platform. The motion platform uses a two-dimensional translation motor LMP-20C20 and a rotation motor FOI170-Z10-A00-N01 from the LinkHou Corporation to compose the translation platform and the rotation platform. The repeatability and resolution of the translation motor LMP-20C20 are ± 2 µm/ ± 2 µm and 0.5 µm/0.5 µm, respectively. The repeatability and absolute accuracy of the rotation motor FOI170-Z10-A00-N01 are ± 0.0398° and ± 0.398°, respectively.

4.2 Calibration Result for the Vision System

4.2.1 Calibration Result for the Telecentric Lens Camera

According to the proposed calibration method given in Sect. 3.1, several pairs of points need to be obtained by the active movement of the motion platform to calibrate the left camera, the right camera, and the rotation center of the rotation platform. The movement amount is taken as the coordinates of the local world coordinate since the accuracy of the motion platform is high enough, and the local world coordinate system is established on the axis of the motion platform. Thus, the motor platform is controlled to move according to the commands (Δx, Δy) = {(− 3,3), (0,3), (3,3), (3,0), (0,0), (− 3,0), (− 3,− 3), (0,− 3), (3,− 3)}(mm) and then the images containing the left and the right corner of the workpiece are captured.

Taking the coordinates of the corners into the aforementioned Eqs. (4)–(5) in Sect. 3.1, the calibration parameters for the left camera are obtained as [θ_zL, θ_yL, θ_xL, p_xL, p_yL, k_L] = [− 1.20°, − 10.23°, 181.63°, 16.07, 10.95, 92.36] with a pixel equivalent 10.82 μm/pixel. The calibration parameters for the right camera are obtained as [θ_zR, θ_yR, θ_xR, p_xR, p_yR, k_R] = [− 1.20°, − 9.46°, 180.87°, 21.33, 12.52, 92.02] with a pixel equivalent 10.86 μm/pixel. Therefore, the homography matrix between the image pixel coordinate systems o_Lu_Lv_L, o_Ru_Rv_R and the camera world coordinate systems o_WLx_WLy_WLz_WL, o_WRx_WRy_WRz_WR, denoted as M_L, M_R, respectively, are given by

$$ M_{{\text{L}}} = \left[ {\begin{array}{*{20}c} {0.0909} & { - 0.0159} & {0.0165} & {1.4842} \\ { - 0.0019} & { - 0.0923} & {0.0026} & {1.0113} \\ 0 & 0 & 0 & {0.0924} \\ \end{array} } \right] $$

$$ M_{{\text{R}}} = \left[ {\begin{array}{*{20}c} {0.0907} & { - 0.0149} & {0.0151} & {1.9628} \\ { - 0.0019} & { - 0.0920} & {0.0014} & {1.1521} \\ 0 & 0 & 0 & {0.0920} \\ \end{array} } \right] $$

4.2.2 Calibration Result for the Rotation Center

According to the angle-based constrained calibration method of the rotation center, the motion platform is rotated clockwise with the angle α = 3° three times to attain the local world coordinates of the left and the right corner, denoted as p_WLi(x_WLi, y_WLi), p_WRi(x_WRi, y_WRi)(i = 1,2,3). According to the Eqs. (14)–(15) proposed in Sect. 3.3, the coordinates of the rotation center in the view of the left and the right camera are obtained as p_WOL(x_WOL, y_WOL) = (34.80, − 85.21) and p_WOR(x_WOR, y_WOR) = (− 37.16, − 84.49).

In addition, to evaluate the calibration and the alignment error of the angle-based constrained calibration method proposed in this paper and the least-squares-based calibration method, the contrast experiments are implemented in Sect. 4.4.

4.3 Verified Experiments for the Calibration Accuracy and the Alignment Accuracy

To verify the calibration accuracy and alignment accuracy, experiments were conducted with the workpiece is adsorbed on the motion platform in a random pose. Under the condition of keeping the relative attitude of the workpiece and the motion platform unchanged, a large packet of data is obtained through the active motion of the motion platform, as shown in Table 1. As shown in Fig. 10, each slice of data is composed of images containing the left and the right corners of the workpiece. The left and the right corners of the workpiece are denoted as p_li(u_li, v_li) p_ri(u_ri, v_ri) in the image coordinate system o_Lu_Lv_L, o_Ru_Rv_R, and p_li(x_wli, y_wli) p_ri(x_wri, y_wri) in the world coordinate system o_WLx_WLy_WLz_WL, o_WRx_WRy_WRz_WR. Due to the high precision of the motion platform, these movements can be used to calculate the true value of the theoretical position deviation of the workpiece. Furthermore, to verify the method’s robustness, the above experimental process is repeated after changing the relative attitude between the workpiece and the motion platform.

Table 1 Movements amount of the motion platform

Full size table

(Contain 10 translations, 10 rotations, 20 translation-and-rotations.)

4.3.1 Verified Experiments for the Telecentric Lens Camera’s Calibration Accuracy

M_L and M_R are the calibration results of the telecentric lens cameras, which can be utilized to map the image coordinates of corners to the world coordinates. As shown in Table 1, if any two slices of data have the same value of Δθ^*, they are considered to be parallel. Each pair of the data satisfying the parallel relationship is picked out, and the world coordinates of the corners are calculated using M_L and M_R. In these data pairs, the world coordinates of the left corner are denoted as (x_wli, y_wli), (x_wlj, y_wlj), i, j ∈ 1,2,…,C_n², i ≠ j. Then the difference between pair the data pair (i, j) is written as d_lk(d_xlk, d_ylk) = (x_wli − x_wlj, y_wli −y_wlj). The theoretical distance of the left corner between the data pair (i, j) is given by d_lk*(d_xlk*, d_ylk*) = (Δx_wi −Δx_wj, Δy_wi −Δy_wj). Thus, the calibration error {d_xlerror, d_ylerror} of the left camera can be obtained by

$$ \left\{ \begin{gathered} {\text{d}}_{{xl_{error} }} = {\text{d}}_{xlk}^{*} - {\text{d}}_{xlk} \hfill \\ {\text{d}}_{{yl_{error} }} = {\text{d}}_{ylk}^{*} - {\text{d}}_{ylk} \hfill \\ \end{gathered} \right. $$

(24)

The calibration error d_xrerror, d_yrerror of the right camera is obtained in the same way. Seventy pairs of data’s calibration error are given in Fig. 11, and the calibration accuracy for the left camera and right camera are within ± 0.020 mm. Furthermore, another data package is also obtained when the relative attitude between the workpiece and the motion platform is changed. As shown in Fig. 12, the calibration accuracy for the left camera and right camera are also within ± 0.020 mm, which indicates that the calibration method is robust to the variation of the relative attitude between the workpiece and the motion platform.

4.3.2 Verified Experiments for the Rotation center’s Calibration Accuracy

According to Eqs. (20)–(23), the rotation center calibration accuracy plays a crucial role in the alignment error. Thus, the rotation center calibration accuracy is evaluated by calculating the alignment error. Based on Table 1, the alignment error can be calculated by the difference between the alignment command and the movement amount. Specifically, the unified world coordinates of the left and the right corner are obtained according to Eq. (20) using their image coordinates. The target position is set as {x_w*, y_w*, θ*} = {0, 0, 0}. Then, the alignment commands {Δx_w, Δy_w, Δθ} can be calculated according to Eqs. (21)–(23). Take the movement amount of the motion platform as the standard value, the alignment error {x_error, y_error, θ_error} in the x, y, θ direction can be obtained as follows.

$$ \left\{ \begin{gathered} x_{error} = \Delta x_{{\text{W}}}^{*} - \Delta x_{{\text{W}}} \hfill \\ y_{error} = \Delta y_{{\text{W}}}^{*} - \Delta y_{{\text{W}}} \hfill \\ \theta_{error} = \Delta \theta_{{\text{W}}}^{*} - \Delta \theta \hfill \\ \end{gathered} \right. $$

(25)

where {Δx_w*, Δy_w*, Δθ*} is the movement amount of the motion platform.

Forty slices of data’s alignment errors are given in Fig. 13. The alignment error x_error, y_error is attained within ± 0.020 mm, and the θ_error is attained within ± 0.25°. Meanwhile, when the relative attitude between the workpiece and the motion platform is changed, the alignment errors of another package of data are also obtained, as shown in Fig. 14. The alignment experiments results indicate that the achievable calibration accuracy is reasonable. In addition, the average calculating time for an alignment command is only 20 ms, which can fully meet the real-time requirements of the industrial application.

4.4 Contrast Experiments

A series of contrast experiments are carried out to prove the advantage of the method proposed in this paper. Firstly, the performance of the two mentioned calibration methods of the rotation center are compared, which are the least-squares-based calibration method and the angle constraint-based calibration method. Secondly, to verify whether a multi-vision system would improve the alignment accuracy, we implement a contrast experiment to accomplish the alignment task based on a binocular vision system and a monocular vision system, respectively.

4.4.1 Contrast Experiment for the Calibration Method of the Rotation Center

According to Eqs. (20)–(23), the calibration accuracy of the rotation center directly influences the alignment error. Therefore, comparing the alignment accuracy makes it possible to evaluate which algorithm has better higher accuracy. Specifically, to compare the least-squares-based calibration method and the angle constraint-based calibration method, their calibration results are taken as the rotation centers p_WOL and p_WOR, respectively. Similarly to the verified experiments illustrated in 4.3.2, forty slices of data’s alignment errors of two calibration method are given in Fig. 15. Furthermore, the mean absolute error (MAE) and the range of the alignment errors are obtained to characterize the accuracy of the two calibration methods, respectively, as shown in Table 2. The experiment results show that the angle constraint-based calibration method proposed by this paper outperforms the least-squares-based calibration method in alignment accuracy.

Table 2 Alignment errors with calibration method of the rotation center

Full size table

4.4.2 Contrast Experiment for the Vision System

To explore the influence of the vision system on alignment accuracy, different types of vision system are utilized to perform the alignment task. A binocular vision system composed of left and the right cameras is utilized to perform the alignment task. Then, a monocular vision system is formed by the left camera or the right camera, respectively, to perform the same alignment task.

Figure 16 shows that the binocular vision-based system performs better than the monocular vision-based system. Furthermore, the MAE and the range of the alignment error in Table 3 also indicate that the binocular vision-based system achieves higher accuracy than the monocular vision-based system. The main reason lies in that the monocular vision-based system calculates the angle of the workpiece using the region of interest from a single image, whereas the binocular vision-based system utilizes the left and the right images. In other words, it means that the binocular vision-based alignment method can utilize more image information and achieve higher alignment accuracy, especially for the alignment task with a larger workpiece size.

Table 3 Alignment errors with different types of vision system

Full size table

5 Conclusions

A high precision and fast alignment method based on binocular vision is proposed to accomplish the alignment task in one operation. A calibration method for the telecentric lens camera based on an improved nonlinear damped least-squares method is proposed to speed up the convergence of the calibration process. Meanwhile, an angle constraint-based calibration method for the platform’s rotation center is proposed to pursue higher precision. Furthermore, to detect the feature point more robust, a two-stage feature point detection method based on shape matching is presented. Experiments conducted on an alignment system demonstrate the effectiveness of the proposed methods. The alignment error is within ± 0.020 mm and the time taken to calculate the alignment command is less than 20 ms. Future work will focus on expanding proposed methods to adapt four cameras for high-precision alignment tasks of larger size workpieces.

References

Chen, Z., Zhou, D., Liao, H., & Zhang, X. (2016). Precision alignment of optical fibers based on telecentric stereo microvision. IEEE/ASME Transactions on Mechatronics, 21(4), 1924–1934.
Article Google Scholar
Shen, F., Wu, W., Yu, D., Xu, D., & Cao, Z. (2015). High-precision automated 3-D assembly with attitude adjustment performed by LMTI and vision-based control. IEEE/ASME Transactions on Mechatronics, 20(4), 1777–1789.
Article Google Scholar
Tamadazte, B., Piat, N. L. F., & Dembélé, S. (2011). Robotic micromanipulation and microassembly using monoview and multiscale visual servoing. IEEE/ASME Transactions on Mechatronics, 16(2), 277–287.
Article Google Scholar
Gu, Q., Aoyama, T., Takaki, T., & Ishii, I. (2015). Simultaneous vision-based shape and motion analysis of cells fast-flowing in a microchannel. IEEE Transactions on Automation Science and Engineering, 12(1), 204–215.
Article Google Scholar
Kyriakoulis, N., & Gasteratos, A. (2010). Color-based monocular visuoinertial 3-D pose estimation of a volant robot. IEEE Transactions on Instrumentation and Measurement, 59(10), 2706–2715.
Article Google Scholar
Li, C., & Gao, X. (2018). Adaptive contour feature and color feature fusion for monocular textureless 3D object tracking. IEEE Access, 6, 30473–30482.
Article Google Scholar
Ma, Y., Liu, X., & Xu, D. (2020). Precision pose measurement of an object with flange based on shadow distribution. IEEE Transactions on Instrumentation and Measurement, 69(5), 2003–2015.
Article Google Scholar
Shen, F., Qin, F., Zhang, Z., Xu, D., Zhang, J., & Wu, W. (2021). Automated pose measurement method based on multivision and sensor collaboration for slice microdevice. IEEE Transactions on Industrial Electronics, 68(1), 488–498.
Article Google Scholar
Lins, R. G., Givigi, S. N., & Kurka, P. R. G. (2015). Vision-based measurement for localization of objects in 3-D for robotic applications. IEEE Transactions on Instrumentation and Measurement, 64(11), 2950–2958.
Article Google Scholar
Liu, S., Xu, D., Liu, F., Zhang, D., & Zhang, Z. (2016). Relative pose estimation for alignment of long cylindrical components based on microscopic vision. IEEE/ASME Transactions on Mechatronics, 21(3), 1388–1398.
Article Google Scholar
Ma, Y., Liu, X., Zhang, J., Xu, D., Zhang, D., & Wu, W. (2020). Robotic grasping and alignment for small size components assembly based on visual servoing. The International Journal of Advanced Manufacturing Technology, 106(11), 4827–4843.
Article Google Scholar
Kwon, S., Jeong, H., & Hwang, J. (2012). Kalman filter-based coarse-to-fine control for display visual alignment systems. IEEE Transactions on Automation Science and Engineering, 9(3), 621–628.
Article Google Scholar
Wang, P., Shen, S., Lu, H., & Shen, Y. (2019). Precise watch-hand alignment under disturbance condition by microrobotic system. IEEE Transactions on Automation Science and Engineering, 16(1), 278–285.
Article Google Scholar
Golnabi, H., & Asadpour, A. (2007). Design and application of industrial machine vision systems”. Robotics and Computer-Integrated Manufacturing, 23(6), 630–637.
Article Google Scholar
Transtrum, M. K., Sethna, J. P. (2012) Improvements to the Levenberg- Marquardt algorithm for nonlinear least-squares minimization. arXiv preprint arXiv
Kåsa, I. (1976). A circle fitting procedure and its error analysis. IEEE Transactions on Instrumentation and Measurement, 25(1), 8–14.
Article MathSciNet Google Scholar
Zhu, J., Li, X. F., Tan, W. B., Xiang, H. B., & Chen, C. (2009). Measurement of short arc based on center constraint least square circle fitting”. Optics and Precision Engineering, 17(10), 2486–2492.
Google Scholar
Korman, S., Reichman, D., Tsur, G., Avidan, S. Fast-match: Fast affine template matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2331–2338).
Hinterstoisser, S., Cagniart, C., Ilic, S., Sturm, P., Navab, N., Fua, P., & Lepetit, V. (2012). Gradient response maps for real-time detection of textureless objects. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(5), 876–888.
Article Google Scholar

Download references

Acknowledgements

This work was supported by Youth Innovation Promotion Association, CAS (2020139).

Author information

Authors and Affiliations

Research Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, People’s Republic of China
Han Gao, Fei Shen, Feng Zhang & Zhengtao Zhang
CASI Vision Technology CO., LTD., Luoyang, 471000, People’s Republic of China
Fei Shen, Feng Zhang & Zhengtao Zhang
University of Chinese Academy of Sciences, Beijing, 100049, People’s Republic of China
Han Gao, Fei Shen, Feng Zhang & Zhengtao Zhang

Authors

Han Gao
View author publications
You can also search for this author in PubMed Google Scholar
Fei Shen
View author publications
You can also search for this author in PubMed Google Scholar
Feng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhengtao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fei Shen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, H., Shen, F., Zhang, F. et al. A High Precision and Fast Alignment Method Based on Binocular Vision. Int. J. Precis. Eng. Manuf. 23, 969–984 (2022). https://doi.org/10.1007/s12541-022-00674-7

Download citation

Received: 24 September 2021
Revised: 18 February 2022
Accepted: 09 May 2022
Published: 13 July 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s12541-022-00674-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A High Precision and Fast Alignment Method Based on Binocular Vision

Abstract

Similar content being viewed by others

Target Positioning Based on Binocular Vision

Vision measurement system for position-dependent geometric error calibration of five-axis machine tools

Hand–eye calibration method based on three-dimensional visual measurement in robotic high-precision machining

1 Introduction

2 Alignment System and Coordinate Systems

2.1 Binocular Vision-based Alignment System

2.2 Establishment for the Coordinate Systems

3 Fast Alignment Method Based on Binocular Vision

3.1 Calibration for the Vision System

3.1.1 Telecentric Lens Camera Model

3.1.2 Camera Calibration Based on an Improved Levenberg–Marquardt Algorithm

3.1.3 Calibration for the Rotation Center

3.2 Calculation for the Alignment Command

3.3 A Fast Feature Point Detection Method

4 Experiments and Analysis

4.1 Alignment System

4.2 Calibration Result for the Vision System

4.2.1 Calibration Result for the Telecentric Lens Camera

4.2.2 Calibration Result for the Rotation Center

4.3 Verified Experiments for the Calibration Accuracy and the Alignment Accuracy

4.3.1 Verified Experiments for the Telecentric Lens Camera’s Calibration Accuracy

4.3.2 Verified Experiments for the Rotation center’s Calibration Accuracy

4.4 Contrast Experiments

4.4.1 Contrast Experiment for the Calibration Method of the Rotation Center

4.4.2 Contrast Experiment for the Vision System

5 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation