1 Introduction

With the rapid growth of manufacturing in the automotive, shipbuilding, and aerospace industries, welding plays an indispensable role in joining metals in manufacturing. Due to the harsh working environment, increased costs, and lower quality of manual welding, welding robots are being used more and more in the industry. Compared to humans, welding robots offer advantages such as accuracy, stability, consistency, and adaptability. However, it is still difficult for welding robots to perform welding tasks automatically without human assistance. Robotic welding tasks are implemented mainly through online programming and offline programming. Online programming is typically realized using a teaching programming method and offline programming is automatic programming generation based on design files of workpiece [1]. Compared with the online program, offline programming decreases the downtime required for system programming and resulting in enormous savings in labor costs. The offline programming technique is to manually plan the trajectory to the virtual model; thus in practice, it is necessary to obtain the relative positions of the workpiece and the robot. High-precision clamping and positioning or writing a program to detect the positioning is generally used. As a result, the entire welding cycle becomes longer and the efficient advantages of the welding robot cannot be realized. Therefore, the detection and location of the weld seam through sensor technology are conducive to improving the autonomy of the welding robot.

To identify and extract weld seam, a wide range of sensors are used in welding robots to make the welding robots more autonomous and intelligent, such as visual sensing, acoustic sensing, ultrasonic sensing, and arc sensing [2,3,4]. Compared with other sensors, the visual sensor has the advantage of non-contact characteristics, high precision, fast detection, and strong adaptability [5]. Chen et al. [6] proposed a single-camera stereo vision system to acquire the depth information of spatially curved weld seams. However, the results showed that the welding accuracy was not adequate in the experiment with a low-resolution camera. Dinham and Fang [7] proposed an improved ROI-based method for narrow welds and obtained the 3D Cartesian coordinates of the weld based on stereo vision, but this method was only available for welds on a single plane. Yang [8] proposed a seam extraction and identification algorithm to extract and identify weld seams of various shapes and sizes without any prior knowledge of the geometry of the workpiece by passive vision sensors. However, this method cannot achieve 3D positioning of the weld seam by the welding robot.

Compared with passive vision sensors, active vision sensors can acquire high-precision depth information. The laser sensor is a typical active vision sensor with desired detection accuracy and reliability, mostly used in the weld seam tracking stage. In some studies, laser sensors can also be used for the starting point detection of narrow weld seams with high accuracy. Liu et al. [9] proposed a method to precisely identify the initial weld position by employing an automatic dynamic programming-based laser light inflection point extraction algorithm. However, the trajectory information sensed by the laser sensor is relatively least, and it provides the least sensing information for the welding robot, which is more suitable for precise welding applications. In recent years, a new coded structured light sensor with better performance in 3D reconstruction has attracted the attention of many scholars [10, 11]. Xiao et al. [12] proposed a new 3D sensing and point cloud modeling approach to realize the accurate extraction of weld seams. Patil et al. [13] proposed a novel algorithm to classify and extract weld seams from 3D point clouds, which is independent of the shape of the workpiece. Yang et al. [14] proposed a novel weld seam extraction system with a structured light sensor, which can be applied to many types of weld seam detection and has good robustness to complex welding environments. However, this sensor is difficult to be applied in the detection of narrow butt welds.

Although the current laser sensor can identify the initial position with high accuracy, it is difficult to identify the entire trajectory of the weld due to its small amount of sensed information. When the weld trajectory information of multiple workpieces is known, it is difficult for laser sensors to achieve autonomous welding by welding robots through trajectory recognition. Traditional passive vision is mostly used for 2D identification of flat welds, while the new encoded structured light camera has difficulty in identifying narrow butt joints. Therefore, we propose a method for narrow weld seam detection and localization based on binocular stereo vision in this paper to improve the perception and autonomy of the welding robot system. The method identifies narrow weld seams and reconstructs the weld seam point cloud using stereo vision techniques. Then, the point cloud information is matched with the known weld trajectory to achieve recognition of the 3D trajectory of the weld. In addition, the trajectory matching process can realize the localization of the entire weld seam, which in turn guides the welding robot to complete the welding task. The final experimental results show the effectiveness of the proposed method.

This paper is organized as follows: In Sect. 2, the system description and architecture are introduced. The details of the weld seam extraction and trajectory planning of the proposed method are presented in Sect. 3. The corresponding experimental results and discussion are given in Sect. 4. The conclusions are given in Sect. 5.

2 System description and architecture

To achieve the goal of narrow weld detection and localization, the schematic diagram of the experimental system is shown in Fig. 1, which consists of a binocular camera, a welding robot, and an industrial computer. The binocular camera is mounted on the robot and the workpiece is placed on the workbench. The computer-aided design (CAD) file containing the 3D model of the workpiece is applied to generate the robot’s welding trajectory. Stereo vision technology is applied to obtain 3D information about the weld seam.

Fig. 1
figure 1

The schematic diagram of the experimental system

The block diagram of the proposed method is shown in Fig. 2, and the whole process can be divided into five steps: image capture and preprocessing, edge chain detection, 3D reconstruction of the weld seam, curve fitting, and trajectory matching. In the image capture, through the optical analysis of weld features, we propose adjusting the image grayscale expectation value to enhance narrow weld features. At this stage, the workpiece is captured twice at a fixed position, the first time for a normal capture and the second time for a high grayscale expectation capture. The image pre-processing is divided into two parts. The first is the extraction of the weld seam region in the high gray value image, and then the stereo rectification of the binocular image is completed. Edge chain detection mainly completes the detection of the weld seam in the image and realizes the extraction of the weld seam combined with the weld seam area obtained in the preprocessing stage. The third step is the sparse reconstruction of the extracted weld edge chains to get a 3D point cloud of the weld area. At this stage, the filtering of the initial point cloud is also completed. The fourth step is to fit the weld trajectory based on the weld point cloud. Finally, the fitted weld trajectory is matched with the weld trajectory obtained from the 3D model of the workpiece. The weld trajectory is updated according to the transformation matrix acquired during the matching process. By transforming the camera coordinate system and the robot coordinate system, the weld trajectory planning of the welding robot can be realized.

Fig. 2
figure 2

The block diagram of the proposed method

3 The proposed method

3.1 Image acquisition and preprocessing

Fig. 3
figure 3

The schematic diagram of diffuse reflection in the seam area

In the field of image processing, researchers have been troubled by the problems caused by reflections on smooth surfaces. The surface of metal objects is particularly reflective, and metal usually has no texture, which has always been a difficult point in image processing. When the photographed object is a weld, its typical feature is that its grayscale difference changes drastically, which is also the detection method that researchers have been using. However, the illumination changes in the welding environment are large, and it is generally difficult to completely extract the welding seam.

Fig. 4
figure 4

The workpiece image at different grayscale expectation values

3.1.1 Image acquisition

To enhance the weld seam features, we analyze the camera sensor shooting concept. When we increase the camera exposure time, the diffuse reflected light from the object surface is fully received by the camera and tends to saturate, which is especially obvious on smooth surfaces. As shown in Fig. 3, the weld area diffuse reflection diagram, when the camera is located above the seam, the seam area reflects the least amount of light. As the exposure time increases, the grayscale value of the object surface in the image will gradually saturate while the grayscale value of the seam area is steady and low. As shown in Fig. 4, the image of the workpiece at various grayscale expectation values, when the grayscale expectation value is adjusted to the maximum value, the features of the weld seam area are still obvious. Therefore, the binocular vision system takes two frames in the same pose to obtain stable weld seam features. The grayscale expectation value is set to 120–180 for the first acquisition and 250 for the second acquisition, where only the left image is retained for the second acquisition.

Fig. 5
figure 5

Binarization result of high grayscale expected value image

3.1.2 Region of interest

To extract the weld region better, we set a grayscale threshold \(\tau\). Under the condition of different grayscale expectation values, the grayscale value of the seam region changes relatively stable and below the threshold \(\tau\). As shown in Fig. 5, the image with a grayscale expectation value of 250 is binarized, where the threshold \(\tau\) is set to 80. To avoid noise and reduce the data to be processed, the region of interest (ROI) method [15, 16] is applied. Since a binocular vision system is used, the ROI is set within the best field of view of the left view. Here, we set the right half of the left image as the ROI. The connected part spanning the longest distance within the ROI region is considered to be the region where the welding seam is located.

Fig. 6
figure 6

The ideal configuration of the stereo vision system

3.1.3 Stereo rectification

The ideal configuration of the stereo vision system is shown in Fig. 6. \(C_l\) and \(C_r\) are the projection centers of the left and right cameras, respectively. A point P in the 3D world is projected onto the left and right images. \(P_l\) and \(P_r\) are the projections of P in left and right images, respectively. The plane formed by the projection center of stereo cameras with the point P is known as the epipolar plane. The epipolar line is the intersection of the epipolar plane and image plane of stereo cameras. However, it is almost impossible to guarantee that the optical axes of the left and right cameras are parallel and the imaging planes are co-planar in practical applications. A small amount of rotation and translation between the two image planes must be corrected by epipolar lines to meet the ideal imaging characteristics of parallel binocular vision. The stereo vision system calibration was carried out using a checkerboard grid pattern with a spacing of 20 mm*20 mm [17]. In order to reduce the distortion of reprojection. The Bouquet algorithm was applied to rectify the stereo images in this study to reduce the reprojection error [18].

3.2 Edge chain detection

The weld seams can be detected by edge or contour features in the image. In image processing, the features of gray value differences can be classified as edge features or contour features. The most commonly employed edge detection method in weld seam extraction is the differential operator method. The differential operator method is a classical edge detection method, which is based on the gray change of image for each pixel in their areas, using the edge close to the first-order or second-order directional derivative to detect the edge [19]. Although edge detection can be accomplished perfectly by the differential operator method, and check each pixel of an image belongs to an edge or not, it cannot identify whether an edge pixel belongs to welds or not. Hence, a grouping method that chains edge points together is proposed to distinguish edge pixels. To reduce the noise and extract the edge representing weld seam, the edge chain is a suitable method.

As the traditional edge detection, the first step of the edge chain detection is to calculate the gradient. The gradient magnitude can be expressed as

$$\begin{aligned} M=\sqrt{{G_x}^2+{G_y}^2} \end{aligned}$$
(1)

where the \(G_x\) and \(G_y\) are the X and Y derivatives at the point being considered, respectively. The direction of the gradient can be expressed as

$$\begin{aligned} \theta =arctan(\dfrac{G_x}{G_y}) \end{aligned}$$
(2)

Similarly, the upper threshold \(\tau _u\) and the lower threshold \(\tau _l\) of the gradient magnitude are used as the critical parameters of the algorithm. The hysteresis thresholding method is applied to remove the weak edges and connect the split edges. To detect more potential edge pixels, it starts at a pixel that is greater than \(\tau _u\) and searches all 8 surrounding neighbors. If the neighbor is greater than \(\tau _l\), then it will also become an edge. Besides, the non-maximum suppression method is also applied in the proposed algorithm.

For an edge pixel A with the gradient direction \(\theta _a\), if a neighbor pixel B also belongs to the edge with the gradient direction \(\theta _b\). They can be connected if

$$\begin{aligned} \left\{ \begin{matrix} |u_a-u_b |<\Delta _x \\ |v_a-v_b |<\Delta _y \\ |\theta _a-\theta _b |<\theta _0 \end{matrix}\right. \end{aligned}$$
(3)

where \((v_a,v_b)\), \((u_b,v_b)\) are the pixel coordinates of A and B, respectively; \(\Delta _x\) and \(\Delta _y\) are the distance between A and B in X-axis direction and Y-axis direction, respectively; \(\theta _0\) is the orientation difference between A and B. In order to reduce the computational cost of the weld seam extraction algorithm and reduce noise from rust and scratches, a threshold \(\lambda\) of the edge chain length is applied. Only long-enough edge chains are retained as meaningful. In this paper, \(\Delta _x\) is set to 1, \(\Delta _y\) is set to 2, \(\theta _0\) is fixed empirically to 30°, and the chain length is set to \(\lambda =100\).

Fig. 7
figure 7

Edge detection comparison results. a Original image. b Canny edge detection. c The refined edge detection

Fig. 8
figure 8

The weld seam extraction results of the left image. a Edge chain detection results. b The weld seam extraction results

To refine the edge detection further, the edge pixel (uv) is divided into horizontal and vertical edge points, according to the local gradient magnitude. For a horizontal edge point, the pixel is a horizontal local maximum of the gradient modulus (\(M(u-1,v) < M(u,v)\) and \(M(u, v) > M(u+1, v)\)) and the image gradient is more horizontal than vertical at that point \(|G_x(u, v) |> |G_y(u, v) |\). Analogously, the pixel will be a vertical edge point when \(M(u, v-1) < M(u, v)\) and \(M(u, v) > M(u, v+1)\) and the image gradient is more vertical than horizontal at that point \(|G_x(u, v) |< |G_y(u, v)|\). If two edge points are horizontally (or vertical) adjacent, and one of them is regarded as a horizontal ( or vertical) edge point, then the other edge point will be deleted. The comparison result of edge refinement is shown in Fig. 7. Compared with canny edge detection, the proposed edge chain detection can obtain more accurate pixels.

Combining with the results of image preprocessing, the weld seams pass or are partly contained in the weld seam region of the image. Only a long-enough edge chain can be retained. The weld seam extraction result of the left image is shown in Fig. 8.

3.3 3D reconstruction of weld seams

Shown in Fig. 6 is the ideal configuration of the stereo vision system. For a point P with world coordinates (XYZ) , the depth information can be calculated by the following equation:

$$\begin{aligned} Z=f*\frac{\displaystyle b}{d} \end{aligned}$$
(4)

where f is the focal length of the camera, \(b=\left\| T \right\|\) is the baseline length, and d is the disparity between the mapped point in the left and right images. Then, the other coordinates can be calculated by

$$\begin{aligned} X=Z*\frac{\displaystyle x}{f} \end{aligned}$$
(5)
$$\begin{aligned} Y=Z*\frac{\displaystyle y}{f} \end{aligned}$$
(6)

where x and y represent the modified coordinates of the point P in the image frame.

3.3.1 Sparse matching

It is certainly wonderful to be able to accomplish a dense reconstruction of a 3D model of a workpiece. However, in welding applications, especially narrow butt welds, the 3D information on the workpiece surface is not as important as expected. In addition, 3D reconstruction of the workpiece surface is computationally expensive due to the peculiarities of the metal surface. Therefore, a better approach is to obtain the most accurate sparse reconstruction possible. The edge is a typical feature; it can be used for sparse 3D reconstruction [20, 21]. Since the edge chain represents the weld seam extracted, the disparity of the edge chain can be computed by feature matching.

According to [22], the first step to calculating the disparity is the matching cost computation. The matching computation is to find the lowest cost for each candidate. Considering the light influence, the match computation is based on normalized cross-correlation (NCC), which has the advantages of robust to slight deformations and lighting changes. If a pixel p of the left image \(I_l\) corresponds to the pixel \(p+d\) of the right image \(I_r\), the NCC similarity measure between two pixels with neighbors can be calculated by

$$\begin{aligned} NCC(d)=\frac{ \displaystyle \mathop {\displaystyle \mathop {\sum }_{t_{L}\in W_{L}(p),}}_{t_{R}\in W_{R}(p+f_{{p}})} [I_{l}(t_{l})-\bar{I} _{l}(p)]\times [I_{r}(t_{r})-\bar{I} _{r}(p+d)] }{ \sqrt{\displaystyle \mathop {\sum }_{t_{l}\in W_{l}(p)}\vert I_{l}(t_{l})-\bar{I} _{l}(p)\vert ^{2}} {\times }\sqrt{\displaystyle \mathop {\sum }_{t_{r}\in W_{r}(p+d)} \vert I_{r}(t_{r})-\bar{I} _{r}(p+d)\vert ^{2}}} \end{aligned}$$
(7)

where \(\bar{I} _{l}(p)\) is the mean value of pixels in the window \(W_l(p)\) centered at p, and \(\bar{I} _{r}(p+d)\) is the mean value of pixels in the window \(W_r(p)\) centered at \(p+d\).

In the sparse match, the cost aggregation step is fused with the disparity optimization step. In general, the disparity of a pixel corresponds to the minimum matching cost value as the final disparity is called win take all. To eliminate false matches, disparity optimization will be run twice, with reference images to choose either the left or right image. The disparity can be reserved if they correspond to pixel consistency in the disparity optimization processing. However, the disparity optimization method is difficult to apply in processing a ferrous metal image with less texture. Besides, the corresponding edge pixel in the left and right images may not be detected at the same time. Hence, we proposed basic principles for sparse matching.

1. Under the scanline, if the edge pixels belong to the same edge chain and are adjacent, only the pixels with a lower matching cost are reserved for disparity calculation.

2. Under the scanline, the priority of sparse matching disparity calculation is determined according to the matching cost, a pair of pixels with a lower matching cost is given priority, and then the error points are eliminated according to the principle of epipolar constraint.

Fig. 9
figure 9

The point cloud of S-shaped weld seam

Neighboring pixels belonging to the same edge chain are given the optimal disparity based on their matching costs, which is done to minimize noise and improve reconstruction accuracy. Prioritizing the pixel points with lower matching costs in the disparity range can reduce the mis-matching to a certain extent. After completing the calculation of the parallax, the real physical coordinates of the weld point cloud in the camera coordinate system can be calculated by Eqs. 5, 6, and 4. As shown in Fig. 9, the point cloud of S-shaped weld seam.

3.3.2 Point cloud filtering

Although the accuracy of the weld feature is refined in the image processing stage, the noise points will inevitably appear in the generated weld point cloud data caused by the influence of illumination and the influence of the metal black surface on the stereo matching. To obtain a more accurate weld seam trajectory, it is necessary to filter the point cloud and remove some discrete points far from the point cloud of the weld seam. The point cloud data of the narrow weld seam can be defined as a point set \(P=\left\{ p_1,p_2,...,p_N\right\}\). The radius outlier removal filter is applied to remove the noise. At a point \(p_i\) in the point set P, if there are less than M points in the range of the spherical radius R of the point P, then the point \(p_i\) is regarded as a noise point. Besides, the statistical outlier removal filter is also applied to remove the sparse outliers.

3.4 Curve fitting

Narrow welds can be fitted into a curve, which can further improve the accuracy of the point cloud and reduce the number of midpoints in subsequent calculations. The commonly used methods of welding robot trajectory fitting are mainly polynomial fitting and spline function fitting. Compared with polynomial fitting, the spline function fitting has high accuracy and can fit irregular curves, which can better achieve the fitting effect. A parametric B-spline curve C(t) of order p is defined by

$$\begin{aligned} C(t)={\textstyle \sum _{i=0}^{m}}Q_iB_{i,k}(t) \end{aligned}$$
(8)

where \(Q_i(i=0,\dots ,m)\) are control points and \(B_{i,k}(t)\) are the nomalized B-spline functions of order k defined on a knot vector. Now, we have a set of point cloud, which represent the shape of an unknown curve with considerable noise. To order the point data along the fitting curve, we can minize the error function

$$\begin{aligned} f=\frac{1}{2}{\textstyle \sum _{i=0}^{m}}d^2(C(t),X_i) \end{aligned}$$
(9)

where \(X_i(i=0,1,\dots ,n)\) represents sample points, and \(d(C(t),X_i)=min_t\left\| C(t)-X_i \right\|\) is the Euclidean distance of the data point \(X_i\) to the curve C(t). To solve this problem, the control points \(Q_i\) are considered to be column vectors, and can be expressed as

$$\begin{aligned} \hat{Q} ={\begin{bmatrix} Q_0&Q_1&\cdots&Q_m \end{bmatrix}}^T \end{aligned}$$
(10)

Similarly, the sample points \(X_i\) are considered to be column vectors, and can be written as

$$\begin{aligned} \hat{X} ={\begin{bmatrix} X_0&X_1&\cdots&X_n \end{bmatrix}}^T \end{aligned}$$
(11)

Then, the error function can be expressed as

$$\begin{aligned} E(\hat{Q}) =\frac{1}{2} \sum _{i=0}^{m} \left\| \sum _{j=0}^{n}B_{j,k}(t_p)Q_j-X_i \right\| ^2 \end{aligned}$$
(12)

where the \(t_p\in [0, 1],\) is scaled sample time. The error function measures the total accumulation of squared distances. To solve this problem by minmizing the error, we can find the minimum point when all its first-order partial derivatives are zero. The first-order partial derivative is term of the control point \(Q_i\) and can be expressed as:

$$\begin{aligned} \frac{{\mathrm {d}E}}{{\mathrm {d}Q_i}}= {\sum _{p=0}^{m}{ \sum _{j=0}^{n}B_{i,k}(t_p) B_{j,k}(t_p)Q_j}} - { \sum _{p=0}^{m}B_{i,k}(t_p)X_p} \end{aligned}$$
(13)

The first-order partial derivative is set equal to the zero vector; we can get the equations

$$\begin{aligned} \begin{aligned} 0&={ \sum _{p=0}^{m}{ \sum _{j=0}^{n}B_{i,k}(t_p) B_{j,k}(t_p)Q_j}} - { \sum _{p=0}^{m}B_{i,k}(t_p)X_p}\\&=A^TA \hat{Q}-A^T\hat{X} \end{aligned} \end{aligned}$$
(14)

where \(A=[B_{i,j,k}(t_p)]\) is a matrix with \(m+1\) rows and \(n+1\) columns. This equation can be regarded as a least-squares problem. The equation can be represented as

$$\begin{aligned} \hat{Q}={(A^TA)}^{-1}A^T\hat{X}=[{(A^TA)}^{-1}A^T]\hat{X} \end{aligned}$$
(15)

The control points \(\hat{Q}\) can be obtained by solving Eq. 15. Shown in Fig. 10 are the filtered point cloud and fitted curve.

Fig. 10
figure 10

The curve fitting results of S-shaped weld point cloud

3.5 Trajectory matching

The points representing the weld seam are calculated from pixel coordinates and cannot be used directly by the welding robot. Although a more artificial way of setting the interpolation can be obtained by fitting, it is difficult to obtain the complete weld trajectory by one capture. Therefore, the fitted weld trajectory will be matched with the weld trajectory in the database. This can identify the trajectory in 3D to obtain the complete planning information of the trajectory, and can also complete the true weld trajectory through the matching. The matched weld trajectory information can be used directly by the robot to realize the welding task.

From the 3D model of the workpiece, the model trajectory of the weld seam and the planning information can be generated. The model trajectory of the weld can be defined as \(W=\{w_1,w_2,... ,w_M \}\). The fitted trajectory computed from the stereo vision system can be defined as \(C=\left\{ c_1,c_2,... ,c_N\right\}\). A transformation T can adjust the position, pose, and scale between C and W. An objective function f can be defined. Then, the trajectory alignment problem for C and W can be represented by the following mathematical function:

$$\begin{aligned} T=argminf[T(C),W] \end{aligned}$$
(16)

where R is the rotation matrix and t is the translation vector. The problem can be represented more specifically as

$$\begin{aligned} E(R,t) = \sum _{i=1}^{N}\sum _{i=1}^{M}s_{ij}[(Rc_i+t)-w_i]^2 \end{aligned}$$
(17)

where \(s_{ij}\) are the weights for corresponding points. To find the corresponding points, the rotation matrix and translation vector can be obtained by minimizing the square error E(Rt) .

To satisfy the best correspondence between given point clouds, the first step is to calculate the nearest point in the set C for every point in the set W due to the \(N\ne M\). The nearest distance can be defined as

$$\begin{aligned} d_i=min{|c_i-w_j |} \qquad j=1,..., M \end{aligned}$$
(18)

Since the sequence of the two sets of point clouds is inconsistent, a threshold \(d_t\) is defined to remove the pair of points if \(d_i>d_t\). If the point \(c_i\) is the nearest point to \(w_j\), we set the weight as \(s_{ij}=1\), otherwise the \(s_{ij}=0\).

The acquired point clouds are arranged in disorder. To find the corresponding points more efficiently, a k-dimensional tree search model of P is built to quickly find the nearest point in C for each point in W [23]. The (Rt) can be obtained based on singular value decomposition (SVD). Hence, the center point of both sets can be computed as:

$$\begin{aligned} Center_c=\frac{1}{N} \sum _{i=1}^{N}c_i \end{aligned}$$
(19)
$$\begin{aligned} Center_w=\frac{1}{M} \sum _{i=1}^{M}w_i \end{aligned}$$
(20)

The new point set after alignment by the corresponding center point can be expressed as:

$$\begin{aligned} C'=\left\{ {c_i}'=c_i-Center_c\right\} \qquad i=1,..., N \end{aligned}$$
(21)
$$\begin{aligned} W'=\left\{ {w_i}'=q_i-Center_w \right\} \qquad i=1,..., M \end{aligned}$$
(22)

The covariance matrix can be computed by \(H=C'{W'}^T\), and the SVD can be written as:

$$\begin{aligned} H=U\Lambda V^T \end{aligned}$$
(23)

Then, the optimal solution of (Rt) can be computed by:

$$\begin{aligned} R=\Lambda V^T \end{aligned}$$
(24)
$$\begin{aligned} t=Center_c-RCenter_w \end{aligned}$$
(25)

In every iteration, the E(Rt) can be computed by transforming the point cloud C using rotation matrix R and translation vector t. The iteration is continued until the number of iterations reaches a threshold or the E(Rt) is lower than the given threshold. Finally, the weld trajectory is replaced by the transformed trajectory W.

After trajectory matching, we establish a criterion for the similarity between trajectories. This can be calculated from the distance between the fitted trajectory and the model trajectory, which can be defined as

$$\begin{aligned} Dist=\frac{1}{N} {\textstyle \sum _{i=1}^{N}d^2(W,c_i)} \end{aligned}$$
(26)

The trajectory pair with the smallest distance is the matching trajectory. The weld trajectory planning for weld robot can be obtained by transfom to robot coordinate system.

4 Experiments and discussion

4.1 Experiment setup

Fig. 11
figure 11

Experimental equipment

To verify the effectiveness of the proposed method in this paper, the experiments were conducted using an AUBO collaborative robot with six degrees of freedom. The stereo vision system consists of 2 grayscale cameras with a target plane of 1/1.8 in. and an imaging plane of \(3088(H)\times 2064(V)\) pixels. The cameras were mounted on the robot arm as shown in Fig. 11. The 3D model of workpieces was used in the experiment as shown in Fig. 12. The S-shaped narrow butt weld and saddle-shaped weld without groove are designed to verify the validity of the proposed method.

Fig. 12
figure 12

The 3D model of weld workpiece

4.2 Experiment results

Fig. 13
figure 13

The S-shaped weld seam trajectory detection result

In the experiment, the robot with the stereo vision system moves above the workpiece, where the left camera is moved as vertically down as possible to capture the workpiece. The experiments were performed indoors with ambient light provided by fluorescent tubes in the ceiling.The results of aligning the S-shaped weld fitting trajectory extracted by the stereo vision system with the model trajectory are shown in Fig. 13. After trajectory matching, the updated S-shaped weld trajectory is complete and can further reduce the influence of the single-point error.

Fig. 14
figure 14

The image process results of the saddle-shaped weld seam. a The left image with hight grayscale expection value. b The left image after stereo rectification. c The right image after stereo rectification. d The weld seam extraction results of left image. e The edge chain detection results

To verify the effectiveness of the proposed method for spatially narrow welds, experiments were also performed on saddle-shaped weld seams. One of the reasons for using saddle-shaped weld seams is to verify the effectiveness of the method for narrow welds with complex spatial curves. Due to the properties of the optical sensor, complex curved welds may be blocked by the workpiece itself, and only a local area of the weld can be obtained. Secondly, it is used by many researchers as a typical complex welding seam, which makes it convenient to compare the experimental results with previous similar studies. The image process results of saddle-shaped weld seam workpieces are shown in Fig. 14. The results of the point cloud processing of the saddle-shaped weld seams are shown in Fig. 15. The accuracy of the obtained initial point cloud is relatively poor and the weld seam cannot be extracted completely due to the pose of the cameras. However, the complete weld seam trajectory can be better obtained by fitting the point cloud and the trajectory matching.

Fig. 15
figure 15

The saddle-shaped weld seam trajectory detection result

4.3 Error analysis

Fig. 16
figure 16

The root mean square error between matching trajectories of the S-shaped weld seam

Fig. 17
figure 17

The root mean square error between matching trajectories of the saddle-shaped weld seam

The method proposed in this paper refines the 3D position of the weld trajectory twice. The first time is to perform curve-fitting on the weld point cloud to eliminate part of the point cloud error, and the second time is to further eliminate the error through trajectory registration. The error analysis is divided into two parts, one is the alignment error between the fitted trajectory and the model trajectory, and the other is the error between the aligned trajectory and the real trajectory in the robot coordinate system. The matching error between the fitted trajectory and the model trajectory is used to verify the effectiveness of the matching method. Shown in Figs. 16 and 17 are the root mean squared errors between the fitted sampling points and the matched model trajectory for S-shaped welds and saddle-shaped weld seam, respectively. Compared with S-shaped welds, the matching accuracy of saddle-shaped weld seams is relatively low. The main reason is that the length of the saddle-shape weld seam trajectory obtained from the vision system is lower compared to the overall trajectory, and the rougher surface of the saddle-shaped weld seam results in a disparity deviation. From the matching results, both types of shapes of welds can complete the trajectory matching well, and the matching accuracy is within the acceptable range.

In order to evaluate the error between the updated trajectory obtained after matching and the real trajectory, we calibrate the robot tool central point, camera coordinate system, and robot coordinate system. By means of manual robot demonstrations, experiments are performed on real weld seams to verify the error between the acquired trajectory and the actual trajectory. The maximum and root mean square errors for S-shaped and saddle-shaped weld seams are shown in Table 1 and Table 2, respectively. Considering the calibration error between coordinate systems and the error of the teaching process, the proposed method can guide the welding robot to complete the welding task.

Table 1 The S-shaped weld seam
Table 2 The saddle-shaped weld seam

4.4 Comparison with the existing method

To better validate the proposed algorithm, the existing algorithm is compared with the algorithms proposed in the literature [6] and [7]. All these studies conducted experiments on S-shaped butt narrow welds, while Chen also conducted experiments on intersecting curve welds. Since neither Chen nor Dinham gives the root mean square error, we compared the maximum detection error under different types of welds. The comparison results are shown in Table 3. Our method has similar accuracy to Dinham’s method when dealing with plane curve weld. However, when dealing with spatially curved welds, our proposed method has higher accuracy. The proposed method is also capable of matching and completing the weld trajectory compared to previous methods.

Table 3 The maximum error of different methods

5 Conclusion

This paper presents a method to identify narrow weld seams and obtain their 3D location based on binocular vision. The proposed method not only realizes the 2D detection of narrow welds in images but also acquires 3D coordinates of the welds. This approach improves the weld seam recognition capability of the welding robot system. With known weld trajectory planning information, the autonomy of the welding robot is improved, and the assembly accuracy requirements of the workpiece are reduced. The main contributions of this paper are as follows:

  1. 1.

    A method to enhance the weld seam features by adjusting the grayscale expectation value is proposed. Combined with edge chain detection, the weld seams in the image are extracted.

  2. 2.

    The sparse reconstruction of spatially curved welds is achieved on the basis of stereo vision techniques to obtain 3D information about the welds.

  3. 3.

    The trajectory matching method is proposed to realize the complement of weld trajectory for robot welding trajectory planning. In future research, we will further utilize the point cloud information acquired by sensors to improve the autonomy of the welding robot.