Keywords

1 Introduction

Augmented reality (AR) technology has broad application in the world. It has been used in various fields, such as gaming and medical training [1]. In engineering fields, assembly instruction is a potential application of AR. Through a head-mounted display (HMD) or a tablet, an AR system can display a computer-aided design (CAD) model of a component from an assembly at its installation location to lead an operator to complete an assembly operation step by step. For complex assemblies, it is beneficial to provide the operator concise and effective instructions for efficient assembly operation and training instead of relying on that operator to read a manual during the assembly operation.

An AR-based assembly assistance system must be able to show CAD models (virtual objects) at their correct positions for reliable instructions. That is, the system should determine precise transformation relationships between coordinate systems, such as the coordinate systems of the real world, of the virtual world in which the CAD models are defined, and of the user’s position. This is commonly realized using AR markers [2] to evaluate transformation matrices of various coordinate systems. In these arrangements, in addition to the HMD, several external devices, such as AR markers and an RGB-D camera, are typically prepared and set up before the operation. This limits the working area and increases the time taken to set up the instruments. In addition to assembly instruction, it is important to detect in real time whether there is an occurrence of misassembly during the operation [3]. Prevention of assembly errors can avoid unexpected increases in assembly time or serious damage to the assembly product. Thus, it is critical to evaluate whether the real object is placed in the target position to confirm a reliable transition to the next assembly stage.

In this research, we aimed to use an AR HMD, Microsoft HoloLens, as the main device establishing a basic AR-based assembly assistance system that can display CAD models to the system’s user for assembly instruction and can simultaneously evaluate any occurrence of misalignment between the virtual and the real objects. Two methods are proposed to provide functions for the system: coordinate calibration and efficient evaluation of misalignment. Experiments aligning multiple primitive blocks were also conducted to preliminarily assess the system performance.

In Sect. 2, we review the related work on AR-based assembly assistance systems. The core design of our system is described in Sect. 3. Preliminary experiments and a system assessment are described in Sect. 4, and Sect. 5 is the conclusion of this paper.

2 Related Work

Several studies have focused on AR-based systems by using various methods to assist assembly operations. Microsoft HoloLens is a commonly used commercial HMD. Evans et al. [4] used HoloLens to construct a system that performed assembly operations in a room-scale environment. However, although HoloLens can scan the real world by using an embedded depth camera and generate meshes of a scene, the meshes are not detailed enough to track real objects and support an assembly application. Instead, AR markers were placed in the environment and used to precisely define the position of a real object. Radkowski et al. [5] used an RGB-D camera to track a real object inside the sensing zone of the camera and made a CAD model that could follow the real object.

In addition to essential functions for assembly instruction, Mura et al. [2] and Alves et al. [3] developed a method to detect any assembly errors during the operation. External devices such as an RGB camera and a computing system were needed to evaluate the errors in real time. This verification provided better training to the operators and offered the potential to prevent serious damage to products.

Thus, AR technology can make assembly operations more effective and efficient. However, external devices limit the working area’s scalability, and it may take time to perform initial device setup. These issues motivated us to develop a system that uses only an HMD to achieve the functions of assembly instruction and verification. This single device can make the system more flexible to various scales of environments and products.

3 Core Design of the AR-Based Assembly Assistance System

Our target system allowed a user to assemble parts in their correct locations in an assembly. These correct locations were specified in a CAD model that was supplied for the assembly. Thus, the system told the user how closely a real part was aligned to its CAD model while the user was trying to place the real part in the assembly. Figure 1 shows the proposed system’s process. First, we performed coordinate calibration and determined the transformation relationship between the virtual and the real worlds to display the CAD model in the working area. Next, the system started to show the CAD model at its installation location to let the user understand which real object should be chosen and where it should be placed at each assembly stage. During the assembly operation, the system evaluated whether misassembly had occurred and gave a warning to the user if an instance was found. The process continued until the assembly operation ended. Coordinate calibration and evaluation of misalignment are the cores of the system and are described in Sects. 3.2 and 3.3, respectively.

Fig. 1.
figure 1

Process of the AR-based assembly assistance system

3.1 Microsoft HoloLens

HoloLens served as the HMD and was used for the entire computation. It has several embedded sensors to capture the physical world, including four grayscale environment-sensing cameras to determine where the user is in the real world, an RGB camera to record the user’s view, and a depth camera using a time-of-flight technique to scan surfaces of real objects. HoloLens defined a real-world coordinate system, and its origin was at the initial location where the system started up. This real-world coordinate system defined the user’s position and the CAD model positions in the real world to demonstrate the correct placement of the CAD models to the user.

3.2 Coordinate Calibration

Our system required good measurement accuracy for our assembly application because our target workspace was a small desktop area compared with the working area of standard applications by HoloLens. Such accuracy cannot be achieved without calibration. Moreover, we needed to define the working area to make the system display CAD models within an expected region. When the system initially started up, the origin of the virtual world was predefined somewhere in the real world. Therefore, the first step was to transform the origin of the virtual world to the working area. In Fig. 2, to determine this transformation, the CAD model located at the origin of the virtual world and its corresponding real object are used as references for calibration. The real object was first placed in the working area. Then, the transformation relationship was determined by aligning the CAD model to the real object position and computing the transformation matrix \( \varvec{T}_{V \to R} \). Afterward, the assembly’s CAD models were shown relative to the reference object in the real world. In particular, the CAD models were given that included the positions of parts in the assembly and their 3D models.

Fig. 2.
figure 2

Schematic plot of coordinate calibration

The alignment contained two steps: rough alignment and precise alignment. For rough alignment, with a function in the Mixed Reality Toolkit [6] developed by Microsoft, the embedded depth camera tracked the user’s hand manipulations and let the user move and rotate the CAD model manually, as shown in Fig. 3. Thus, the user could roughly place the CAD model to the corresponding real object’s position.

Fig. 3.
figure 3

Rough alignment through the user’s hand manipulations

Next, precise alignment was applied to make the CAD model almost overlap the real object. We used the point-to-plane iterative closest point (ICP) algorithm [7, 8] to minimize the difference in positions of the CAD model and the real object. To apply the point-to-plane ICP, surface information about the real object was necessary. Thus, we obtained and used raw point cloud data of the real object scanned by the depth camera. This eventually led to precise alignment between the CAD model and the real object and gave us \( \varvec{T}_{V \to R} \) for coordinate calibration.

3.3 Efficient Evaluation of Misalignment Between Virtual and Real Objects

Misalignment was estimated based on the distance between the CAD model and the point cloud of the real object. For efficiency, instead of performing computation in 3D, including finding corresponding points between the CAD model and the real object and calculating the distance, we compared the depth maps of the CAD model and the real object to evaluate the distance. Figure 4 shows the process of obtaining depth maps for a cube on a table. We set two virtual depth cameras that had the same location as the physical one and accessed Z buffers of the CAD model and the point cloud of the real object generated by a graphics processing unit in HoloLens. Thus, we obtained the depth maps of the CAD model and the real object from the same viewpoint.

Fig. 4.
figure 4

Process of obtaining depth maps of the CAD model and the real object

From the two depth maps, we evaluated the misalignment by calculating the average and the standard deviation of the depth value difference. In Fig. 4, for each pixel \( \left( {u_{i} , \,v_{i} } \right) \) that belonged to the CAD model in its depth map, we computed the difference \( d_{i} \) between its depth value \( p\left( {u_{i} , \,v_{i} } \right) \) and the one \( q\left( {u_{i} , \,v_{i} } \right) \) in the same pixel position in the depth map of the real object; that is, \( d_{i} = p\left( {u_{i} ,\, v_{i} } \right) - q\left( {u_{i} , \,v_{i} } \right) \). We then calculated the average \( D_{avg} \) and the standard deviation \( D_{std} \) of \( \left\{ {d_{i} } \right\} \) as indicators for evaluation. If the real object aligned well to the CAD model, both \( D_{avg} \) and \( D_{std} \) would become close to zero; otherwise, one or both of them would be far from zero.

4 Experiment

To preliminarily evaluate system performance, we conducted an experiment that aligned wooden blocks on a table to verify whether the system could work in a condition of no occlusion. In Fig. 5(a), a wooden cube with an edge of 10 cm was used as a reference object for coordinate calibration. Four primitive blocks—two small cubes, one cuboid, and one cylinder—were used for the assembly operation and should be installed in the order designed and numbered in Fig. 5(b). For the indicators \( D_{avg} \) and \( D_{std} \), with preliminary tests, we determined that misalignment did not occur if \( \left| {D_{avg} } \right|\; \le \;1.8 \) cm and \( D_{std} \; \le \;1.5 \) cm; otherwise, misalignment occurred.

Fig. 5.
figure 5

Experimental setup: (a) real objects and (b) designed alignment of parts in the CAD model

4.1 Assembly Operation and Evaluation of Misalignment

When the system initially started, the user first aligned the reference CAD model to the corresponding cube to perform coordinate calibration. This was achieved first through rough alignment, as shown in Fig. 6(a), and then through precise alignment, as shown in Fig. 6(b). During the process, the system evaluated whether the user set the real object correctly and gave evaluating results using different colors. In Fig. 6(c), which shows placement of the second cube, the cube was not set in the desired location initially; thus, the CAD model was colored red. After the placement was corrected, in Fig. 6(d), the color changed to green, indicating that the user could continue to the next step. Figures 6(e) and 6(f) show the completion of the entire assembly operation.

Fig. 6.
figure 6

Assembly process: coordinate calibration after rough alignment (a) and precise alignment (b), (c) and (d) detection of misalignment during the operation, and (e) and (f) completion of the assembly process

4.2 Discussion

During the process, the system could display clear CAD models as instructions. To visualize the CAD models at the desired positions, the most important factor is a good result from coordinate calibration. However, because of the depth value error, the point cloud from the depth camera could not always represent the reference real object well. When such misalignment occurs, the results of precise alignment and coordinate calibration will not be good. This will influence the subsequent assembly process, such as through unexpected shifting and tilting of the CAD models.

For evaluation of misalignment, the system takes about 0.08 s per complete computation; thus, it provides real-time evaluation. The designed indicators \( D_{avg} \) and \( D_{std} \) usually give a correct assessment. This helps the user set the real object the correct distance from the user position because the user has low sensitivity to depth distance.

5 Conclusion

Using AR technology is a potential way to ensure better experience during the process of assembling a product. Besides assembly instruction from an AR-based system, it is important to determine whether assembly errors occur. In this research, we propose two methods to construct an essential AR-based assembly assistance system using only an HMD, in this case, Microsoft HoloLens. With coordinate calibration, we can define the working area and make the CAD models display at the proper positions. Moreover, with efficient evaluation of any misalignment between the virtual and the real objects, the user can rapidly understand the installing condition. If an assembly error occurs, the system instantly warns the user. A preliminary experiment conducted to evaluate the performance shows that the system can give clear instructions and correct evaluation of any misalignment to help the user complete the assembly task. Future work will continue to improve system reliability and performance and add functions to better instruct the assembly operation. With these features, this compact system has high portability and is expected to be used in wide-ranging situations.