Keywords

1 Introduction

VR (Virtual Reality, VR) also called as temporary environment or spiritual environment [1]. Computer and electronic technology generate the virtual world. The user can connect with virtual world by hearing, touching, smelling and a series of actions and the interaction causes a change to the objects in the virtual reality environment. Through the closed loop system users can have immersive, immediate feeling [2]. Virtual reality is a natural interaction technology [3]. It reflects the most advanced achievements in the fields of computer technology, computer simulation technology and parallel processing technology [4]. A standard virtual reality system consists of a computer, input/output devices, applications and databases [5].

Virtual training assembly system is a typical application of virtual reality, which has higher requirements of direct operation (DM) method. Through various input devices trainees can do prototype, verification, simulation, assembly path planning and training [6, 7]. Virtual reality technology was mainly used in the field of science education and display in domestic, while in abroad virtual reality were used in training, simulation, testing, and training.

2 Background of Virtual Reality

2.1 Status Quo of Virtual Reality System

Virtual assembly environment can be divided into the following categories: desktop systems, helmet-mounted system, CAVE systems and large-screen projection systems [8]. Several types of systems were shown in Table 1.

Table 1. The virtual assembly environment

The system uses Unity 3D to finish program part. Unity 3D is a integrated game development tool which supports the creation of a three-dimensional video or games, architectural visualization, real-time three-dimensional animation and other types of interactive content released in multi-platform and it’s largely based on the programing.

The three-dimensional interactive display can be divided into the following categories according to the demand: (1) naked eye three-dimensional display and brands are Magnetic, SuperD, Alioscopy and so on, users don’t need to wear 3D glasses to view three-dimensional images, but the resolution, viewing angle and aspects of visual distance still have many deficiencies; (2) polarizing stereoscopic display and brands are Zalman, TransviDeo, iZ3D and so on, users can enjoy 3D images with polarized stereoscopic glasses but the resolution is reduced by half and difficulty of technology is high; (3) inertial motion capture system and brands are Christie 3D, Perception, DepthQ DP and so on, it can projected to any screen, but the display space has limitations and the operator should have professional skills.

Usually there are two main factors to judge a virtual assembly training system, which are if the system correctly understands the user’s intention and interaction semantics or not. To understand the user’s intention can make human-computer interaction more natural and harmonious, and the semantic structure can facilitate information and well-structured knowledge. The former factor requires user behavior research, while the latter need to use DM method and multi-channel information input way [912]. To some degree, users’ experience and professional knowledge limit their behavior and the interaction with virtual environment. So take the limitations of human perception into account, combined with interactive scenarios and actions to reach the target [13, 14].

2.2 Status Quo of Virtual Training System

The design and production of a model is a complicated systematic work in industrial production, which involves many specialties, complex system and special working space. These technical difficulties require workers have higher level of technology and experience. Original assembly training way is experienced operators using blueprint to teach new operators. But project file is not vivid enough; moreover operators are reluctant to look text.

The system mainly used for training and display. On one hand, to improve teaching efficiency and make the operators understand all aspects of device intuitively. The other hand, making the key technical points and features of the product looked more vividly and more easily for visitors to understand. Users will have curiosity of the system when they first use it because of the lack of experience. Designers take this for advantage to guide users to use stylus and infrared glasses to experience the characteristics of these input devices and set the tasks of scene and the functions of menu gradually.

2.3 Interactive Technology of Virtual Training Platform

Choosing Zspace as display platform because it’s user-friendly operation. It can facilitate the exhibition and can be used in training, testing and simulation analysis. It was developed by Infinite Z. The system is an immersive, three-dimensional display platform which provides hardware and software solutions. It allows users and developers to use 3D holographic projection way to observe objects in virtual space.

Through tracking the markers on the polarized glasses to locate the transform and rotation, the system adjusts the screen angle according to the users’ perspective in real time and alternates left and right perspective image at a rate of 60fps or 120fps. The two polarizing lenses of polarized glasses and the polarizer eye images have the same polarization direction, which ensure the separate of two images [15, 16]. Then the observer will have the illusion of depth like viewing the real thing. Polarized glasses are non-holographic 3D display technology, compared to the other style of glasses it has a high resolution, high contrast, high comfort, large range of viewing angles, low manufacturing cost.

From a hardware perspective, it has the advantage of desktop system, helmet and CAVE-style that is the cost is low, with strong sense of immersion and reduce the sense of vertigo. It also supports multi people watching. Comparing to the data gloves or operating handle stylus can manipulate 3D objects in a more accurately and quickly way. From a software perspective, it has software development kit for Unity3D, which reduced the difficulty to develop. It builds on the experience of human life, thus reducing the burden of learning and memory. One can use infrared stylus to pick up 3D objects to see the details with proprioception. It’s the same when people want to observe an object in real life. This platform provides a natural unconsciously mode of operation (Fig. 1).

Fig. 1.
figure 1

Zspace

3 Design of Virtual Training System

3.1 The Interactive Task of System

The system uses the infrared stylus, polarized glasses to obtain operation information and output results via the monitor. The main action is the system determines whether to pick up the target object or not by collision detection; moving object to target position then assemble it in accordance with the requirements of logic and safety. The Fig. 2 shows the configuration of the system.

Fig. 2.
figure 2

Overall system description

The needs are: (1) The system can simulate assemble scenarios. (2) Analyzing if the assembly sequence logic is correct or not. For instance, operation must follow a certain sequence, the battery will interfere with other components when user trying to remove or install it. (3) Feedback about erroneous operation. Based on the operational safety principle, the system should simulate various operating result and the interference between different components.

Interactive behavior includes: (1) Operation between input devices and target object, UI, scene. (2) Interface interaction: presenting the appropriate menu depending on the selected model. (3) The state of 3D objects model state: move, crawl, release, and different color under state changes.

Extracting the task flow of installing the battery to the tooling plate as shown in Table 2.

Table 2. The task flow of installing the battery to the tooling plate

Extracting the character models based on the above segments task. As shown in Fig. 3. Users can smoothly understand the menu function in the process of completing the task.

Fig. 3.
figure 3

Assembly tasks

3.2 The Design Flow of the System

Through expert interviews designer initially identified the needs of the assembly training systems and get to know the specific craft process. Focus on the main functions, interfaces and interactive, task flow of the existing software then summarizing the advantages and disadvantages. The Fig. 4 shows the low fidelity prototype of one scene.

Fig. 4.
figure 4

Low fidelity prototype

In the second expert interview, designers verified the rationality of the framework of the prototype and confirmed the detail of software function. Discussing and select a typical assembly scene applied in display system. The third phase, confirming the user role, functional framework and interaction. Requirements of each module were shown in Table 3.

Table 3. Requirements of each module

System consists of two modules: the three-dimensional display part and interactive part. This article focuses on 3D interactive function module that is training and assessment system. Asking the trainees to operate the system in a correct flow sequence and remind trainees when occur to errors so that trainees can master techniques through repeated practice. The assessment system aims at testing the training effect.

Role of the scene: visitors can pick up and amplify the structure, observe it in a precise and interactive way. Open different module to different users, such as designers, operators and workers are able to test and operate. Figure 5 shows the information architecture of typical assembly module.

Fig. 5.
figure 5

The information architecture of typical assembly

Figure 6 shows the final interface of the system.

Fig. 6.
figure 6

(a, b) Final interface

4 The Definition and Implementation of Interaction of Virtual Training System

4.1 Interactive Feature of Virtual Training System in Terms of Software

In the real world, the convergence and focus are in the coupling state. However, when looking at the screen, the human eye needs to focus on the screen, while also converge at different depth look at 3D images. When depth of field of 3D image and the human eye’s depth of field are same then the convergence and focus in coupling state, which is the main space when viewing 3D scenes. In order to see the 3D object in front of the screen or behind when it is not in coupled state, users need to separate convergence point and focal point to adapt to changes in the focal point of the depth of field; this situation will increase eye fatigue. 3D imaging and the decoupling between convergence point and focal point is proportional to some extent, but when the separation beyond a certain range, the three-dimensional illusion will be completely broken.

Three-dimensional image space is composed of the mutual of coupling and decoupling zone. It can be divided into within the screen, on-screen, off-screen three zones. To understand the advantages and disadvantages of each region will help to improve the experience of applications.

On-screen coupling region. When the object near the coupling region the human eye will be more comfortable. The scene should be placed in coupling region mostly to maximizing their stereoscopic effect. The surface of screen in the virtual world is fixed and it won’t move with the glasses, so UI should also on this region. Decoupling area within the screen can be applied to strengthen the depth of background scene. Even the audience’s attention remains in the coupling region; the depth of the background can make the background look broader especially the background is space. But in high-contrast areas should avoid display ghosting. As shown in Fig. 7, the background of display module is universe; the satellite will be placed in the space and in the state of operation that makes it looked more realistic.

Fig. 7.
figure 7

Display module

Decoupling off-screen area is the most amazing area. The object looks like floating in the air, it can break the shackles of the screen, jump out of the virtual world. Applications should be designed to encourage users to take the objects out of screen to experience this amazing 3D effect. As shown in Fig. 8, “take out” battery from screen to observe its structure, the red line is the radiation emitted by the infrared stylus.

Fig. 8.
figure 8

“Take out” the battery from the screen

The application can create a virtual world larger than the physical space, which allows users to explore the virtual space by moving the position of glasses [17]. As shown in Fig. 9, users use glasses or pen to view the virtual factory space.

Fig. 9.
figure 9

(a, b) The factory in typical assembly scene

4.2 Interactive Feature of Virtual Training System in Terms of Hardware

The virtual assembly training systems are mostly based on PC. The input includes mouse, keyboard and handle. The output includes the stereoscopic world within the screen, voice or other multi-media. Using mouse or keyboard to interactive with stereoscopic world is to emit a ray from a point in the screen, and control the object detected by the ray. It may results in most operation only valid for 2-dimensional space. When users need to move object vertically or rotate it, the interaction might be complex. Rare systems use the middle button to control object as a third source of information. Using operating handle as input can be used in scene roaming, however, there are limitations on the operation of a single object.

Platform consists of a computer, infrared stylus, polarized glasses and a monitor that can track the position and rotation of the glasses and stylus. Stylus has six dimensions that mean the virtual object can make movement in three directions. Meanwhile users can use stylus to “pick up”, “flip” objects. The scene will automatic adjust its position based on the position of glasses. Compared to other assembly systems, the glasses not only as an output but also part of the input.

Monitor tracks the position of the glasses thus to control the left and right camera position in virtual scene and focal length. The interactions of infrared stylus are click, move, release corresponds to the operation of the pen is pressed, release, click. Stylus has three keys, taking into account the different functions assigned to each key will increase the difficulty of memory, so the system uses only one button. Operation target are target objects, UI components, static scene. As shown in Table 4.

Table 4. Interactive action and object state change

Extracting operation task flow according to the interaction as shown in Fig. 10. Clicking the button to select object, UI components, long press to move the object.

Fig. 10.
figure 10

Assembly process

4.3 Realization of Function of Picking up the Virtual Object with Stylus

PC version was finished by Unity3D, controlled by the mouse and displayed on the normal screen. Then installed the SDK package, modify the input part of the program, so that it is adapted to the new display platform.

The system changes the depth of the 3D image according to the distance between the plate and the human eye in real time. Input devices have four important variables which are the moving range and midpoint of the glasses and the moving range and midpoint of the stylus. The midpoint is the intermediate position from the position of glasses or stylus in real world to the position of object in virtual world, moving range is the effective operating range of glasses or stylus in the real world.

To control an object includes three parts, control stylus, rendering the ray emitted by the stylus, changing parameter of the camera. Stylus emits rays then collide with the object with specified class; pressing the stylus button when the ray come into an object, then pick it up and the object and ray remained relatively fixed; release the button then release the objects. The transform and rotation of stylus are ps and rs, through the point C of contact rays and the object and returns the position and direction of the length of the ray and the object T, as shown in Fig. 11.

Fig. 11.
figure 11

Position of P, S, T

The key value are:

pc: the position of collision point (rays projected on a screen), pt: target position of the object (output), ps: the position of stylus (input), rs: the rotation of Stylus, rt: the rotation of target object (output), rc: the rotation of the collision point, l: distance from nib to collision point, l0: initial length (no target object) (Fig. 12).

Fig. 12.
figure 12

Operation

5 Conclusion

Based on the virtual training system conclusions are as follows: compared to other 3D stereoscopic display device, the system has better solution for three-dimensional objects under certain demands. Users can interactive with virtual world in a breakthrough way. Polarized glasses used both as inputs and outputs, virtual objects have six degrees of freedom to control. Descripting the design process of the project. Summed up the interaction of glasses and stylus corresponding to different scenes. Explaining the principles of visual effects and interactive features of the platform.