Introduction

In assembly activities, many manual operations in the shop floor are involved, and which is a critical and time-consuming section for product development. Research indicates that assembly costs 50% of the production developing time while nearly takes up 20% of the total manufacturing time (Pan 2005). Especially in manual assembly, many types of work are manually done by operators repetitively. If the assembly planning process is not well-designed, it causes not only the lack of productivity but also the occupational diseases of workers. In order to ensure the correctness of manual operations, assembly planning is an essential procedure before the execution of the actual assembly task. What is more, market demand for individualization leads to a continuously increasing number of product variants (Lasi et al. 2014), and different assembly tasks are expected to fulfill personalized requirements, making assembly planning facing various challenges.

Therefore, a reasonable process planning verification for manual assembly activities is of significant, leading the planned worker to an optimum cost without negative consequence. Traditionally, assembly planning verifications are mainly based on experienced workers who carried out the predefined activities in simulation environments, and then various planning aspect assessments for their actions are evaluated by experts (Lassalle et al. 2007). During this verification period, the manual assembly process is planned empirically at the beginning, and then engineers with the assistance of digital planning tools to evaluate the planning process by a set of metrics (such as costs, time and so on). Workers should perform the same planning process while changing planning parameters repeatedly until a good solution is obtained. For manual assembly, these approaches require prohibitive efforts for a broad adoption or they restrict process verification to few common variants, which is regarded as a suboptimal solution because of arising insecurity of what the common variant is expected (Meulen and Seidl 2007). In addition, since the assembly planning is optimized by means of iterative improvements, it is considered costly and desired to be shortened as much as possible (Manns et al. 2018). Thus, a more efficient verification for manual assembly planning is expected to improve the intelligent performance of the manual assembly operations.

Process planning verification for manual assembly with classical digital human modelling (DHM) is an attractive approach, which makes full process coverage economically in a virtual environment. Nevertheless, the inconsistencies between the virtual and real assembly process are inevitable (Agethen et al. 2016). In addition, to make the DHM act in a virtual environment, it is necessary to deduce realistic motion data at the shop floor. The motion capture system can record workers’ operations at the shop floor (Qiu et al. 2013) instead of data from virtual models, enabling data-driven verifications for manual assembly planning in smart manufacturing (Tao et al. 2018). However, due to the complex and dynamic scene in manual assembly, the current data collection technologies are not ready for collecting reliable and detailed data about the assembly process in the shop floor. Existing work measurement techniques are still relying on stopwatch measurements or manual video analysis, making them time-consuming and inadaptability to different manual assembly scenarios (Bauters et al. 2018). What is more, large amounts of shop-floor data about the manual assembly process are discrete and messy, and there is an urgent demand to understand and extract valuable information from the collected data, enabling semantic knowledge to the manual assembly operations.

In this paper, a self-contained tracking method to collect worker’s shop-floor data during the manual assembly process is proposed, providing essential material for subsequent assembly planning verifications. Besides, due to the uncertain movements of the operator when performing manual assembly task on the shop floor, the collected shop-floor data is disordered and confused. The machine learning based automated segmentation method is applied, empowering the shop-floor data with corresponding semantic knowledge. Then, a spatial–temporal verification for manual assembly planning is carried out, which can illustrate the deviation between the ideal assembly planning with the actual one. Thus, potential improvements about the current manual assembly planning can be deduced, providing indicators for the optimization of the existing manual assembly planning. Thus, the contributions of this paper are as followings:

  1. 1.

    Instead of the dependence on large amounts of sensors arranged on the shop floor, a portable self-contained tracking system is proposed to collect shop-floor data about workers’ manual assembly activities.

  2. 2.

    Based on the retrieved shop-floor data, an unsupervised machine learning method is applied to understand the worker-walking assembly actions by empowering the walking trajectory with semantic knowledge.

  3. 3.

    The data-driven spatial–temporal verification for existed manual assembly planning is carried out, revealing feedbacks to confirm and improve the predefined assembly planning quantitatively.

This paper is structured as follows: “Related work” section briefly reviewed the related work. In “Data collection and understanding” section described the proposed shop-floor data collection and the corresponding semantic segmentation method. In “Data-driven spatial-temporal verification” section, the spatial and temporal verification for the current manual assembly planning are depicted. In “Experiments” section, experiments for spatial–temporal verification of the manual assembly planning are discussed. Finally, conclusions and future work of the research are presented.

Related work

In order to ensure rationality and efficiency for manual assembly planning, spatial and temporal parameters during the assembly process have to be planned before performing the actual assembly actions. Thus, the verification of the assembly planning process with shop-floor data is of significance, which allows workers to evaluate assembly processes and address assembly issues. To cope with this issue, the methods about shop-floor data collection and verification about manual assembly task are two essential components, and we will present a literature review about the two aspects.

Shop-floor data acquisition

Performing manual assembly planning in virtual environments is an economical method, where users at different locations can do manual assembly operations collaboratively, conducting the component verification and assembly process evaluation at the same time (Jayaram et al. 1997; Dan et al. 2009; Wu et al. 2012; Gao et al. 2016). Nevertheless, these collected data in simulation environments is not in-production facilities, and the assumptions for on-site assembly conditions based on virtual planning are not reliable. Previous research work also indicates that the spatial (Agethen et al. 2016) and temporal (Baines et al. 2003) parameters of real assembly operators deviate significantly from their ideal planning, and the inconsistencies between the virtual and real assembly process may cause the failure of the assembly execution for product development.

Therefore, in order to improve the reliability of the assembly planning verification and enhance the human-centered assembly workplace, the collected data about the assembly process on the shop floor is thus a necessity. RFID based method (Huang et al. 2008) is applied to collect and synchronize the manufacturing data, and assembly parts can be tracked and traced in the shop floor. Based on advanced techniques such as sensor network and radio frequency identification, Liu et al. (2017) proposed internet of things enabled intelligent assembly system for mechanical products. However, these methods pay more attention to the assembled products instead of workers during the manual assembly process. To access the field data about workers’ operations in workplace, INTERACT (2016) is a representative project by arranging lots of low-cost and non-intrusive sensors in the workspace, the faster ramp-ups and first-time-right assembly process of the worker can be retrieved from the shop floor. The project indicated that the shop-floor sensing architectures incorporated low-cost sensor systems for retrieving real-time data about human-based assembly activities are the foundation for assembly planning verifications.

Currently, the optical motion capture technology is widely applied to retrieve shop-floor data about the worker within the actual workplace. Usually, special markers are needed to be attached on the operators for these achievements (Du and Duffy 2007; Yang et al. 2013; Puthenveetil et al. 2015; Wang et al. 2016), and these methods can usually provide a good measurement accuracy. Nevertheless, the drawbacks are obvious: it is time-consuming to stick markers on the tracked operator, and the attached markers may be occluded with the movement of the worker (Ming et al. 2013). With the rapid fall in the cost of computer vision, the marker-less motion capture method becomes a more attractive alternative to perform the shop floor data acquisition. Especially, when Microsoft releases the product Kinect, low-cost and friendly RGB-D sensors (acquiring RGB and depth image simultaneously) are widely used to track the assembly activities for the assembly planning verification. To collect and recognize worker activities in the assembly process, lots of RGB-D sensors are arranged in shop floor to acquire live data from an operational manufacturing cell without any guided or scripted work (Rude et al. 2015). To take into consideration the possible occlusions occur in the workplace, the arrangement of multi RGB-D sensor is expedient. Agethen et al. (2018) investigated walk paths of a worker in the shop floor by multi RGB-D sensors, and then spatial parameter verification by comparing the ideal trajectory to the real one is implemented. In order to ensure an overall measurement accuracy on the shop floor, Prabhu et al. (2015) deployed multiple RGB-D sensors by dividing the motion capture process into far and near motion-sensing zones. Nevertheless, these methods for shop-floor data collection in the assembly process are classified as an outside-in capturing style. As the name suggested, lots of optical sensors must be deployed reasonably within the shop floor in advance. Besides, many optical sensors should be rearranged to avoid occlusions according to different manual assembly tasks. In addition, these methods are inadaptability to the varying assembly tasks, and all the existing arranged sensors should be redeployed when the manual assembly tasks change.

Therefore, a self-contained motion capture method, is more suitable and promising for worker-based field data retrieval in an actual workplace (Fang et al. 2017). As an inside-out motion capture method, it can determine the position and orientation of the operator in an unprepared industrial environment. Given portable and lightweight traits, the combination of an optical and inertial sensor is used in this paper for data retrieval on the shop floor. As an industrial wearable system to empower the human-centric ability (Kong et al. 2018), the portable device can percept workers’ movement for assembly activities in the shop floor. Moreover, to improve the wearing comfort of the portable devices for workers, the light-weight and low-power consumption of the monocular visual-inertial system is given prior consideration in the paper.

Assembly planning verification

Generally speaking, the collected shop-floor data about worker operations from the manual assembly is discrete, noisy and disordered, and it is confusing to the decision-maker. Thus, reasonable data process and usage are indispensable for smart applications in manufacturing (Kusiak 2017). Besides, as opposed to conventional manufacturing systems, the cognitive capabilities of highly skilled assembly workers are still a keystone to provide flexibility and reliability in a modern production environment (Stoessel et al. 2008). In order to cope with varying worker assembly tasks occurred in a real-world setting, the data-driven verification for manual assembly planning can supports decision-makers to perform a more reasonable improvement (Claeys et al. 2015).

Therefore, automatic recognition and classification of the assembly operations from shop-floor data is worth studying for assembly planning verifications. Manns et al. (2016) summarized and compared different motion capture methods, and then a data-driven motion synthesis approaches are proposed with the high-quality input data. Moreover, the principal component analysis (PCA) and a shannon-entropy-based quality measure were tested. Given the walk paths for assembly operators in a manual assembly varied widely from each other, Huikari et al. (2011) compared feature selection on collected data on an industrial assembly line, and PCA is applied to address the shop floor data. Agethen et al. (2018) introduced a probabilistic two-dimensional motion planner incorporating fine-grained information on human gait. Then, these data were drawn from a multivariate Gaussian mixture model base on the real captured data, contributing to a better prediction quality of planning models by enabling production planning departments. Bauters et al. (2018) proposed a video-based system to analyze the manual line work, and then a work cycle classification method was applied to detect a problematic situation in the workflow, generating performance indicators for the operator during the manual assembly. In summary, these methods are mainly focused on analyzing spatial parameters of the operator in the shop floor, providing effective verification for assembly planning according to the real shop-floor data. Besides, the time analysis of collected shop-floor data in manual operations is also important for the verification of manual assembly (Sahin and Kellegoz 2019). Hedman and Almstrom (2017) addressed the importance of updated and valid time data in planning and controlling production, and considered how they related to manufacturing system performance and improvement. For a walking-worker assembly system, Cevikcan (2014) developed a mathematical programming approach in assembly processes, and the study can add value to industrial assembly systems in terms of raising engineering control for allocation activities. In summary, these researches mainly focus on the spatial or temporal analysis about the shop-floor data in manual assembly, lacking quantitative and semantic feedback to perform spatial–temporal verification for the planned assembly planning.

Given the shop-floor data derived from the actual workplace, an unsupervised learning method is applied to empower semantic knowledge for manual assembly operations. And then, a spatial–temporal verification for manual assembly planning is proposed, declaring the deviation between the ideal assembly planning with the actual one. This assembly planning verification can be used to evaluate the manual assembly planning and result in a more reasonable human-centered manual assembly environment.

Data collection and understanding

In order to perform the verification of the manual assembly planning, the schematic of the proposed method is depicted in Fig. 1. Traditionally, before the execution of manual assembly task, some planning items of the assembly task should be carried out in a simulation environment, such as predefined assembly time and assembly path, and so on [as “(1) assembly planning” shown in Fig. 1]. Then, according to the planning items [as “(2) Perform actual manual assembly” depicted shown in Fig. 1], the actual assembly task is carried out in the shop floor, and workers are supposed to perform the same planning process repeatedly while changing planning parameters until a convergent solution is obtained. Lack of objective shop-floor feedbacks related the actual assembly task to adjust the predefined assembly planning reasonably, and this would neglect the unreasonable planning items regardless of the actual assembly operations.

Fig. 1
figure 1

The schematic of the proposed spatial–temporal verification for manual assembly planning

Thus, a spatial–temporal verification for the assembly planning based on the shop-floor data is proposed, establishing a closed-loop and data-driven feedback mechanism to improve the current manual assembly planning [as “(3) Verification for manual assembly planning” depicted shown in Fig. 1], which are also the contributions and contents of the research. Firstly, a self-contained tracking system is proposed to retrieve the shop-floor data during the actual assembly operations, enabling the foundation for subsequent assembly planning analyses. Then, an unsupervised clustering method is applied to address the worker’s walking operations, empowering the collected shop-floor data with semantic knowledge. Finally, given the semantic knowledge about the shop-floor trajectory, automatic spatial and temporal analyses about the actual assembly are carried out, enabling a spatial–temporal verification for the predefined manual assembly planning. The verified results can provide reasonable indicators to improve the current assembly planning, resulting in a more reasonable manual assembly operation.

Data collection in manual assembly

In the actual workplace, workers should perform lots of manual operations to fulfill the assembly task. Instead of arranging a series of cameras within a shop floor in advance, the proposed motion capture system is a portable and self-contained one by combining a monocular camera and an IMU. The prototype experimental platform can be used to perceive workers’ walking actions when performing manual assembly operations. As shown in Fig. 2, which includes the following steps:

Fig. 2
figure 2

The flowchart for the shop-floor data collection

  1. (1)

    Self-contained data acquiring system.

  2. (a)

    Sensor data The data streams come from a sensor module, where the optical sensor can acquire 640 × 480 pixel image at about 30 Hz, and the internal sensor can output the linear acceleration and angular velocity at 250 Hz. Generally speaking, a larger field of view of the monocular camera results in a more robust and accurate optical tracking. The data from the optical and inertial sensor module is ported to the mobile device [Samsung S6, CPU: Exynos 7420 (1.5 GHz)] for post-processing.

  3. (b)

    Preprocessing Given sequential IMU measurements and synchronized images derived from the sensor module, a real-time pre-processing is performed to address these data. On one hand, based on the accelerated velocity and angular velocity from IMU sensor, the relative motion constraints between adjacent frames is achieved by pre-integrating inertial measurements, which can be seamlessly integrated into visual-inertial fusion pipeline (Forster et al. 2017). Based on the time integration for inertial measurements, the corresponding translation and orientation are obtained. On the other hand, feature extraction is executed with the coming image, establishing a specific mathematical description. Followed by the feature matching with sequential image, triangulation is performed to recover the visual-based pose estimation.

  4. (c)

    Optimization In order to bound the motion capture drift, visibility constraints between the node and edge according to the tracking and mapping module are established. The nonlinear optimization can be applied to bound the drift of the motion tracking, and which is also called the back-end for the proposed portable visual-inertial system. What is more, the relocated strategy is running in a parallel thread, and it can ensure the perception accuracy when the camera revisits the scene which has been seen before. Especially in the actual assembly site, which is helpful to ensure the data accuracy due to the repetitive operations in the actual workplace.

  5. (2)

    Data accuracy evaluation.

To verify the reliability of shop-floor data derived from the portable tracking system, the measurement accuracy of the optical tracking system (MotionAnalysis Co. Ltd) is about 0.1 mm in room-scale space, and which can be used as a benchmark to evaluate the proposed portable tracking system. Thus, the quantitative accuracy evaluation is conducted to ensure the reliability of the shop-floor data. The detailed introduction of the proposed self-contained tracking system can be referred to our previous works (Fang et al. 2017). Given quantitative error evaluations of the contrast experiments above, the translational errors are calculated and the average error is about 3 cm, while the mean rotational error is below 1 degree. More detailed statistics about the quantitative comparisons are illustrated in Table 1, the results indicate that the shop-floor data derived from the assembly workplace is credible for the subsequent verification of assembly planning.

Table 1 Tracking accuracy evaluation of the proposed tracking system
  1. (3)

    Shop floor data collection.

Given the portable system mounted on the worker while doing manual assembly operations, the workers’ walking path and sparse 3D map of the field scene can be recovered, which can be applied to verify the predefined assembly planning logically. In order to address the human-centered activities in manual assembly, only the recovered trajectory related to the worker walking in shop floor is discussed, and which provide the basis for the subsequent spatial–temporal verification for manual assembly planning.

Semantic segmentation

Workstation detection

In the walking-worker manual assembly process, the shop-floor data about operator activity is collected by the proposed tracking system in “Data collection in manual assembly” section (as shown in Fig. 3a). Nevertheless, there collected shop-floor data is messy and discrete due to worker’s irregular movement in the workplace (as shown in Fig. 3b), and it is hard to perform automatic verification for the predefined assembly planning according to the confusing information. In order to perform detailed spatial–temporal analysis for the manual assembly, the shop-floor trajectory is empowered with semantic information in our study. Generally speaking, the recovered spatial trajectory is centralized around the workstation, and this prior knowledge about the worker-walking manual assembly can divide the shop-floor trajectory into “workstation point” and “walking point”. The “workstation point” denote locations where a worker has done the manual assembly operations, as the points within a certain circle shown in Fig. 3b. The other type “walking point” represents the places where the operator walks around within the shop floor to fulfill the manual assembly task.

Fig. 3
figure 3

The schematic diagram for the walking trajectory derived from the shop floor

Given the previous trajectory segmentation, the shop-floor time stamped points \( \varvec{P} \) can be divided into a sequence of meaningful locations \( \varvec{S} \).

$$ \varvec{P }=\left\{ {p_{1} \to p_{2} \to \cdots \to p_{n} } \right\} \Rightarrow \varvec{S = }\left\{ {s_{1} \to s_{2} \to \cdots \to s_{n} } \right\} $$
(1)

Trajectory segmentation

In order to distinguish the assembly operations from the assisted walking trajectory within the workspace, we need to divide a trajectory into segments for a further process. The segmentation does not only reduce the computational complexity but also enable us a richer knowledge (Zheng 2015). During the manual assembly process, operators should perform certain assembly tasks among different workstations. These discrete spatial–temporal data \( \left\{ {t_{i},\,p_{i} ,q_{i} } \right\} \) from the shop floor can be obtained and are represented by a series of chronologically order point, e.g. \( p_{1} \to p_{2} \to \cdots \to p_{n} \), where each point \( p_{i} = \left( {t_{i} ,\,x_{i},\,y_{i} } \right) \) consists of a geospatial coordinate set and the corresponding timestamp.

To empower the recovered trajectory with semantic knowledge, the collected shop-floor data must be subdivided logically with respect to the real manual assembly task. In this research, an semantic segmentation based on the workstation layout in the actual workplace is proposed. A new cycle occurs when the operator leaves the assembly zone and moves to the next workstation. To entitle the semantic meaning about the shop-floor data, the clustering approach is usually applied to represent a trajectory with the feature vector. However, it is difficult to generate a feature vector with the uniform representation for these shop-floor trajectories because of their complex properties, such as sampling rate, different shapes and numbers of points. Fortunately, some prior knowledge about the manual assembly planning can be approached, such as the number of workstations within the shop floor. Thus, the hierarchical clustering method, such as K-means (Kanungo et al. 2002, Hu et al. 2006), can be widely used for the classification and clustering of the timestamped shop-floor data. K-means aims to group \( N \) observations into \( K \) clusters for the minimization of squared errors criterion:

$$ J = \sum\limits_{i=1}^{N} {\sum\limits_{j=1}^{K} {\left\| {x_{i}^{\left( j \right)} - c_{j} } \right\|_{2} } } $$
(2)

where \( \left\| {x_{i}^{\left( j \right)} - c_{j} } \right\|_{2} \) represents the distance between an observation \( x_{i}^{\left( j \right)} \) and its cluster center \( c_{j} \). Given the value of clusters \( K \) are specified beforehand by the number of the workstations, the segmentation for the shop-floor walking trajectory is carried out automatically. Table 2 illustrates the procedure of K-means based semantic segmentation for shop-floor data derived from the manual assembly workplace.

Table 2 Segmentation procedure for the shop-floor trajectory

According to the K-means clustering method, the worker’s shop-floor walking trajectory can be divided into workstation points and walking points within the shop floor, such as:

$$ Traj_{T} = Traj_{W} + Traj_{A} $$
(3)

where \( Traj_{T} \) represents the total trajectory, \( Traj_{W} \) and \( Traj_{A} \) illustrate the workstation points and the walking points, respectively. Thus, a further study on the pre-determined manual assembly planning is available, and we would perform the spatial–temporal verification about the shop-floor trajectory. As shown in Fig. 4, given the total trajectory during performing the manual assembly in the actual workplace, the corresponding walking points \( \left\{ {Traj_{W} 1,Traj_{W} 2,Traj_{W} 3} \right\} \) and workstation points \( \left\{ {Traj_{A} 1,Traj_{A} 2} \right\} \) are segmented automatically with semantic knowledge. Detailed spatial and temporal analyses of the actual manual assembly are performed for the subsequent assembly planning verification.

Fig. 4
figure 4

Shop-floor trajectory segmentation with the clustering method

Data-driven spatial–temporal verification

For every assembly cycle acquired by the segmentation procedure, the corresponding event list is generated based on the predefined layout of the workstation. In worker walking manual assembly operations, two different events have been defined: workstation operation (WOROP) and walking operation (WALOP) in the shop floor. Where WOROP takes place when the worker performs assembly actions at a certain workstation, and WALOP illustrates the worker moves from one workstation to another one on the shop floor. According to the segmented shop-floor trajectory, the spatial–temporal analysis is performed to verify the predefined manual assembly planning, and the detailed schematic of the verification process is shown in Fig. 5.

Fig. 5
figure 5

The schematic of the spatial–temporal verification for assembly planning

Data-driven spatial analysis

Given the worker walking manual assembly within the workplace, excessive motions would not only induce the risk of injuries but also lead to the loss of productivity. Thus, the reduction of the unnecessary movement in the shop floor has an immediate and positive effect for the manual assembly. When the worker performs manual assembly operations, the shop-floor data is perceived automatically through the proposed portable tracking system. The spatial–temporal data related to the shop-floor performance is represented as:

$$ Po\text{int}_{i} = \left\{ {t_{i} ,x_{i} ,y_{i} ,z_{i} } \right\} $$
(4)

where \( t_{i} \) is the timestamp of the trajectory point, and \( \left( {x_{i} ,y_{i} ,z_{i} } \right) \) is the location of the worker within the actual workplace. Thus, according to the actual worker’s movements on the shop floor, the real-time trajectory derived from the actual manual assembly operations is acquired. Based on the hypothesis that the walking trajectory deduced from the planar scene, the length of the shop-floor trajectory \( S_{{\text{Re} alTraj}} \) is obtained as follows:

$$ S_{{\text{Re} alTraj}} = \sum\limits_{{t_{0} }}^{{t_{i} }} {\left( {\sqrt {\left( {x_{i} - x_{i - 1} } \right)^{2} + \left( {y_{i} - y_{i - 1} } \right)^{2} } } \right)} $$
(5)

The walking distance through different workstations to perform manual assembly task depends on the actual workplace. Thus, assembly efficiency can be quantized to unveil the potential improvement in the workstations’ layout. According to the predefined manual assembly planning in a simulation environment, such as the assembly simulation in Delmia, the length of the predefined ideal trajectory \( S_{IdealTraj} \) is available easily. Thus, the spatial verification of the shop-floor trajectory can be compared, and the relative deviation \( S_{errorRatio} \) is defined as:

$$ S_{errorRatio} = \frac{{S_{{\text{Re} alTraj}} - S_{IdearlTraj} }}{{S_{IdearlTraj} }} $$
(6)

In addition, lots of other motion information within the shop floor can be learned when the worker moves through different workstations. When a worker is somehow hindered in his movement, this would show up in the observed traveling speed. Usually, a decrease in pace may have many causes: other workers or objects crossing the worker’s path, a worker carrying the heavy load during walking and rough or slippery floor conditions. The instantaneous and average speeds are depicted as follows:

$$ \left\{ {\begin{array}{*{20}l} {v_{i} = \frac{{\sqrt {\left( {x_{i} - x_{i - 1} } \right)^{2} + \left( {y_{i} - y_{i - 1} } \right)^{2} } }}{{t_{i} - t_{i - 1} }}} \hfill \\ {\bar{v}_{i} = \frac{{S_{{i\text{Re} alTraj}} }}{{t_{i} }}} \hfill \\ \end{array} } \right. $$
(7)

Data-driven temporal analysis

The last “Data-driven spatial analysis” section is mainly focused on the spatial analysis for the manual assembly with the shop-floor data. Besides, the study for the temporal analysis with the shop-floor trajectory is also important to verify and improve the current assembly planning. Based on the detailed study on different workstations, the operating time for certain assembly task on each workstation is acquired, and the assembly productivity of the current task can be improved by shorting the bottleneck workstation.

Given the clustering method depicted in “Semantic segmentation” section, the time sample trajectory belongs to the corresponding workstation is distinguished. Thus, the operating time is determined by the accumulation of the discrete timestamp belong to the ith workstation:

$$ T_{i} = \sum\limits_{j = 1}^{n} {t_{ij} } $$
(8)

where \( T_{i} \) is the total assembly time at the ith workstation, and \( t_{ij} \) is the time of the jth assembly unit within the ith workstation.

In this research, besides the time occurs at each workstation, the worker’s walking time between different workstations is also acquired. Based on the clustering centroid derived from K-means segmentation method, the circular with a radius of 0.5 m is given to distinguish the workstation area from the shop-floor data. Thus, the total operating time \( T \) for the work-walking manual assembly task is available by accumulating the workstation-based operating time and auxiliary walking time:

$$ T = \sum\limits_{i = 1}^{m} {T_{i} } + \sum\limits_{i = 1}^{m} {T_{iA} } $$
(9)

where \( T_{iA} \) the auxiliary walking time from the \( \left( {i - 1} \right) \)th workstation to the ith workstation, and \( m \) is the number of workstations. In addition, in order to realize the continuous evaluation of the existing assembly process planning, the balance ratio of the manual assembly process can be acquired:

$$ p = \frac{{\sum\nolimits_{i}^{m} {T_{i} } }}{{T_{\hbox{max} } \times m}} $$
(10)

where \( p \) denotes the balance ratio of the manual assembly task, and \( T_{\hbox{max} } \) represents the maximum operating time within all the workstations. Thus, the variable \( p \) can illustrate the balance performance of the manual assembly reasonably. A bigger \( p \) means a better balance performance of the current manual assembly planning.

Moreover, the smooth index \( SI \) of the predefined manual assembly can illustrate the time deviations of different workstations. A smaller \( SI \) illustrates that the time distribution of each workstation is relatively uniform, and better balance performance of the current manual assembly task. The equation \( SI \) is depicted as follows:

$$ SI = \sqrt {\frac{{\sum\nolimits_{i = 1}^{m} {\left( {T_{\hbox{max} } - T_{i} } \right)} }}{m}} $$
(11)

Experiments

Workers’ activity data perception

To demonstrate the feasibility of the proposed method, the worker-walking assembly experiment for a model car is carried out. As shown in Fig. 6, the target of this assembly task is to accomplish the model car where 3 modules are located at different workstations, they are: (1) attachments, (2) pedestal and (3) main body. To realize the assembly task, the worker needs to carry the read-to assembly parts from different positions to the main assembly table, and then achieve the final assembly target. During the manual assembly process, the walking trajectory of the worker is perceived by the proposed tracking system, as the red dotted line shown in Fig. 6, while the predefined ideal trajectory is shown as the purple line.

Fig. 6
figure 6

The schematic diagram for the verification of manual assembly operation

Before the actual assembly task, a manual assembly planning in a simulation environment is performed within the software Delmia, as shown in Fig. 7. In order to accomplish the assembly task for the model car, three workstations are designed within the workspace in a simulation environment, and the ideal walking paths are generated during the assembly process between adjacent workstations (as the red line shown in Fig. 7a). Thus, the accomplishment of the manual assembly task requires the worker to do lots of assembly activities at each workstation (as Fig. 7b), and Fig. 7c shows that the operator is walking on the shop floor according to the predefined assembly planning.

Fig. 7
figure 7

Manual assembly planning in a simulation environment

Based on the simulation-based planning for the manual assembly task, the actual assembly operations are performed at different workstations on the shop floor, as shown in Fig. 8a. In order to fulfill the total manual assembly task, worker’s walking operation on the shop floor is necessary between different workstations on the shop floor. With the proposed portable tracking system mounted on the head, the real-time data of the actual assembly process in an actual workplace is collected.

Fig. 8
figure 8

The comparisons between the assembly planning and the actual performance

Given the shop-floor data derived from actual manual assembly operations, quantitative comparisons are available to verify current assembly planning. As shown in Fig. 8b, the blue trajectory illustrates the actual assembly trajectory of a worker when performing manual assembly task. The red dotted line is the ideal planning path predefined in a simulation environment, and the total length of the ideal trajectory is about 26.2 m. A conclusion can be inferred from Fig. 8b that the actual assembly operations exist obvious deviations with the ideal planning, which also demonstrates that the verification is essential to improve the reliability and efficiency of the manual assembly planning.

Spatial–temporal verification for assembly planning

To address individual variation for the same manual assembly task, the same manual assembly task with different workers is carried out, which would reveal a more credible conclusion to verify the manual assembly planning. As shown in Fig. 9, it denotes four different operators to carry out the same assembly task on the floor (shown from Fig. 9a–d), the experimental results show that all the actual walking paths (as the solid line in blue) deviate from the ideal one (as the dashed line in red).

Fig. 9
figure 9

The recovered trajectories derived from different workers (Color figure online)

Thus, in order to empower semantic knowledge for the shop-floor trajectory, the clustering method in “Semantic segmentation” section is applied to segment the assembly trajectory. Based on the prior knowledge about workstation layout on the shop floor, the number of the clustering centroid is determined. Thus, the clustering centroids can be estimated from the shop-floor trajectory, as the red triangle shown along with each workstation. We can see that all the estimated centroids are close to the corresponding workstations, which also indicates that the clustering method can estimate the correct locations of the workstations.

Based on the semantic segmentation for the shop-floor data, the total trajectory is divided into walking-based and workstation-based one. Then, the length of walking trajectory, assembly time and the movement velocity derived from actual manual assembly operations are acquired, and which can be applied to the verification of the current manual assembly planning. The results of spatial analysis about the actual assembly operations are illustrated in Table 3, we can find that the actual length of the shop-floor trajectory varies from the ideal planning trajectory significantly (about 50.69%), which demonstrate that the spatial verification of manual assembly planning is essential to quantify the current assembly planning. What is more, although the deviations existed between different workers, the actual results tend to be uniform and are more convincing to verify the manual assembly planning.

Table 3 Automatic spatial verification for the manual assembly planning

Besides the spatial verification for the manual assembly task, the operating time verification for assembly operations at each workstation can also be approached by means of timestamped shop-floor data, and which can provide an indicator to evaluate the balance performance of the current assembly task. Based on the temporal and balance analysis in “Data-driven temporal analysis” section, the temporal verification for the above four manual assembly tasks is depicted in Table 4. We can find that the mean balance ratio of the current manual assembly task is 84.4%. Besides, the mean operating times at workstation 1 and 2 are 43.22 s and 61.19 s, respectively, and which also mean the maximum and minimum time in the current manual assembly. Thus, apart from the temporal verification of the current assembly planning, the statistical result also provides evidence to improve the balance performance of the current assembly planning. E.g. we can transform partial assembly tasks from the workstation 2 to workstation 1, and the assembly balance and productivity would enhance logically.

Table 4 Automatic temporal verification of the manual assembly planning

In addition to the spatial and temporal verification of the manual assembly task, the walking time in the workplace can be distinguished from total operating time. As shown in Fig. 10a, during performing manual assembly in the workplace, the manual assembly operations around the workstations occupy the majority time with about 87%, while the auxiliary walking time on the shop floor is about 13% of the total assembly time. Thus, more attention should be paid on the workstation-based time analysis to improve assembly efficiency. We find that the operating time within workstation 2 is significantly more than other workstations (as shown in Fig. 10b), which also means a uniform time distribution is expected to rearrange the predefined manual assembly planning.

Fig. 10
figure 10

Temporal verification for manual assembly planning

Discussion

Spatial–temporal verification for manual assembly planning is to perform a convergence between the cyber and physical world, and convert data derived from the assembly operations into manufacturing intelligence by systematic analysis. To address this issue, the data-driven spatial–temporal verification for manual assembly planning is carried out. At the beginning, data collection within the workplace is the basis for the subsequent improvement. A self-contained motion capture method is proposed in the research, and which is adaptability and low-cost to retrieve shop floor data compared to the traditional motion capture methods.

The study shows that the portable self-contained tracking system can be applied to collect the worker’s manual assembly data in the workplace. However, this shop-floor data is confusing and difficult to use unless it is translated into concrete meaning and context. In order to entitle semantic knowledge to the time sample shop-floor data, the K-means method is used in our research, which can classify the shop-floor data into walking-based and workstation-based trajectory, ensuring spatial and temporal analysis for the predefined manual assembly planning in the simulation environment. Based on the shop-floor data about human-based assembly activities, the ramp-ups and first-time-right data-driven assembly process verification are achieved. In addition, according to the shop-floor data derived from the portable tracking system, experimental results also show that the real assembly operations exist significant deviation with the ideal planning one, illustrating the necessity to perform assembly planning verification.

However, the above contrast experiments are only conducted spatial and temporal verification with the shop-floor data during the manual assembly task. Due to the limitation of the number of the experimental equipment, only un-paced manual assembly experiments are carried out, thus the pace based manual assembly task is expected for the future. Moreover, given the prior knowledge about the number of workstations in the workplace, a K-means clustering method is applied directly to classify the total trajectory, and a more adaptive semantic segmentation method is worthwhile to be given further. Besides, during the motion trajectory acquired in the proposed motion capture system, every minimal change in trajectory and each tiny step of the worker can be acquired. Although it seems too restrictive for human-based assembly operation, the more convincing spatial–temporal verifications are approach according to these collected shop-floor data. In addition, the tracking method is appropriate for the work such as the human-robotic collaboration, ergonomic analysis and augmented reality, which will become important components for human-centered digital factory and intelligent manufacturing, and we will do further research among these aspects.

Conclusions

In this study, based on the combination of the optical/inertial sensor, a self-contained portable tracking system is proposed for the shop-data collection during the actual manual assembly. The motion capture system can perform accurate and robust motion capture for workers’ walking activities on the shop floor. Then, with the time sample trajectory from the actual assembly operation, an automatic and unsupervised segmentation method is applied to process the shop-floor data. It can divide the current trajectory logically, enabling the sub-trajectory with the corresponding semantic information. Thus, the systematic statistics for the workstation-based and walking-based trajectory is available. Based on the data-driven strategy, the spatial and temporal verification for the predefined manual assembly can be performed, providing critical feedback to improve the human-centered assembly planning. Experimental results about the actual manual assembly demonstrate the feasibility and applicability of the proposed method.