1 Introduction

In recent decades, mass production in combination with a high degree of automation seemed to be the evident means to reach an economical production of products. With the shift from seller markets to buyer markets and increasing dynamics, such as rising customer demands, increasing number and variety of products, and changing market demands, flexibility and changeability became main enablers for an efficient production [1]. The calls have been answered by several concepts proposed for the physical system, the control system and the organization of production systems. However, those concepts mostly ignore the immense cognitive capabilities that humans possess and which enable them to react to unpredictable situations, to plan their further actions, to learn and gain experience and to communicate with others [2]. Hence, the most flexible and changeable production system remains the skilled and experienced human worker [3, 4].

Traditional systems for digital assistance supporting these workers in manual assembly, e.g. optical displays at the work place, are inherently suboptimal for providing efficient and ergonomically feasible guidance. The display of sequential instructions does not offer an increase in productivity beyond a certain degree [5]. Little situational support and a distraction from relevant information in the environment arise. The resulting guidance reduces acceptance by the worker [6, 7]. This is in part caused by the purely deterministic nature of assembly planning. Moreover, the generation of instructions happens without the comprehension of the actual production environment and physiological or psychological state of the worker [8].

A key to solving these discrepancies is seen in the application of situation-oriented methodologies for the generation of assembly instructions [9, 10]. Information regarding the mental as well as the physical state of the worker is integrated into the task retrieval process. Within this process, the most feasible assembly sequence is determined in regard to the current state of the product, the availability of parts and relevant information about the worker (e.g. receptiveness, mental workload and confidence).

In respect of the task generation a crucial parameter restricts the maximum step-length within the assembly sequence. As such, it represents the degree of detail and complexity of the task displayed to the worker. In this context, the publication presents a complexity measure for assembly primitives. It can be brought into accordance with the current degree of receptiveness of the individual human worker. It extends the concept and application of common systems of predetermined times (e.g. MTM) by including dimensions of actual human performance, mental workload and attention allocation, learning effects based on the product and its reference levels and learning effects resulting from external influences such as rules and guidelines [11].

2 Critical perspective on task complexity measures

2.1 On common measures of task complexity

Despite manual object assembly being a widespread task in high-wage countries, research into information processing and cognition has not been sufficient. This holds true for integrated views on the assembly process and its context during assembly as well. This indicates that the variables that affect the performance of procedural assembly tasks are not fully known. Existing measures of assembly task complexity investigated largely the physical attributes of the objects that influence the difficulty of its assembly. A relationship between the task variables and assembly difficulty could be shown [12]. The results were embedded into regressional models as tools to evaluate the difficulty of assemblies or assembly steps pre-defined by instructions. These task variables could also be applied to produce guidelines to ensure that assemblies are manageable. However, the studies focused on individual and isolated assembly processes.

An assembly complexity factor by Hinckley [13] relates the number of assembly operations and the assembly times to defects. It has been extended to develop new methodologies for predicting assembly defect levels based on the complexity of individual process steps [14, 15]. Several metrics to measure sources of system complexity based on relationships between system components (number of flow paths, travel distance, etc.) and system elements (number of components, setup time, cycle time, reliability, etc.) were introduced by Kim [16]. These methods assess elements of product and process complexity in a systematic manner, but they cannot be readily extended to other manufacturing domains. ElMaraghy and Urbanic [3] presented a general methodology to assess manufacturing complexity that can be adapted to suit any enterprise and extended it to encompass complexity at the operational level. However, focus of this specific, entropy-based approach is the selection of the least complex manufacturing configuration meeting the preset requirements. It is not suited for operational use within a run-time system.

Systems of predetermined times (SPT) are globally accepted methods in assembly planning. The individual occurrences thereof are distinguishable by the set of included parameters and the respective level of detail. Generally, the methods are of use in the temporal evaluation of assembly sequences. In the course of this, manual tasks and simple intellectual decisions are separated into motional elements (e.g. reach, grab) and mental functions (e.g. identify). Every motional element and function is attributed a measure of standardized time. Its level corresponds to the respective factors, e.g. motion length, required force or placement accuracy. As such, SPT can only be applied if the tasks to be evaluated can be fully affected by the human worker. The developments of all SPT-methodologies are rooted in the Segur’s motion time analysis of 1948. An overview of the historic development and the interrelationships of the different methodologies can be found in [17]. The MTM-system (Methods of Time Measurement) and the WF-methodology (Work-Factor) are mainly made use of in German industry, whereas the former is by far more common worldwide [18]. Consequently, this paper takes into account the MTM-system as the factor for base execution time (see Sect. 3.2). However, adaptation is merely a technical effort. Information about the WF-methodology is available in [19].

Boothroyd and Dewhurst’s [20] method of “Design for Assembly” (DFA) puts emphasize on simplifying a product by critically challenging the necessity of each and every subassembly/part. The method is applied under the presumption that a reduction of parts and subassemblies is attended by a decreased effort for assembly. The decrease in effort is said to be independent of the respective subassemblies’ complexity. Subsequent to a temporal evaluation of the tasks, the derived solution is put into contrast to a theoretically best solution. Although the main objective of the DFA-method is the evaluation and enhancement of the product’s construction, this method offers a means to derive assembly times as well. The basis thereof is the evaluation of handling and joining tasks per part with a DFA specific system of predetermined times or by the transformation of time studies into target times.

Assembly Evaluation Method (AEM) developed at Hitachi, Ltd. is a method similar to DFA. Herein, the analysis of assembly justness in a team-oriented process constitutes the basis for an evaluation of the complexity of an assembly. An extension by General Electric Company (GE) offers the means for determining assembly execution times [21]. In this method as well, the basis of the evaluation is a system of predetermined times for joining tasks. In a draft, each task is attributed a characteristic number which can be transformed into a temporal measure under the consideration of comparative data. When using operative comparative data, company specific limitations and parameters can be accounted for. However, this method does not consider handling effects, orientation of parts and commissioning tasks.

2.2 Cognitive processes in manual assembly

Understanding the cognitive processes involved in manual assembly is essential for predicting the worker’s task performance. Mental resources of humans are limited and have to be distributed and allocated to relevant task aspects. Accordingly, interactions between cognitive processes and task properties have to be taken into account for the estimation of task complexity. Nevertheless, with standard systems of predetermined times only predictions of mental processes including binary decisions are possible [22]. As especially in the assembly of high variant and high value-added products the cognitive processes and decisions involved are more complex, a detailed analysis of relevant human cognitive processes and respective bottlenecks is necessary.

Traditional theories for human information processing postulate that processing stages are passed through in a sequential order. Figure 1 shows relevant processing stages and resource dimensions for the assembly tasks commissioning and joining. The perceptual processing covers stimulus pre-processing, feature extraction and stimulus identification. For example, information presented in the instruction has to be localized before visual attention can be directed to the relevant details. In the commissioning phase, a relevant assembly part has to be located first, identified on the parts list and to be searched in a specific storage box. Also in the joining phase, the assembly instruction has to be localized and part location as well as orientation has to be identified. Following this, relevant actions or responses have to be selected internally and then executed. Regarding action execution, objective task parameters such as size and distance of the object to be grasped, can help to estimate movement and grasp times. In common systems of predetermined times each task unit is computed rather independently from the previous and the following subtasks and on the basis of physical task properties.

Fig. 1
figure 1

Relevant processing stages and resources in manual assembly tasks

Nevertheless, some task demands cannot be measured and evaluated easily, as they result from the interplay of several subtasks. Although humans are quite successful in performing multiple actions at the same time, there exist limitations to respective control processes. In the multiple-resource theory of Wickens [23] it is proposed that mental resources can be described along the four dimensions:

  • Perceptual processing (visual–auditory),

  • Processing codes (spatial–verbal),

  • Processing stages (perception–central processing–responding) and

  • Response modalities (manual–verbal).

These resources have specific capacity limitations leading to a decreased task performance in case two tasks need the same resource. Accordingly, analyzing task aspects that might interfere regarding a certain resource enables the prediction of multiple-task performance and mental workload. In conclusion, it seems fruitful to integrate the investigation and analysis of human cognitive processes into the prediction of assembly complexity.

3 Concept of a multi-dimensional measure of task complexity

3.1 Overview and structure

The necessity for a multi-dimensional measure was deducted in Sect. 2. It is based upon a critical analysis and evaluation of existing measures for one-dimensional complexity of assembly tasks and preceding neuro-psychological considerations. Consequently, it is suggested, that the exposure of the human worker resulting from a certain task (taski) shall be based upon the following factors of influence:

  • Temporal factor d t (task i ): As a function of base task execution time, this factor corresponds to a choice of system of predetermined times.

  • Cognitive factor d c (task i ): Dependent on the mental processing efforts during the assembly task, this factor results from the cognitive elements of perception, selection and action.

  • Knowledge-based factor d k (task i ): Indicating a certain mid- and long-term knowledge of the workforce or individual worker, this factor adheres to commonness in product and production programs encompassing the task in question.

The above mentioned dimensions are interrelated as depicted in Fig. 2. The resulting measure of task complexity in manual assembly processes can be displayed as a three-dimensional vector. A known system of predetermined times is applied within the temporal factor d t (task i ). The cognitive element d c (task i ) integrates the temporal factor d t (task i ), characteristics of the current task and the sequence of tasks the latter is embedded in. As shown in Sect. 2.1, common systems of predetermined times provide merely a measure of the average worker. Hence, the cognitive element d c (task i ), being individual to each worker, can be seen as a factor, which lowers or respectively raises the effect of the temporal measure. The knowledge-based element d k (task i ) is the sum of the individual similarity measures between the task in focus (task i ) and the product environment it stems from. The individual dimensions are elaborated on in Sects. 3.2, 3.3, 3.4.

Fig. 2
figure 2

Factors of influence of the task-induced complexity measure

3.2 Base task execution time

The MTM-system is integrated as a standard method of predetermined times for the computation of basic task execution times. The system is widespread in industrial applications as it delivers explicit values for the prediction of sensory motor execution times. A variety of physical task parameters (e.g. distances, weight) as well as hand movements (e.g. models for grasping and joining), arm, feet and body movements as well as eye movements (gaze functions) are integrated. The basic action cycle describes a typical movement sequence of pick and place movements. The resulting predictions are average execution times of an exemplary worker. Moreover, cognitive processes are included in the task “visual checking” as a sequence of binary decisions. The respective basic execution times delivered by the MTM-system can be used in case no further information is available.

3.3 Task processing measure

Manual assembly of high variant products is not solely determined by manual complexity and physical constraints but also to a large extent by the mental processing of task information. Accordingly, an analysis of task requirements on processing stages like attention allocation and response selection can deliver important information on mental workload and resulting task performance (e.g. search times, completion times and movement parameters).

The multiple-resource theory of Wickens (see Sect. 2.2) can be applied to the assembly situation with commissioning and joining tasks. Here, different spatial areas have to be monitored (e.g. instruction area, parts area, work piece) and different manual actions have to be performed with both hands at the same time [24]. Performance declines may be due to interference stemming from simultaneous execution of several subtasks or from cross-over effects of previous activities [25]. For example, interference might occur when workers have to switch between different assembly parts or part properties (small/large objects, varying dimensions). Accordingly, task sequences during the assembly process have to be taken into account. In dual task situations, the parameters similarity, practice and task difficulty determine task performance. With respect to task similarity the multiple-resources theory predicts dual task performance by estimating the mental workload depending on the availability of certain resource dimensions. In more detail, the resource allocation theory distinguishes between resource- and data-limited processes within a task [26]. Whereas performance in data-limited tasks cannot be improved by further resource allocation, performance in resource-limited tasks is determined by the amount of allocated resources. This relationship is described in the performance-resource function for different task difficulty or practice levels (for an overview see [27]).

On the basis of such an analysis, the worker can be optimally supported during the assembly process: Task relevant information has to be presented at the right time, at the right place and with adapted content in regard to task complexity. Optimal guidance of attention allocation can help to support resource-limited processes. By directing selective visual attention to task relevant aspects and reducing the amount of necessary attention shifts perceptual processes action execution can be optimized as well (see Sect. 4.2). Therefore, integrating adaptive display techniques for instruction presentation can lead to a facilitation and acceleration of the assembly process.

One objective of the investigation and analysis of mental processes during manual assembly is to refine and modulate predictions for execution times based on standard methods of predetermined times. The predicted task execution times will be compared online with the task performance of the worker. Results can be used in order to adapt assembly steps and instruction content.

3.4 Product reference measure

Similarity and commonness within the product and production program result in a certain mid- and long-term knowledge of the workforce regarding assembly. Hereby, the comparison of a task (set) of the product in focus to the tasks of the overall product range of the production environment (i.e. limited to a single manufacturing company) is the basis of a further dimension of task complexity. This commonness can be measured on varying levels of detail (see Fig. 3).

Fig. 3
figure 3

Factors of influence of the task-induced complexity measure

The levels of similarity are based upon the available representational forms of the product state. Different representations for product states have been evaluated. They mainly differ in the level of detail they provide.

The 3D CAD model representation represents the geometry and nature of an assembly in a three-dimensional view, and is used in recent approaches for assembly planning based on virtual reality techniques. Examples may be found in [2830]. 3D CAD representations provide a detailed and demonstrative view of the assembly.

In the algorithmic generation of assembly sequences, relational model representations and derivations thereof are common. Their structure builds on entities, which are put in relation to each other. They may also be attributed in order to provide information about the assembly and its context. Relations can be modeled using relational graphs. These provide basic information about the composition of assemblies depending on the attributes and further features such as shape, coordinates etc. However, this representation does not provide a graphical view of the assembly; therefore it is difficult for the user to intuitively understand the degree of similarity [3133].

The binary vector representation provides information about the established connections in a defined state of the assembly. Merely the state of surfaces’ contacts is known. It is possible to derive information about the allocation of parts and subassemblies, but one cannot make a statement about their exact alignment and orientation. Consequently, this basic representation is only used for the representation of assembly sequences (e.g. for the directed graph representation). If detailed information about coordinates, shape, type of attachments etc. is needed, this kind of representation cannot be applied and offers little insight into similarities [34, 35].

Similar to the binary vector representation, the set of parts representations can be used to tell if two or more parts are in contact. However, it cannot make a statement about which surfaces are in contact, about coordinates of contacts etc. This kind of representation is, as the previous one, mainly used for the representation of assembly sequences where detailed information is not essential [31].

If available, the aforementioned 3D CAD model representation is to be preferred. Comparison of three-dimensional data has been an area of research in disciplines such as computer vision, mechanical engineering, artifact searching, molecular biology and chemistry. These provide the elements for 3D search algorithms in databases in product design, enabling an efficient search for similar, already existing parts in the production program [36]. There are two different techniques for comparing such 3D objects:

  • Feature-based techniques: engineering features (e.g. machining features, form features) are extracted from a solid model of a mechanical part.

  • Shape-based techniques: transformation invariant attributes can be extracted from the polygon mesh in order to find similarity among 3D models [37].

The approaches can be classified into non-graph-based techniques and graph-based techniques. The non-graph-based techniques can further be distinguished into:

  • Global feature-based

  • Manufacturing feature recognition-based

  • Histogram-based

  • Product information-based and

  • 3D object recognition-based

None of the above provides local support, i.e. matching of local geometry needed to determine the presence of single details. Graph-based techniques instead (except for Reeb graphs) provide local support. A disadvantage is that the comparison tends to be extensive for complex parts (large graph) and large databases [36].

Based upon the varying state representations, a distance between two given products was developed, which is measured using an indication of the amount of work to be performed in order to transform one state into the other. (knowledge-based element d k (task i )).

4 Validation

4.1 Experimental setup

The experimental setup shown in Fig. 4 consists of a workbench complemented by a motion caption system and an eye tracker for the registration of hand, arm and eye movements. Moreover, two Firewire cameras for online and a DV camera for offline analysis of task performance are integrated. A projector is placed on top of the workbench, which, in combination with a front-surface mirror, enables the display of assembly instructions directly on relevant spatial locations of the work area.

Fig. 4
figure 4

Display of assembly instructions at the experimental workplace (© CoTeSys/Kurt Fuchs)

The assembly task consisted of the commissioning and joining of a various number of parts. Two up to seven parts per assembly step had to be found, selected and grasped during the commissioning phase. In the joining phase the previously selected parts had to be assembled according to detailed picture instructions. Joining included assembly primitives of different complexities like positioning and orienting, i.e. two- as well three-dimensional operations.

The experimental comparison of task performance with different communication modes allows evaluating possibilities for the optimization of mental resource allocation. So far, the selection and grasping phase of the commissioning task was supported by highlighting relevant part boxes (contact analog condition), by projecting schematic views of boxes (projection condition; dots at the relevant boxes) onto the workspace close to the boxes or by displaying schematic views at the more distant monitor. The setup allows real time adaptations of instruction content according to the worker’s skills and mental workload [24]. Furthermore, task difficulty, task similarity and task sequences are varied, in order to fill a database for refining the prediction of execution times based on standard methods of predetermined times.

4.2 Results

Experimental results demonstrated an influence of task difficulty and communication mode on commissioning as well as on joining tasks.

In the contact analog and projection condition movement onset times for the first hand movement during commissioning were shorter and peak velocity as well as acceleration movement were faster in comparison to the monitor condition. Therefore, it seems that highlighting of the relevant boxes facilitates attention shifts to the relevant box. Also the projection of schematic box views close to the box seems to improve attentional selection, because the part box positions can be compared more easily. To conclude, guidance of visual attention modulates performance times in the commissioning phase [38].

In the joining phase, different classes of assembly primitives showed substantial differences in assembly times in accordance with common methods of predetermined times: complex operations (e.g. orienting in two versus all rotational axes) took more time per part. The projection of instructions close to the work piece lead to shorter completion times if difficult spatial relationships had to be considered during the joining operation. With simple operations like sticking no differences between communication modes could be observed. Summarizing, complex assembly primitives benefited more from support by AR-based communication modes than easier ones. These results demonstrate a statistical interaction between task difficulty and attentional guidance on task performance [24, 39].

To summarize, an interplay between mental processes and action execution has been demonstrated. The observed differences in task performance can be interpreted as an indicator for task complexity and build an important factor for the prediction of performance times in the commissioning and joining phase.

5 Conclusion

A measure of three dimensions for assembly task complexity was presented. It supports the provisioning of efficient and ergonomically feasible guidance by an accurate and detailed technique of adjusting the instructional content given to the human worker. In this context, the factors for measuring the degree of detail and complexity of a task displayed were outlined. It extends the concept and application of common systems of predetermined times (e.g. MTM) by including dimensions of human performance, mental workload and attention allocation. Moreover effects based on the product and its reference levels are incorporated. The dimensions can be classified as of temporal, cognitive and knowledge-based nature.

The presented measure is put to use in a system providing guidance in complex manual assembly scenarios. The system’s utilization of not solely temporal (e.g. MTM) but as further matter cognitive and knowledge-based elements, allows for a well adjusted display of instructions at any point in the assembly process. Hence, the fundamental axiom of occupational physiology not to overburden workers with an abundance of information is satisfied. At the same time, time-consuming search and localization for instructional elements can be avoided.