Keywords

1 Introduction

Modern industry tends to automate industrial processes to a wide extent in order to optimise mental workload imposed on the operators. However, the industry still consists of many processes where automation does not apply. This is especially notable in assembly tasks and processes where costs related to the process automation are generally not justifiable [1]. Manual assembly work is often repetitive and monotonous and as such, it carries low mental workload (MWL). Importantly, mental underload can be as dangerous as overload [2, 3] because the probability of error occurrence and MWL exposure are mutually related, according to the U-shaped curve [4], i.e. extremely high/low MWL increases the probability of error occurrence, while the optimal MWL leads to the smallest probability of error occurrence. Therefore, there is an increasing need to find the methodology for objective assessment of the influence of MWL on human operators for both the automated and manual processes. Human factors and ergonomics (HF/E) is the scientific discipline that investigates the interaction between system and human operators [5,6,7]. Classical ergonomics’ approach to studying human cognitive state and the interaction between humans and operating systems mainly utilises qualitative and subjective methods, such as questionnaires and measurements of overt performance [8]. However, these methods are often unreliable and unable to investigate the covert cognitive processes of workers during their everyday routine in industrial environments [8]. For that reason, neuroergonomics emerged as a novel path in ergonomics research [8, 9]. Neuroergonomics merges knowledge from ergonomics and neuroscience, and it is defined as the science discipline that studies the human brain in relation to work [10].

One of the most powerful neuroimaging methods in neuroergonomics research is electroencephalography (EEG), since wireless EEG is capable of direct recording of electrical brain activity in real world [11]. A commonly employed EEG method for evaluating cognitive state is the extraction and investigation of event related potentials (ERPs). ERPs are defined as voltage fluctuations in continuous EEG signal that are associated in time with certain physical or mental occurrences [12]. ERP components are usually defined in terms of polarity, and latency with respect to a discrete stimulus, and have been found to reflect a number of distinct perceptual, cognitive and motor processes. In that sense, the so-called P300 component is represented by the positive deflection in terms of voltage, appearing around 300 ms after the stimulus presentation [13]. The P300 component is often used to identify the depth of cognitive information processing and is not influenced by the physical attributes of the stimuli [13]. For these reasons, the P300 ERP component is assumed to reflect the attention level of the person [14] and its amplitude is modulated by mental workload [15].

It is important to stress that the goal of neuroergonomics is not only to investigate the brain’s functions, but also to put it in the context of human behavior in everyday environments [9]. As such, it is important to investigate the neural basis of physical performance, e.g. body movements and reaction times (RTs). Traditionally, RTs were used to estimate the cognitive state of the person. The main reason behind the wide use of RT measurements is that they are easy to measure and simple to interpret [16]. RT represents a time interval from the indicated start of a work process or operation (stimulation), until the moment of the action initiation. However, as it was pointed out in [17], reaction-time experiments usually consist of a stimulus followed by the response, without direct possibility to observe the mental processing that occurs in between.

Physical performance measurements are ubiquitous in ergonomics studies, mainly in the domain of physical ergonomics. These became even more prominent with rapid development of the motion capture (MoCap) sensors that are nowadays affordable and unobtrusive. The majority of research related to operators’ motion is related to posture estimation or action recognition [18], while significantly less studies are oriented towards linking cognitive processes to motor actions. One study that investigated the relationship between gestures and the cognitive state of the person showed that during the task that carries less mental workload, the quantity of the task unrelated movement increases [19]. This study investigated behavioral activity offline and indirectly, since the participants were recorded with a video camera and manual analysis was subsequently performed with replaying the video [19]. Advances in computer vision technology (namely structured light technology) nowadays allows for automated analysis. This enabled us to develop and use a simple behavioral model, based on movement energy (ME; [20]). Ultimately, the combination of brain dynamics and behavioral modalities can open a deeper understanding of the influence of the mental workload on human mental states during complex work activities [21].

The aim of the present study is to investigate how changes in mental workload during a simulated industrial manual assembly task are influencing the P300 ERP component’s amplitude, but also the behavioral modalities of RTs and ME. We investigated the influence of the task duration on these modalities, where the expectation was that the ME and RTs should show an increasing trend, while the P300 component’s amplitude should decrease as the task progresses. Additionally, we investigated whether the changes in mental workload modulate the P300 component’s amplitude.

2 Methods

2.1 Participants

Ten subjects, aged between 19 and 21 years old. volunteered as participants in the study. Participants were instructed not to drink any alcohol on the day before and the day of their participation in the study, as well as not to drink coffee at least three hours prior to their participation in the study. All participants had normal or corrected-to-normal vision. They had agreed to participation and had signed the informed consent, after reading the experiment summary. The study was approved by the Ethical Committee of the University of Kragujevac.

2.2 Replicated Workplace

Reliable EEG recording still relies on wet electrodes, thus the on-site industrial EEG recording still represents a big challenge, since it may cause discomfort to the workers on the industrial floor. For that reason, we replicated a workplace (Fig. 1) in the building of the Faculty of Engineering (University of Kragujevac) and we simulated the production process of the rubber hoses, used in the hydraulic brake systems in automotive industry. Once the replicated workplace was created, the participants in the study were equipped with the wearable EEG. Participants’ movements were recorded using Kinect sensor, which was placed in front and above the participants. Foot switch was used with the aim of recording the RTs, as will be explained in Sect. 2.7. The sensor placement is presented in Fig. 1.

Fig. 1.
figure 1

Replicated workplace and the sensors placements

2.3 Sensors Used in the Study and Multimodal Synchronization

EEG data were recorded with the SMARTING (mBrainTrain, Serbia) wireless EEG system. The small and lightweight EEG amplifier (85 × 51 × 12 mm, 60 g) was tightly connected to a 24-channel electrode cap (Easycap, Germany). The communication between the SMARTING and the recording computer was established through a Bluetooth connection. The electrode cap contained sintered Ag/AgCl electrodes that were placed based on the international 10–20 System. The experimental procedure imposed that the electrode impedances must be set below 5 kΩ, which was confirmed by the device acquisition software.

To investigate the body movements, we used the Microsoft Kinect sensor. Kinect has a sampling frequency of 30 frames per second (fps) and it is capable of representing the human body with a stick figure, where the most prominent human body joints (e.g. shoulder, elbow) are represented with the key-points. For this study, we used a 10 key-point seated model, since in the experimental setup. the replicated machine occluded the lower part of the participants’ body.

To synchronize the data coming from different, above-mentioned, sensors, we used the Lab Streaming Layer (LSL) framework (https://github.com/sccn/labstreaminglayer). As explained in [21], LSL is a real-time data collection and distribution system that allows multiple continuous data streams as well as discrete marker timestamps to be acquired simultaneously by Lab Recorder, in an eXtensible Data Format (XDF). This data collection method provides synchronous, precise recording of multi-channel, multi-stream data that are heterogeneous in both type and sampling rate [21], and all of the sensors mutually communicate over a local area network (LAN).

For running the experimental tasks (explained in detail in Sect. 2.4) we used the SNAP environment. SNAP allows relatively simple, script-level development of complex, interactive experimental paradigms and it can retrieve the signals from various input devices. This feature was used to attach the foot switch through a USB port to the recording computer, with the aim of recording the RTs. with the aim of extraction of the behavioral modality of RTs.

The overall system architecture for synchronous recording of all described streams is graphically depicted in Fig. 2.

Fig. 2.
figure 2

Overall system architecture

2.4 Experimental Task

Simulated Assembly Task

In the production process, an operator carries out the crimping operation in order to join a metal extension to a rubber hose. This single operation, carried out in a sitting position, consists of eight simple steps (actions). The simulated operation consists of eight major production steps that can be summarised as follows (Fig. 3): first, the information to initiate the simulated assembly operation is presented to the participant, in the form of visual stimulus (step 1), upon which he is instructed to instantly initiate the operation by taking the metal part (step 2) and the rubber hose (step 3). Following this, participants should place the metal part on the hose (step 4) and place both inside the crimping machine (step 5). The participants then proceed by promptly pressing the pedal, which initiates the improvised machine and replicates the real machines’ crimping sound (step 6). Upon completion of the simulated crimping process, the participant removes the component and places it in the box with completed parts (step 7). Finally, the participant sits still and waits for subsequent stimulus (step 8).

Fig. 3.
figure 3

Graphical presentation of the step-by-step simulated crimping operation (Color figure online)

Experimental Procedure

Experimental procedure was similar for all the experiments and it was described in detail in [11]. The participants were subjected to the modified sustained attention to response task (SART) and Arrows task, simultaneously with the simulated task. The tasks were balanced across the participants and duration of each task was around one and a half hours, upon which the participants had a 15 min break, before starting the second task. Both tasks were presented on a 24” screen from a distance of approximately 100 cm. Upon presentation of the stimuli on the screen, the participants were instructed to complete the previously explained assembly operation.

As explained in [11], the original SART paradigm consists of consecutively presenting digits from ‘1’ to ‘9’ and participants are required to give the speeded response on all stimuli, with the exception of digit ‘3’. The main difference between the original SART and in modified SART paradigm is that the digits in Numbers are randomized, with the condition that forbid the appearance of two consecutive digits ‘3’ (‘no-go’ stimulus) and in between two ‘no-go’ conditions at least two ‘go’ conditions must appear. The participants in the study were instructed to initiate the assembly operation as soon as the digit appeared on the screen, with whichever hand they felt more comfortable (they could freely choose, previously explained step 2 and 3, presented in Fig. 3).

The Arrows task was presented and explained in [11]. The Arrow task is a choice reaction “go/no-go” task, where the arrows pointing to the left and right appear on the screen; the white arrows represent the ‘go’ (target) condition, whereas the red arrows represent the “no-go” stimulus. Similarly to the SART task, the stimuli sequence in Arrows was randomised with the condition that forbade two consecutive appearances of the “no-go” stimuli. Contrary to SART, the participants were required to initiate the action altering the hand according to the direction in which the white arrow on the screen was pointing, i.e. in the Arrows task the participants should initiate the action with the right hand (step 2) if the white arrow is pointing to the right, or with the left hand (step 3) if pointing left. Regardless of the task, all the stimuli were presented for 1000 ms on a black screen background. Each task consisted of 500 stimuli, where the probability of the appearance of the ‘no-go’ stimuli was set to 10% (50 in total), while the ‘go’ stimuli were presented 450 times.

2.5 EEG Processing

EEG signal processing was performed offline using EEGLAB [20] and MATLAB (Mathworks Inc., Natick, MA, USA). EEG data were first bandpass filtered in the 1–35 Hz range, followed by re-referencing to the average of the TP9 and TP10 channels. Further, an extended Infomax Independent Component Analysis (ICA) was used to semi-automatically attenuate contributions from eye blinks [21]. After the data preprocessing, ERP epochs were extracted from −200 to 800 ms with respect to timestamp values of “go” and “no-go” stimuli. Baseline values were corrected by subtracting mean values for the period from −200 to 0 ms from the stimuli occurrence. The identified electrode sites of interest for the ERP analysis in this study were Fz, Cz, CPz and Pz, as the P300 component is most prominent over the central and parieto-central scalp locations [14]. The P300 amplitude was calculated for both the “go” and “no-go” conditions and for each experimental condition, using mean amplitude measure [12] in the time window from 350 to 450 ms, with regard to the timestamps of the stimuli.

2.6 Movement Energy (ME) Calculation

During the simulated assembly operation, the upper-body movement of participants was recorded with the Kinect. The Kinect was placed in a position above and in front of participants (as shown in the Fig. 1). The motion data are acquired in a form of a stick figure with the 10 key-points seated model that represent the joints of the upper body.

Automatic quantification of the task unrelated ME was based on the kinetic energy of the key-points. The motion data were extracted and analyzed in the period between the operators’ completion of each operation and the consecutive stimuli that was presented to the participants (Step 8, Fig. 3). In that period, the participants had no prescribed activity and the expectation was that they would spend that time relatively still. Further, the kinetic energy of movement was calculated for each simulated operation and for each of the key-points in all-three axes (as explained in [20]). Finally, the ME for each trial was calculated as the summation of kinetic energies in all three axes.

2.7 Reaction Times

The experimental design did not allow subjects to react to the button press on seeing the visual ‘go’ stimulus, thus the reaction time (RT) could not be measured in the traditional way (as the time elapsed between the stimulus presentation and the speeded response by the participants). For that reason, RTs in our study were defined as the time elapsed between the stimulus presentation (step 1) and the foot switch press (step 6 from Fig. 3). This allows the calculation of RTs, as the difference between timestamps from simulated operation initiation and the beginning of the machine simulated crimping process.

2.8 Statistical Analysis

Prior to statistical analysis, we averaged our data using a 15-point and one-step moving average window, as explained in [22]. The statistical analysis was performed using IBM SPSS software. We performed Spearman correlation in order to investigate the changes in behavioral and neural features as the time of the task progressed, i.e. the general trend of P300 amplitude, RTs and ME.

Additionally, in order to investigate whether the mental workload modulates the P300 amplitude, we performed a paired t-test between P300 amplitude in SART and the Arrows task, on four electrode sites of interest (Fz, Cz, CPz, Pz). It is noteworthy that we compared the values of P300 amplitudes only in the “go” condition.

3 Results

The Spearman correlation results are presented in Table 1. These results revealed the general negative trend of RTs, regardless of the task and order of the task presentation. Regarding other modalities, the ME data showed a positive trend, whereas the P300 amplitude showed a negative trend in most of the task conditions, with an exception in the case of the arrows task presenting as the second task, i.e. when a more demanding task was following the monotonous task.

Table 1. Spearman correlation results

The general trends of P300 amplitude at the Pz electrode location and ME are graphically shown in Fig. 4. Regarding the P300 amplitude’s analysis, we found that P300 component’s amplitude in the “go” conditions elicited higher P300 amplitude in the Arrows task, compared to the SART task (p < .05). The t-test results for all four electrode sites are provided in Table 2.

Fig. 4.
figure 4

graphical representation of general trends of P300 amplitude from Pz electrode site (red color) and ME (black color). (Color figure online)

Table 2. T-test results for P300 amplitude comparison between the task conditions (Arrows/SART task) and for all electrode sites under study.

4 Discussion

In this study, we investigated the influence of mental workload on the cognitive state of the workers during the manual assembly operations. We imposed two different levels of mental workload on the workers during the simulated manual assembly operations and observed its effect on the behavioral modalities of ME and RTs, as well as on the modulation of the P300 component’s amplitude. P300 component’s amplitude and ME showed comparable results. From Table 1 and Fig. 4, it can be seen that the P300 amplitude is decreasing during the task, reflecting that the attention of the participants is showing negative trend, the amount of the task unrelated movement (ME) is increasing in almost all experiment conditions (also shown in Fig. 4). These results are in line with our hypothesis that the amount of task unrelated movement should increase during the monotonous task [19], while the P300 amplitude is expected to decrease. The exception from the general trend is the experimental condition in which the task that carries higher MWL (Arrows) is performed upon completion of the more monotonous (SART) task. The difference in MWL is the consequence of the choice action in the Arrows task that does not exist in the SART task. Our results suggest that if the more monotonous task is followed by the more demanding task, the amount of the task unrelated movements is decreasing, while at the same time, there is a positive influence on the participant’s attention level, as the P300 amplitude shows an increasing trend as the task progresses. Additionally, the t-test revealed that the P300 amplitude elicited during the Arrows was of higher magnitude than during the SART task. This can be expected, since in the Arrows task the participants are exposed to slightly higher demands of the arriving stimuli evaluation, as they are unaware of the arrow stimuli direction. On the contrary, the digit stimuli from SART task carries significantly lower information, which can cause the participants to stop evaluating the content of the stimuli after some time [11], i.e. in SART task participants should just pay attention to whether it is a “go” or “nogo” condition, while in the Arrows task they should also pay attention to which hand they will initiate the operation. All these results confirm our hypothesis that the modulation of mental workload also modulates the P300 amplitude, but also ME. On the other hand, we found that the RT results did not depend on the level of the imposed mental workload. Although we hypothesised that RTs will decrease over the time course of the task, they showed a negative trend in all cases under study, i.e. the participants were faster in executing the task as the experiment progressed. This may not be surprising, since the participants in the study were students, without any prior working experience in similar tasks. Therefore, the decrease of the RTs can be attributed to the effect of rehearsing [23], as the students seemed to become increasingly familiar with the simulated assembly operation.

The results from this study suggest that the overt performance monitoring, as observed through RTs, may not be reliable enough, since we did not observe any difference in reaction times between different experiment conditions. Notably, this finding is in line with one of the main premises of neuroergonomics [8]. Additionally, this study suggests that a slight increase in mental workload in a manual assembly operation, compared to an entirely repetitive and monotonous task, has a positive influence on the cognitive state of the operators. Finally, findings from this study may be also implemented in the job rotation strategy in factories. Job rotations in assembly lines are often proposed as a method of reducing the monotony of the task, thus keeping the workers more focused [24]. We propose that job rotations on assembly tasks should be organised in such a way to avoid cases in which a more demanding task is followed by a task that is more monotonous in nature. However, this observation should be investigated thoughtfully in future studies.

5 Conclusion

This study demonstrated how neuroergonomics methods can be successfully applied in investigating the influence of changes in mental workload to the cognitive state of the workers. The monotonous task showed a decrease in P300 component’s amplitude and an increase in ME, both indicating a decrease in the attention level of a worker, as the task progresses. It is noteworthy that in the more demanding task, this result was not consistent. Furthermore, we also showed that the P300 amplitude was more prominent in the task that carried a slightly higher cognitive demand, in comparison to a highly monotonous and repetitive task. All these results suggest that the wireless EEG, but also Kinect, can be successfully utilised in the measuring of the influence of mental workload modulation on the cognitive state of the workers.