1 Introduction

Aging and disabilities are primary factors that reduce motor function, and motor control ability declines with age [1, 2]. Parkinson’s disease [3] is a brain disorder that impairs body movement and is common among the elderly. In addition, disorders, such as the developmental coordination disorder [4] and Down’s syndrome [5], develop in children. This can impede the development of fine motor skills required for daily life and social interactions, such as using tools, grasping objects, and manipulating objects. These disabilities make it difficult for those suffering from them to participate in society. This is because they often require more time and effort to perform tasks, compared to able-bodied individuals. For example, in recent years, worker-retention rates in manufacturing and agricultural workplaces have been declining owing to an aging workforce. In addition, many people with disabilities wish to work, but they cannot find regular employment.

Power-assisted devices that improve work efficiency, reduce fatigue, and compensate for differences in physical strength have been developed to address these social issues [6, 7]. However, these devices are focused solely on assisting motor function. Although these devices can be used to increase the work capacity and reduce the workload, it is difficult to support intellectual work involving complex processes or tasks that require skills and detailed movements. Supporting intellectual work in this context refers to reducing a user’s cognitive load. These power-assisted devices may increase cognitive load instead of reducing it.

Human augmentation technologies have been proposed to expand the concept of power-assisted devices. This technology enhances and extends the various sensory and motor functions that people possess. Furthermore, this is expected to not only compensate for functional decline due to aging and disability but also to expand and enhance natural abilities of humans [8]. For example, Saraiji et al. developed a system that achieved physical augmentation using wearable robotic arms [9]. This system improved the capabilities of the operator by enabling them to use additional robotic arms to collaborate with and assist them in completing tasks. However, this system is primarily controlled in a master–slave way, and the device lacks functionalities such as environmental perception, intelligent decision-making, and autonomous operation. In addition, some technologies augment perceptual and cognitive abilities by presenting information obtained through sensors and information technologies to people through augmented-and mixed-reality displays. For example, a practical system of smart glasses that is used by workers at a job site to transmit visual and audio information to their supervisors located elsewhere, enabling real-time instructional information to be conveyed, has been developed [10]. This system can provide real-time guidance from skilled individuals to workers using real-time information. However, this system is limited to aiding cognitive and perceptual functions, and does not facilitate physical augmentation. Thus, current human extension technologies are limited to extensions of each human functions. Technology for extending cognitive and motor functions in a combined and simultaneous way has not yet been developed.

This study aims to develop a device that can cover not only motor functions but also individual differences in cognition and judgment in a combined and simultaneous way. We have developed a prosthetic hand with a visual functionality, equipped with vision sensors that enable it to recognize the environmental conditions and intentions of use [11]. In this study, we apply this system to tasks involving complex processes and cognition. We attempt to apply this technology to intelligent tasks requiring cognition, judgment, and memory. Solving a dissection puzzle with tetromino blocks is an example of an intellectual task. We verified that the developed technology could simultaneously expand cognitive and motor functions and realize fast and accurate work movements.

2 Human augmentation hand

The human augmentation hand developed in this study enhances the operator’s ability to perform cognitive tasks. The device is attached to the operator’s arm and becomes a part of the operator’s body to support cognitive, decision-making, and behavioral functions. Although much research has been conducted in areas such as power assistance, where physical strength is enhanced, and technologies that expand perceptual abilities such as augmented reality, technology related to devices that recognize, think, and act together with humans on tasks requiring intelligence do not exist. In this study, we developed a human augmentation hand to support tasks that require higher brain function. The dissection puzzle was used as an example of a task that requires this functionality. Subsequently, the effectiveness of the proposed technology was verified. The puzzle blocks were tetrominoes, which can generate various shapes on a plane, and they are shown in Fig. 1. Each tetromino-shaped block comprised four squares with sides measuring 3.5 [cm]. The front side of each block was equipped with a metal plate to enable gripping with a solenoid. The dimensions of the O block were 7 [cm] height, 7 [cm] width, and 2 [mm] thickness with a 15 [g] weight. The T block was 7 [cm] high, 10.5 [cm] wide, 2 [mm] thick, and weighed 15 [g]. The puzzle blocks used in this experiment were made of acrylic plates, and a steel washer was attached to the center of each block to enable the solenoid to ‘grasp’ it. Seven different puzzle blocks were used, and each could be rotated by 0\(^{\circ }\), 90\(^{\circ }\), 180\(^{\circ }\), or 270\(^{\circ }\). The task was performed on a flat plane, and the manipulation of the block was limited to rotation and horizontal movement in the plane, making it a two-dimensional (2D) task. There is only one unique solution to this dissection puzzle.

Fig. 1
figure 1

Tetromino blocks

2.1 Device

The developed human augmentation hand is shown in Fig. 2. The system is a tabletop device that assists in grasping and placing puzzle blocks on a flat surface; it consists of a vision sensor, a solenoid, and a servomotor.

The vision sensor chosen was a web camera (CMS-V43BK, Sanwa Supply, Japan) with a wide-angle lens, because the object was located at a short distance from the device. The end effector consisted of a solenoid (WF-P25/20) and servomotor (HS-322HD). The solenoid is used for picking up the tetromino blocks. Once the block is picked up, the servomotor can be rotated to block it in the correct orientation. The servomotor has an actuation range from − 90\(^\circ\) to 90\(^\circ\). Arduino was used to control the solenoid and servomotor.

For object detection, the real-time object-detection algorithm You Only Look Once (YOLO) [12] was used. YOLO is a function of Darknet, a C-language framework, comprising a 26-layer neural network with 24 convolutional layers and two all-connected layers.

The appearance of the human augmentation hand is shown in Fig. 3. The hand weighed 860 [g] and had a slightly larger structure. Because this could affect usability, downsizing and weight reduction are considered future challenges.

Fig. 2
figure 2

Overview of the system

Fig. 3
figure 3

Human augmentation hand

2.2 Block recognition

To distinguish the seven types of puzzle blocks (Fig. 1) and their rotated states of 0\(^{\circ }\), 90\(^{\circ }\), 180\(^{\circ }\), and 270\(^{\circ }\), we prepared a custom dataset of puzzle blocks with different orientations to train the neural network. The training dataset used in the experiments consisted of 1534 images captured by a camera mounted on the human augmentation hand. The images were classified into 19 classes for each orientation and block type. If the orientation did not affect the shape, the object was treated as belonging to the same class. For example, block ‘S’ had the same shape when oriented at 0\(^{\circ }\) as that at 180\(^{\circ }\).

2.3 Solution search algorithm

The dissection puzzle solution-search program described below was introduced into the system to identify the puzzle blocks required to solve the dissection puzzle problem. When a puzzle is presented, the first step is to determine whether an error pattern exists. An error pattern is the shape of an area that cannot be created by combining the puzzle blocks. If an error pattern exists, the search fails. If no error pattern exists, the reference regions of the problem and candidate blocks are aligned. The reference area is the uppermost left edge of each block and puzzle problem, and the blocks are moved, such that they are aligned. After aligning the reference area, it determines if a block can be placed. If a block can be placed, it is placed, and the puzzle problem is recursively processed as a new puzzle problem, excluding the block area from the puzzle problem. If there are no more silhouettes, the search is considered successful.

figure a

2.4 System operation

First, the dissection problem to be solved was passed to the system. The solution-search algorithm was then used to solve the puzzle, determine the required blocks, and determine their correct orientation.

Next, the user could communicate the grasping and rotating commands to the system by showing the puzzle block to the augmentation hand for more than 20 frames (approximately 1–2 [s]). The human augmentation hand then attempted to recognize the type of puzzle using its camera. The system then compared the puzzle blocks recognized by the camera with the results of the solution-search algorithm to determine whether they should be used.

If the recognized puzzle block was used as the solution, the system enabled the puzzle block to be selected. An electronic buzzer was used to provide auditory feedback on whether a puzzle block was required for the solution. When not used, one low-pitch tone (330 [Hz]) was produced; when used, two higher pitch tones (440 and 880 [Hz]) were produced in sequence.

The solenoid used to pick up the puzzle block was then activated using either a push button or automatically. The user then moved the puzzle block to the location of the solution. During this time, the system rotated the puzzle block in the direction determined by the solution-search algorithm. The solenoid was then turned off either using the same push button or after a fixed period (5 [s]).

This process was repeated until all the puzzle blocks derived from the solution search were placed, thereby supporting the puzzle-solving operation.

3 Experiment

Fig. 4
figure 4

Dissection puzzle task

Two experiments were conducted to verify the functionality and usefulness of the proposed system. The first experiment verified the operation of the human augmentation hand and is described in Sect. 3.1. The second experiment examined the usefulness or effectiveness of the system for physically solving a dissection puzzle, as described in Sect.  3.2.

3.1 Basic system validation

The operation of the system was validated by performing a puzzle-solving task, as shown in Fig. 4. The solution to the dissection puzzle consisted of three puzzle blocks. These included blocks that were not rotated and those that had to be rotated by 90\(^{\circ }\) or 180\(^{\circ }\).

Fig. 5
figure 5

Operation example (camera view)

Fig. 6
figure 6

Operation example (third-person view)

The operation for solving the dissection puzzle illustrated in Fig. 4 using the human augmentation hand is illustrated in Figs. 5 and  6. This graph shows the results of the work in seconds on the horizontal axis, with results “Block detection”, “Hand rotation”, and “Up/Down”. “Block detection” shows the human augmentation hand recognizing each puzzle block, and the color indicates whether the recognized block is required to complete the puzzle. Blue marks indicate that a block is unrequired, whereas red marks indicate that a block is required. “Hand rotation” indicates the angle of the motor of the human augmentation hand, and “Up/Down” indicates when the solenoid was switched on and off to pick up and set on the puzzle block.

In the experiment, the blocks were recognized in the order of ‘Z’, ‘O’, ‘I’, ‘ J’, and ‘T’. From the dissection puzzle problem, the system derived three objects ‘O’, ‘J’, and ‘T’ as solutions. These 3 blocks were recognized as “use” and the other blocks as “disuse.” When the “use” status continues for more than 20 frames, the solenoid was turned on, indicating that the hand was picking up the block, rotating it as required, and setting on it.

The images in Figs. 5 and 6 show the human augmentation hand assisting in solving the dissection puzzle problem in this experiment. The images in Fig. 5 (i), (ii), (iii), (iv), and (v) show the viewpoint of the camera attached to the human augmentation hand. In (i) and (iii), puzzle blocks were recognized as blocks that were not used to solve the puzzle. In (ii), (iv), and (v), the puzzle blocks are recognized, and the ‘O’, ‘J’, and ‘T’ blocks are used for the solution.

The images shown in Fig. 6 (i), (ii), (iii), (iv), (v), and (vi) show the experiment from the side and viewpoint of the user operating the system. In (i) and (ii), the ‘O’ block was placed directly on top of the puzzle question (shown in black), because rotation was unrequired. In (iii) and (iv), the J block was placed on the lower left side of the puzzle question (shown in black) with a 90 \(^{\circ }\)clockwise rotational movement. Finally, in (v) and (vi), the T block was set on the lower right side of the puzzle question (shown in black) with a rotational movement of 180\(^{\circ }\).

Through these experiments, we verified that the proposed system operates correctly.

3.2 Comparison experiment with the human operator

To verify the effectiveness and usefulness of the system, we compared the time taken to solve the dissection puzzles with the subjective perception of a human operator when solving the puzzle.

3.2.1 Condition

Fig. 7
figure 7

Scene of the comparison experiment

Fig. 8
figure 8

Puzzle problem

Two experimental patterns were conducted on six young subjects to compare the differences between human trials when the augmentation hand was used (four males [one 24-year-old and three 22-year-old] and two females [two 22-year-olds]). To prevent malfunctions in the manipulation of the human augmentation hand in this experiment, the grasping motion of the block was performed using a button operated by a human operator. (1) Manual: Dissection puzzle problems were solved by moving blocks with a human hand without using a human augmentation hand. (2) Robotic hand: The human augmentation hand was used to solve the dissection puzzle problems. For the robotic hand, instructions to solve the dissection puzzle problems were shown on the display in front of them. The blocks had to be solved in the problem, and their locations are indicated. Each experiment is shown in Fig. 7. Three experimental patterns were conducted, each with six questions of varying difficulties.

Figure 8 shows the six problems used in the experiment. The difficulty of these problems was determined by the number of blocks used. Three problems used three blocks, and the other three problems used five blocks. To ensure the uniqueness of the problem, only five types of blocks were used in this problem: ‘J’, ‘L’, ‘S’, ‘T’, and ‘Z’. Each experiment was conducted at least 1 d apart, and the order of each question was randomized. An evaluation using the National Aeronautics and Space Administration task load index NASA-TLX [13] was conducted at the end of each problem, and an interview with the human augmentation hand user’s was conducted after all experiments. NASA-TLX is the NASA task load index, which is a subjective mental workload-assessment method. The questionnaire consisted of six items [mental demand (MD), physical demand (PD), temporal demand (TD), own performance (OP), effort (EF), and frustration (FR)]. NASA-TLX is calculated using a numerical value with “Low-High” as the two ends, and the final rating can be given on a scale of 0–100. In this experiment, only the simple average calculated from the ratings of the six items was used to evaluate the overall mental work. In addition, a timer was set up during the experiment to provide time pressure for the evaluation. Experiments and evaluations were conducted under these conditions.

3.2.2 Results

Fig. 9
figure 9

Time taken to complete the task (mean ± s.d.). The upper graph shows the trial time for a simple task (3 blocks) and the lower graph shows the trial time for a complex task (5 blocks) for each problem. The blue line represents manual operation, and the gray line represents the graph for when the robotic hand was used

Figure 9 shows the mean time and standard deviation required by the subjects for each trial in the two experimental patterns for each question. The left graph shows a simple task (3 blocks), and the right graph shows a complex task (5 blocks). Blue represents (1) manual and gray represents (2) the robotic hand. In the simple task, Condition (1) resulted in shorter trial times, although there were differences between the subjects. In Q3, subjects’ trial times varied significantly. The maximum and minimum trial times in Condition (1) were 97.5 and 17.2 [s], respectively. The difference in trial time shows that there is considerable variation compared to the other problems. In Condition (2), the results were similar to those in Condition (1), but they were stable, with little difference among the subjects. The average trial time for all problems in Condition (2) ranged from 30 to 38 [s]. Moreover, the results were stable regardless of the problem. Furthermore, the difference between the maximum and minimum trial times in Condition (2) was smaller than that in Condition (1) for all problems. In the complex task, Condition (2) showed stable results with little differences between subjects. In Condition (1), the differences in trial times between subjects were more pronounced than in the case of the easy task. The average times for Q4 and Q5 were more than twice that for Condition (2) with values of 198.2 and 176.2 [s], respectively. The trial time was large even when compared to the simple task. The maximum and minimum trial times differed by more than 400 [s] for both problems, indicating that the subjects’ abilities to solve the problem differed significantly. Furthermore, two subjects in Q6 and one subject each in Q4 and Q5 had shorter trial times under Condition (1) than under Condition (2). These results show the stability and usefulness of the robotic hand in difficult tasks.

In addition, the number of trials (the number of times the subject had to start after grasping or installing the wrong block) for each problem differed in each experiment. The number of trials in the experiment using the robotic hand was small, and most subjects did not exhibit any movement to reposition the blocks. However, in Condition (1), because the human hand could grasp and place the block instantly, the simple task resulted in a short trial time, whereas the number of re-do attempts increased for the complex task, resulting in a considerable increase in the trial time. In the interview after Condition (2), some participants commented that because the robotic hand automatically performed the rotating motion, the installation motion was easy, taking this as a hint. This may have been the reason for the stability of the robotic hand.

Fig. 10
figure 10

NASA-TLX scores in the experiment. The upper graph shows NASA-TLX scores for a simple task (3 blocks) and the lower graph shows NASA-TLX scores for a complex task (5 blocks) for each subject. The purple line represents manual operation, and the black line represents the graph for when the robotic hand was used

Fig. 11
figure 11

NASA-TLX scores in the experiment (average score for each element). The top graph shows the average NASA-TLX scores for each item in a simple task (3 blocks) and the bottom graph shows the average NASA-TLX scores for a complex task (5 blocks). The purple line represents manual operation, and the black line represents the graph for when the robotic hand was used

Figure 10 shows the NASA-TLX scores for each subject at each difficulty level of the problem. The upper graph shows a simple task and the lower graph shows a complex task. Purple represents (1) manual and black represents (2) the robotic hand. A comparison of the task difficulty showed that the mental load for simple tasks was lower. The mental load was also lower when the robotic hand is used. Subject 1 experienced the highest mental load in Condition (2), whereas the other subjects experienced less than half of the mental load in both difficulty levels, compared to Condition (1). In Condition (2), the mental load did not change depending upon the problem; however, in Condition (1), the mental load increased from a simple task to a more complex task.

Figure 11 shows the mean value of each NASA-TLX scale for each difficulty level of the problem. The upper graph shows an easy task and the lower graph shows a complex task. Purple represents (1) manual and black represents (2) the robotic hand In Condition (1), the scores on all scales increased as the difficulty level increased; however, in Condition (2), the scores did not increase, indicating that the task was performed in a stable manner without mental load. In Condition (2), the PD score was higher than those of the other components. We believe that this resulted from the the physical load imposed by the longer operation of the robotic hand in PD, compared to in Condition (1). In the post-experiment interview, the participants commented on the weight of the robotic hand and difficulty holding it. We believe that this is one of the reasons the PD scores were higher. However, the results of the complex task, Condition (1), were higher than the scores of all factors in Condition (2).

These results suggest that the robotic hand is useful for reducing mental load during work and stabilizing the load caused by problems.

4 Conclusion

This study proposed and developed a human augmentation hand prototype that can support intellectual tasks requiring higher brain functions. The study attempted to simultaneously expand cognitive and motor functions while solving a dissection puzzle. Subsequently, basic verification experiments were conducted. Results revealed that the human augmentation hand was capable of high-speed and accurate work operations. In addition, an experiment was conducted to compare the performance of the robotic hand with that of a human, and the evaluation of the human augmentation hand and its results were presented. Moreover, the usefulness of using a comprehensive system was demonstrated. Some participants commented negatively on the weight and operability of the human augmentation hand; therefore, future studies will focus on improving the design of the device with respect to those factors.