1 Introduction

In recent years, gestural interfaces have become more prevalent both in the commercial industry and in research. This type of interface emerged in the video game industry, where users move their body to play video games [9]. In this case, the users’ body acts as the video game controller. As gestural interfaces gained popularity and hardware cost decreased, several applications were developed out of the gaming context [4, 18, 26].

Gestural interfaces are considered easy to learn and use, as people express themselves and interact in everyday social life through gestures [6]. Since body gesture interfaces are easier to learn, they may prove helpful in captivating user groups that, until now, have shown some resistance in adopting technology. It is the case of older people who, in general, find traditional computer interfaces overly technical and difficult to use [5, 22]. This often deprives them from the benefits that technology has to offer, such as increased social activity and information access [28]. Moreover, this is a problem with evermore increasing importance, since we live longer and this causes the number of older people to become bigger [8]. If the interest of senior users in technology could be captivated, this would contribute to fight isolation and exclusion and allow older people to be more productive, independent and to have a more social and fulfilling life [28].

Despite the fact that gestures are naturally part of how people interact with each other, it does not guarantee that gestural interfaces will be practical and well accepted in some contexts and for some user groups [21]. The commercial success of gestural interfaces in the video games industry proves its value for the gaming context. In research, the current literature has also explored the suitability of in-air gestures for interacting with technological interfaces [4, 18, 26]. They found that the mainstream user, the young adult, was successful and enjoyed using gestural interactions.

However, regarding the older adults user group, almost all studies focus on the gaming context with the aim of performing physical therapy and rehabilitation. Indeed, one problem that may affect the use of gestural interfaces by older people is their physical aptitude. Research shows that as people age, a significant decline in cognitive, perceptual and motor abilities occurs [30]. Motor issues of older adults include slower motion, less strength, less fine motor control and decreased range of motion and grip force [23]. Therefore, the gestural interactions for older users should be carefully designed in order to avoid fatigue, exhaustion and fine motor control. On the other hand, since in-air gestural interactions require some degree of physical activity to operate, it is likely to positively impact the health of senior users, even if the intensity of the physical activity is low [29].

Currently, seniors’ performance and acceptance towards body gesture interfaces is not well understood, particularly considering their specific needs and abilities out of the gaming context. At the moment, it is not clear whether gesture-based interactions can be useful for older adults. The study aims to understand how older adults perceive gesture-based interactions, in terms of suitability and acceptance, when interacting with technological interfaces in general. Specifically, a scenario where the older user can interact with any computer application while sitting in the comfort of his living room is considered. By having a computer connected to the TV and with gesture recognition hardware, users can interact and control virtually any technological application.

This study was performed to evaluate the suitability of in-air gesture interaction for the PaeLife (Personal Assistant to Enhance the Social Life of Seniors) [25, 28] project. PaeLife focuses on recently retired individuals who are used to some degree of technology usage and who want to keep themselves active, productive and socially engaged, through the use of a personal virtual life assistant. It is a virtual presence who supports social communication, learning and entertainment. PaeLife aims to fight older people isolation by enhancing seniors’ connectivity to family, friends and society.

AALFred is the application developed in Paelife and designed to run both on tablets and PCs [25, 27, 33]. When it is running in a tablet device, users can interact with the application using touch and voice commands. When running on a PC connected to a TV display, the interaction is made by voice and gestures. Both devices and displays have the same interface.

Indeed, the interface of touch applications is better suited for in-air gestural interaction than traditional desktop applications. The former usually features bigger icons and buttons, as the precision of touching with a finger is not as good as interacting with a pointer device, such as a mouse [1]. Moreover, since AALFred follows the design guidelines for older users [32], it features bigger icons than the ones in most touch applications. This layout makes it particularly well suited for the in-air hand gesture usage scenario presented, where the user is sitting in the living room some metres away from the television.

Instead of evaluating the gestural interaction with AALfred, a stand-alone prototype was developed, for the evaluation of the in-air hand gestures without the constraints posed by the more complex application. This way, we aim to draw conclusions for any general technological interface, instead of validating the gestural interaction just for AALFred. Our prototype focused on two types of tasks that are most commonly required to interact with most technological interfaces: navigation and selection tasks. For each task, we developed and evaluated two alternative gestures, as shown in the diagram of Fig. 1.

Focus was on developing gestural interactions to be used only with the upper part of the body, mainly the arms and hands, which is the most practical movement fitting the use scenario: an older adult sitting in his living room. The in-air hand gesture recognition was performed using the Microsoft’s Kinect sensor, as it is considered to have several advantages over its competition for our target user group, as explained in Sect. 2. In case if multiple users are present in the living room, the gestural control can be given to the user who is closest to the Kinect sensor.

This user study aims to answer three main research questions:

Q1:

Are in-air hand gestures adequate for older adults, in order to interact with general technological interfaces?

Q2:

Which type of gesture allows for fastest navigation and selection with the lowest error rate?

Q3:

Do older users enjoy using gestural interfaces, finding it easy to use? Which gestures do older users prefer?

In order to answer these research questions, an experimental evaluation with 40 older people was performed. During the user tests made in a laboratory environment, performance was measured and user feedback was gathered about the gestures they performed. Most participants enjoyed using the gestural interface and completed all the proposed tasks more or less easily. The results were positive, indicating that this type of interface can be successfully used by seniors.

Moreover, the evaluated gestures were systematically compared, concluding with the better suited alternative for seniors. From these results and observations, a set of design guidelines that aim to help future developers working on gestural interactions were derived. The results of this study are transversal to many applications, since most technological interfaces require navigating through information and selecting a particular target in a set.

Fig. 1
figure 1

Diagram showing the evaluated gestures to interact with a general technological interface

2 Gestural interaction studies

In the related literature, several terms are used to describe the type of gestural interaction in study. The terminology used is diversified: full-body gestures, free-hand gestures, in-air hand gestures, etc. Most studies surveyed mainly focus on tracking the movements of the hands, with the exception of studies regarding games or physical fitness. In the context of the present study, since it also focuses on hand movement, the authors opted to use the terminology in-air hand gestures. Although the users are also required to move their arms in order to move their hands, making this term a bit misleading, the main motion that is tracked to effect change is performed by the hands. Thus, despite this small imprecision, in-air hand gestures has the benefit of giving readers a clearer picture of the main focus of the gestural interactions considered, i.e. the users’ hands.

Gesture recognition can be seen as a way for computers to understand human body language, using algorithms to interpret signals (e.g. movement, video, audio). There are two main ways of achieving gesture recognition: with devices that have motion sensing capabilities (e.g. accelerometer, gyroscope, magnetometer) or video capturing and processing. The latter is also called computer vision or remote sensing.

The use of gestures to control technological devices is a discipline that is being investigated for some years now [16]. However, in recent years, body gesture interfaces have become more popular due to the video game industry. Most gestural interactions reviewed are based on the Microsoft’s Kinect device, since it is considered to have certain advantages over the competition.

The Kinect sensor has the benefit of not requiring the user to wear or hold any physical accessory, making it more practical. Kinect captures the users’ movements with colour and depth cameras, sensing and tracking the users’ whole body. The direct competitors, Nintendo Wii Remote and PlayStation Move, require a hand-held controller to perform the gesture interaction. In this case, only the forces applied to that controller are tracked, and the controller is usually handheld. PlayStation Move is also able to track the position of the controller.

Considering the case of the older adults, remote sensing should be better suited than handheld devices. Potential holding issues related with fine motor control of the pulse can be avoided. This type of problems can be further aggravated by dementia which is common in old age, as well as users who suffer from arthritis [23]. Moreover, remote sensing better fits the scenario of older people using applications displayed on the TV in their living room, where a TV remote control is already necessary.

Therefore, the literature was reviewed focusing on remote sensing when available, only resorting to other types of gestural interaction as a complementary alternative. Since the use scenario is for the older user sitting in the living room, focus was primarily on in-air hand gestures instead of whole body interactions, which are common in video games.

Some studies evaluate static and dynamic gestures. Static gestures refer to the users’ pose or spatial configuration that should be held for some time in order to perform a command. A dynamic gesture is performed by moving the body in a certain way, in a defined time interval.

In Sect. 2.1, gestural interaction interfaces for the young adult to interact with general technological interfaces are analysed. Section 2.2 presents the studies performed with older users, mostly in gaming contexts, as there are very few studies out of the gaming scope.

2.1 Studies with young adults

As already stated, despite the fact that the Kinect sensor was developed to interact in video games, several applications have emerged in a diversified range of scenarios. Maidi et al. [20] developed a gestural interaction that allows controlling interactive media. They developed a photo viewer application that receives instructions via recognized gestures. The authors defined four gestures: a click gesture performed by pushing the hand towards the Kinect sensor; a pull movement to enable returning to the parent level of hierarchy; and moving the hand to the left or right in order to look around all the photos. Although this study has the novelty of controlling a media interface with hand motion, no experimental evaluation was performed.

Henze et al. [15] analysed static and dynamic gestures to interact with a music application. Authors performed an experimental evaluation with 12 participants and found that dynamic gestures were easier to remember, more intuitive and simpler for controlling their application. Panger [26] focused on using Kinect in real-life kitchens, which allows interacting even when users’ hands are messy. He implemented a recipe navigator, a timer and music player. One of the main challenges was preventing accidental commands, since intentional commands are interspersed with the cooking movements. The possible interactions consisted of left, right, front and back movements. A five-subject experiment in each user’s home revealed that installing the Kinect was simple and that subjects felt successful interacting in this context.

Kim et al. [18] propose Ambient Wall, a prototype of a smart home system that can display the current status of the house through a projection on a wall. Their scenarios include changing channel on the TV, control the room temperature, check for messages and turn-off all devices. Their gestural interaction consisted only in pointing at elements of the wall to select them. Similar to this is Hands Up [24], which uses the Kinect device and a projector to display an interactive interface onto the ceiling of a room. The application interface consists of a circular main menu, in which users can control various devices in the house.

Chen et al. [4] developed a gestural interaction to control the channel and volume settings of a TV. Users would move the left or the right arm upwards or downwards, in order to increase or decrease the channel or volume depending on the used arm. Bailly et al. [3] adapted three menu techniques for freehand gestural interaction: linear menu, marking menu and finger count menu. Results showed that finger count requires more mental demand than the other two more direct techniques.

Gallo et al. [10] presented a Kinect-based system to allow for a controller-free manipulation of medical images. This solution is to be used in operating rooms, where non-sterile computer interface devices such as mice and keyboards cannot be used. They implemented gestures that allowed to point and click, zoom, translation and rotation, among others. Although it was implemented and integrated into a medical image viewer, no evaluation of the gestural interaction was performed. Stannus et al. [31] performed a comparison study, where they asked participants to perform different navigation tasks within a virtual globe using either a mouse, a 3D-mouse or gestures. The study was of a qualitative nature, having only 10 participants. They found that most users preferred using the standard mouse for completing the tasks. However, the gestural interaction was rated and commented positively in terms of naturalness. Moreover, authors reported that technical problems occurred in the prototype for the gestural interaction, which may have biased the results.

2.2 Studies with older people

Most studies evaluating body gestures interfaces with older people fall on the scope of games. Probably because many older people would benefit from frequent physical activity, this type of interaction interface can provide the necessary motivation. The main goal of these studies is to keep the seniors engaged while performing physical exercises, and thus remain healthier both physically and emotionally.

Gerling et al. [13] compared sedentary (such as the computer mouse or gamepad) and motion-based (gestures) interactions for playing a game they developed. They performed tests on both younger and older adults. Their results have shown that older adults can use motion-based game controls efficiently and that they enjoy motion-based interaction. They found that older adults do not perceive motion-based game controls as more exhausting than younger adults. Additionally and as expected, when comparing controller comfort, both user groups reported some degree of fatigue when using motion-based controllers. However, participant comments showed that older adults considered the increased physical effort of motion-based controls a welcome challenge, whereas some young participants commented on physical fatigue as a negative aspect of motion-based game controls.

Jung et al. [17] examined the impact of playing Nintendo Wii games on the psychological and physical wellbeing of seniors in a long-term care facility. Although the game was not specifically adapted to older users, the seniors enjoyed playing it. Results showed that seniors found this kind of games stimulating, and showed interest in participating in this kind of activities again. Moreover, a substantial amount of physical activity is required to play these games, which is likely to be beneficial in the health of older users. Identical findings were observed in a similar study, by Saposnik et al. [29].

Other approaches focus on developing gesture-controlled games for seniors, considering their physical limitations. Gerling et al. [12] conceived a game in collaboration with a physical therapist, since commercially available games may put older adults at risk of injury by failing to accommodate for their physical abilities. The authors developed and tested four static and four dynamic gestures. They found that the gestural game was successful among older adults, and even had a positive effect on the participants’ mood. However, they also found that recalling gestures was too challenging for most participants. The authors also found that institutionalized older adults represent an extremely heterogeneous group, and defend that gestural games should be individually adjusted to each user. Similar to these studies, Ganesan et al. [11] aimed to find the factors that play an important role in motivating older adults to maintain a physical exercise routine, a habit recommended by doctors but difficult to sustain. They developed an early game prototype for the Microsoft’s Kinect. Authors tested the prototype for basic usability, and preliminary results were promising. Next steps include play tests with older adults.

Hassani et al. [14] developed an assistive robot which helps older people perform physical exercises. Seniors were required to perform an exercise, detected by a Kinect, and then, they would manually skip to the next exercise. In order to move to the next exercise, users had to tap a touch device or perform a “next” gesture. Therefore, two interaction modalities were compared: gestural and touch interfaces. The score of both modalities was very positive. In a final interview, participants reacted more positively to the use of in-air gestures. Although the tested interaction was very simple, this study showed that seniors find gesture interfaces an easy way to interact with technology.

Bobeth et al. [2] evaluated several TV menu control methods for older users. They compared four different types of freehand gestures to control a TV menu: hand movement tracking; static hand positions to control a cursor; hand strokes for a marking menu; and a dial plate for a radial menu. Results showed that directly transferring tracked hand movements to control a cursor on a TV achieved the best performance and was preferred by the users. In general, the older participants showed a very positive attitude towards gesture-based interactions.

2.3 Critical analysis

A systematic search on HCI and accessibility-related conference proceedings and journals (TACCESS, ASSETS, UAIS, CHI and INTERACT, among others) was performed for work assessing the use of gestural interaction by older adults. No published work that evaluated the performance and acceptance of older people to interact using in-air gestures was found. To the author’s knowledge, only the study of Bobeth et al. [2] evaluated their suitability though it focused on a TV menu control, which is a very particular case. How gestures fare, for older people, in order to interact with a technological interface in general remains largely unknown.

Results from other studies regarding gestural interaction show that seniors seem to adapt well to gestural interfaces, finding them easy and enjoyable to use. However, as already stated, the current literature regarding gestural interfaces for older people is almost exclusively focused on gaming interactions. Seniors’ performance and acceptance towards body gesture interfaces are currently not well understood, particularly considering their specific needs and abilities out of the gaming context.

It has also been seen that most of the surveyed studies, even the works that were targeting the young adult, did not evaluate alternative gestural interactions. Typically, authors define a set of complementary gestural interactions and test how these interactions fare. Having and testing alternative ways of interacting is a particularly important concern when considering older users, as they constitute a very heterogeneous user group [12].

3 Design and implementation

As has been discussed in the previous section, the way in-air gestural interfaces fare, for older users, to interact with general technological interfaces remains largely unexplored. In order to evaluate whether older people enjoyed using gestural interfaces and whether it was suitable for this purpose, a prototype application was developed. By performing the user test in a general prototype instead of a particular application, it is possible to evaluate the defined in-air hand gesture interaction in isolation and therefore open out the conclusions to similar technological interfaces and use scenarios. The results will offer understanding as to which are the best gestures for interacting with AALFred, the PaeLife application [25, 27, 28, 33], as well as other similar graphical user interfaces.

3.1 Prototype

As particularly happens with most touch interfaces, as well as most interfaces in general, interacting with AALFred requires to navigate through information and to select a desired item among a set 2. Thus, focus was on evaluating two types of tasks traditionally required to interact with graphical user interfaces: navigation and selection tasks (Fig. 1).

Fig. 2
figure 2

Screenshot of AALFred, the PaeLife application

The test prototype (Fig. 2) was implemented as a Windows Presentation Foundation application, and the gesture tracking was developed using Kinect for Windows SDK. It was decided to use Microsoft’s Kinect sensor in our implementation, as it has advantages over the competition for the older user group, as depicted in Sect. 2. In order to keep tasks general, a list of horizontally scrollable numbers were used, as shown in Fig. 3a, to test the navigation gestures. In navigation tasks, participants were asked to scroll to a particular number in the list. For the selection tasks, a different number of targets were displayed on the screen (Fig. 3b). In this case, users were asked to select a particular target from the set.

Fig. 3
figure 3

Navigation (a) and the selection screens (b)

3.2 Gestures

The in-air hand gestures were designed, incorporating the lessons, guidelines and best practices learnt from other studies. Considering the potentially reduced motor skills and physical limitations of older adults, effort was put to avoiding fast motions, complex positions or movements difficult to perform. Also, since the extensive use of this interface may cause fatigue, the gestures were designed carefully taking that into account. Simple one-hand gestures were used, thus avoiding problems that may arise with bi-manual interactions [23]. Moreover, dynamic gestures were preferred, as they are easier to remember and considered more intuitive [15]. Focus was also placed on directly transferring tracked hand movements to control the interface, as it achieved the best performance in comparison with other alternatives [2]. For all gestures we defined, it is only required that users move any hand above the hip and in front of their body for a short period of time. Therefore, all the gestures are relatively simple and physically easy to achieve.

Regarding navigation, Swipe and Grab and Drag gestures were evaluated. To perform a Swipe, users should drag either hand in the air and perform a horizontal motion to the desired direction, as illustrated in Fig. 4a. A Swipe gesture is only considered when users horizontally move their hand for at least 30 cm. The vertical motion of the hand should not exceed 10 cm, otherwise the gesture is not considered a horizontal Swipe. The time interval of the gesture should be between 0.25 and 1.5 s.

These thresholds were determined experimentally. This was done in two phases. A user was asked to sit in front of the Kinect device and perform a swiping motion as if he wanted to swipe in a particular direction. The motion data were captured by the Kinect and stored. Then, this data was analysed and the thresholds were calculated so that all the desired swipes were recognized with a minimum of errors and false positives. Several iterations of this procedure were performed to fine-tune the thresholds, until the system positively recognized the user’s gestures 95 % of the time.

Regarding the Grab and Drag gesture, the implementation available in Microsoft’s Kinect SDK was used. To perform the Grab and Drag, users should raise either hand so that a hand cursor appears on screen. The hand should be open, and the palm should be facing the Kinect sensor. Then, users can close their hand to “grab” the content and can drag the hand in the desired direction to scroll. To scroll further, users have to open their hand to “release”, so they can Grab and Drag again. This motion is illustrated in Fig. 4b. This alternative may require more movements and coordination than the Swipe gesture, though it was hypothesized that users could have more control on the navigation process. The strong suit of the Swipe gesture was its simplicity.

Fig. 4
figure 4

Navigation gestures. a The Swipe gesture. b The Grab and Drag gesture

For the selection tasks, the Point and Push and Point and Hold gestures were developed. For both gestures, users should raise either hand towards the screen so a hand cursor appears. Then, to perform a selection through the push gesture, users should move their hand forward towards the screen, as if they were reaching for the target (Fig. 5a). For this gesture, the implementation available in Microsoft’s Kinect SDK was also used. Regarding the Point and Hold gesture, users should keep the hand cursor over a target for 1.5 s to select it, as shown in Fig. 5b. The interface gives feedback about the selection state of the target by progressively filling its background with a lighter colour, like a sandglass. When the target is completely filled, it is selected. The hypothesis was that the Point and Push gesture would be more precise, since it would not restrict the time users have to aim. The Point and Hold was expected to be simpler, as the users only have to keep pointing for a while to perform a selection.

Fig. 5
figure 5

Selection gestures. a The Point and Push gesture. b The Point and Hold gesture

Regarding the gesture recognition implementation, a major concern is the recognition of a few significant gestures from a continuous sequence of movements, as studied in [19]. This is a complex process because the gestures are ambiguous to recognize: it is difficult to determine when a gesture starts and ends in a continuous sequence of movements; gestures are also segmented: multiple instances of the same gesture vary in shape, length and trajectory, even for the same person. To address these issues, Kim et al. [19] proposed a sequential identification scheme that performs gesture segmentation and recognition simultaneously. In the cases where the Microsoft’s Kinect SDK defaults were not used, this approach was followed for the implementation of the gesture detector.

Prior to performing the user tests, the developed gestures were evaluated by a physical therapist, in order to assess their suitability when taking into account seniors’ physical limitations. The physical therapist concluded that the defined gestures posed no danger of overexertion or lesion on older people.

4 User study

The aim of this user study is to understand how older adults can benefit from gesture-based interactions, in terms of suitability and acceptance, when interacting with technological interfaces in general. This section presents the user tests and their results: which gestures allow for better performance as well as user satisfaction on both navigation and selection tasks.

A similar user study with young adults, which would act as a control group, was also considered. However, as has been seen in related work, in-air gestures were already proven to be effectively used by young adults to interact with technological interfaces [4, 18, 26]. Young adults enjoyed and successfully used gestural interactions. Since the goal with this user study is to understand whether the same conclusions are valid for older adults, we decided that performing the same tests with young adults would not be necessary.

Tests with older adults were performed in two countries, France and Hungary, based on the same protocol of predefined tasks. The objective was to have a representative panel of users. On both countries, users have similar profiles (see Sect. 4.1). By having participants with different cultural backgrounds, the aim was to have a more diverse sample, which would allow to have more generalizable results.

When the details or results of each set of users differ from each other, the differences on each respective subsection are detailed. When the studies are similar, the shared description is provided.

4.1 Participants

Twenty elderly users, 5 men and 15 women, took part in the French user study. Their ages ranged from 60 to 80 with an average of 67 years (sd = 4.8). All participants had some experience with computers, as only two of them rated their computer experience as very low. None of them had prior experience with in-air gestural interfaces nor any other kind of gestural interface. Eleven users had some sort of physical movement limitations, such as slight rheumatism, tendinitis, osteoarthritis and ankylosing spondylitis, though nothing particularly severe.

Regarding the Hungarian user study, 20 older people participated, 8 males and 12 females. Their ages ranged from 56 to 77 with an average of 64 years (sd = 6.3). Most participants had previous experience with computers, though four participants did not have any prior experience. None of them had prior experience with any gestural interface. Eleven participants had some degree of physical movement limitations, such as rheumatism, Parkinson’s disease, leg and back pain.

On both studies, the physical conditions of participants did not prevent them from using the in-air gestural interface. All necessary precautions were taken to let them rest if they felt tired or had aching articulations.

4.2 Apparatus and setup

As already mentioned, the use case of in-air gestural interfaces being used by older people while sitting in the comfort of their living room was considered. By having a computer connected to the TV and with gesture recognition hardware, they can interact and control almost any technological application.

This way, a living room scenario was replicated in laboratory settings, so that the situation would be as close as possible to a real-life scenario. In the French study, the output device was a 55” Samsung LED TV with 1080p, connected to a Dell laptop with quad core 3.2 GHz processor and 4 GB of RAM. The computer was running Windows 8.1. A Kinect for Xbox was used, connected to the laptop through an adapter cable. Participants were at a distance of 2.5 metres from the TV and the Kinect sensor. Figure 6 shows a picture of a user in the test environment.

Fig. 6
figure 6

A participant using in-air gestures to select a target on a TV screen

The Hungarian setup involved a 42” LG Cinema 3D TV with 1080p. The computer used was a Samsung ATIV Smart PC with a Intel Core i5 Processor (1.70 GHz), 4GB of RAM, running the Windows 8.1 operating system. The used gesture recognition hardware was a Kinect device for Windows. Participants were at a distance of 2.5 metres from the TV and the Kinect sensor.

4.3 Procedure

The user test had a training phase and an evaluation phase. At the beginning of the training phase, it was clearly explained to each participant that the aim of this study was to evaluate the gestures and not the users’ performance. The authors are aware of the potential anxiety that test situations can cause, especially among senior users who had, for most of whom had never participated in user tests, and cautions were taken to try to limit this anxiety. The way gestures were performed was then explained. Since participants were not familiar using in-air gesture interactions, they were allowed to try each gesture for a maximum of 2 min. However, if the monitor found that the senior had understood and was already comfortable performing the gesture before this time interval, they would skip to the next gesture.

In the evaluation phase, users were asked to perform specific tasks for navigation and selection. To test the navigation gestures—Swipe and Grab and Drag—participants were successively asked to scroll to a predetermined number that was displayed on the screen (Fig. 3a). After the user scrolls to the required number and that target stays visible for 2 s, the application automatically shows a new target. This two-second visibility threshold was imposed in order to exclude cases where the user did not have enough precision to scroll to a particular number, and thus avoid activation when participants just quickly passed by the target. The required navigation numbers order was chosen in a way to cover three conditions: large, medium and small ranges of scroll. A total of eight navigations were required for each navigation gesture. The numbers’ order is the same for both navigation gestures.

Regarding the selection tasks, where participants were asked to perform the Point and Hold and Point and Push gestures, the application asks to select a random target in a grid of 2 targets, then in a grid of 4, then in grid of 8 (Fig. 3b) and finally in a grid of 16 targets. The varying number of selectable targets allows us to assess the performance and precision of the developed gestures relative to the target number and size. This procedure is repeated three times, so a total of 12 selections were performed per participant. When users select the desired target, the application automatically moves to the next target selection task. In case of a wrong selection, the application logs it as a missed hit and the user is asked to select the same target again. If a participant makes three wrong selections, the application assumes the user failed to complete that task and would automatically switch to the next target selection task. However, this never occurred during the user tests.

Between each navigation and selection tasks that were successfully completed, a five-second period was imposed where users could not interact. This allowed for frequent relaxation of the older users’ arm, as well as simulating more realistic interactions. Indeed, in real scenarios, users typically have to process the newly displayed information after interacting with a technological system. In order to avoid any bias related to the sequence of the performed gestures, the application randomized the order of the gestures tested for the navigation and selection tasks.

Evaluation was both quantitative and qualitative. Participants’ performance was automatically measured by logging the task completion time, as well as the number of errors. After performing all the required gestures, the participants answered a simple questionnaire with three questions for each gesture regarding the easiness of performing that gesture, whether it was tiring, and the accuracy of the gesture detection. A simple usability questionnaire was preferred since from previous experience, older users seem to find exhaustive questionnaires like TAM3 [34] too complex and have difficulty discerning between questions. The satisfaction was not measured solely based on the questionnaire grading: the answers were contextualized related to the action users had performed, and users were asked to explain their grading and to make comments. A whole user test took, on average, between 25 and 40 mins. to complete.

4.4 Dependent measures and analysis

A within-subjects design was used where each participant tested all conditions. For the navigation tasks, each participant performed eight scrolling subtasks for each navigation gesture. Regarding the selection tasks, each participant performed 12 selection subtasks for each selection gesture.

Shapiro–Wilkinson tests of the observed values were performed for the task completion time, and number of errors to assess whether dependent variables were normally distributed. If they were, parametric statistical tests such as the paired and unpaired t test and Pearson correlations were used. On the other hand, if measures were not normally distributed, nonparametric tests, Mann-Whitney, Wilcoxon and Spearman correlations were used.

5 Results

The results of the user tests are based on the quantitative analysis of the time required to complete the tasks proposed, as well as the number of errors that participants made while performing those tasks. These were complemented by the qualitative analysis of the questionnaires’ answers, and the comments the users made just after having realized the proposed tasks.

In Sects. 5.1 and 5.2, the analysis and results shown in the graphs refer to 33 participants only. Seven of the total number of 40 users did not complete the test. Several reasons played a role for participants not completing the test. In two cases, the participants did not complete the test due to external reasons (one participant did not have more time available, and in the other case, technical problems occurred). In the remaining five cases, users did not finish the proposed tasks due to difficulties they met while interacting. A detailed discussion about the participants who did not complete the user test is presented in Sect. 6. For the time being, the results from the 33 complete user tests are discussed.

5.1 Task completion time

To understand whether the data from the French and the Hungarian user studies could be merged and analysed together or whether the data should be analysed individually, a Mann-Whitney test between the two groups was performed. For all the tested gestures, no statistical differences were found regarding the task completion time (Swipe: U(32) = 199, p = 0.41; Grab and Drag: U(32) = 76.5, p = 0.12; point and hold: U(32) = 95.5, p = 0.17; and point and push: U(32) = 111.5, p = 0.43). Since there are no statistical differences between the French and Hungarian user groups, the analysis on task completion times merged can be performed. In this case, it was found that cultural differences did not play a major role in the performance of executing the defined gestures, and therefore the results can be generalized for both countries.

The boxplot in Fig. 7 illustrates the time required to perform the proposed tasks in seconds, grouped by gesture. Regarding navigation tasks, users completed them faster when using the Swipe gesture. Indeed, a Wilcoxon signed-rank test revealed that the differences are statistically significant [(Z(32) = −2.76, p = 0.006]. This occurred mainly because Swipes are very simple and easy to learn, and most participants did not have difficulties performing this gesture. Swipes also allow to scroll bigger distances faster, thus allowing users to complete the proposed tasks faster. However, some participants also had problems using the Swipe gesture. They interacted in a way that led the system to recognize Swipes in the opposite direction than the user intended. Control was made very difficult and since it was the very first time they were experiencing gesture interaction, users could not make sense of what was happening, which created a frustrating experience. Otherwise, most users found the Swipe as being more natural and easier to perform.

Most seniors found the Grab and Drag gesture to be more complex and harder to perform than the Swipe gesture. Some participants reported that they needed to be very focused in order to coordinate the motions required to perform the Grab and Drag gesture (Fig. 4b). This feedback was corroborated by the time they took to perform the tasks using the Grab and Drag gesture. Although the median is much lower (283 s), some users required around 10 min to complete the Grab and Drag gesture tasks. This time is considered too long for completing the proposed tasks, revealing a difficult and inefficient interaction. These participants struggled to coordinate the different movements required for the execution of this gesture. On the other hand, some seniors who were able to perform the Grab and Drag gesture declared that they preferred it because it allowed a finer control, particularly for scrolling small distances.

Fig. 7
figure 7

Time required to complete the proposed tasks, in seconds (without outliers)

Regarding selection tasks, the Point and Hold gesture allowed for a better performance when compared to the Point and Push alternative. A Wilcoxon signed-rank test showed that this difference is statistically significant [(Z(32) = −4.66, p = 0.00]. Both gestures require pointing at the screen, though selecting a target using Point and Hold requires only one movement (pointing the hand for 1.5 s), while Point and Push requires two movements: pointing the hand over the target and then hitting the target by pushing the hand forward. Therefore, it can be concluded that users took more than 1.5 s to perform the Push gesture—the time required in Point and Hold to perform a selection. Indeed, some users had trouble performing the “push” part of the gesture, taking around 5 min to complete the selection tasks using Point and Push. However, almost all users performed the selection tasks without difficulties, finding the gestures simple and easy to perform.

5.2 Error rate

Similar to the task completion time, a Mann-Whitney test was performed between the French and the Hungarian groups to check whether the analysis could be performed together. For all the gestures, no statistical differences were found regarding the number of errors (Swipe: U(32) = 98, p = 0.20; Grab and Drag: U(32) = 72.5, p = 0.08; point and hold: U(32) = 129, p = 0.84; and point and push: U(32) = 104, p = 0.25). Since there are no statistical differences between the French and Hungarian user groups, the analysis on the number of errors can be combined.

Regarding navigation tasks, errors were considered and classified into three categories: direction, no output and precision errors. Direction errors occur when users are asked to navigate in one direction though end up scrolling in the opposite direction. This happened when participants did not fully understand or did not perform the gesture correctly, or when the system wrongly recognizes the users’ movements. No output errors are considered when users move their hand with the intention of navigating, though no actual scrolling occurs. This may occur when the gesture is not wide or fast enough or when the Kinect failed to precisely recognize the motions of the user. Finally, precision errors happen when users are scrolling in one direction to get to a particular number but due to lack of precision of the gesture pass it by. In this case, users have to perform an additional gesture in the opposite direction to acquire the desired number.

Fig. 8
figure 8

Number of errors that users made while performing the proposed tasks (without outliers)

The boxplot in Fig. 8 summarizes the number of errors that users made when performing the proposed tasks. As shown, the total number of errors in both navigation tasks is similar. Indeed, a Wilcoxon signed-rank test showed that there are no statistically significant differences (Z(32) = −0.18, p = 0.86). However, the types of errors users committed on each navigation gesture are different (Fig. 9). It must be noted that not all errors were to be attributed to users’ actions, as the technology still lacks accuracy in some cases. Also, the segmentation and recognition of gestures is a complex process, in which it is difficult to algorithmically detect the users’ intentions. Nevertheless, since this type of technology is still being improved, it is expected to be more accurate in the near future.

As shown in the graph of Fig. 9, the most common type of error was direction errors, and users made more of these errors using the Swipe gesture. When performing this gesture, direction errors occurred mainly because sometimes it is difficult to algorithmically interpret the intention of the user. For example, if users wanted to perform a Swipe to the left, they would raise their hand, and in order to have more amplitude of movement to perform the gesture, they would first move their hand to the right. In this case, if the movement to the right was wide enough, the system would erroneously recognize this as a Swipe to the right. To segment and recognize the intentions of the user in a continuous space of gestures is a complex challenge, magnified by the fact that each user has his own way of interacting.

Considering the Grab and Drag gesture, direction errors occurred mostly because of lack of coordination. In order to perform consecutive scrolls in the same direction, users have to close their hand to grab, drag the hand to the desired direction, then open the hand and bring it back again to repeat this motion. However, some users would forget to open the hand between these steps, which made them scroll in the wrong direction, to the point where they started. This error occurred more frequently when users had to scroll consecutive times in the same direction, and also in the beginning of the test, when users did not have so much experience.

Fig. 9
figure 9

Percentage of errors in navigation tasks, grouped by type of error

Regarding the number of output errors, the Grab and Drag gesture had more errors of this type (Fig. 9). This mainly occurred because users would forget to close the hand at the right moment and would initiate the horizontal movement without closing the hand, thus performing no scroll. Other times, they would close and immediately move the hand, giving no time in between for the system to correctly detect that the hand was closed. There were also cases where the system did not correctly recognize that the user’s hand was closed. The Swipe gesture produced no output when users’ movements were so slight that the system did not recognize them as intentions to scroll, or when they also moved the hand vertically and thus it was not considered as a Swipe.

Finally, regarding precision errors, their number was slightly higher with the Swipe gesture compared to the Grab and Drag (Fig. 9). This is mainly because when using the Grab and Drag gesture, users get instant feedback and direct mapping of hand movements to scrolling. However, for the Swipe gesture, users have to perform the whole gesture first for it to be detected by the system and then only see the results after the gesture was fully executed. Users do not have instant feedback while performing the Swipe gesture, which does not allow a precision as good as on the Grab and Drag gesture. This data corroborate with users’ comments about their perception of control with the Grab and Drag gesture. The users who were able to perform this gesture reported that they had more precision than with the Swipe. However, the difference between the two gestures is not very accentuated because participants who struggled in accurately performing the Grab and Drag gesture did not have much precision, and therefore would make errors of this kind.

Considering the selection tasks, it was acknowledged as an error case when users selected a different target from the one they were required to select. As shown in the boxplot of Fig. 8, the Point and Hold gesture allowed for a lesser number of errors than Point and Push. A Wilcoxon signed-rank test revealed that these differences are statistically significant, although with a lower confidence (Z(32) = −1.92, p = 0.055). It can therefore be concluded that the Point and Hold gesture allows for a better precision than the Point and Push gesture, and allows users to commit less errors when selecting targets on screen.

Moreover, very different results were noticed on each gesture when comparing the number of errors to the number of targets displayed on screen. Figure 10 shows the error rate for each gesture, grouped by number of targets on screen. As illustrated, when users were performing the Point and Hold gesture, most of the errors occurred when there were only two or four targets on screen. This happened because usually in the beginning of the test, users would start pointing at an undesired target though they did not have enough practice time to adjust the hand position to the desired target (selection is effective in 1.5 s), and an erroneous target selection would occur.

Fig. 10
figure 10

Percentage of errors in selection tasks, grouped by number of targets on screen

On the other hand, the number of errors on the Point and Push gesture increases with the number of targets on screen. In this case, the reason was the lack of precision users had when performing the “touch” part of the gesture. Indeed, users had no trouble preselecting the desired target by putting the hand cursor above it, however when performing the “touch” gesture, they would slightly move their hand and would accidentally select another target. Also, some users would start to perform the Point and Push gesture with the arm already stretched, leaving no room for the arm to perform the “push” part of the gesture. In this case, they had to stretch even more, resulting in a loss of precision.

5.3 User satisfaction

At the end of the user tests, participants were asked to answer a satisfaction questionnaire regarding the ease of performing the gestures, whether it was tiring, and the accuracy of the gesture detection. A 5-point Likert scale was used, with the higher score being the better.

A Mann-Whitney statistical test was performed between the French and the Hungarian groups to see whether there were differences in satisfaction between the two user groups. For questions answered, no statistical differences were found (p > 0.05 in all cases; the results of these tests are not individually presented because 12 tests were performed).

Figure 11 shows a boxplot of the results of the satisfaction questionnaire. As illustrated, in all measured metrics for every gesture, participants frequently gave high scores to the gestural interaction. The elderly were enthusiastic and captivated by the novelty of this type of interaction. This may have inflated the scores. Moreover and independently from the quantitative results, participants felt that it was enjoyable and fun to perform the in-air hand gestures to interact with the prototype application. In fact, during the debriefing interviews, users assigned the difficulties to the time needed to get used to performing the gestures. They were confident that with more practice, they would be able to use this interaction more proficiently.

Regarding navigation tasks, it was found that both gestures achieved similar satisfaction results. Indeed, a Wilcoxon signed-rank test showed no statistically significant differences between the Swipe and the Grab and Drag gestures, for every measured metric (easiness: Z(32) = −1.61, p = 0.11; tiring: Z(32) = −1.06, p = 0.29; and accuracy: Z(32) = −0.34, p = 0.74). Participants were divided between these two gestures: some would prefer the Swipes and others the Grab and Drag.

For selection tasks, a Wilcoxon signed-rank test showed that there are statistical significant differences, being the Point and Hold gesture easier to perform (Z(32) = −3.24, p = 0.00). Participants also found that the Point and Hold was less tiring than the Point and Push, and this difference was statistically significant (Z(32) = −2.59, p = 0.01). For the accuracy measures, no statistically significant differences were found (Z(32) = −1.62, p = 0.11) between the alternative selection gestures.

Fig. 11
figure 11

Results of the satisfaction questionnaire (without outliers)

Besides the satisfaction questionnaire, participants’ comments while interacting were gathered. Regarding the navigation gestures, it has already been seen that both alternatives achieved similar results in the satisfaction questionnaire. However, this occurred because some participants preferred the Swipe, while others preferred the Grab and Drag, which tied the satisfaction scores. Thus, it is relevant to analyse the most frequent comments made by the participants.

They reported that the Swipe gesture is easier to learn and execute than the Grab and Drag gesture. Swipe was considered a more natural gesture, although some older people declared that it was not always easy for them to think in which direction they should Swipe, especially in the beginning of the user test. Some participants found the Grab and Drag gesture too complex and demanding in terms of coordination, considering it a gesture that is not usually performed in everyday life and thus harder to master. They also reported that this gesture was difficult to start because they did not know where they should initially place the hand when starting the Grab and Drag gesture. If users place the hand cursor at the left of the screen, there is not much room to perform a scroll to the left.

Regarding the precision of the navigation gestures, users reported that the Swipe did not allow for very precise scrolling, particularly when users wanted to scroll very little amounts. They stated that fine movements were not well detected and that the scroll moved too quickly, which did not allow for much precision. Another recurrent comment—and one which shows in the log analysis, as mentioned earlier—was that sometimes seniors wanted to perform the Swipe in one direction, although the system ended up recognizing the gesture in the wrong direction.

The participants who were able to perform the Grab and Drag gesture commented that it allowed for more control and precision. Other seniors reported discomfort while performing the Grab and Drag gesture, stating that since the hand palm needs to be facing the television screen, this right-angled twist of the wrist is an uncomfortable position. Indeed, one user who had a broken arm could easily do the Swipe gesture, but had more difficulties performing the Grab and Drag gesture. Some seniors also complained that the system was not very precise detecting whether their hand was closed or opened.

Participants also linked the difficulty of performing navigation gestures with the lack of practice. However, they were optimistic that if they had more time to practice, they would get used to it and would be able to use the in-air hand gestures more proficiently.

The selection tasks were performed more easily when compared to the navigation tasks. Besides being simpler, the authors are aware that this may have occurred due to the fact that they were performed after the navigation tasks, leaving more time to the users to get used to the gesture-based interaction.

For selection tasks, most users preferred the Point and Hold gesture. They reported this gesture to be very simple and easy to perform. Even when the number of targets on screen increased, users were able to select the correct target using the Point and Hold gesture, although they reported it was uncomfortable when there were many targets on screen.

The participants also enjoyed the Point and Push gesture, though found it a bit more tiring to the arm than its alternative. Some users would start to perform the Point and Push gesture with the arm already stretched out. In this case, there was no room for the arm to stretch more and perform the “push” part of the gesture, resulting in users stretching the whole upper body painfully. Indeed, most users had no problem preselecting, i.e. getting the hand cursor over the desired target, although when performing the “push” part, they would lose precision and press on another target or even out of the screen.

The user tests in France were performed prior to the ones in Hungary. After gathering the insights from the user tests made in France, the satisfaction questionnaire was improved to include a question regarding the participants’ preferred gesture for navigation and selection tasks. The Hungarian participants showed a clear preference towards the Swipe gesture for navigation tasks, as 65 % of users voted for this gesture against the 35 % who preferred the Grab and Drag gesture. For selection tasks, a majority of 65 % of users preferred the Point and Hold gesture against 30 % that preferred the Point and Push alternative (1 of the participants could not decide between the two gesture alternatives and voted on both).

5.4 Impact of physical aptitude on performance

In order to evaluate whether the physical aptitude of participants played a major role when interacting with in-air gestural interfaces, after examining the personal data about health conditions that participants declared when doing the tests, two group categories of participants were established.

The first group is composed of older users that reported not having any physical movements issues (referred to as the physically fit group from now on). The second group was the one that reported some kind of physical impairment (referred as the conditioned group from now on). This second group included older people with conditions such as rheumatism, tendinitis, osteoarthritis, ankylosing spondylitis, Parkinson’s disease, leg and back pain. However, these conditions were not severe to the point of preventing them to complete the proposed tasks.

Although the conditioned group was composed of participants with heterogeneous physical impairments and despite the importance of individual differences, all physically conditioned participants had to be merged in the same group. Otherwise, it would be impossible to run statistical tests, since there would be many groups with very few elements.

The physically fit group had 18 elements, 6 males and 12 females. Their ages ranged from 53 to 71, the average being 62 years (sd = 4.5). The conditioned group had 15 elements, 6 males and 9 females. Their ages ranged from 56 to 80, with an average of 67 years (sd = 6.3). Although there is a difference in the ages of both groups, this is an unavoidable consequence as when people grow older, the likelihood of showing some kind of impairment increases [30].

Figure 12 provides a boxplot of the time participants required to complete the navigation tasks, grouped by physical aptitude. As expected, the data reveal that the physically fit participants were able to complete the proposed tasks faster than the ones who were conditioned. Indeed, a Mann Whitney test revealed that there are statistically significant differences between these two groups for the Swipe gesture (U(32) = 82, p = 0.033). For the Grab and Drag gesture, a Mann Whitney test also showed statistically significant differences, although with a lower confidence level (U(32) = 89, p = 0.058).

This result is somewhat expected, as users who are more physically fit are able to perform the gestures faster. Since the navigation gestures involve moving the arm back and forth several times, it is not surprising that the physically fit users could do it in a shorter period of time.

Fig. 12
figure 12

Participants’ speed for completing navigation tasks, in seconds, grouped by physical aptitude (without outliers)

Figure 13 shows a boxplot of the time participants required to complete the selection tasks, grouped by physical fitness. As shown, the median of the physically fit participants was lower than the conditioned ones, and physically fit participants were able to perform the tasks slightly faster. However, a Mann Whitney test revealed that the differences are not statistically significant, for neither Point and Hold (U(32) = 105, p = 0.28) nor Point and Push (U(32) = 124, p = 0.69).

Fig. 13
figure 13

Participants’ speed for completing selection tasks, in seconds, grouped by physical aptitude (without outliers)

It was also expected that the physically fit participants could perform the selection tasks faster, though in this case, it did not happen. This result can be interpreted with regard to the inherent simplicity of the selection gestures. When compared to the amplitude of the navigation gestures, the selection gestures are relatively more simple and easier to perform. The selection gestures only involve pointing at the screen, and, in the case of the Point and Push gesture, performing the “push” motion. These are relatively simple gestures, and therefore being more physically fit would not make a difference. Moreover in their comments, participants complained less about articulation problems when performing selection gestures as compared with the navigation gestures.

A similar analysis was also performed for the error rate. Figure 14 shows a boxplot of number of errors participants made in navigation tasks, grouped by physical aptitude. As shown, the results of both groups regarding the number of errors are very similar. For the Swipe gesture, conditioned users even had a lower median error rate than the physically fit users. However, a Mann Whitney test revealed that there are no statistically significant differences between the two groups, for both gestures (Swipe: U(32) = 133.5, p = 0.95; Grab and Drag: U(32) = 105, p = 0.26).

Fig. 14
figure 14

Participants’ number of errors on navigation tasks, grouped by physical aptitude (without outliers)

Regarding the number of errors for selection tasks, the results are shown in Fig. 15. As happened with navigation gestures, the error rate of the two groups is very similar. Indeed, a Mann Whitney test revealed no statistically significant differences between the physically fit and conditioned groups, for both gestures (Point and Hold: U(32) = 102, p = 0.10; Point and Push: U(32) = 131, p = 0.88).

Fig. 15
figure 15

Participants’ number of errors on selection tasks, grouped by physical aptitude (without outliers)

This data show that physical aptitude does not correlate with the number of errors participants make while interacting. Therefore, the ability of a conditioned user to interact with in-air hand gestural interfaces will not be compromised. The accuracy of the conditioned users is comparable with the physically fit users, as the error rate is similar for both user groups.

As already seen, the only difference between physically fit users and conditioned users is that the former group was able to perform the navigation gestures faster. These gestures involve actively moving a hand in the air, in which case the physical fitness of users proved to be an advantage. For selection tasks, there was no correlation between physical aptitude and the task completion time. There was also no correlation between physical aptitude and the number of errors participants committed. This indicates that the physical limitation problems typically experienced by older adults do not compromise their ability to use in-air gestural interactions, as both physically fit and conditioned groups had similar results on most metrics.

A Mann-Whitney test was performed between the physically fit and conditioned groups to check whether there were differences in satisfaction between the two user groups. For questions answered, no statistical differences were found (p > 0.05 in all cases; the results of these tests are not individually presented because 12 tests were performed).

6 Discussion

After analysing all data both quantitative and qualitative, the research questions that were presented at the beginning of this user study can now be answered.

Q1:

Are in-air hand gestures adequate for older adults, in order to interact with general technological interfaces?

Despite the fact that no participant had any previous experience with gestural interfaces, most senior users performed all the proposed tasks without major problems. However, 7 of the 40 participants did not complete the full test. Two of these cases, participants #10 and #19 of the Hungarian study, occurred due to external reasons. Participant #10 did not complete the test because he was giving his opinion about gestural interfaces, and the time he had available to complete the test passed by. While participant #19 was performing the test, the computer froze and the test had to be ended prematurely. Therefore, only 5 out of 38 participants could not complete the full test (13 %).

Participant #8 of the French study, who recently had a stroke, was having many difficulties performing the Grab and Drag Gesture and the Point and Push gesture, so it was decided to end these tasks before the user completed them. She did, however, complete all the tasks in the Swipe and Point and Hold gestures. This confirms previous results that these last two gestures are easier to execute, even for people having more difficulties interacting.

The remaining four participants who did not fully complete the user test were Hungarian seniors. Participant #8 was only able to complete all the proposed tasks for the Point and Hold gesture. For the other gestures, she only completed the tasks halfway. She could not do the whole test because she got confused with the situation and the exercises. Participant #9 could not also use the gestures easily. She never used a computer before, and she was far from understanding the technology itself. She managed to complete the Swipe and the Point and Hold gestures, although not the Grab and Drag and the Point and Push gesture. Participant #11 was in a similar situation of participant #9, and she also only completed the Swipe and the Point and Hold gestures. Participant #13 had both her hands broken, and she could not keep them in one particular position for a long time. She managed to complete the navigation tasks, since they involve moving the hands more actively, she could not complete the selection tasks.

All other participants (87 %), even the ones suffering from other minor physical health problems such as arthritis and mobility and balance issues, had no major hindrances performing the whole user test. On the contrary, interacting through gestures could be beneficial since previous studies found that even these low intensity exercises positively impact the health of the elderly [29]. Therefore, in short, the in-air hand gestures tested can be used by the majority of older people to interact with a general technological interface.

Q2:

Which type of gesture allows for fastest navigation and selection with the lowest error rate?

For navigation tasks, the Swipe gesture outperformed the Grab and Drag gesture in terms of speed. This occurred because the Swipe gesture allowed users to scroll bigger distances faster and also because this simpler gesture is easier to learn and perform. In terms of number of errors, both alternatives achieved similar results. However, participants committed more direction and precision errors on the Swipe gesture and more no output errors on the Grab and Drag gesture.

Indeed, the main problem of the Swipe gesture was recognizing the direction the user intended to scroll, as some users had a way of interacting that led the system in recognizing a Swipe in the opposite direction. A possible solution to this problem is to only allow each hand to Swipe to a particular direction. However, despite certainly reducing the number of errors, this solution limits the number of possible interaction scenarios as it requires the user to have both hands free in order to navigate in both directions. Moreover, most users declare that they rarely use their non-dominant hand, and people suffering from one arm may not feel comfortable with being compelled to use both hands. It is a trade-off that should be weighted for each scenario.

Regarding the time and distance thresholds imposed for the Swipe gesture, they were found not perfect for all users. For some users, their Swipe motion was so wide that our gesture recognizer would detect two consecutive Swipes in the same direction. For other users, these gestures were so slight that the system was not recognizing them. Therefore, despite having defined reasonable thresholds that allowed all users to adapt and perform all the proposed tasks, it can be concluded that each user has his own particular way of interacting. When developing gestural interactions, designers should take this into account and provide an adaptive model or several user profiles to accommodate for users’ different ways of interacting.

The Grab and Drag gesture, although slower, allowed for more control and precision while navigating. Therefore, for technological interfaces that require much precision, the Grab and Drag gesture may be a better alternative. Nevertheless, it must be stressed that some of the senior participants had many difficulties in coordinating and performing this gesture, so it may not be the best choice for this particular user group.

In conclusion for navigation tasks, the Swipe gesture is preferred for older users. It allows for better performance and is simpler and easier to learn. Even from the five participants that could not complete the whole user test, four managed to complete the Swipe gesture tasks. Only one managed to complete all the Grab and Drag gesture tasks. Hence, 97 % of participants were able to perform all the tasks for the Swipe gesture, in contrast to the Grab and Drag gesture where only 89 % of users were able to complete all the proposed tasks.

For selection tasks, the Point and Hold gesture allows for faster selections and fewer errors than the Point and Push. Moreover, the Point and Hold allows for greater precision even when there are more targets on the screen. An improvement that can be made to the original implementation is to increase the time required to hold over a target before selecting it. When performing selection tasks, participants erroneously selected some targets because they did not have the time to readjust to the desired target, thus increasing the error rate, which needs to be interpreted taking these parameters into account.

The selection gestures were easier to perform than the navigation gestures, probably because they are simpler. Of the five users who did not manage to complete the whole user test, four were able to complete the Point and Hold gesture. Only one of these five users was able to complete the Point and Push gesture. Thus, 97 % of participants were able to perform all the tasks for the Point and Hold gesture, though only 89 % of seniors were able to complete all the proposed tasks using the Point and Push gesture. Therefore, for selection tasks, Point and Hold is the preferred gesture.

Q3:

Do older users enjoy using gestural interfaces, finding it easy to use? Which gestures do older users prefer?

Most seniors adapted well to gestural interfaces. They found them easy to use and enjoyed using them. All the developed gestures achieved good rates on the satisfaction questionnaire in terms of easiness to use, whether it was tiring, and the accuracy of the gesture recognition. The median rate of all gestures was four or better, for all measured metrics.

In terms of preference for navigation gestures, some older adults preferred the Swipe, while others preferred the Grab and Drag, which resulted in a draw satisfaction score. Despite the difference not being statistically significant, the Swipe gesture was rated higher in easiness and tiredness metrics, and there was a tie in terms of accuracy. This shows a slight preference of users towards the Swipe gesture. In the Hungarian study, where users were asked which gesture they preferred between the two alternatives, the Swipe was favoured with a majority of 65 %.

Regarding the selection gestures, there was a clear preference towards the Point and Hold gesture. It was considered easier to perform and less tiring than the Point and Push, being the differences statistically significant. Regarding the accuracy, the alternatives were tied. In the Hungarian study, when asked about the preferred gesture, the Point and Hold was chosen by a majority of 65 % of seniors.

6.1 Limitations

Although participants reported that selection gestures were easier to perform than navigation gestures, another factor that must be taken into account is that the selection tasks were performed after the navigation tasks. When designing the experimental evaluation, although the order of each gesture inside each category was randomized, the order in which each category was tested was static. Thus, participants had a little more experience using gestural interfaces when the selection tasks were performed, which may have biased the results a little in favour of the selection tasks.

As already stated, the detection performed by the Kinect sensor is not perfectly accurate. Sometimes, the Kinect hardware cannot accurately detect users’ positions and movements. It was particularly noted that the detection loses precision when the user is sitting, as the body is constantly in contact with the supporting object. This makes user identification less precise, a case that does not happen when users are playing video games, which are usually played standing. Therefore, some of the errors that occurred were not due to the user’s actions, but because the gesture detection mechanisms are still not perfect.

This detection accuracy problem could be mitigated by using a remote controller with movement sensing capabilities, such as the Wii Remote or PlayStation Move. However, the aeging process usually brings motor issues such as less strength and decreased grip force [23], which in turn is may affect the use of these devices. On the other hand, Kinect is a technology that is still maturing. It is expected that in the following years it will improve the detection accuracy, making it more usable and less error prone.

6.2 Other remarks

Having answered the research questions that motivated this study, some important details have also been observed. In all gestures, it was noticed that users had a tendency to perform slightly better in the second gesture that was tested. This probably occurred due to the increased experience with gestural interfaces and also to the reduction in potential stress associated with a test environment, which is usually higher at the beginning of the test procedure. This was also corroborated by the participants’ comments, as they stated that they would certainly perform better if they had more time to practice.

It was also found that the gestures distributed by Microsoft as defaults in their SDK, the Grab and Drag and the Point and Push gestures, were not well suited for older adults. Microsoft certainly performs exhaustive user tests before performing any software release, and these gestures must have proved to be better suited for the mainstream user. However, what may be best for the mainstream user is not necessarily the best alternative for another particular target group. In this case, the elderly users did not adapt as well to the defaults Microsoft provides, instead simpler gestures allowed seniors to interact better with in-air gestural interfaces.

Regarding the five participants who did not complete the user test because they were having difficulties while interacting, four of them did not have any prior experience with computers, and the other reported that she had very little experience with computers and did not own a computer at home. In fact, except just for another user with very low proficiency using computers, these were the users with the lowest levels of experience with using computers.

Since almost all users with low computer experience had problems using the gestural interface, it can be argued that these participants did not have a specific problem using the in-air hand gestures, and their difficulties are due to their lack of prior experience with technology in general. It was difficult for them making such actions such as navigating and selecting on a technological interface. It can be concluded that if users are not familiar with technology, the gestural interfaces will not provide for such a considerable simplification which allow older people to accept and use technology. Gestural interfaces seem better suited for older people who already have some familiarity with technology and computer interfaces.

7 Design implications

From the results, observations and user comments, a number of design implications for gestural interfaces were derived. The authors believe that these implications are also valid for the average user, although are more significant for the older user, on which the user tests have focused.

7.1 Avoid complex coordination

Gestures that may look simple for the young adult, such as the Grab and Drag, may prove to be complex coordination challenges for older adults. The gestures that were composed of two distinct steps (Grab and Drag, Point and Push) demanded more concentration from seniors, which led to a reduction in performance. The simpler the gesture, the easier it is to learn, which also increases motivation to keep using it. Therefore, the gestures should be as simple as possible, avoiding complex coordination.

7.2 Develop gestures that can be performed by either hand

In this study, attention was paid to simulate a real-life scenario where gestural interfaces bring value, such as a living room. In this scenario, users may not have their dominant hand free. Therefore, related to the previous design implication, the gesture must be simple enough to be used by the non-dominant hand. All participants have only used their dominant hand to perform the Grab and Drag gesture, though some seniors used both hands to perform the Swipe gesture in both directions. The Swipe, by being simple enough to be performed by any hand, allows for greater freedom in interactions.

7.3 Give visual feedback of the state of the gesture recognition

Users felt more confident using the system when they could see the status of the gesture recognition. Indeed, gestures which do not directly and instantly map users’ movements, such as the Swipe gesture, did not allow for control and precision as good as gestures that directly map users’ movements. Another recurrent comment users made was to show the direction of their hand in the screen, even when they were not pointing at the screen. By having visual feedback, users can more easily understand what the system is recognizing, and therefore correct their movement if necessary.

7.4 Allow personalization and adaptation

Each user has his/her own particularities in the way he/she moves in both speed and distance. This makes static thresholds not optimal for all population. Gesture recognition is a great challenge per se; however, the optimal solution involves adapting these thresholds to each user, preferably automatically. Otherwise, manual personalization should be available, such as defining the speed and length of the gestures.

8 Conclusion

This study has shown that in-air hand gestural interactions are an appropriate way for older adults to control a general technological interface. Most participants enjoyed using this type of interface, finding it fun to use. The participants found most of the evaluated gestures easy to learn and perform. It was also confirmed that, for older adults, the simpler the gestures, the better performance participants had in completing the tasks we proposed.

For navigation tasks, the Swipe gesture was simpler and therefore allowed users to complete the proposed tasks faster, as opposed to Grab and Drag. The latter, on the other hand, provided more precision and control. Regarding selection tasks, the most successful gesture was Point and Hold, as it allowed for accurate and fast selections. Participants also considered the alternative, the Point and Push gesture, easy to perform, although it did not allow for a precision as good as the Point and Hold gesture. In terms of satisfaction, all gestures were highly rated in the satisfaction questionnaire, which means that this type of interaction was widely accepted. In general, the senior participants showed a positive attitude towards gesture-based interactions. The remarks they made indicated that they enjoyed using this interface. The only downside is that the technology still cannot predict the user intentions in all cases. When this happens, it may frustrate the user. Fortunately, the cases when it works are more frequent than when it does not work.

It has also been seen that comparing physically fit and conditioned participants, their performance results were similar except for the time required to complete the navigation tasks. These gestures involve more movements, in which case physical fitness proved to ease the task completion. Regarding selection tasks and error rate, there were no differences between the physically fit and conditioned participants’ results. This indicates that the diseases typically experienced by older adults do not compromise in-air gesture interaction.

8.1 Future work

Although in this experimental evaluation the authors tried to simulate an environment as close as possible to the real-life use scenario, we considered—the older user comfortably sitting in the living room and interacting through in-air hand gestures—the results should be validated in the wild [7], at the users’ homes. This test was made in a controlled laboratory setting, and it would be interesting to evaluate how these results compare to the use in an everyday context over a period of time. As future work, an evaluation of how would older users react in a longer and more repetitive use, such as using the system regularly, is considered. Field trials have been made as part of the PaeLife project [25, 27, 28, 33], and it is hoped that the data gathered will produce new complementary insights.

Finally, it was illustrated that on the two countries where the user studies were performed, France and Hungary, had comparable results in both performance and user satisfaction. Although the gestures tested were relatively simple, it can be concluded that the different cultural backgrounds of these two countries did not play a major role for the interaction with in-air hand gestures. However, in order to assess whether the cultural background plays an important role in performance and acceptance of the gestures we defined, further tests should be performed in more countries with larger sample sizes.