Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Parkinson Disease (PD) is a progressive degenerative disorder of the central nervous system characterised by a large number of motor and non-motor features that can impact on function to a variable degree. There are four main motor features of PD: tremor at rest, rigidity, akinesia (loss of control of voluntary muscle movements) and postural instability [19]. Gait is one of the most affected motor characteristics. Gait abnormalities can cause loss of balance and a tendency to fall, which often causes serious injuries [1]. In addition, the non-motor symptoms associated with PD include autonomic dysfunction, cognitive/neurobehavioral disorders, as well as sensory and sleep abnormalities [19]. As the percentage of the elderly in the population grows, the prevalence of PD in North America is expected to double in the course of the next 20 years. There is an important economic burden caused by the disease [10]. In the United States alone, the annual economic impact of PD is estimated at $10.8 billion, 58 % of which is related to direct medical costs [16, 29]. Given the economic impact, the decrease in quality of life caused by PD, along with the predicted rise in prevalence, there is a substantial need for new and novel methods of treatment and rzehabilitation for PD.

Unfortunately, there is currently no cure for PD, but there is medication and various forms of therapy and rehabilitation designed to help manage symptoms and improve quality of life. However, several issues with current approaches to rehabilitation of patients with PD have been reported [23], with the lack of task and context-specific rehabilitation programs being a main issue. Benefits from rehabilitation have been often linked to context, and the in-clinic context is typically contrived or artificial and does not adequately capture real life scenarios, situations or challenges that patients face in a daily basis. Limitations of the in-clinic environment restrict the types of activities that can be made as part of rehabilitation programs [23]. In particular, scenarios that are potentially hazardous or dangerous, yet are part of daily life, cannot be supported in current rehabilitation programs.

Recently the interest in Virtual Environments (VEs) has grown in the PD research community due to the potential that comes through the use of VEs. Different scenarios can be simulated, providing whatever “context” is needed, while bypassing inherent limitations of the current clinic environment and ensuring safety regardless of the scenarios presented. Many different VEs can be created through virtual or augmented reality technologies.

In this work, we created three different virtual environments using augmented reality. These environments allow us to assess patients with PD while they perform dual-task activities. How well the patients perform in those activities will help us evaluate the feasibility and limitations of using augmented reality as a support tool in PD rehabilitation programs.

1.1 Augmented Reality in This Context

Augmented Reality (AR) is the visual combination of real-time video streaming and computer generated 2D and 3D imagery. Opposed to the classic Virtual Reality (VR) paradigm in which users are immersed in an entirely simulated world, augmented reality allows users to stay connected with the real wold while creating the illusion of being in a different physical location. Furthermore, AR provides users with the ability to see and interact with objects that are not present in their surroundings. According to Azuma et al. [5], augmented reality applications should meet the following three requirements: AR should be the mixture of video sequence and computer generated imagery, AR applications have to run in real time, and virtual objects have to be properly aligned (registered) with real world structures.

In AR, computer generated graphics are overlaid into the user’s field of view. For example, graphics can be used to (a) add supplementary information or instructions about the environment, (b) insert virtual objects, (c) enhance real objects, or (d) provide step-by-step visual aids that are needed for the execution of a task. In its more basic form, augmented reality overlays simple head up displays, images or text into the user’s field of view. More complex AR applications display sophisticated 3D models rendered in such a way that lighting conditions, shadows casting and the simulation of occlusions appear indistinguishable from the surrounding natural scene. Figure 22.1 shows an example of a common AR system in which the video image is acquired, registered and augmented. In order to register the virtual cereal box in the image, the AR system derives tracking information from the video input. After rendering the registered 3D transformation, the real object can take any other appearance or even be transformed into a completely different object. This type of visualization is a powerful tool for exploring the real world along with added contextual information.

Fig. 22.1
figure 1

A simple augmented reality example. a Original video feed. b Augmented scene

1.1.1 Registration and Tracking

In order to appropriately integrate real and virtual information, both the real image and the 3D augmentation have to be carefully combined rather than simply attached together. If computer graphics are generated separately without correctly registering the visible real environment, a favorable visual composition between both types of data may not be accomplished. Providing robust and accurate registration is the main technical difficulty that AR systems have to overcome. In AR systems like ours, where head mounted displays are used, registration is equivalent to computing the pose (rotation and translation) of the user’s viewpoint.

In AR, image registration uses video tracking algorithms that usually consist of two stages: tracking and reconstructing. In the first stage, fiducial markers or image features are detected. The tracking step usually employs feature detection, edge detection, or other image processing methods. The reconstructing stage uses the data obtained from the first stage to reconstruct a real world coordinate system based on a camera model and object transformations [39]. Figure 22.2 shows a diagram that illustrates a simple AR system and its components.

Fig. 22.2
figure 2

Diagram of a simple augmented reality system

1.1.2 The Occlusion Problem and Depth Perception

One of the inherent drawbacks of overlaying virtual environments to video, is that objects of interest are frequently occluded by 3D augmented objects, thus creating an unrealistic effect where foreground items that should appear in front of the augmented information are occluded (see Fig. 22.3). Realistic image composition requires the correct combination between virtual and real objects, in which background/distant augmented objects must be correctly occluded by foreground real objects. Solving the occlusion problem in augmented reality is challenging when there is not enough information about the real world that is being augmented.

Fig. 22.3
figure 3

Unrealistic effects are created in cases in which augmented objects occlude real objects. In this picture the drawer should be rendered behind the chair

If we do not take into consideration the information covered by the overlaying virtual objects, the resulting visualization may cause problems in depth perception. The human cognitive system interprets a set of monocular and binocular cues in order to interpret depth and spatial organization of the 3D objects in the environment, and so we must be careful to simulate these cues accordingly.

1.1.3 Skin Segmentation Using Color Pixel Classification

As mentioned in Sect. 22.1.1.2, in the classic augmented reality approach, what a user sees is a combination of two layers: video as background and 3D as foreground. One of the main challenges of augmented reality is the occlusion problem. In simple terms, occlusion is the process of determining which objects should be visible in relation to other objects. Occlusion provides a very important visual cue to the human perceptual system when rendering data in three dimensions [36].

For example, when we interact with the real world, it is clear that if we place our hand in front of some other object, for example a table, some part of it will be hidden by our hand. In augmented reality systems, occlusion is not always resolved successfully, leading to an unnatural and confusing experience for the user. Skin detection can help tackle this problem by identifying the set of pixels that correspond to skin in an image so that hands can be placed in a separate layer. Thus, instead of having two layers (video and 3D models), we are proposing the implementation of a third layer that would correspond to hands and other objects of interest. With the third layer, the occlusion problem can be corrected by placing skin pixels in front of both the 3D and video layers (Fig. 22.4 shows a representation of the multilayer approach we are proposing to solve the occlusion problem). Machine learning algorithms can be of great aid for computer vision applications such as the implementation of a skin classifier. For this work, we implemented a two-class skin color classifier using an Artificial Neural Network.

Fig. 22.4
figure 4

To avoid the occlusion problem (4), we overlay the 3D information (2) over the original image layer (1), and on top of the first two layers we add a third layer composed by the user’s hands and other objects of interest (3). The result is a properly composed image (5)

1.1.4 Presence

In virtual environments, presence can be defined as a state of consciousness, the psychological state of “being there” [13, 30]. Witmer and Singer [35], defined presence as the subjective experience of being in one place or environment, even when people are physically situated in another. Involvement and immersion are two concepts of interest related to presence [35].

One of the objectives of this work is to evaluate whether the proposed AR system provides the sense of presence required to virtually transport and immerse users inside the synthetic environment. Based on the work of Witmer and Singer [35], we asked patients with PD to answer a subjective presence questionnaire. The questionnaire was used to evaluate relationships among reported presence and other research variables. The results of such evaluations are described in Sect. 22.5.

1.2 Goals of this Work

The main goal of this work is to design, develop and evaluate a wearable augmented reality system, designed to assess patients with PD in dual tasking activities (performing simultaneous motor and cognitive tasks) and assist in programs of rehabilitation. This system will allow patients with PD to interact with both augmented and real objects, using nothing but their bare hands. This approach is novel, because the system provides mechanisms to allow free and natural navigation inside a virtual environment. Through a head mounted display, patients with PD are immersed in 3D virtual environments. In this way, multiple context and task-specific scenarios can be represented. For instance, patients could be immersed inside a virtual environment representing a grocery store, in which they can perform tasks that commonly are difficult to patients with PD. Examples of such tasks are: bending over to pick something from a bottom shelf or walk through reduced aisles while avoiding virtual obstacles. This would allow physicians to observe their patients as if they were present while their patients did their grocery shopping.

The AR system has been used in a series of trials and its performance was evaluated. Those trials followed a strict protocol approved by the University of Western Ontario Human Subjects Research and Ethics Board. Following the protocol instructions, every patient is asked to perform several tasks, both in the real-world and using the AR system. The patients repeat the same set of activities in three appointments during three weeks. At the end of each appointment, they are asked to answer a questionnaire based on Witmer and Singer’s work [35]. The results of those questionnaires are used to evaluate if the system provides the sense of presence required for a more intuitive and immersive experience.

1.3 Chapter Outline

In Sect. 22.2 of this chapter, we give an overview of related work. In Sect. 22.3, we describe the design and architecture of the proposed augmented reality system. The protocol and method of the study are defined in Sect. 22.4. In Sect. 22.5, we outline the results of trials we conducted to evaluate the performance of the system, as well as the results we obtained from the presence questionnaire. In Sect. 22.6, we summarize this chapter, presenting future work and concluding remarks.

2 Related Work

As the main objective of our augmented reality system is to enable novel methods of assessment and rehabilitation for Parkinson Disease, we believe that there is an inherent need to provide intuitive and natural forms of interaction and navigation. In this section, we explore the literature in this area, and examine other applications of virtual environments to the study and treatment of Parkinson Disease.

2.1 Registration and Tracking

This section summarizes different tracking strategies used in augmented reality. Non-visual tracking technologies have been used in virtual environments. Active technologies that use magnetic fields or ultrasound are available. Some popular examples of magnetic trackers are the products produced by companies like Polhemus. InterSense produces inertial-ultrasonic hybrid tracking systems such as the IS-900 system. Even though commercial products are robust and provide low latency, they are not widely used in augmented reality due to their high cost. Moreover, they are still prone to errors caused by external factors such as interference. The low cost of video cameras and the increasing processing capacity of computers and handheld devices have inspired a significant increase in research into the use of video cameras as visual tracking sensors. The literature review in this section is focused on vision based tracking methods that have been used in augmented reality applications.

In augmented reality, image registration uses different computer vision methods. Fiducial markers or interest points are detected from camera images. Tracking uses feature detection, edge detection, or other image processing algorithms to analyze live video from a camera. Tracking techniques can be divided in two classes: feature-based and model-based [39]. Feature-based algorithms consist of finding the relationship between 2D image features and their 3D world coordinates [24]. Model-based methods use real-world object heuristics. For example, a virtual model of tracked objects’ features can be used. Another example of a model-based method would be the use of 2D templates based on distinguishable features of an object. Once the relationship between the 2D image and 3D world frame coordinates are found, the camera pose can be obtained by projecting the 3D coordinates of features into the observed 2D image. The reconstructing stage uses the data obtained from the first stage to reconstruct a real-world coordinate system.

Some methods assume the existence of fiducial markers in the surroundings. Other methods, like the one proposed by Huang et al. [15], uses pre-calculated 3D structures for what they call the AR-View. There are two important characteristics of the AR-view approach: the first one specifies that the camera has to remain stationary and the second one dictates that the position must be known beforehand. In their approach, when the scene is not known, they first use fiducial markers and Simultaneous Localization And Mapping (SLAM) to compute the relative position of the device with respect to the scene. In cases where the AR device is static, an approach like the one adopted by Huang et al., can be used. On the other hand, if the AR device is mobile, tracking becomes much more difficult. Movable systems have to be able to model and deduce both camera motion and the structure of scene.

There are some open-source AR libraries available for use, the most popular of which is ARToolKit and its many derivatives. ARToolKit is a library that was developed based on the research of Hirokazu Kato from the Nara Institute of Science and Technology [22]. ARToolkit is a vision-based tracking library that uses real-time video to calculate the camera position and orientation relative to fiducial markers. Once the real camera position is known, the information can be used to correctly overlay 3D computer graphics over the markers.

2.2 Natural Selection and Manipulation

Many AR prototypes that support interaction are often based on classic desktop metaphors (for example, a mouse is needed to use on-screen menus, others require users to type on keyboards). Others make use of video game devices and controls such as joysticks, the Wii Remote, PlayStation Move, etc. Techniques popularized by handheld devices such as gesture recognition are also common in AR. The two main trends in AR interaction research are (a) using heterogeneous devices to exploit the characteristics of multi touch displays and (b) integration of the physical world through tangible interfaces [5]. Different devices suit different interaction techniques. For example, a handheld tablet is very useful to play games, surf the web or read eBooks. In augmented reality, users usually manipulate data through a variety of real and virtual mechanisms and can interact with data through projective and handheld displays. Tangible interfaces allow direct interaction with the physical world and virtual world using real, physical objects and tools. Tangible Augmented Reality (TAR) [8] combines the intuitiveness of Tangible User Interfaces (TUI) [17] with the abstractness of virtual objects. In a TAR environment, the user is normally in an egocentric view and is able to interact with virtual objects by using a TUI-based direct manipulation artifact. 3D object selection and manipulation is possible by collision or proximity between the prop and a marker representing the 3D object. 3D object position and orientation is modified using a tilting, dropping, or hiding gesture using the prop.

Natural Interaction in virtual environments is a key requirement for the virtual validation of functional aspects in the design of PD rehabilitation programs. For example, in rehabilitation programs, patients are often asked to pick up objects and perform tasks with them. Natural interaction is the metaphor people encounter in reality: the direct manipulation of objects with their hands. As mentioned earlier, our system uses color-based skin classification to segment users’ hands from the video signal to allow natural interaction. In our approach, the segmented images are rendered directly over top of the virtual information.

2.3 Navigation

The most intuitive way of navigation is natural walking. However, virtual environments still face various restrictions to allow unrestricted walking. One of the big issues of VEs has been the unfulfilled goal of enabling a person to move freely in the cyberspace without using metaphors which translate gestures to motion [34]. Most current setups do not offer the possibility of walking through VEs, or if they do, it is only in a very restrictive manner. In desktop-based metaphors, users simply navigate through the VE using keyboard, mouse or joystick, or similar input devices. This creates a sensory conflict, where the user is physically not moving, but receives visual input congruous with self-motion [31].

Innovative approaches to solve the navigation issue have emerged. Such approaches allow unencumbered movement within the virtual space through user selfmotion. One example is the so called Gaiter System [32], which evaluates the movements of users to simulate motion without using a special floor or treadmills. However the real movement is limited by the room dimensions. The omnidirectional treadmill (ODT) [31] uses orthogonal belts which are made up of rolls. This machine facilitates omnidirectional unrestricted walking in the infinite virtual environment, within a finite real world footprint. A different approach, the Torus Treadmill [6], uses several belts which form a complete torus [18]. These advanced walking devices have usually been combined with Cave Automatic Virtual Environments (CAVE) [11] to maximize the immersive experience.

2.4 Virtual Environments in Parkinson Disease Research

Navigation can be seen as an interaction between mobility and an environment that requires the rapid integration of information from visuospatial input, kinematic input and memory. Navigation deficits involving visual processing have been reported in PD [12, 37], and may contribute to gait impairment, increased risk of falls and inefficiency in completing tasks. Virtual Reality is a technology that has been used for assessing and rehabilitating such complex deficits. VR uses computer graphics software to create virtual environments that visually immerse users, resulting in the perception that those environments are “real”. Virtual reality has been used in rehabilitation of gait and cognition in a variety of neurological conditions [9, 25]. This technology has demonstrated efficacy for both assessment and treatment [38].

The field of virtual reality research in PD has grown rapidly in previous years. Many studies have utilized non-immersive systems that do not allow ambulation. Studies have focused on aspects of reaching, problem-solving and navigation using non-ambulatory, desktop-based systems [2, 26, 27]. Kaminsky et al. [20] evaluated the effect of visual and auditory cues along with VR to simulate the real-world experience during ambulation. Mirelman et al. [28] used immersive virtual environments to provide visual context and cognitive/motor challenges in a VR gait-training program. However, the trajectory of ambulation was restricted to treadmill walking. Hollman et al. [14], used a curve display and a treadmill to study whether or not gait instability is prevalent when people walk in immersive virtual environments. Their results suggest that the use of treadmills combined with VEs can cause instability in stride length and step width as well as variability in stride velocity.

In a previous in-home VR based project [21, 33], we developed a fully simulated house that delivered visual information in the form of static contextual cues typical of a home environment such as furniture, doorways, walls, etc. In that study, the goal was to observe patients with PD ambulating freely without the inherent veering restrictions of a treadmill in a more “familiar” virtual environment. A head mounted display was worn by patients with PD to visually immerse them inside the virtual environment. Based on patients’ orientation and ambulation in the real world, a third person was in charge of navigation inside the virtual home using an experimenter-driven “Wizard of Oz” controlling scheme. A study was conducted with patients with PD and controls in a variety of navigation tasks such as line following tasks and free-form room-to-room navigation tasks. Results from that study were both interesting and valuable, indicating potential for the use of virtual worlds in creating ecologically valid research and rehabilitation environments for PD [21, 33].

Six Degrees of Freedom (6DOF) tracking devices have been used together with VR systems to monitor the position and orientation of selected body parts of users. When used on a head mounted display, the position and orientation of the head can be measured. This information defines the user’s viewpoint in the virtual world and determines which part of the VE should be rendered to the visual display. The information delivered by tracking devices can be used to simulate navigation [6]. However, despite their huge cost, tracking devices are still prone to failure due to interference, out-of-range distances, or sensitivity to environmental factors. Depending on the technology, 6DOF position trackers can be sensitive to large metal objects, various sounds, and objects coming between the source and the sensor.

2.5 Discussion

Previously published Parkinson Disease studies have focused on aspects of reaching, problem-solving and navigation. However, patients had to use joysticks or the keyboard to both interact with the environment and navigate through it. In this work, we instead segment the user’s hands and make them visible inside the VE to allow natural interaction both with real and augmented objects.

Regarding navigation, researchers have evaluated the effect of visual and auditory cues delivered via virtual environments to enhance the real-world experience and cognitively challenge patients during ambulation using treadmills [28]. Many other non-Parkinson Disease studies have utilized omnidirectional treadmills combined with CAVE systems to immerse people inside virtual environments. Such advanced configurations allow people to freely navigate in any direction inside the VE without restrictions. Unfortunately, the size, complexity, but above all, the price of such systems is so high, that makes it infeasible to use them for rehabilitation. The AR system we developed as part of this work takes advantage of the vision-based tracking characteristics of Augmented Reality to obtain the 6DOF transformation of the camera. We use this transformation to emulate a head motion tracking system. The use of this 6DOF head tracking system allows patients with PD to freely navigate inside virtual environments without using any kind of treadmill or inertial/hybrid tracking devices.

3 System Design and Development

As discussed earlier, the main objective of this work is to develop an augmented reality system for Parkinson Disease assessment and rehabilitation that can provide the user with a sense of presence and immersion. We sought to do so without requiring the use of expensive equipment discussed in the previous section. In addition, some technologies are impractical for this particular application. They might not be suitable for in-clinic use, equipment might be bulky, heavy, or awkward for patients to use (especially seniors or people who have mobility issues; for example, treadmill-based systems are not suitable for them). This is important, because our system is intended to be portable and transferable so that it can be used in any clinic without requiring a huge investment. Our system will allow physicians to observe patients with PD as they perform daily life activities in the virtual environment.

Our wearable augmented reality system is transcendental and innovative because it provides natural interaction and free navigation. In terms of navigation, the only limitation of our approach would be the physical space available. This avoids issues found in other work in this area that use desktop-based metaphors, which employ simpler devices such as off-the-shelf game controllers for interaction. In those systems, navigation is implemented through the use of common treadmills. That approach, however, has not given satisfactory results and have been unable to reproduce real-life activities under context, which is very important for a successful rehabilitation. Our system does not suffer these deficiencies because of its design and implementation. In this section, we describe the three main components of our wearable AR framework: the hardware, physical space, and software system.

3.1 Hardware

Our approach employs a camera system to sense the environment and provide a source video stream for augmentation and positioning/orienting the user, a computer to run our software and do all the processing involved to construct, compose, and render the virtual environment as the user should see it, and a head mounted display for presenting the environments to the user. The main aspects that we considered for the hardware in our framework were: weight, computing power and connectivity.

The laptop computer. We chose a laptop computer that was light so that it could be fit into a small backpack, as this is what makes our system wearable. Our objective was to minimize the patients’ awareness regarding the fact that they are carrying or “wearing” a laptop. We consider this to be crucial to provide a better sense of presence, since the patient can concentrate on the task at hand without worrying about the laptop. Another important factor in our decision was computing power since our system renders 3D graphics and processes video at the same time. In addition, we needed a laptop with support for an Internet wireless connection, video output and USB ports. We chose the ASUS UX31 because it was the lightest Windows-based computer that complied with our requirements. See Fig. 22.5a.

Fig. 22.5
figure 5

The three devices used in our framework, (a) ASUS UX31, (b) VUZIX iWear 920VR, (c) VUZIX iWear CamAR

The head mounted display. This device is vital in our system, because it is through it that the user sees the virtual environment in first-person (i.e. as if patients were using their own eyes). Figure 22.5b shows the VUZIX iWear 920VR HMD we are using. This model is light and supports a resolution of up to 1024 × 768 pixels.

The camera. This device is used to capture video at 30 Hz. The video is processed by computer vision algorithms in order to compute the position of the camera relative to the real world. We decided to use a Dinex CamAR webcam, which is shown in Fig. 22.5c. This model is designed so that it can be easily attached to the VUZIX iWear 920VR HMD.

3.2 The Physical Space

In order to use our system, a physical space is required in order to install the fiducial markers needed to represent the virtual world. In this section we will describe the physical setup of the space we used for our experiments.

In order to setup our system, the London Health Sciences Center provided us with a room that measures 6.68 m × 4.92 m. The room is enclosed by four vinyl walls over which we mounted fiducial markers. We also installed fiducial markers in the floor. Fiducial markers are points of reference that a computer vision system uses to measure the position of the camera with respect to each fiducial marker. The fiducial markers in our system are unique black and white patterns printed on a material known as coroplast. Black and white fiducial markers are easier to detect because they provide high contrast. Figure 22.6 shows a photograph of the physical space with the fiducial markers. We installed fiducial markers of different sizes; bigger markers are used to track the position of the camera from long distances, while smaller markers are used so that the user can interact with virtual objects from shorter distances. We used five different marker sizes: 45 × 45 cm, 30 × 30 cm, 20 × 20 cm, 15 × 15 cm and 10 × 10 cm. We installed and configured 110 fiducial markers in total.

Fig. 22.6
figure 6

Picture showing the physical space with fiducial markers on the walls and on the floor

As we can observe in Fig. 22.6, the biggest markers, which measure 45 × 45 cm, were installed in the bottom and top of the walls. The reason for this is that both the top and bottom of the walls are farther with respect to the point of view of the user. We can also observe that the markers are smaller in size as they approach the level that corresponds to a person’s eyes when looking straight ahead (approximately 1.70 m). In order to compute the position of the camera with respect to the markers in the room, the system must know the 3D position of each marker with respect to a specific point of reference in the real world. Therefore, we measured the 3D position of each marker with respect to the point of reference, one of the corners of the room.

3.3 Software

In this section we describe the software component of our AR framework. This software is novel because no other Parkinson Disease research has used augmented reality to create immersive virtual environments. Even though our approach to allow natural interaction is simple, users can select and manipulate real and augmented objects without the need for external devices. The system was developed to be easy-to-use and intuitive as it is intended to be used by patients with Parkinson Disease.

Our system is composed of four main modules: CoreSystem , VideoSource, ARDriver and ScenarioManager . Figure 22.7 shows the architecture of our system and illustrates how these four modules interact. As shown in this figure, the VideoSource module captures and processes the video signal. The ARDriver module computes the transformation matrices of the 3D objects. These matrices are fed to the ScenarioManager , which renders the final scene. Each of these four modules will be discussed in the following sections.

Fig. 22.7
figure 7

System architecture

3.3.1 Core System

The CoreSystem module manages the data structures that are used by other modules. In addition, the VideoSource , ARDriver and ScenarioManager modules are instantiated from CoreSystem . The CoreSystem module also manages the GUI and user actions in general. One of the main advantages of our system is that it allows creating and administering multiple scenarios without modifying or recompiling the source code. CoreSystem uses XML configuration files in order to manage the structure and behavior of the GUI and 3D scenarios. Therefore, to create a new scenario, the operator of the system only needs to edit or add to these files without any programming required.

3.3.2 Video Source

This module is one of the most important components of our system because it captures the video signal and detects/segments objects of interest such as the hands of the patient. This module is divided into three main functionalities: video capture, color thresholding, and color based skin classification, as described below.

Video capture. The captured video signal is sent to the CoreSystem module so that the ScenarioManager module can incorporate the original video signal as background over which the 3D objects are rendered.

Color thresholding. The VideoSource module segments objects of interest using a simple thresholding technique to classify green objects. This classifier generates a monochrome image that is used in combination with the results of the skin classifier to generate a mask which is used by the ScenarioManager module.

Color based skin classification. This is the feature that enables the user to have natural interaction with objects in the environment using their hands. We used an Artificial Neural Network (ANN) classifier, because among the other options we tested (Support Vector Machines and simple thresholding), ANN gave us the best experimental results, with an accuracy of 85 % over training and testing data.

3.3.3 ARDriver Module

This module detects and extracts the position of the fiducial markers with respect to the camera. ARDriver receives an instance of the video signal from the CoreSystem module and detects all of the fiducial markers on the current frame. Each 3D object is associated with a series of different markers. This is known as multi-marker configuration. Multi-markers are detected according to the hierarchy defined by the CoreSystem module. This hierarchy groups fiducial markers according to size. For example, a given multi-marker might be formed exclusively by four 10 × 10 cm fiducial markers.

In our system, the multi-marker configurations are defined in an XML file, which follows the format defined by the augmented reality library ALVAR. ALVAR uses this configuration to compute the 3D transformation of each 3D object and the result is translated into the format required by the ScenarioManager module.

3.3.4 ScenarioManager Module

This module integrates information from VideoSource and ARDriver , in order to render the final scenario. Essentially, it integrates the video signal, the 3D models and the segmented objects of interest to create the augmented reality environment that the user perceives. In Fig. 22.7, in the ScenarioManager box, we can observe an illustration of how these elements are merged. Additionally, ScenarioManager renders the GUI when necessary. The ScenarioManager renders the scene by performing the following actions:

  1. 1.

    ScenarioManager receives the original video feed from VideoSource and composes an initial layer over which the 3D models will be rendered.

  2. 2.

    ScenarioManager receives from CoreSystem the list of all the 3D models that need to be rendered.

  3. 3.

    ScenarioManager transforms the 3D models with matrices received from the ARDriver module, to correctly project the models into a second layer.

  4. 4.

    ScenarioManager receives an image that contains the segmented user hands (through skin classification) and objects of interest (through color thresholding) from the VideoSource module. With this information, it creates a third layer.

  5. 5.

    Finally, ScenarioManager merges the three layers mentioned above to generate the final scene for display to the user.

4 Experiment Protocol

One of the objectives of our work was to evaluate whether augmented reality can be used as a support tool in the development of rehabilitation programs for patients with Parkinson Disease. To that end, we performed a series of experiments that are designed to challenge patients in a similar way as it is done in regular rehabilitation programs. We used our system to observe how patients respond to cognitive, motor and executive-function challenges.

For our initial experiments, eleven participants between the ages 50 and 80 were recruited using a convenience sampling technique from the Movement Disorders Centre at London Health Sciences Centre. Nine of these individuals had Parkinson Disease (patients with PD), while two of them did not (controls). The criteria for inclusion and exclusion of participants in the trials were determined by the Movement Disorders Program. For example, patients with a high-level of dementia were excluded from the study because they are unable to follow instructions. Patients that present any type of freezing of gait were excluded because they are prone to falling. The experiments were performed at the London Health Sciences Centre South Street Hospital. The procedures described below were completed by both patients with PD and controls. The experiments were conducted in 3 sessions over 3 weeks.

In this section, we describe how our experiments were conducted. We developed three different virtual environments. We refer to them as scenarios. The first scenario, called “Watering the Plants”, represents a living room and a kitchen. “Supermarket”, the second scenario, represents an aisle in a supermarket. The third scenario, “Street Walk”, represents a pedestrian crossing in a street. We explain these scenarios below.

4.1 Watering the Plants Scenario

This scenario represents a room filled with various combinations of flower pots. The flower pots were coloured to different colours and placed throughout this room. In this environment, subjects were asked to move toward a table where there were two rows of flower pots, one on the left and one of the right. They were then given a real watering can. The watering can, as well as the participants’ hands, were segmented out to appear in the virtual world. The segmentation and overlaying gives the illusion of immersion and allow natural interaction. The patients were asked, while standing in one spot, to reach and water the furthest plant on the table in front of them, with both their right and left hands, 3 times for each hand. This procedure is performed in the virtual environment first and then in the real world. In the real world, the participant performs the same activity without the visual cues that the virtual environment provides. Figure 22.8 shows a participant performing this task in the virtual environment.

Fig. 22.8
figure 8

Watering the plants scenario: a participant performing this task in the virtual environment

4.2 Supermarket Scenario

In this scenario, the participants were immersed in a grocery store in which they could interact with augmented cereal boxes within the environment. One at a time, participants would remove a box of cereal from a shelf in the virtual store and place the box in a numbered augmented basket in the environment. Participants were given a series of numbers to remember representing the order in which baskets were to receive cereal boxes, providing a challenge to both the cognitive and motor skills of participants. The cognitive challenge is memorization. The motor challenge is requiring the participants to bend over, which is particularly difficult for patients with Parkinson Disease. Additionally, this task in particular helped us to observe how naturally the participants selected and manipulated the augmented objects. This procedure was repeated 3 times in the virtual environment first and then again in the real world. In the real world, the participants interacted with real cereal boxes and the same baskets that were visible in the virtual counterpart. The baskets were labeled in the same way as in the virtual world. Figure 22.9 shows a picture in which a participant is interacting with the augmented environment.

Fig. 22.9
figure 9

Supermarket Scenario: The motor and cognitive skills of participants are challenged within this task to observe gait impairment issues

4.3 Street Walk Scenario

This scenario represents an outdoors scene, where the participants must cross the street in a crosswalk to reach a mailbox on the opposite side of the street. The participants must adjust their walking speed based on instructions. The participants are asked to walk in 3 different speeds: normal speed, twice as fast with respect to their “normal” speed and half as fast with respect to their “normal” speed. Participants have to adapt their walking speed based on internal or external cues. In this context, an internal cue is a spoken instruction. An external cue is a visual element that indicates how fast the participant must walk. In this case, our external cue is a timer which displays a countdown. The participant must reach the other side of the street before the timer expires.

We measured the time participants took to cross the street using a calibrated stopwatch. Timing began with the first step taken by participants and ended when participants reached the other side of the street. We averaged the results from normal walking speeds in order to obtain what we refer to as a baseline measurement. This baseline is our point of reference to define our external cues (i.e. the duration of the countdown). The countdown was defined as half the average baseline (1 baseline) for the “twice as fast” trial. We defined the countdown for the “half as fast” trial as double the baseline (2 × baseline). Figure 22.10 shows the outdoors scene with the external visual cue presented to participants. These same procedures were repeated in the real world. This time, in order to represent the crosswalk, we used a mat. Instead of asking the participant to walk towards the mailbox, we asked them to walk towards a red cross marked on the floor. As we mentioned before, our main variable of interest is the time to complete the different tasks. In Sect. 22.5 we will present the results of our experiments. We will analyze whether there was a change in the participants’ performance during the 3 weeks of trials.

Fig. 22.10
figure 10

Street walk scenario: participants have to adjust their walking speed to cross the street in the amount of time that appears in the pedestrian light

5 Experiment Results

In this section, we present the results of the experiments that we described in Sect. 22.4. Our objective is to measure and compare the time it takes a participant to perform a series of tasks in both virtual environments and in real-life scenarios. If patients take a similar amount of time to perform tasks in a virtual environment with respect to a real environment, augmented reality is not significantly interfering with the patients’ perception. Thus, the patients’ experience in the augmented world can be deemed similar to the real world. This is an indication that skills learned in an augmented reality environment can be transferred to the real world and that augmented reality is adequate for the development of tools that doctors can use to assess or even rehabilitate patients.

Another way of evaluating our system is to determine how participants feel about using our system. For that reason, we asked them to complete a presence questionnaire. The objective of this questionnaire is to determine if our participants perceived our system as realistic. Therefore, this questionnaire is valuable in assessing the suitability of our system as perceived by participants.

To evaluate participants’ performance, we focus on two timed scenarios: Supermarket (see Sect. 22.4.2) and Street Walk (see Sect. 22.4.3). We do not include the results from the Watering the Plants scenario (see Sect. 22.4.1), because this particular scenario was not used to measure time, which is our metric of interest here. This scenario was included in the presence questionnaire, however. (Further discussion of the experimental results of this scenario, however, can be found in [7].)

5.1 Results of the Supermarket Scenario Experiments

As we mentioned in Sect. 22.4.2, we asked the participants to take cereal boxes and place them into baskets in an arbitrary sequence. This experiment consisted of having patients visit the hospital 3 times (once a week over 3 weeks). In each visit, participants repeated the task 3 times in order to rule out measurement errors.

As trials continued, participant performance steadily increased in the augmented reality environment as the participants adjusted to the environment. During the first visit, the range of times to complete the task was between 25 and 140 s; by the end of the third visit, however, the range of times had been reduced to a range between 22 and 60 s. In real-world testing, on the other hand, performance was much more consistent at between 15 and 50 s throughout all visits as no period of adjustment was necessary. While it took on average 10 s longer to complete the task in the AR environment, accuracy was comparable between AR and real-world testing, with an 81.1 % success rate in the AR environment and 83 % success rate in the real-world, which indicates that there was not a significant interference induced by the AR environment.

5.2 Results of the Street Walk Scenario Experiments

As we described in Sect. 22.4, the task in the Street Walk scenario consisted of asking participants to walk and adapt their walking speed according to internal and external cues. Using a baseline measurement, we asked participants to walk at two different paces: twice as fast and half as fast (see Sect. 22.4.3).

We observed that participants encountered moderate difficulty in adapting their walking speed using internal cues, but were able to do so more accurately with external cues. While some improvement was seen with repetition, it was not significant. Results were reasonably consistent between the augmented reality environment and real-world testing, with a maximum difference in performance of 5 %. This means that the fact that participants were not able to adapt their walking speed had nothing to do with them being in the virtual environment or the real world. Rather, this result had to do with the complexity of the task in itself. Consequently, we found that for this specific activity, augmented reality is not substantially interfering with the task and that the participants’ experience was similar in both cases.

5.3 Presence Questionnaire Evaluation

The effectiveness of a virtual environment has been linked to the sense of presence reported by the user. Presence can be defined as a normal awareness phenomenon that requires attention and is based in the interaction between sensory stimulation, environmental factors and internal tendencies to become involved [35]. To evaluate if our augmented reality system provided an adequate level of presence enough to immerse patients in the different scenarios, we asked our participants to complete a presence questionnaire after they finished the tasks in every visit. The presence questionnaire we employed is based on the work by Witmer and Singer [35].

The presence questionnaire consists of 34 questions with a 7-point Likert scale, which evaluates different factors that affect the involvement of participants in a virtual environment and thus the level of immersion. These factors can be classified in four categories: control factors, distraction factors, sensory factors and realism factors. Control factors refer to the degree of control a user can have when interacting with a virtual environment. In addition, this factor evaluates if a user gets satisfactory feedback to an action. Sensory factors refer to how many senses are involved when using a system. Distraction factors evaluate whether the hardware interferes with the degree of focus that users achieve. Realism factors measure how well the virtual environment is built to simulate real world places.

Overall, we found that participants rated their experience as moderately real to very good and excellent, with an average rating of 5.25 where a score of 1 would indicate the worst experience possible and a 7 would indicate the perfect experience. From this, we can conclude that participants had an overall favorable perception of the system.

5.4 Discussion

Our experiments helped us understand the benefits and limitations of immersive augmented reality. Overall, participants had a positive opinion regarding our system, as reflected by the presence questionnaire. Although there was a difference between the time participants took to complete tasks in the Supermarket augmented reality environment against the real world, participants were able to successfully complete the tasks in both cases. Regarding the Street Walk scenario, our results show that the performance of participants in this task was very similar both in the augmented reality environment and real world. Thus, we can conclude that augmented reality was not a factor in the performance of our participants.

We were able to successfully develop an augmented reality system that allows people to freely navigate virtual environments. Moreover, it allows natural interaction with both real and augmented objects. Therefore, this system can be used not only as a support tool in rehabilitation programs, but in other areas as well.

Our system is not without limitations, however. The main limitation is the head mounted display, which restricted the participants’ field of view and affected their perception of the virtual environment. Basically, the HMD eliminates peripheral vision. The HMD we used for our experiments provides only 32° of vertical field of view, compared to the normal human eye vertical field of view of 120° [4]. Because this, the current HMD is not suitable for people that suffer from slouched posture as they cannot see important aspects of the virtual world; in some cases limiting them to only seeing the floor of the virtual environment. Another limitation of the system is the lack of physics and collision detection, allowing participants to walk through objects or move physical objects through virtual objects, for example. This would allow participants to employ movement strategies that would not be effective in the real world, which could limit transferability. Both of these issues, however, are being addressed in a new version of the system that is under development [3].

6 Conclusions

The main objective of this work was to develop and evaluate an augmented reality system for Parkinson Disease rehabilitation. We successfully developed a flexible augmented reality system that could be used by doctors as a tool to assess their patients. We consider that one of the most important contributions of our system is that it provides users with the ability to naturally interact with objects without the need for external devices. For example, instead of interacting with the system through a mouse or glove, users are able to grab objects with their own hands. This was made possible by our implementation of a skin classification algorithm that allows our system display the users’ hands on top of the virtual environment.

Another key feature of our system is that it provides free navigation. That means that users can walk and move freely within the virtual environment, as they would in real-life. Regarding navigation, the only limitation of our system is determined by the physical space where the system is deployed. Free navigation was implemented by using vision-based tracking. This feature allows users to feel as if they were actually present in the virtual environment by providing a first-person view.

Another objective of this work was to perform experiments to determine if augmented reality can be used in future rehabilitation applications. Our experiments consisted of comparing user performance in a series of tasks in a virtual environment to the same tasks in the real-world. To perform this comparison, we measured the time it took users to complete a set of predefined tasks. Our results show that the time it took participants to complete tasks in the augmented world is similar to the time it takes to complete the same tasks in the real world. From this, we can conclude that augmented reality provides a realistic environment where users can perform tasks in a similar way as they would do in real life. Also, we found that there is a relation between the sense of presence that participants experienced, and how well they performed in tasks in the augmented environment. This means that if people perceive the virtual environment as being “natural”, there are more possibilities to obtain attention and learning that can be transferred to real world activities.

Overall, our research and development experiences lead us to believe that augmented reality can, in fact, be used quite successfully applied in healthcare applications. There is still much work that can be done, however. In order to setup the physical space needed for the system, it is necessary to manually configure all of the fiducial markers to be used. It would be desirable to implement a feature where users could use fewer markers and complement tracking by using natural features to reduce the time needed to setup the system. Such work is already in progress in [3]. We also propose to mix our segmentation algorithms with depth perception and object segmentation to allow multiple levels of occlusion between the real world and the virtual environment. In order to confirm that augmented reality can be used for rehabilitation programs, further experimentation is needed to gather further feedback to improve and refine the system. Such experiments are currently under way.