Keywords

1 Introduction

Virtual reality (VR) environment is a computer-generated environment that aims to provide immersive feeling to the user, in other words, to make user feel that they are being there. In order to achieve this, usually a three-dimensional space is created digitally, and this 3D space will be projected to the user’s field of vision. One of the common ways of projection is by using a head-mounted device (HMD).

Recently, realization of commercializing VR technology in the society had obtained valuable achievement, which is the success of VR headset development. The VR headset is actually a ski-mask-shaped goggle device, and it has to work together with a smartphone, computer, controllers or others processing device. According to Desai et al. [1], the new technology used by the Oculus Rift has successfully overcome the comfort problems faced by the old-generation VR HMD, such as motion illness and dizziness after or while using the HMD. This breakthrough is believed to be the catalyst for the VR commercialization and popularization.

The race on VR industry has started since these few years. We can see that some of the major technology companies have competing to go into the VR industry, with their production of VR headsets. Oculus has come out with the Oculus Rift and Samsung Gear VR, HTC has released the HTC Vive, and Sony has come out with the PlayStation VR. Besides that, came after the cheapest headset, Google Cardboard, Google has announced their more advanced successor of Google Cardboard, which is the Google Daydream.

By looking into this trend, it shows that the VR industry has started to move towards a new milestone, and we can estimate that, in the next few years, it might become increasingly popular, and prevalent in our society, especially, among the youngest generation.

However, the VR technology is actually not mature enough to provide a good sense of immersion. With a VR headset covering our eyes, we can deceive our vision; with a speaker or earphone, we can deceive our auditory perception; but, how about the senses others than “sight” and “hearing”, for example “touch” and “body movement”? How are we going to interact with the virtual world?

Current VR games or applications mostly require the user to hold two controllers on both hands in order to control the action inside the games or applications. Most of the hand’s actions like pick up something, choose something or trigger something can be mimic with the controllers, because user’s hands were actually holding the controller, so they can have the feeling of moving their hands and fingers to perform the action inside the virtual world. However, one of the main concerns on current VR games or applications is the way to travel inside the virtual world.

In the next section, this paper will discuss current virtual travel method used by commercial VR games and applications, and the inadequate of these travel methods in term of sense of moving.

In the rest of the paper, we will focus on reviewing researches regarding the techniques and methods used to enable “walking” inside a VR environment. The “walking” we mentioned here is related to the body movement similar or exactly as we are walking in the real world. This paper aims to provide a big picture on current virtual travel methods, what have been done to enable “walking” inside virtual environment and what is the challenge in order to integrate “walking” techniques with current VR industry trend.

2 Overview of Current Virtual Travel Methods

Reviewing current VR games and applications, one fact about current VR games and applications is that they still containing one main issue, which is the means to travel inside the VR environment. They have controllers to control the hand movement; however, hand movement itself does not provide a good way for the user to move freely inside the virtual world.

We can see that game companies try to come across this issue through the aspect of game design. For example, design a game that does not need to move, such as spaceship games or drawing applications; or, a game which will move the user’s view automatically following the user’s head position when times goes on, or after user clearing the stage, such as first-person shooting game; or, a game that can move with “teleport” method. “Teleport” method is a way which requires users to point out the next point inside the virtual world where they want to go, and the game will immediately move the user’s view to that point.

By limiting users through the game logic, this makes the games able to proceed smoothly. However, this has greatly reduced the feeling of presence, while they are trying to say that virtual reality games can give you an immersive feeling as you are be there. There are some comments from the video games vlogger, WCK [2], who had said that, “standing VR with motion controllers is a WI game, it is the same thing”, “I can’t really walk around this entire space”.

It is undeniable that we like to have control on ourselves; this will not change when we are getting into the VR environment. Users prefer to move freely with their own conscious, rather than in control by the system. “Teleport” method used in some games might provide some level of control to the user, but the sense and feeling are different. Teleport is not feeling like we are walking inside the world.

The best way to deliver an immersive feeling is to let the user perform the same motion in the real world as what they want to do inside the virtual world, just like what have been done with the hand controllers, which enable user to move their hand and fingers to control the player inside the game. But user cannot directly walk in the real world while they are playing the VR games, because the space of real world is different with the space of virtual world. We do not have the equal size of world.

In order to cope with the space problems to allow user to “walk”, there are researches on different techniques used to enable walking inside VR environment, which will be discussed in the following sections.

3 General Virtual Travel Framework

Generally, in most of the VR games or applications, there are three common perspectives which were used in order for the user to move or travel inside the virtual world, such as first-person perspective, third-person perspective and bird-eye view perspective. First-person perspective displays the same view point as the character inside the game, third-person perspective displays the view behind the character inside the game, and normally user can see the character from their view. Bird-eye view perspective displays the view far from the scene, in which user can see the view as an observer. All these perspectives are actually created by locating the camera inside the virtual world on different position, so that it will display different point of view for the user.

The perspective which can provide the greatest immersive feeling for the user in a VR environment is the first-person perspective as shown in Fig. 1, because user will have the same view as the character inside the virtual world.

Fig. 1
figure 1

Example of perspectives used in VR games: first-person perspective, third-person perspective, bird-eye view perspective (from left to right)

The basic framework for virtual travel in a VR environment is as shown in Fig. 2. Generally, virtual travel started when an event happens to specify user’s intention to move. This event can be totally different depending on what type of input method was used. For example, for a system that uses a controller for user to control the movement, the event can be simply, pushing a forward button; for a system that allows user to use gesture to control the movement, the event can be a pre-defined gesture that performs by the user, such as a walking gesture.

Fig. 2
figure 2

Virtual travel framework

When an event fired, the system requires to know the direction where user intents to move to. In this part, commonly there are two types of direction might be captured, which are the head direction and the torso direction. Depending on the system requirement, the direction to move might follow the head direction or torso direction. There are systems that use both directions as well, where the direction to move will follow the torso direction, while the angle of the view will depend on the head direction. System that used both head and torso directions will have better sense of reality because this better simulates the real situation in the real world, which user might walk and see into different direction at the same time.

Besides that, speed to move might or might not be captured depending on the system requirement. Certainly, virtual travel that involved speed can provide better feel of immersion, but it will require more complicated data capturing, processing and calculation. Different system will have different definition and mean to calculate the speed. For example, Yan et al. [3] used the speed of leg lifting to calculate the speed for virtual travel, while Wendt et al. [4] used step frequencies as the criterion.

With the detection of walking event, the collected inputs such as the direction and speed will be used to calculate a new position for the camera inside the virtual world, move the camera to the new position and process the new frame to be displayed to the user.

Finally, the frame shows the view after move is rendered and displayed on the screen of the output devices. These processes are running continuously in order to create dynamic visual effect. Normally, a frame rate of 17 frames per second (fps) will provide a good experience for movement in the virtual environment [5].

4 Solution for “Walking” in Virtual Environment

There are various methods involving walking motion for immersive VR virtual travelling. In this paper, we categorized them into three: walking with repositioning device, real walking and walking-in-place (WIP).

4.1 Repositioning Device

Repositioning device utilizes machine to enable natural walking in the spatial constraint environment without changing the user’s real-world position by repositioning the user’s position.

In the early years, Iwata [6] has proposed a torus treadmill that allow walking on x and y directions by using a combination of 12 sets of treadmill belt connected together. Later, Medina et al. [7] have proposed a device called Virtusphere, using the concept of “hamster ball” that allows human to walk inside a big sphere-shaped device. From these researches, we can see that the mechanical repositioning devices developed in the early stage tend to big in size in order to provide enough space for human to walk on the devices. This might cause by some technical problems since the technology on that time is still not mature.

When we look back to the researches done in this few years, we can see that the proposed devices have likely to be more compact and lightweight. Walther-Franks et al. [8] have come out with a method called “suspended walking”, which try to hang up a person slightly with a harness to allow body motion without changing the actual position. Later on, a project named Kat Walk with similar concept was launched in Kickstarter [9].

Next, Cakmak and Hager [10] have introduced a prototype device named Cyberith Virtualizer. The device fixes the user on the centre with a ring-shaped belt, so that the user able to perform the body motions such as walk, run, squat and jump without changing the body position. The device works together with special shoe sole to enable effortless “walking” while using the device. Cyberith Virtualizer has sensors on the device itself to detect the movement of the user. Virtuix Omni [11] is another repositioning device which has the same concept as Cyberith Virtualizer.

Since the VR industry is growing fast, researchers are competing introducing their mechanical repositioning devices, and we believe that this kind of device might be seen in the market on the next few years. The concern of using the mechanical repositioning method for VR might be the comfortability while using it, and the cost. Since these devices are still on the prototyping stage, the actual cost of the device is not announced yet currently, but can be estimated as not less than the cost of a normal running treadmill.

4.2 Real Walking

Definitely, synchronized real walking with virtual travel is able to provide the greatest sense of presence and naturalness to the user. However, real walking technique has a critical weakness, which is the space that user allows to move. Real walking technique usually required an area that setup with trackers in order to track the movement of the user. This means that user can only move within the working area of the trackers, therefore the movable space of the virtual environment is limited.

In order to enable real walking in a larger virtual space than the tracking area, Razzaque et al. [12] first introduced a method called redirected walking. This method allows user to perform real walking within the tracking area, but use tricks such as manipulate the rotation or the view’s movement speed to reorient user into desired direction; to say briefly, this method actually tries to fool the senses of user without noticed by the user. For example, user might feel that they are walking in a straight line, but actually they are walking in a circle. The limitation of their redirected walking technique is that user cannot move freely according to their will, but has to follow the pre-defined path to reach and look around on the pre-defined waypoints in the VE.

Later, Peck et al. [13] have improved the redirected walking technique, so that user is allowed to walk freely inside a virtual space larger than the tracked area without pre-defined path. Their proposed method was named as Improved Redirection with Distractor (IRD). This method generates distractors on VE to stop user when they need to be redirected instead of using pre-defined waypoints.

Vasylevska et al. [14, 15] have introduced a novel approach—flexible space—which is a different solution for real walking in a larger VE space. Instead of mapping the real-world space on a fixed VE space, flexible space method dynamically changes the VE layout accordingly based on the user position on the real-world space, so that the VE space will be fixed inside the tracked area in the real world. The limitation of flexible space is that this method only feasible in certain condition, such as, moves from one room to another room in the VE.

The main disadvantage of methods related to real walking is the requirement of tracking spaces. Although redirected walking able to map smaller tracking area to bigger virtual space, it still requires quiet a large area for setting up the tracking area; however, this is not suitable for a home user because most of the people do not have the space to set up the tracking area.

4.3 Walking-in-Place (WIP)

WIP is a way that simulates the walking action. Different with real walking which the body of user will move from one place to another place; WIP requires user to perform the walking action, but the body of user will remain at the same place, we can say that WIP is a pretend walking motion. WIP has the advantage that can give the sense of walking in a spatial constraint environment.

The main concern of WIP techniques is to capture the motion of user as a sign of will to move, and link the motion into the movement of the avatar or user viewpoint in the virtual environment. WIP techniques for locomotion in VE can be categorized into three: physical interface supported, body motion sensing and capture, and commercial device utilization.

Physical Interface Supported. WIP techniques with physical interface supported required the user to stand on top of a physical interface that integrated with some sensors to detect the step perform by the user, without the help of machine to fix the position of user. The difference of physical interface and repositioning device is that user has to control their own position while performing the walking gesture, therefore one main issue of WIP is the position drift problem.

One common technique in physical interface-supported method is to utilize pressure or force sensors integrated on the underfoot tools such as shoes or mat in order to detect footstep. Couvillion et al. [16] have introduced Pressure Mat, a mat with pressure sensors on it to detect the force produced by users while they walking-in-place on top of the mat. The Pressure Mat system is able to detect different walking directions such as forward, backward, left, right and standing by analysing the amount of force user applied to the different area or position on the mat. For example, in order to walk backward, user has to walking-in-place while concentrating his body weight to the heel of his leg. For moving forward, the body weight will be concentrated on the toe.

Other than using pressure sensors to detect the force, Zielinski et al. [17] proposed a novel method that detects walking-in-place motion based on feet’s shadow that fall on the floor screen of an underfloor projection system, for example a cave automatic virtual environment (CAVE) system. They capture user’s feet shadow with a camera placed under the floor screen and analyse the captured shadows through several image-processing methods. This method is only able to be applied on underfloor projection system.

Swapp et al. [18] introduced a novel walking-in-place interface, Wizdish, which utilize a disc-shaped apparatus with a pair of low-friction shoes for locomotion in virtual environment. User wears the low-friction shoes and stands on top of the disc-shaped apparatus to perform a walking-liked gesture; they called “skating” gesture. The “skating” gesture requires user slipping one leg to the front and one leg to the back alternately at the same time, without lifting the legs from the disc-shaped apparatus. The Wizdish itself is not an apparatus to detect step, but they use a motion capture system to track the feet position to identify walking-in-place gesture. In other words, the Wizdish is just a physical interface for user to perform the walking-like gesture.

The limitation of physical interface-supported techniques is that the physical interface is the need for their system. However, this kind of physical interfaces might have lower cost and more portable as compared to repositioning device.

Body Motion Sensing and Capture. WIP techniques use sensors placed on user’s body parts, or use camera to identify the body motion of the user. In the early years, Slater et al. [19] have implemented WIP techniques to sense body motion using an electromagnetic tracking device on the head-mounted devices to track user movement and use neural network to analyse the patterns and recognize whether the user is performing walking-in-place or not.

Later, Yan et al. [3] proposed a method that detects WIP motion by tracking the legs movement instead of head movement, using a motion tracking system. In this method, two sensors were placed on the legs, one sensor placed on the back of user and one sensor placed on the head to track the head direction. Their algorithm uses the speed of leg lifting to reflect the locomotion speed. They claim that their system has low latency but does not provide analysis; this issue has been raised in later research [20].

Feasel et al. [20] have pointed out the inadequate of Slater et al. research [19] in term of naturalness because their algorithm required pre-four steps of walking-in-place for the neural network to recognize and register the moving movement, and required two full cycles of “no step” being detected to stop the movement. In order to improve the problem of latency time, Feasel et al. [20] proposed a WIP method, called Low-Latency Continuous-Motion (LLCM)-WIP to achieve low latency, smooth and continuous control of locomotion speeds using magnetic foot tracker attached on the shin of user to track the heel position on a metal floor lab, and chest orientation tracker to track the orientation of user.

Wendt et al. [4] have introduced another approach, called Gait-Understanding-Driven (GUD)-WIP, which apply biomechanics gait to further enhance the analysis of stepping motion. Their proposed system utilized an optical motion capture system to detect the WIP gesture. They want to beat LLCM-WIP method [20] in terms of consistency of the step frequencies to match the output locomotion speeds. Their results show that GUD-WIP is more consistent than LLCM-WIP; however, there is stopping latency issue in their method.

Bruno et al. [21] highlight the problem of GUD-WIP which controls the locomotion speeds by step frequency which is not accurate enough, since vertical and horizontal footsteps will take different time to perform the step. They therefore proposed Speed-Amplitude-Supported (SAS)-WIP, which uses the footstep amplitude and speed metric instead of frequency metric to reflect the locomotion speeds. They used an optical tracker system to identify the heel position for walking-in-place motion tracking.

Other than sensing the movement, Templeman et al. [22] integrated their system, Gaiter with physical interface, to support the motion sensing and capture. Gaiter uses both pressure sensors placed inside user’s shoes together with the movement sensing sensors on leg, waist and head to provide more flexibility in motion control. The Gaiter system enables user to walk forward, backward, sideway and turn using different walking-in-place gesture.

These researches [4, 20, 21] focus on enhancing the accuracy of speed controls and latency issues, and [22] focuses on flexibility of motion control, but their experiments were conducted in dedicated laboratory, which is hard to implement and use by home users who does not possess the required hardware and device.

Commercial Devices Utilization. Recent years, in consider to the low cost and ease of implementation issues, there are researchers who proposed to use the built-in sensors in smartphones, or recent commercial gaming devices such as Wiimote [23], Nintendo Wii Balance Board [24, 25] and Microsoft Kinect [26,27,28] for tracking the WIP motion. Using the ready-made components of the gaming devices, such as accelerometer in Wiimote, pressure sensors in Nintendo Wii Balance Board or the infrared camera in Kinect is a good idea to reduce the cost because this can omit the need to have another device in order to support their method.

Nintendo Wii Balance Board itself has pressure sensors, thus it can be directly used to detect the pressure applied on it while user is walking on top of it. Williams et al. [25] used the four pressure sensors located on different corners of the Wii Balance Board to identify the walking-in-place action, while the weight applied to different corners shifts rapidly. Harris et al. [24] introduced Wii-Leaning method that uses weight measurements on the four pressure sensors of Wii Balance Board to detect user leaning action and translate it into a moving command. Their method requires user to gaze on the direction to move and lean to that direction in order to travel inside the virtual world.

Microsoft Kinect has the functionality to track real-time human body movement using a depth camera; thus, it can be utilized to detect the walking-in-place gesture performed by user. Williams et al. [29] and Wilson et al. [26] proposed to use two Kinects instead of one because the data collected from single Kinect can be noisy and inadequate to detect the walking-in-place gesture accurately. Zheng et al. [27] also have proposed to use Microsoft Kinect to detect the WIP gesture. Since the skeletal data collected from one Kinect is noisy, they have researched on the pattern of WIP gesture and they have come out with an algorithm to detect the WIP gesture based on knee position.

Wiimote is a part of gaming device that has built-in accelerometer to detect the hand movement of user while user is holding the Wiimote; therefore, it has been used in researches to collect acceleration data. Shiratori and Hodgins [23] introduced the use of accelerometer data collected from Wiimote that attached on legs, wrists or arms of the users to control the movement of character, such as walk, run and jump in a simulated virtual environment. Vela Nunez et al. [30] presented a similar approach that uses two Wiimotes attached on user’s thighs to identify human gait, which able to differentiate walking and running in place motion, and adopt the method into real-time virtual reality application.

Besides Wiimote, smartphone is another type of device that has a lot of built-in sensors. Kim et al. [31] introduced a new algorithm for WIP using the magnetic and acceleration sensor inside the smartphones. They aim to achieve a wireless, easy-to-implement WIP algorithm, which able to detect dynamic changes of the walking speed. Two smartphones and a magnet are attached to the user’s legs for motion and speed tracking.

Methods proposed in [23, 30, 31] rely on two or more devices that attached on the corresponding parts of the body, such as legs or hands. The amount of devices needed and how to attach the devices to user’s body is a concern.

There are researches which required less amount of the devices to be attached to the user’s body, which they used head-mounted-based method. Williamson et al. [28] have used single Wiimote’s fixed on the head of user to detect upward movement during the running in place gesture for locomotion in American football video games. However, this method has the problem of starting latency, because the algorithm is looking for steady upward movement to recognize the running in place motion.

In consider to current trend of virtual reality industry, which is the popularization of VR headset, head-mounted-based method is a good direction to deliberate because while user using the VR headset, their smartphones were placed inside the VR headset and located on the head position. Therefore, utilization of sensors in smartphones to detect user’s motion with head-mounted-based method is able to fully utilize the ready-made resources we have, and thus a good way to simplify the set-up requirements.

5 Conclusion

Virtual travel, or locomotion in the virtual environment, is an important issue to be considered recently in order to enhance current virtual reality industry because consumers need a way to “walk” to immerse themselves inside the virtual environment. The best way to simulate the real feeling is to perform the same thing at the same time. However, synchronizing the walking motion of user in the real world with the action in the virtual world is still a problem because of the spatial differences between virtual world and real world. In this paper, we have reviewed on different kind of researches and current mechanism that aim to enable “walking” in the virtual environment.

Repositioning device can be seen as the most promising method to be implemented in the following years, in fact, some prototypes such as KAT WALK, Virtuix OMNI and Cyberith Virtualizer and their promotional videos have already been announced. However, the cost, ergonomic and practicality of the devices can be the critical factors to affect their success or failure.

Real walking, although it can give the most immersive feeling to the user, but, the set-up and space requirement is still a main issue to consider, and it will not suitable to be used by home user.

Walking-in-place is an alternative method to enable walking in a small space. The utilization of smartphones or recent gaming devices to detect body motion is its advantage over other methods, because reuse the devices that posses by the user able to reduce the cost of setup. Nevertheless, the security issue related to position drift has to be considered.

In conclusion, different solutions have their own strengths and weaknesses. Yet, in considering the feasibility to be widely used and accepted by the masses users in general public, some walking-in-place techniques have the strength of low cost and low set-up requirements, provided that the WIP technique is able to function with acceptable result and deliver the immerse feeling stable.