Background & Summary

The scientific assessment and modeling of human locomotion has been a central topic in various domains such as medicine, ergonomics, robotics, and sports1,2,3,4,5,6,7,8. Traditionally, human gait data have been recorded in controlled laboratory settings9,10,11, e.g., on catwalks or treadmills. However, based on these data, it is impossible to model human gait in more natural and challenging environments where people exhibit a richer gait behavior. Such models are necessary, e.g., for a satisfactory user experience with assist devices that can be used in a clinical environment and in daily life.

Often gait data are available as part of human activity data sets12,13,14,15 and hence they rarely contain ground truth information for the segmentation of single steps. These data sets provide lower body IMU data and sometimes include FSR data to provide ground truth for step segmentation. There are only a few dedicated human walk data sets that show walking in natural outdoor environments16,17,18. The focus is on walking speed variation and measurement of different walk patterns in isolation.

We aimed to create a richly annotated gait data set of natural, everyday walk scenarios requiring continuous walking of 5 to 15 minutes that naturally contain diverse gait patterns such as level walking, walking up/down ramps and stairs, as well as the corresponding transitions in between. In particular, the recordings include the natural interaction with other pedestrians and cyclists that affect the subjects’ gait behavior. We provide whole-body data from 17 IMU sensors to enable a wide variety of motion modeling. Additionally, we include plantar foot pressure data that yield accurate foot contact information and may be used independently from the IMU data.

The data set consists of 9 hours of gait data recorded from 20 healthy subjects. They walked across three different courses in a public area around a suburban train station. A single repetition of each course required several minutes of walking and captured many common elements such as straight and curvy passages, slopes, stairs, and pavements. We annotated the walking mode, e.g., regular walk, climb/descend stairs, ascend/descend slopes, interactions with other pedestrians/cyclists, curves and turnarounds, as well as terrain segments. The timings of heel strike and toe-off events are provided as well.

Another unique feature of our data set is the usage of a mobile eye tracker to record the gaze behavior of our participants during walking. Humans extensively use visual information of the environment for strategic control planning19. For instance, they adapt their gait speed and gaze angle to the complexity of the environment20. Spatio-temporal visual information is essential for proper foot positioning on complex surfaces21. Thus gaze may serve as an indicator of human intention during walking22 as well as an estimator of fall risk23,24,25,26. To our knowledge, this data set is the first publicly available database that provides gait motion data together with the corresponding visual behavior. In particular, the data allows the estimation of the gaze trajectory by combining the gaze position with the head orientation. As gaze is known to be an early predictor of human intention27,28, we think that the analysis of gaze patterns as a predictive signal for the anticipation of walk mode transitions provides an exciting research opportunity.

In summary, we anticipate that this data set will provide a foundation for future research exploring machine learning for real-time motion recognition and prediction, potentially incorporating visual behavior and analyzing its benefits.

Methods

Participants

Twenty-five healthy adults with normal or corrected-to-normal vision volunteered to take part in the study. The data from five participants were incomplete due to sensor failures and are not included in the data set. The anthropometry of the remaining 20 participants, 5 females and 15 males, is given in Table 1. The participants’ average height of 178.55 ± 7.6 cm corresponds to the average height in central Europe, while their average weight of 72.95 ± 8.7 kg was slightly below the central European average, i.e. all participants were slim, cf. Figure 1.

Table 1 Anthropometry information of the participants.
Fig. 1
figure 1

Statistics of participants’ (a) age, (b) height, and (c) weight.

All participants provided written informed consent, including written permission to publish the data of this study. The study was approved by the Bioethics Committee in Honda’s R&D (97HM-036H, Dec. 14, 2020).

Experimental tasks

The participants were asked to complete different walking courses in the area of a suburban train station that included walking on level ground, ascending and descending stairs, walking up and down ramps, and stepping up and down a curb. Figure 2 shows maps of the three walking courses.

Fig. 2
figure 2

Maps of walking courses A, B, and C.

Courses A and B include level walking, walking up and down ramps, and up and down stairs. Figure 3(a–d) illustrate the walking tasks A and B: Fig. 3(a) shows a level area. Figure 3(b) shows typical stairs. They consist of one or two groups of 8 to 13 steps that are separated by landings. Typical ramps, as shown in Fig. 3(c), have a slope of 6% and a length of 50 m to 70 m. There is one short and steep ramp in course A as shown in Fig. 3(d). Here the slope is 15%, and the length is approx. 3 m. The walking distance for each course is roughly 500 m.

Fig. 3
figure 3

Photos of the experiment location.

Course C includes straight level walking, walking a 90-degree curve, stepping up and down a curb in a lay-by, and turning by 180 degrees. The lay-by is shown in Fig. 3(e). The curb height here is 10 cm. The walking distance in course C is roughly 200 m.

Sensors

The participants were equipped with the following sensors, cf. Figure 4:

  • Inertial measurement units (IMUs). For tracking of motion and posture, we used a full-body inertial kinematic measurement system, the Xsens motion capture suit29 consisting of the MVN-Link BIOMECH full-body system and the MVN Link lycra suit. The system consists of 17 IMU sensors with 3D rate gyroscopes for measuring angular velocity, 3D linear accelerometers measuring accelerations including gravitational acceleration, 3D magnetometers for measuring the Earth’s magnetic field, and a barometer to measure the atmospheric pressure. The IMUs are placed on the head, sternum, sacrum, and on the shoulders, upper arms, forearms, hands, upper legs, lower legs, and feet.

  • Force sensitive resistors (FSRs). Foot pressure data we recorded using the IEE ActiSense Smart Footwear Sensor insole30 (IEE S.A., Luxembourg). The measurement system consists of thin, foil-based, removable pressure insoles with eight high dynamic pressure sensing cells that are inserted into the shoes below the shoes’ insoles. The FSR sensor cells are located below the hallux, the toes, the heads of the first, third, and fifth metatarsal, resp., the arch, and the left and right side of the heel. The pressure insoles are controlled by ECUs that are clipped to the participants’ shoes and come with IMUs consisting of a 3D accelerometer, a 3D gyroscope, and a magnetometer. Note, that here the accelerometer and gyroscope axis coincide while the magnetometer orientation is rotated by 180 degrees around the accelerometer/gyroscope x-axis.

  • Eye tracker. Eye-tracking data were recorded with a mobile eye tracker, the Pupil Invisible Glasses31. The eye tracker is worn like a regular pair of glasses. Two small cameras on the bottom rim of the glasses capture the wearer’s eye movements by using infrared light (IR) LEDs for tracking of the pupil and map the wearer’s gaze point into a scene video captured by a scene camera attached to the spectacle frame.

Fig. 4
figure 4

The sensory equipment of the participants (left). They wore a mobile eye tracker, the Pupil Invisible Glasses31 (top right), an Xsens full-body motion suit with 17 IMU sensors (middle right), and the IEE ActiSense Smart Footwear Sensor insoles (IEE S.A., Luxembourg) to record foot pressure data (bottom right). Note that the ICUs of the pressure insoles, the small black boxes attached to the shoes, also contain an IMU each. This means that there are two IMUs from different measurement systems attached to each foot: the IMU from the motion capture suit is located below the shoe tongue in the middle of the top of the instep and the pressure insole IMU is located on the shoe on the side of the top of the instep. The participant shown in this figure provided permission for their likeness to be used.

Data collection

Hardware set-up

The participants were asked to bring tightly fitting clothes and comfortable, flat, lace-up shoes with removable insoles. They wore the Xsens suit over their clothes. The FSR insoles were inserted in their shoes below the shoe insoles. The Xsens suit requires a separate calibration recording before the actual data recording. This calibration consisted of the participant standing in a neutral pose for 5 seconds than walking forward for 5 to 10 meters, making a u-turn, walking back to the starting position, turning and again standing in neutral pose for 5 seconds. The other sensors did not need a calibration procedure. Their proper functioning was checked using their associated smartphone applications. To facilitate the synchronization of the different sensors, we asked the participants to look at their feet for the first few and last few steps of each recording.

Experimental tasks

The participants were asked to complete three repetitions of walking courses A and B, resp., and five repetitions of walking course C. They were instructed to walk at their preferred, normal speed and to take a break whenever necessary. All participants completed each walking task without taking a break. One experimenter followed them at a distance to give directions and support the participant if necessary.

The experiments took place in dry weather conditions either in the late morning or the early afternoon to avoid busy commuting times at the train station. However, all participants encountered commuters and passers-by during the experiments so that the data contains side-stepping maneuvers. Note also that some participants chose to take two steps at a time when climbing stairs.

The average time for completing one repetition of courses A, B, and C was 235 s ± 22 s, 198 ± 19 s, and 77 ± 7 s., resp. Figure 5 illustrates the time it took each participant to complete one repetition of each task. Note that the participant with ID 7 completed six instead of five repetitions of course C. The total recording time of the complete data set amounts to 9:22 hrs with 3:55 hrs for course A, 3:17 hrs for course B, and 2:10 hrs for course C.

Fig. 5
figure 5

Experiment duration in seconds for each participant and each task.

Data processing

The data were recorded on-device and transferred to a desktop computer for post-processing. For each sensor, we used the post-processing software provided by the respective manufacturer.

  • IMU Data. The Xsens IMU data were recorded with a sampling frequency of 240 Hz. The raw sensor data was post-processed with the Xsens MVN software29 (MVN Studio 4.97.1 rev 62391) that computes full-body kinematic data based on a biomechanical model of the participant and sensor fusion algorithms. We provide the full data as post-processed by MVN Studio. Magnetometer data are subject to magnetic distortion from the environment and should be used with care. The resulting data were saved in MVNX file format for further processing.

  • FSR Data. FSR data were recorded with a sampling frequency of 200 Hz. The IEE ActiSense Smart Footwear Sensor insoles30 come with a tool to convert the raw digital values to voltages and to convert the raw accelerometer, gyroscope, and magnetometer data to accelerations, angular rates and magnetic flux density. Additionally, the tool synchronizes the data from both feet. The resulting data is saved in CSV file format.

  • Eye Tracking Data The gaze data were recorded with a sampling rate of 66 Hz. We used the open-source software Pupil Player32 (v3.4) to export the gaze position data to CSV file format and to create a scene video with a gaze position overlay. In a second step, we blurred passers-by and license plates in the resulting scene video for data protection reasons.

The data of all three sensors have been down-sampled to 60 Hz. In the case of the IMU data, we kept every fourth data point, whereas we did a linear interpolation of the FSR and eye-tracking data values. All three modalities have been synchronized manually by one experimenter and validated by the other, using the visualization and labeling tool shown in Fig. 6 that is provided with the software related to this data set. The participants were asked to look at their feet during the first and last few steps of each recording to facilitate the post-synchronization. This is required in particular to synchronize the eye tracker recordings with the other two modalities. Some example videos showing all sensor modalities after the synchronization procedure are available in the code repository related to the data set.

Fig. 6
figure 6

Visualization tool that jointly displays all three sensor modalities. The body posture is based on the XSens segment positions. In the case of the insoles, eight pressure segments are shown for each foot as well as a binary state that indicates whether the foot is on the ground. The scene video including the current fixation as well as the recent gaze trajectory is visualized from the eye tracker recordings.

Using the same tool, the data was labeled by walk mode and walk orientation: The walk modes are ‘walk’, ‘stairs_down’, ‘stairs_up’, ‘slope_down’, ‘slope_up’, and ‘pavement_up’, ‘pavement_down’ to indicate stepping up or down a curb. The walk orientations are ‘straight’, ‘curve_right’, ‘curve_left’, ‘turn_around_clockwise’, and ‘turn_around_counterclockwise’.

Additionally, we label whether or not the participant interacts with passers-by, i.e. whether or not the participant’s motion trajectory is affected by the motion of other persons in their surroundings. It has been shown that gaze is the main source of information used by pedestrians to control their motion trajectory33. Therefore, we annotated encounters with other persons as interaction based on the gaze behavior from the eye tracker that was overlayed over the eye tracker’s world video. We defined an interaction to start as soon as other persons were visually fixated by the participants and to end when the persons left the field of vision.

To easily identify identical course segments over participants and repetitions, the walk courses were segmented by walk mode and consecutively numbered, cf. Figure 8. Since each task was recorded in one go, we included a counter to indicate the repetition of the walking task.

Step detection

To simplify gait analysis, we determine the heel strike and toe-off events by the pressure outputs of the insole sensors. Similar to the approach of Hassan et al.34, we use a heuristic based on two thresholds. We normalize the measured values of each sensor cell for each recording between the first and 99th percentile to remove outliers and achieve an invariance against different body weights, shoe characteristics etc. A foot is assumed to be on the ground if its maximum sensor output surpasses the threshold αstep and lifted if it falls below the threshold αlift. We consider all sensor cells because the most relevant ones can vary, particularly for stairs and slopes or when subjects perform evasive motions due to other pedestrians or cyclists. A heel strike or toe-off event is given for the first moment the foot switches from being lifted to being on the ground and vice versa. An illustrative result of the detected events is depicted in Fig. 7.

Fig. 7
figure 7

Illustrative example of the heel strike and toe-off detection based on two thresholds.

Fig. 8
figure 8

Segments according to walk mode (upper row) and walk orientation (lower row) in walking courses A, B, and C.

Both thresholds αstep and αlift are optimized using a random search on a small set of annotated recordings, minimizing the mean absolute error between the ground truth events and the detected ones. The set of annotated recordings contains five different subjects. For each of those the heel-strike and toe-off events for 60 steps were annotated, thereby solely using the insole visualization shown in Fig. 6. The average temporal difference between detected and annotated events of a test subject is approx. 3.5 ms for heel strikes and approx. 5.5 ms for toe-offs, which is accurate considering the signal frequency of 60 Hz.

We also provide the estimated foot contact data calculated by the Xsens software, however, in our experience they are often inaccurate, particularly for slopes and stairs, and we suggest to treat them with caution.

Data Records

We provide the data on the figshare data-sharing platform35. The repository contains a folder with detailed documentation on the walk courses, including photos to illustrate the area, length, and slope of the ramps and the number, height, and width of the steps in the stairs. The processed data is provided in a file structure that is organized hierarchically by experimental task and participant. Each participant folder contains synchronized CSV data files with (i) eye tracker data (8 columns), (ii) pressure insoles data (91 columns), (iii) full-body Xsens data (757 columns), (iv) labels (22 columns) and (v) the eye tracker scene video in MP4 format. Detailed lists and explanations of all data columns in each file are given in Tables 25 and are also provided in the documentation folder on the data-sharing platform. Note that all files contain columns with the experiment time, the participant ID, and the experimental task. The experiment time is synchronized over all sensors and may be used to join data from several files.

Table 2 Explanation of data columns in the eye tracker data files.
Table 3 Explanation of data columns in the pressure insole data files.
Table 4 Explanation of data columns in label data files.
Table 5 Explanation of data columns in processed Xsens data files.
Table 6 List of Xsens motion sensor locations, segment labels, and joint labels.

Related Data Sets

An overview of related data sets is given by Table 7. Our database differs from the available ones in multiple aspects. The main difference is the extensive sensory setup that combines full-body IMU data with foot pressure and gaze data. In particular, this is the first data set providing natural gait data that includes the visual behavior of the subjects. Another significant difference lies in the trial design. In most data sets, trials aim to capture specific effects such as the influence of the terrain complexity on the gait in an isolated manner. In contrast, our scenarios were designed to capture traits of natural everyday walks, including transitions between various gait patterns. Each trial consists of several minutes of walking in a public space covering common elements such as straight and curvy passages, slopes, stairs, and pavements. All these elements are annotated.

Table 7 Overview of publicly available human gait databases.

Technical Validation

The sensors were validated before each recording session in the following way: The Xsens suit was calibrated in the lab before going to the experiment location. The validity of the calibration was checked by visualizing the resulting modeled skeleton in the lab using the Xsens software. The calibration was repeated at the experiment location directly before the recording to account for potential shifts of the IMU sensors. The insoles were validated for each participant by inspecting the pressure signals using the manufacturer’s live-streaming app during a short practice walk of approximately 30 seconds. The eye tracker does not require manual calibration. However, we ensured a reasonable accuracy of the estimated gaze point by letting participants fixate four objects in the vicinity and inspected the estimated gaze point on the world camera video.

Usage Notes

Each sensor modality is stored in a separate CSV file for each walking task and participant and can be imported into any software framework for further analysis. The labels are also available in separate CSV files. We provide a Python script that generates a single pandas data frame from the CSV files, which can be directly used within standard machine-learning libraries such as Scikit-learn36, Pandas37, PyTorch38 or Tensorflow39.

The data provides natural walking behavior annotated by different walking modes and heel strike and toe-off timings. One concrete application could be to train real-time machine learning models on the task of classifying and/or predicting the walk modes and/or hell strike, toe-off timings in order to enhance the control of walk assist systems such as exoskeletons40, or prostheses41.