Keywords

1 Introduction

This work presents the development of a novel method for the augmentation of accelerometer and gyroscope motion sensors. It aims to improve the performance of gait-based biometric person identification systems that are based on deep convolutional neural networks.

The use of Deep Learning (DL) solutions allows obtaining very good classification results in areas such as natural language processing, speech, or time series analysis. The success of such methods strongly depends on a large and diverse dataset, which will provide good generalization properties of neural networks [1]. In many cases, acquiring a large number of learning samples is time-consuming, expensive, and in some cases impossible. Data augmentation is a set of techniques and tools to artificially generate additional data based on an existing training set. It is a kind of a remedy in situations where new data cannot be acquired. It should be emphasized that in the field of image processing, data augmentation in the form of affine transform usage can be considered as a widely used standard. However, in the field of motion classification, the subject is much more challenging [2]. This demonstrates the importance of developing new algorithms.

The paper is organized as follows. Sect. 2 describes the state of the literature in the field of IMU data augmentation. Sect. 3 describes in detail the methodology of this work, including the database characteristics, proposed algorithm as well as applied classifier. Sect. 4 presents identification results for several selected augmentation methods. Sect 5 contains conclusions and a description of the planned future work.

2 Literature Review

A review of the literature on motion sensor data augmentation shows that there is an absence of a unified standard in contrast to the field of image processing (where affine transformations are widespread). Nevertheless, two main groups of augmentation solutions can be distinguished. In the first one, the IMU signals are transformed as standard time series, while in the second one, their synthetic generation is performed on the basis of orientation and displacement time series.

The authors of the publication [3], who investigated the issue of gait analysis using a smartphone, proposed to generate new signals by rotating the accelerometer measurement data in three-dimensional space. Rotations from 0 to 45° were performed, with a step of 15° for all three rotation axes. This approach allows simulations of what measurements would be acquired if the sensor were rotated. However, this interesting approach has a drawback. The operation of rotation of signals in three-dimensional space is not able to influence the norm/magnitude of the signal. If, at time t, the norm of accelerometer readings was, for example, 1.1 g, then changing the sensor orientation will change the ratio of values on individual sensor axes. A kind of “shift of values between axes” will occur, but the magnitude will remain constant. It is not possible to generate a signal that would be amplified in any way.

On the other hand, in the paper [4] focusing on Parkinson Desis monitoring, the authors proposed a whole series of different types of transformations (Jittering, Scaling, Rotation, Permutation, MagWarp, TimerWarp and Cropping). The developed approach allows both signal rotation and gains addition. In the conducted experiments, augmentation based on rotation and rotation with premutation achieved the best classification results [4]. It should be noted that the rotation was based on randomly selecting the axis and angle of rotation in the range of 180°. The generated rotation could even model an upside-down rotation of the sensor. In the case of gait analysis, it is unlikely that the sensor will be rotated so significantly. It should also be noted that both methods [3, 4] were applied only to accelerometer signals, but with some success, can also be used for sensors such as gyroscopes.

On the other hand, in [5] a three-step augmentation mechanism to both accelerometer and gyroscope signals was proposed. In the first noise was added, in the second the signal was scaled between 0.7 and 1.1 and in the last step the sampling irregularities of the signals were modeled. In the presented approach [5], there is no connection between the gyroscope and the accelerometer signals, which may lead to the generation of samples that are not observable in real conditions. The presence of any quantity measured by the gyroscope is closely related to the rotation of the sensor and results from a change in orientation. Due to the fact that the accelerometer also measures gravitational accelerations (which depend on the orientation), a change in the orientation of the sensor should also affect its indications.

The methods described in [3,4,5] do not require additional information about the sensor orientation, which is their undoubted advantage. On the other hand, they do not enable simulation of orientation drift or sensor vibrations and their influence on the measured values of accelerometer and gyroscope.

The second group of solutions includes approaches related to the synthetic generation of accelerometer and gyroscope signals. A representative of this application can be the solution proposed in [6]. The authors developed an algorithm for the artificial generation of IMU measurement values from videos of YouTube platform. The gyroscope measurement values generation was done using only the sensor orientation data. In the case of accelerometer signals, the measurement values consist of the gravitational acceleration (sufficiently well modeled using orientation), and the acceleration value resulting from the motion of the object. In order to fully model the signal, information about the orientation as well as the trajectory of the motion is necessary. In [6] motion trajectory information was used directly from a video recording. In an approach [7] realizing a similar issue, the professional Vicon motion capture system was used to capture human movement.

While the work of [6, 7] generates angular velocity signals very well, the synthetic generation of accelerometer signals requires an additional source of information about body motion. However, when testing single sets of accelerometer and gyroscope, this knowledge is not available, limiting potential applications.

In this paper we develop augmentation mechanics that is a composite of previously presented techniques. First of all, in the presented approach, we focused on modifying the original orientation signals and subsequently modeling the IMU measurements. The additional rotations have small values and model the limited rotations of the IMU sensors (similar to [3] and opposite [4]). The augmentation of accelerometer signals is a twin solution to [3] and has the disadvantage of being unable to amplify the signal. The novelty of the presented solution is the fact that the output angular velocity signal is generated analytically (in accordance with [6, 7]). Therefore measurement data of the accelerometer and the gyroscope are closely connected. In contrast to [5], where modality augmentation proceeds independently, the generated data is always observable in real-world conditions.

3 Methodology

The aim of the conducted work was to investigate the effect of selected augmentation methods on the performance of gait-based biometric systems. The conducted research examined the influence of literature augmentation techniques [3, 4] and proposed solution on identification metrics. A comparison was performed for a publicly available dataset containing gait acquired on substrate such as pavement, grass, cobble stone.

3.1 Dataset

In the conducted research a publicly available gait corpus “A database of human gait performance on irregular and uneven surfaces collected by wearable sensors” [8] was used. The database contains IMU signals collected with 30 subjects on a few substrate types. In the presented study, samples from surfaces such as stairs and ramps were not included. Scientific research focused on four flat surfaces: pavement I, pavement II, flat even, grass and cobblestone.

Data acquisition was conducted using an inertial motion capture system MTw Awinda which contains 6 IMU sensors. The location in sequence of each of them is right and left shin, right and left thigh, wrist, and the back of the torso. In the current work as in our previous work [9], only a single sensor located on the right thigh was used. This is motivated by the fact that the collected signals with some similarity may reflect data collected with a smartphone located in the trouser pocket. This gives hope for the potential implementation of the system on mobile devices. In this research despite a single IMU exploitation, ten signals were available for further processing (3 accelerometer channels, 3 gyroscope channels, and 4 orientation quaternion time series channels).

3.2 Segmentation and Data Preprocessing

The gait cycles contained in the corpus were represented in the form of block recordings that included both gait and stillness periods. To eliminate pause periods, wavelet decomposition was used according to the methodology described in [10].

Figure 1 shows an example of a block recording in which a period of stationary was observed at the beginning and end of the data. The X-axis presents time and the Y-axis shows the magnitude of the accelerometer signals. A magnitude value close to “1” (related to the effect of gravitational acceleration) is acquired during the pause period. The occurrence of motion interruptions was marginal, however, it is worth noting that these situations occurred and the algorithm was able to process them correctly. After segmentation, the block recordings were divided into frames of fixed length of 128 samples, in order to be used for the classification.

Fig. 1.
figure 1

Segmentation, colored background indicates periods detected as movement

Four experiments involving different types of substrate were conducted in the present study. In each of them, the classifier was trained with the use of gait collected on the pavement I substrate. Validation was carried out with samples recorded on pavement II, flat even, grass, cobblestone. This approach is closest to the real-life scenario. In an actual implementation, most likely the gait samples would be taken on a hard surface and used to create reference set. This approach seems reasonable, typically during the day city inhabitants walk more time on concrete or hard sidewalk than on grass or cobble stone. It would not be efficient to collect samples (and train system) for gait on grass, which may occur relatively infrequently.

Figure 2 shows the full characteristics of the dataset by participant and surface type. The dataset was relatively unbalanced. For example, for Participant 08, approximately 40 gait samples were recorded for the cobblestone surface, and for Participant 12, the number was approximately 85. Since the distribution of data was unbalanced, the f1-score metric was used to evaluate identification performance.

Fig. 2.
figure 2

Number of gait samples by participant and surface type

Before the augmentation process begins, the accelerometer measurement values are subjected to removal of the gravitational acceleration components.

Accelerometric measurement values consist of gravitational acceleration (modeled by sensor orientations) and readings resulting from actual sensor motion. Using the sensor orientation information, the gravitational acceleration value can be estimated and subtracted from the actual recorded signals. Figure 3 presents the IMU signal preprocessing diagram. The output signal is next used in the classification process.

Fig. 3.
figure 3

Signal preprocessing block diagram

3.3 Data Augmentation

The data augmentation process was performed in two stages. In the first one, the orientation signal (quat) was passed through a perturbation pipeline that models constant sensor displacement, vibrations, and sensor orientation drift during the walking motion.

Modified orientation signal (augmented quat) was used in both the gyroscope and accelerometer signal augmentation process. First of all, together with the original orientation signal (quat) and accelerometer signal (acceleration w/o gravity), it was used for the transformation of signals between two coordinate systems. This type of transformation can be understood as obtaining information on how the accelerometer signal would look if the sensor had an artificially created orientation (augmented quat). On the other hand, the augmented quat signal was used to reconstruct the angular velocity signals. In the proposed method, 30 additional learning samples were generated for each gait sample.

Fig. 4.
figure 4

Data flow diagram for the proposed augmentation technique

Figure 4 presents a block diagram of the presented data augmentation mechanism. It can be noticed that the output of the augmentation module is a data block of dimension 6 × 128, which is compatible with that shown in Fig. 3.

In this study, the capabilities of two settings of the augmentation module were investigated. These parameters were selected by trial and error methods. The following settings Offset: 7.5°, Noise: 0.1°, Drift: 3° and Offset: 3.5°, Noise: 0.2°, Drift: 4° were examined.

Orientation Signal Augmentation

Several typical scenarios affecting the values measured by the IMU sensors can be observed. These are the presence of rotation of the sensor with respect to the initial position (Offset), vibrations (Noise), or the slow rotation of the sensor over time (Drift). The proposed augmentation technique can account for all these scenarios and model them by modifying the orientation signals.

Modeling the presence of rotation of the sensor with respect to the initial position (Offset) was realized in the following steps: randomly select angles of rotation about three axes from the Offset range; create a Q_offset quaternion representing the orientation change; multiply the original orientation signals at each time t by the artificially created quaternion (equivalent to giving an additional constant rotation for the entire gait sample).

Modeling of sensor vibration (Noise) was implemented in the following steps: at each time t, randomly select angles of rotation about three axes from the Noise range; create a quaternion Q_noise representing noise; multiply the original orientation signals by the artificially created quaternion (equivalent to giving an additional random rotation at each frame of gait sample).

Modeling of slow sensor rotation during motion (Drift) was implemented as follows. For each of the three axes, randomly select two values from the Drift range. For each pair create a Bezier curve of length 128 which can model slow angle change. For each time t, create a quaternion Q_drift modeling rotation about the three axes. Multiply the original orientation signals by an artificially created quaternion.

Figure 5 a) shows the original orientation signal whereas Fig. 5 b) presents augmented. In the results of the manual inspection, we can see a significant change of the w component (quaternion), while remembering that for each data frame the quaternion is normalized. From the signal patterns we can observe that despite the addition of noise, the augmented signal Fig. 5 b) does not have significant jitter.

Fig. 5.
figure 5

Original orientation signal a) augmented orientation signal b)

IMU Signals Augmentation

Augmentation of the IMU signals was performed by changing the accelerometer measurements coordinate system and reconstructing the angular velocity signal with the use of modified orientation signals.

Augmentation of accelerometer signals involved two-step calculations. In the first one, the quaternion describing the rotation between two reference systems - original and augmented was determined (1):

$$\begin{array}{c}{}_{{\varvec{a}}{\varvec{u}}{\varvec{g}}}{}^{{\varvec{G}}}{\varvec{q}}={}_{{\varvec{S}}}{}^{{\varvec{G}}}{\varvec{q}}\cdot {}_{{\varvec{a}}{\varvec{u}}{\varvec{g}}}{}^{{\varvec{S}}}{\varvec{q}}\\ {}_{{\varvec{a}}{\varvec{u}}{\varvec{g}}}{}^{{\varvec{S}}}{\varvec{q}}={}_{{\varvec{S}}}{}^{{\varvec{G}}}{{\varvec{q}}}^{\boldsymbol{*}}\cdot {}_{{\varvec{a}}{\varvec{u}}{\varvec{g}}}{}^{{\varvec{G}}}{\varvec{q}}\end{array}$$
(1)

where:

\({}_{aug}{}^{G}q\)–quaternion describing a rotation from global (world) coordinates to an augmented orientation;

\({}_{S}{}^{G}q\)– quaternion describing a rotation from global (world) coordinates to sensor orientation;

\({}_{aug}{}^{S}q\)– quaternion describing the rotation that must be performed to switch from the sensor orientation to the augmented orientation;

\({{}_{S}{}^{G}q}^{*}\) – A conjugate quaternion representing a rotation from sensor to global coordinates;

· – Hamiltonian operator/quaternion multiplication operator.

In the next step, the vector (acceleration at time t) was rotated using formula (2):

$${{\varvec{v}}}^{\boldsymbol{^{\prime}}}={\varvec{q}}\cdot {\varvec{v}}\cdot {{\varvec{q}}}^{\boldsymbol{*}}$$
(2)

where:

\(v\) – The original vector (in quaternion form where w = 0);

\({v}^{^{\prime}}\) – vector in the new reference system (in quaternion form where w = 0);

\(q\)- quaternion representing the specified rotation.

Formula (2) could be presented in more detailed form (3):

$$\left[\begin{array}{c}\begin{array}{c}0\\ {a}_{augx}\end{array}\\ \begin{array}{c}{a}_{augy}\\ {a}_{augz}\end{array}\end{array}\right]={}_{{\varvec{a}}{\varvec{u}}{\varvec{g}}}{}^{{\varvec{S}}}{\varvec{q}}\cdot \left[\begin{array}{c}\begin{array}{c}0\\ {a}_{sx}\end{array}\\ \begin{array}{c}{a}_{sy}\\ {a}_{sz}\end{array}\end{array}\right]\cdot {{}_{{\varvec{a}}{\varvec{u}}{\varvec{g}}}{}^{{\varvec{S}}}{\varvec{q}}}^{\boldsymbol{*}}$$
(3)

where:

\({}_{{\varvec{a}}{\varvec{u}}{\varvec{g}}}{}^{{\varvec{S}}}{\varvec{q}}\)– quaternion describing the rotation that is required to switch from sensor orientation to the augmented orientation;

\({a}_{sx}\), \({a}_{sy}\), \({a}_{sy}\)–accelerometer measurement values at time t;

\({a}_{augx}\), \({a}_{augy}\), \({a}_{augy}\)– augmented accelerometer measurement values, the readings that would be measured if the sensor had an augmented quat orientation.

Augmentation of the gyroscope readings involved reconstruction of the angular velocity signals from the augmented orientation timeseries. Process is initiated by determining the quaternion differential (4):

$$ {\dot{{\varvec{q}}}}_{{\varvec{t}}} = ({{\varvec{q}}}_{{{\varvec{t}}} + 1} - {{\varvec{q}}}_{{\varvec{t}}} )/\Delta T, $$
(4)

where:

q(t+1), qt – orientation in the quaternion form at time t + 1 and t;

\(\dot{{q}_{t}}\) – quaternion differential;

∆T– sampling period.

In the next augmentation step, the angular velocity was reconstructed according to the equation.

$$ {{\varvec{\omega}}}_{{\varvec{t}}} ({{\varvec{q}}}_{{\varvec{t}}} ,{{\dot{{\varvec{q}}}}}_{{\varvec{t}}} ) = 2 \cdot {{\varvec{W}}}\left( {{{\varvec{q}}}_{{\varvec{t}}} } \right) \cdot {\dot{{\varvec{q}}}}_{{\varvec{t}}} , $$
(5)

where:

ωt – vector of angular velocities (ωx, ωy, ωz) at time t;

W – matrix mapping the quaternion qt and its differential to angular velocities.

The value of the matrix W depends on quaternion q at time t. The coefficient of the W matrix is specified in Eq. (6):

$$\boldsymbol{W}({\boldsymbol{q}}_{{\boldsymbol{t}}})= \left[ {\begin{array}{*{20}{l}} { - {{\text{q}}_{\text{x}}}}&{{{\text{q}}_{\text{w}}}}&{ - {{\text{q}}_{\text{z}}}}&{{{\text{q}}_{\text{y}}}} \\ { - {{\text{q}}_{\text{y}}}}&{{{\text{q}}_{\text{z}}}}&{{{\text{q}}_{\text{w}}}}&{ - {{\text{q}}_{\text{x}}}} \\ { - {{\text{q}}_{\text{z}}}}&{ - {{\text{q}}_{\text{y}}}}&{{{\text{q}}_{\text{x}}}}&{{{\text{q}}_{\text{w}}}} \end{array}} \right], $$
(6)

where:

qw, qx, qy, qz – value of the quaternion w, x, y, z components at time t;

Figure 6 shows the accelerometer and gyroscope measurements and their perturbed (augmented) forms.

Fig. 6.
figure 6

Comparison of real accelerometer and gyroscope signals with their augmented forms

Several important observation can be noted from the signals presented in Fig. 6. First of all, although the augmented orientation signal (Fig. 5) has a slow-variable nature, the reconstructed angular velocity signal has a jitter noise (reconstructed signal nevertheless has the characteristics of the original signal). The Noise parameter has a crucial influence on perturbation process.

On the other hand, the augmented acceleration signal is remarkably similar to the original, with some distinctive differences. The maximum value of the augmented acceleration signal decreased from about 20 to about 15 m/s2. An increase in the difference between OX and OZ can be observed over a duration of 30–60 frames. In the process of augmentation, the signal did not gain an additional offset, drift or noise, but numerous local distortions.

The effect of augmentation on the orientation signal is shown in Fig. 5. However, the analysis of the orientation signals will be omitted. The x, y, z coordinates of the quaternions represent imaginary numbers. The quaternion itself was normalized to a norm of one. In addition, the same orientation can be presented as two quaternions with negated components.

3.4 Classification

Data classification was carried out with a Deep Learning CNN classifier, in an architecture compatible with [11]. The dimensions of the last dense layer were modified to be consistent with the number of participants in the used dataset. It should be noted that the deep network structure was designed to process accelerometer and gyroscope measurement values in the form of 6 × 128 data blocks. The CNN was trained for 200 epochs using cross_entropy cost function, with the use of Adam optimization algorithms. The structure of particular network layers is presented in Table 1.

Table 1. Neural network architecture used in biometric system, ks-kernel size, p-padding, dimensions consistent to TensorFlow module documentation

This network has the general characteristics of AlexNet type architecture such as alternate convolution and max pooling layers. However, there are several significant differences. First of all, in the first layer of the network, the filter has a dimension of 1 × 9. In this case the first neural layer does not process data from several sensor axes (each filter reacts to a single sensor axis). Moreover, the consecutive conv2D layers are rather unusual.

4 Experiment Results

The experiments were conducted for 4 types of substrate: Pavement, Grass, Flat even and Cobble stone. For each, the identification performance was examined in cases:

  • absence of augmentation (Baseline),

  • proposed augmentation algorithm: Offset: 3.5°, Nosie: 0.2°, Drift: 4° (Proposed I),

  • proposed augmentation algorithm: Offset: 7.5°, Noise: 0.1°, Drift: 3° (Proposed II),

  • augmentation proposed by the Iso et al. (Iso et al. [3]),

  • augmentation proposed by Um et. al. permutation and rotation (Um et al. I [4]),

  • augmentation proposed by Um et al. rotation (Um et al. II [4]).

Considering the non-deterministic nature of the neural network classifiers, each experiment was repeated 50 times.

Fig. 7.
figure 7

Identification results for different substrate types and data augmentation algorithms based on validations repeated 50 times

Several interesting observations can be obtained from the results presented in Fig. 7. First of all, augmentation techniques in many cases have no real impact on the identification results, and in some even degrade the achieved metrics. Regardless of substrate type, the Um et al. methods provided worse identification results than baseline approach.

Application of the algorithms “Proposed I”, “Proposed II”, or “Iso et al.” did not result in significant differences for Pavement and Flat Even surfaces identification metrics. In these cases, the use of augmentation is not recommended. A positive augmentation effect was observed for surfaces such as Grass and Pavement and generally only for the Proposed II method.

5 Conclusions and Future Work

This paper presents an analysis of the impact of motion sensor augmentation techniques on the performance of biometric identification systems. The proposed approach concern classification process with the use of triaxial accelerometer and gyroscope signals. A new augmentation technique based on modification of orientation signals, and providing simultaneous modeling of accelerometer and gyroscope signals was proposed (Fig. 4). The conducted research examined the effect of the proposed algorithm as well as literature techniques [3, 4] on biometric identification metric (Fig. 7). Validation of the augmentation algorithms was performed for a publicly available dataset containing gait recorded on a substrate such as: pavement, flat even, grass, cobblestone.

The results of the experiments (Fig. 7) indicate that the application of the augmentation mechanism is not always profitable. Methods of Um el al. [4] based on rotation (Um et al. I) and rotation and permutation (Um et al. II) significantly worsened the identification results for all examined substrates. It is speculated that due to the significant rotations ranges (±180° range, Sect. 2), method generated samples that were not observable under real conditions. Although these techniques have achieved very good identification rates in the field of Parkinson's disease monitoring, its use in the field of gait analysis is not recommended.

The approach of Iso et al. [3] produced significantly higher identification rates than the methods of Um et al. (Fig. 7). Due to the modeling of limited rotations, it ensures the generation of potentially observable samples. However, this approach does not allow modeling additional disturbances such as vibration or drift. It can be speculated that these factors contribute to the advantage of our proposed solution (Proposed II).

Regarding the proposed approach, which allows modeling both rotation, vibration and drift depending on the introduced parameters significant differences in results can be observed. Proposed I method (Offset: 7.5°, Noise: 0.1°, Drift: 3°) produced lower results than Proposed II (Offset: 3.5°, Noise: 0.2°, Drift: 4°) in each of the analyzed substrate. Certainly, the selected parameters are not the optimal parameters for highest identification performance. However, the purpose of the study was to show the possibility of obtaining identification scores higher than the baseline, rather than searching for a local maximum.

A major advantage of the proposed solution is that the generated accelerometer and gyroscope signals are closely associated with each other. In the presented approach, the disturbances of the gyroscope and accelerometer are not interfered independently as is the case [5]. Consequently, the generation of samples that are not observable in the real world is prevented. However, the proposed approach has two major drawbacks: computational complexity and the inability to amplify the accelerometer signal (similar to [3]). Further work is planned to create a new pre-processing block for the data augmentation module. It is expected to gain additional control by decomposing the accelerometer readings into components resulting from gravitational acceleration (orientation-dependent) and actual motion. This would allow to simulate, e.g., change in walking speed, accelerometer measurement noise. Such a solution is expected to increase the scores of gait-based biometric identification systems.