Keywords

1 Introduction

Around ten million of the world population are reported as amputee, in which three million are armed amputee. Approximately half million amputees are reported in India, with more than 23,500 cases reported every year [22]. The majority of these cases belong to low income working age group, and these accidents affects their life tremendously. Significant advances in prosthetic limbs have been reported by the medical community and in the robotics area using electroencephalogram (EEG), electromyography (EMG), and surface electromyography (sEMG) signals. Exoskeleton prosthetic limbs can help people with amputee limbs to perform daily life activities such as basic hand gestures using myoelectric control systems.

Biomedical signals can capture vital information regarding the functioning of human body and are used extensively to diagnose various pathological conditions. Some of these signals can carry similar information, and the choice of the biomedical signal depends on the application in hand. EMG and EEG data acquisition are not as convenient and user-friendly as sEMG, which can effectively capture the required muscular information and therefore can be used in hand gesture detection. The sEMG signals are collected in a non-invasive manner and can capture the neuromuscular activity in the form of an electrical signal. The research and technological advances in the field of biomedical sensors and devices such as the Myo armbands have created an opportunity for researchers to explore these signals for a variety of applications related to brain computer interface devices. sEMG signals can be used to develop healthcare devices for assisting people with amputee limbs and for patients with neuro-degenerative diseases to help them in their daily activities. Also, depending on the extent of damage to amputee limb the sEMG-based assisting device can be manufactured with different degrees of freedom.

Authors in [4] used auto regression coefficients, Hjorth features, integral absolute value, mean absolute value, root mean square, and cepstral features to classify ten hand movements using myoelectric signals. An average accuracy of \(92.3\%\) was obtained with the multiclass support vector machines (SVMs) with radial basis function as the kernel. Vasanthi and Jayasree [35] computed various time domain features and compared the results obtained with machine learning algorithms, deep learning networks, ANN, and cascaded feed forward ANN. Here, support vector machine classifier gives the best result of \(98.88\%\). Authors in [25] used support vector machine to classify fifteen hand gestures using sEMG signals collected using eight sensors. The best accuracy of \(79.36\%\) was obtained using radial basis function as the kernel.

Ahsan et al. [1] computed root mean square value, standard deviation, variance, mean absolute value, waveform length, zero-crossings, and slope sign change to train artificial neural network (ANN) to detect four hand movements collected from three subjects. ANN has been explored by various authors, such as in [17] neural network was trained with signals collected from a number of subjects to classify four hand gestures. ANN has also been used by Zhang et al. [39] to develop a real-time hand gesture identification algorithm. The algorithm classifies five hand gestures collected from twelve subjects with an average classification accuracy of \(98.7\%\). In [28], sEMG signal has been used for hand movement recognition for a bionic hand.

Geng et al. [15] showed that the instantaneous values of high-density sEMG can be effectively used for hand movement recognition. The sEMG images of eight hand movements were used with deep convolutional network and an accuracy of \(99\%\) was obtained using majority voting over 40 frames. In [26], a hand gesture recognition algorithm has been proposed which is robust to different arm postures. In order to do so, the authors have collected EMG signals and signals from an accelerometer. Features such as the average value and the waveform durations are used to classify eight hand gestures based on the maximum likelihood estimation. Tunable Q-wavelet transform (TQWT) has been used in [23, 33] to decompose the sEMG signal. In [23], a TQWT-based filter bank was developed and Kraskov entropy was computed from each sub-band signal. Subasi and Qaisar in [33] used the mean absolute value, average power, standard deviation, skewness, kurtosis of the coefficients obtained from the sub-band signals, and the absolute mean value ratios of the neighbouring sub-band signals.

In [21], intrinsic mode functions (IMFs) are obtained using empirical mode decomposition (EMD) for four channel sEMG signals collected for seven hand gestures of thirty subjects. Deep convolutional network based on ResNet are then used with the first three IMFs to obtain the required identification. EMD is a popular choice for non-stationary signals such as biomedical signals. Authors in [27, 38] have also used EMD to decompose sEMG signals for hand movement classification. Sapsanis et al. in [27] used various time domain features such as the mean of the absolute values of signal, number of slope sign changes, waveform length, number of zero-crossings, and statistical features such as variance, kurtosis, and skewness. The algorithm was validated on a publicly available dataset and an average accuracy of \(89.21\%\) was obtained for classifying six hand movements of five subjects. Yan et al. [38] used autoregressive (AR) model parameters obtained for each IMF and classified four hand gestures using least squares support vector machines. In [37], variational mode decomposition (VMD) has been used to represent the sEMG signals as variational mode functions (VMFs). Composite permutation entropy index is computed from each of these VMFs, and machine learning algorithms are then employed to classify the hand gestures. The FDM has shown its efficacy in many applications such as detection of sleep apnoea events [12], modelling, audio signal processing [11], ECG and EEG signal analysis [10, 13, 14, 31].

In this work, we present the comparison of the performance of popular signal decomposition techniques including VMD, EMD, and DWT (discrete wavelet transform). Each sEMG signal is decomposed into multi-scale components and time based and statistical features including mean, variance, skewness, kurtosis, and Renyi entropy are computed for each sub-band signal. Different machine learning algorithms are then used to classify the feature space. A freely available dataset from UCI machine learning repository has been used in this work to test the hand gesture classification algorithms based on each decomposition scheme.

The paper is presented in five sections. Section 2 provides a detailed discussion on the dataset, and Sect. 3 presents the proposed algorithm. Simulation studies and conclusions are presented in Sects. 4 and 5, respectively.

2 Dataset

The dataset used here is acquired from the UC Irvine machine learning repository, under the name “sEMG for Basic Hand movements Data Set”. It includes two databases, where the first contains sEMG signals collected from two male and three female participants. The subjects considered in the study does not have an amputee limb and thus can be treated as sample from healthy population. Each subject performs six hand movements namely tip (TI), spherical (SP), lateral (LA), palmar (PA), hook (HO), and cylindrical (CY). Each gesture is repeated 30 times. The sEMG signals in the dataset have been acquired using a two channel programming kernel of the National Instruments (NI) Labview. sEMG signals have been de-noised using frequency selective filters, and the signal obtained after processing lies between 15 and 500 Hz.

The second database includes the sEMG signal acquired over three days from one healthy male participant for six hand grasps. Each movement is conducted hundred times over three consecutive days. This database unlike the first can be used to test the time invariance property of the hand movement recognition algorithm.

3 Methodology

The machine learning-based algorithm developed in this paper consist of decomposing the de-noised sEMG signals using multi-scale decomposition techniques and extracting features from the sub-band signals so obtained, as shown in Fig. 2. Different machine learning algorithms are then trained using the feature set. In the dataset considered in this work, the sEMG signal has been collected using two channels, we could either take correlation of these channels as the single input to the proposed scheme as done in [23] or we can consider individual information which will give us a feature vector in a higher dimensional space as considered in this work. The sEMG signals are represented as multi-scale components using three algorithms including VMD, EMD, and DWT.

EMD was proposed by Huang in [19] as an adaptive time-frequency analysis algorithm for non-stationary and nonlinear signals. EMD decomposes the signal into finite multi-scale components termed as intrinsic mode functions (IMFs). The set of IMFs makes complete basis for the given signal and should fulfil two conditions, the number of extrema and the number of zero-crossings should be equal or their difference is not more than one. The second condition states that at any instant the mean value of the envelope defined by the local maxima and the envelope defined by the local minima is zero. EMD has been employed in umpteen signal processing applications like denoising, pattern recognition, neuroscience, financial time series prediction, ocean data and seismic data analysis, etc. [3, 7, 16, 18, 32]. EMD is not robust to noise and suffers from sifting issues, moreover it is based on an empirical algorithm not on mathematical equations. In order to overcome these limitations, various authors have presented variants of EMD [5, 36].

VMD decomposes signal into intrinsic modes known as variational mode functions. It was proposed in 2015 by [9] to improve the noise and sampling properties of EMD. Unlike EMD, it is a non-recursive adaptive algorithm to obtain VMFs concurrently such that backward error can be taken into account. VMD decomposes the given signal into finite number of narrowband signals such that the VMFs reconstructs the given signal exactly or in the leasts squares sense. The VMFs, \(v_i(t)\), of a continuous time finite energy signal x(t) are given as

$$\begin{aligned} x(t)=\sum _{i}{v_i(t)}=\sum _{i}{A_{i}(t)\cos (\phi _{i}(t))} \end{aligned}$$
(1)

where \(A_{i}(t)\) is the instantaneous amplitude and \(\phi _{i}(t)\) is the instantaneous phase of \(v_{i}(t)\). Here, each \(v_{i}(t)\) is sparse with specific properties. For more details, refer [9].

DWT decomposes the given signal into dyadic sub-band signals. Unlike EMD and VMD, DWT is not a signal adaptive algorithm. DWT has been used by various researchers in varied applications including denoising, feature extraction, image processing, etc. [8, 24, 29]. Researchers have used DWT for multi-scale modelling of various stationary and cyclostationary signal, and it has also been explored for non-stationary and nonlinear signals as well. If \(\psi [n]\) is a wavelet with a support in \(\left[ -K/2, K/2 \right] \), a discrete wavelet scaled by \(a^j\), is expressed as

$$\begin{aligned} \psi _{j}\left[ n\right] =\frac{1}{\sqrt{a^{j}}}\psi \left( \frac{n}{a^j} \right) , \qquad 1 \le a^j \le NK^{-1} \end{aligned}$$
(2)

The discrete scaling filter, \(\phi _{j}\left[ n \right] \) is defined as

$$\begin{aligned} \phi _{j}\left[ n\right] =\frac{1}{\sqrt{a^j}}\phi \left( \frac{n}{a^j} \right) \end{aligned}$$
(3)

DWT decomposes successively each approximation \(a_j\in V_j\) into a coarser approximation \(a_{j+1} \in V_{j+1}\), and the detailed coefficient \(d_{j+1}{\in } W_{j+1}\). \(\{\phi _{j,n}\}_{n\in \mathbb {Z}}\) and \(\{\psi _{j,n}\}_{n\in \mathbb {Z}}\) are orthonormal bases of \(V_j\) and \(W_{j}\). The approximate coefficients and detailed coefficients of level \(j+1\), represented as \(a_{j+1}\) and \(d_{j+1}\), respectively, are obtained using the following equation:

$$\begin{aligned} a_{j+1}\left[ n\right] =a_{j}*h[2n]\end{aligned}$$
(4)
$$\begin{aligned} d_{j+1}\left[ n\right] =a_{j}*f[2n] \end{aligned}$$
(5)

where “\(*\)” denotes convolution, h[n] is the impulse response of low-pass filter, H(z), and f[n] is the impulse response of the high-pass filter, F(z) as shown in Fig. 1.

Fig. 1
figure 1

Block diagram of DWT

The narrowband components obtained using EMD, VMD, or DWT are then used to compute the features. Considering the performance of various time domain and frequency domain features, we have chosen the following time domain statistical features for the problem addressed in this work (Fig. 2).

  1. 1.

    Mean value of the kth sub-band signal

    $$\begin{aligned} \mu _k=\frac{1}{L}\sum _{i=1}^{L}{s_k\left[ i \right] }, \end{aligned}$$
    (6)

    where \(s_k\left[ i \right] \) denotes the kth sub-band signal and L is the length of the signal.

  2. 2.

    Variance of the kth sub-band signal

    $$\begin{aligned} \sigma _k^2=\frac{1}{L}\sum _{i=1}^{L}{(s_k[i]-\mu _{k})^2}, \end{aligned}$$
    (7)
  3. 3.

    Skewness of the kth sub-band signal

    $$\begin{aligned} \text {Skewness}=\sum _{i=1}^{L}{\left( \frac{{s_k\left[ i\right] -\mu _k}}{\sigma _k}\right) ^3}, \end{aligned}$$
    (8)
  4. 4.

    Kurtosis of the kth sub-band signal

    $$\begin{aligned} \text {Kurtosis}= \sum _{i=1}^{L}{\left( \frac{{s_k\left[ i\right] -\mu _k}}{\sigma _k}\right) ^4} \end{aligned}$$
    (9)
  5. 5.

    Renyi entropy of the ith sub-band signal

    $$\begin{aligned} \text {Ent}=\frac{1}{1-\alpha }log_{2}\left( \sum _{i=1}^{L}{p\left( {s_k\left[ i\right] }\right) ^\alpha }\right) , \end{aligned}$$
    (10)

    where \(p\left( s_k\left[ i\right] \right) \) is the discrete probability of \(s_k\left[ i\right] \), \(\alpha \) is the order of the Renyi entropy, \(\alpha \ge 0\) and \(\alpha \ne 1\).

Fig. 2
figure 2

Proposed methodology

The feature vector, thus, obtained for both channels are used to train machine learning algorithms. Performance of machine learning-based recognition algorithms depend on the feature vector used and, also, on the machine learning algorithm selected. In the next section, to choose the best classifier, we will compare various machine learning classifiers using extracted feature set based on performance metrics used in classification algorithms.

4 Numerical Results

We now discuss the simulation results procured using the proposed algorithm. Table 1 presents the results obtained for classifying the six hand movements when the selected signal decomposition scheme is VMD. Here, the first three VMFs are used for feature extraction as increasing the number of VMFs did not improve the recognition rate. A 10-fold cross-validation scheme has been used in this work with different machine learning algorithms such as SVMs with linear, quadratic, cubic and Gaussian kernels, ensemble bagged trees (EBT), k-neighbouring neighbour (kNN), ensemble subspace discriminant (ESD), and ensemble subspace kNN (ESkNN). The best accuracy obtained for Sub#1 is \(93.89\%\) using SVM cubic, \(96.11\%\) for Sub#2 with ESD, \(97.22\%\) for Sub#3 with EBT, \(94.44\%\) for Sub#4 for SVM cubic, and \(95.00\%\) for Sub#5 for EBT. The simulations have been carried on MATLAB 2020b.

Table 1 Performance comparison of several machine learning classifiers with 10-fold cross-validation for each subject using first three VMFs obtained with VMD

Results attained using the EMD algorithm are presented in Table 2. The best results as reported in the table are obtained using the first two IMFs. The best accuracy obtained for Sub#1 is \(95.56\%\) using SVM linear and quadratic, \(97.78\%\) for Sub#2 with linear discriminant, \(97.78\%\) for Sub#3 with EBT, \(98.89\%\) for Sub#4 for SVM quadratic, and \(98.89\%\) for Sub#5 for SVM quadratic and linear.

Table 2 Performance comparison of several machine learning classifiers with 10-fold cross-validation for each subject using first two IMFs obtained using EMD

The results obtained using the DWT are shown in Table 3. Wavelet Symlets four have been used in the DWT. The best accuracy obtained for Sub#1 is \(95.56\%\) using EBT and ESD, \(98.33\%\) for Sub#2 with ESD, \(98.33\%\) for Sub#3 with EBT, \(98.33\%\) for Sub#4 for ESD, and \(98.89\% \) for Sub#5.

Table 3 Performance comparison of machine learning classifiers when the decomposition scheme used is DWT

Table 4 presents the confusion matrix obtained when the second database is considered and EBT classifier is used. The classification accuracy of \(83.2\%\) is obtained in this case. The signals in the first database were acquired in a single session and therefore does not give an idea about the time variance property of the hand movement detection algorithm.

Table 4 Confusion matrix obtained for the second database using EBT classifier

From Tables 1, 2 and 3, it is noted that for the chosen features and dataset, the performance of DWT is superior than VMD and EMD. Finally, we tabulate the results presented by various authors in the literature for the UCI dataset in Table 5. For the proposed framework, DWT performs better than VMD and EMD, however, the obtained accuracies are low compared to algorithms presented in [23, 30]. While [23] utilized TQWT based filter bank, [30] obtained better results using multichannel convolutional neural networks.

Table 5 Performance comparison of the proposed algorithm with the existing hand movement recognition algorithms using common dataset

5 Conclusions

In this paper, the performance of VMD, EMD, and DWT algorithms is compared for sEMG signal classification application. The dataset used in the paper consists of sEMG signals collected from five healthy subjects for six most commonly used hand gestures. Each sEMG signal is first decomposed into multiple sub-band signals using VMD, EMD, or DWT algorithms. Time domain and statistical features are then computed for each narrowband constituents of the sEMG signal so obtained. The average accuracy reported by various machine learning algorithm is \(95.33\%\) with VMD, \(97.78\%\) with EMD, and \(97.89\%\) with DWT. The accuracy can be increased with deep learning and ANN.