1 Introduction

Electromyography (EMG) is a well-known bioelectrical signal that provides muscle information with different features [1, 2]. EMG signal can be analyzed to detect the medical abnormalities and identify the human motions [3]. Presently, there are two types of EMG data acquisition methods which are needle and surface electrode [4, 5]. As compared to needle, the surface electrode can be applied without assisting of the medical doubters.

Previous study revealed several important issues in the classification of EMG signal for practical application such as window length and sampling rate [6, 7]. Recent studies focused on the analysis of window length in obtaining the optimal performance between classification accuracy and controller delay duration [7, 8]. Nevertheless, the selection of sampling rate is also important in EMG pattern recognition. In one study, Chen et al. suggested that 400 Hz sampling rate could achieve comparable classification accuracy by using time-domain (TD) features [4]. However, the relation between the sampling rate and the presence of noise is remaining uninvestigated in time–frequency distribution (TFD). The usage of low sampling rate leads to fast computation time, but the signal information is limited.

Surface EMG signal contains useful information, but it is often corrupted by various types of noise in the process of recording. Current technology immunes to some of the noise source, but there are various noises and artifact which are unavoidable [9]. According to literature, the presence of noise depended on several factors such as human’s skin, body structure and blood flow velocity [10]. Hence, multiple types of noise can be found within the EMG signal. Various time–frequency signal processing methods have been applied to EMG signal for noise reduction purpose [3, 11,12,13]. However, noise and artifact are still the challenging problem in the analysis of EMG signal. Noise is usually involved in a real system and it needs to be considered in the selection of sampling rate.

In recent days, signal processing methods and machine learning techniques have becoming the attention of the researchers [11, 14]. For the choice of classification method the k-nearest neighbor (k-NN) has been commonly used [15,16,17]. Recently, k-NN algorithm showed the promising classification results with low computation cost [17]. On the other hand, support vector machine (SVM) with Gaussian Radial Basis based kernel is widely used because it offers excellent results in EMG studies [18].

The aim of this study is to investigate the relation between EMG pattern recognition and sampling rate using spectrogram. In this study, the optimal window size in spectrogram is first evaluated and selected. In the next step, 256, 512 and 1024 Hz sampling rate are evaluated. It was reported that lower than 1024 Hz sampling rate can provide promising results in EMG signals classification [6]. However, the noise evaluation is not included in the study. Therefore, in the second step, the additive white Gaussian noise (AWGN) is added into the signal at 30, 25 and 20 dB SNR for evaluating the robustness of EMG signal at a different level of sampling rate. In this analysis, k-NN and SVM are employed to classify ten different hand movement types. In addition, the performance of 256, 512 and 1024 Hz sampling rate under the interference of noise are also compared.

2 Material and Methods

2.1 EMG Data Collection

Two wearable EMG devices namely Shimmer (Shimmer3 Consensys EMG Development Kits) with standard setting were used in data collection. The resolution was set at 24 bits with a gain of 12. The EMG signal was gathered from four useful hand muscles namely extensor digitorum (ch1), flexor carpi radialis (ch2), extensor carpi radialis longus (ch3) and flexor carpi ulnaris (ch4) with two reference electrodes at the elbow. The signal was sampled at 1024 Hz and band-pass filtered between 20 and 500 Hz. The skin was shaved and cleaned with alcohol pad before the electrode placement. The surface electrodes with 30 mm diameter were used and the inter-electrode distance was set at 20 mm to reduce the crosstalk. The bipolar electrode configuration was recommended by SENIAM and it was shown in Fig. 1 [19, 20].

Fig. 1
figure 1

Electrodes configuration

In this study, the EMG signals were recorded from ten healthy subjects (eight males, two females) with age ranging from 24 to 47 years (mean age ± standard deviation: 28.6 ± 9.7 years). The subjects were given a detailed explanation on the experimental procedure and provided an informed consent before starting the study.

During data acquisition, the subjects were sitting comfortably on the chair with the hand in neutral position. The EMG data were collected as the subjects performed upper limb movement including thumb flexion (TF), thumb extension (TE), thumb-index (TI), thumb-middle (TM), thumb-ring (TR), thumb-little (TL), make fist (MF), wrist extension (WE), wrist flexion (WF) and relax (R) as shown in Fig. 2. The hand movement tasks were recommended by the previous works, and also FlintRehab exercise guideline [21]. The experiment was partitioned into ten trials and each trial consisted of ten different movements. Within each trial, the subjects were asked to maintain each movement for 5 s, followed by a resting state of 4 s. In addition, 1-min rest period was introduced at the end of each trial to prevent muscle fatigue. At the end of the experiment, 40 EMG signals (10 trials × 4 channels) were collected from each movement from each subject.

Fig. 2
figure 2

Hand movement tasks

Previous studies reported the optimal window length for the EMG myoelectric prosthetic system was between 150 and 250 ms in order to balance the classification accuracy and the controller delay [7, 22, 23]. In this research, the EMG data were segmented into 250 ms window (256 samples) using non-overlapped windowing technique [24]. Hence, 20 segments were obtained from each EMG signal.

2.2 Data Resampling and Window Size Selection

All EMG data were processed under computer processing Intel core i5-3340 3.1G Hz and 8 GB Random Access Memory (RAM). In the first part of the experiments, spectrogram with Hanning window size of 16, 32, 64 and 128 are investigated. The best window size, which gives the optimal classification performance in EMG pattern recognition is selected.

Next, the effect of sampling rate in EMG pattern recognition is examined. The original EMG signal is down-sampled into 256 Hz and 512 Hz sampling rate. For example, Matlab down-samples the EMG signal from 1024 Hz sampling rate to 512 Hz by decreasing the sampling rate with the sequence of 2.

2.3 Spectrogram

Spectrogram is implemented to the EMG data of 2000 × 4 matrix (20 segments per movement × 10 movements × 10 trials × 4 channels) for each subject. The application of time–frequency distribution such as spectrogram is not new to the EMG feature extraction [11]. However, spectrogram is the most fundamental of the signal processing tool in noise reduction. In addition, spectrogram is easily for implementation [25]. Spectrogram is evaluated by computing the square magnitude of the short time Fourier Transform (STFT) [26]. It represents the signal in energy distribution. Mathematically, spectrogram can be expressed as:

$$S(t,f) = \left| {\int\limits_{ - \infty }^{\infty } {x(\tau } )w(\tau - t)e^{ - j2\pi f\tau } d\tau } \right|^{2} ,$$
(1)

where x(τ) is the input signal and w(τ − t) is the Hanning window function.

2.4 Noise Evaluation

Previous study revealed the surface EMG was often corrupted by white Gaussian noise, baseline noise and movement artifact [9, 13]. Note that noise source often contains frequency spectral information at the low frequency component from the EMG frequency spectrum [9]. In order to examine the robustness and performance of EMG signal, the additive white Gaussian noise (AWGN) is added into the signal (256, 512, and 1024 Hz) at 30, 25 and 20 dB SNR before data segmentation. The noise surface signal can be defined as:

$$s(t) = x(t) + awgn(t),$$
(2)

where x(t) is the input signal and awgn(t) is referred to the additive white Gaussian noise segment.

Figure 3 shows the spectrogram of 1024 Hz sampling rate at 30, 25 and 20 dB SNR. As can be seen, the time–frequency information became unclear as the SNR decreased. It indicates that the spectral information in the spectrogram has been limited.

Fig. 3
figure 3

Spectrogram of 1024 Hz sampling rate at 30, 25 and 20 dB SNR

2.5 Feature Extraction

The noise surface signal can be defined as: Spectrogram is a m × n matrix coefficient that presents the signal in two dimensions. In order to reduce the dimensionality, the average instantaneous energy is extracted from spectrogram coefficients. According to literature, the instantaneous energy showed excellent results in investigating the characteristic of bio signals [27]. Average instantaneous energy can be represented as:

$$E_{i} = \frac{1}{T}\int\limits_{0}^{T} {\int\limits_{0}^{{f_{{\rm max}} }} {S_{i}^{{}} (t,f)\,df\,dt} } ,$$
(3)

where E is the average instantaneous energy, S is the spectrogram coefficient and i is the order of data segment.

2.6 Dimensionality Reduction

Commonly, TFD feature vector not only provides a high feature dimension but also increases the computation time in a classifier [22]. In such case, a dimensionality reduction technique is applied to reduce the feature vector into a lower dimensional space. Meanwhile, the decreasing in dimensional space also reduces the computation time. In recent days, principal component analysis (PCA) was widely used in feature extraction and dimensionality reduction [28]. PCA is a technique transforms the feature matrix statistically and gets the correlation between variables in the data [17]. In addition, PCA is an unsupervised dimensionality reduction that projects the data using the eigenvectors of the covariance matrix [29]. In this analysis, the first three principal components (PCs) are used as the input to the classifier. The transformation can be represented as:

$$T_{L} = XW_{L} ,$$
(4)

where TL is the output with first L components matrix and L is 3 in this study.

2.7 Machine Learning Method

Presently, k-nearest neighbor (k-NN) is a popular machine learning method due to its simplicity and speed [15, 17]. According to literature, the value of k of k-NN must be carefully selected according to the model specification [16, 17]. In this work, the weight is employed instead of choosing the k-value. The weight can be represented as:

$$weight = 1/(d_{st} )^{2} .$$
(5)

The Euclidean distance as the distance metric and it can be defined as:

$$d_{st} = \sqrt {(x_{s} - y_{t} )(x_{s} - y_{t} )^{\prime}.}$$
(6)

Support vector machines (SVM) is a supervised machine learning method using computer science in classification [17]. A technique which expands the concept of hyperplane separation to the data is introduced in SVM to discriminate the data sets that failed to separate linearly [5, 28]. In this study, SVM with the Gaussian kernel function is applied. The kernel function is implemented in hyperplane as the idea product of the nonlinear function. The Gaussian kernel can be expressed as:

$$K(x,x_{i} ) = \exp \left( { - \frac{{||x - x_{i} ||^{2} }}{{2\sigma^{2} }}} \right),$$
(7)

where x − xi is the Euclidean distance between feature vectors and σ is the kernel parameter.

3 Result and Discussion

The EMG data are collected from ten subjects and the feature extraction and dimensionality reduction techniques are applied after spectrogram to observe the relation between sampling rate and classification error. In this study, the tenfold cross validation method is used for performance evaluation. The data is randomly partitioned into 10 equal subsets. Each subset is used for testing in succession, while the remaining subsets are used for training session [5, 17]. The averaged results from tenfolds are calculated for performance comparison.

In the first part of the experiments, four different Hanning window sizes are evaluated and compared. In Fig. 4, one can see that the window size of 32 achieves the highest mean classification accuracy. Thus, the window size of 32 is selected and applied in the rest of the study.

Fig. 4
figure 4

Mean classification accuracy according to the change in window size (16, 32, 64 and 128) of ten subjects using SVM

For the second part of the experiments, the performances of three different sampling rates are evaluated. The results of mean classification accuracy of 256, 512 and 1024 Hz sampling rates were shown in Fig. 5. From the results, it is clear that the 1024 Hz sampling rate illustrated the superior classification accuracy, followed by 512 Hz and 256 Hz. Obviously, whichever machine learning method is used, the classification accuracy reduced as the sampling rate decreased. By applying SVM, the corresponding classification accuracy decreased from 92.68% (1024 Hz) to 91.92% (512 Hz) to 87.79% (256 Hz), respectively. In Fig. 5, it is evident that a higher sampling rate can produce better classification result. However, from Table 1, it showed that the computation time at higher sampling rate has been increased.

Fig. 5
figure 5

Mean classification accuracy of three different sampling rates across ten subjects

Table 1 Signal processing computation time (per signal) at different sampling rate

In the final part of the experiments, the robustness of the proposed sampling rates is tested by adding 30, 25 and 20 dB SNR to the EMG signals. The noise and artifact are often the main challenge due to the large number of recording electrodes [13]. Thus, the selection of sampling rate under noisy condition is equally important, which shows great impact on classification performance. As can be seen in Tables 2 and 3, the classification accuracy showed a decreasing trend along with SNR fall. As expected, the 1024 Hz sampling rate achieves the highest classification accuracy in both k-NN and SVM. From Table 2, 512 Hz sampling rate showed a decrement of 0.76% classification accuracy compared to 1024 Hz using SVM. This result indicated that lower than 1024 Hz sampling rate is possible for accurate classification of ten different hand movements. However, when the AWGN is added to the signal, 256 and 512 Hz sampling rate are facing difficulties in pattern recognition, especially at lower SNR.

Table 2 Classification accuracy (mean ± SD) of 256, 512, and 1024 Hz sampling rate at different SNR for SVM
Table 3 Classification accuracy (mean ± SD) of 256, 512, and 1024 Hz sampling rate at different SNR for k-NN

In the real system, the noise and artifact are usually involved and the analysis of EMG signal with noise is preferred. From the results, the reduction of classification accuracy at a different sampling rate is obviously presented when the AGWN is added. At 20 dB SNR, the mean classification accuracy of 256 and 512 Hz sampling rate are falling below 50% in k-NN. In contrast, 1024 Hz sampling rate is possible to maintain high classification accuracy of above 80% at 30 dB SNR in both SVM and k-NN. In comparison with 1024 Hz sampling rate, 512 Hz shows the reduction of 4.88% (30 dB), 6.86% (25 dB) and 8.98% (20 dB) mean classification accuracy in SVM. The reduction in classification accuracy implies that 512 Hz sampling rate is not robust as compared to 1024 Hz. More importantly, it is seen that the classification accuracy at 20 dB SNR was relatively poor for both k-NN and SVM. This might because the AGWN has greatly corrupted the EMG signals in the process of noise addition.

To compare the classification accuracy obtained by different classifiers, F-measure is used, and it can be calculated as:

$$F{ - }measure = \frac{2TP}{2TP + FP + FN},$$
(8)

where TN is the number of true negative, TP is the number of true positive, FP is the number of false positive and FN is the number of false negative.

In the previous research, most of the researchers made use of different classifiers for the classification of EMG signal [5, 18, 22, 30]. EMG data acquisition technique is important in achieving EMG classification results accurately. Therefore, the F-measure is calculated to measure the performance of SVM and k-NN. Table 4 outlines the result of F-measure. When the AGWN is added into the signal, the performance of both SVM and k-NN classifiers are degraded. From Table 4, it is observed that SVM obtains higher F-measure value compared to k-NN. Evidently, SVM has proven to be a better classifier due to higher F-measure value achieved in the performance evaluation.

Table 4 Result of F-measure for SVM and k-NN

Previous studies suggested that the sampling rate between 400–500 Hz to be optimal in EMG pattern recognition [4, 6]. However, the noise evaluation is not included in their study. Additionally, most of the experiments are done in laboratory to minimize the noise from the environment. The noise evaluation is always preferred in the sampling rate selection. Regarding with the presented results in Tables 2 and 3, it showed that lower than 1024 Hz sampling rate was not suitable for accurate classification of ten movement classes in a real system. The performance of 256 and 512 Hz sampling rate become unstable at lower SNR. In terms of computation cost, 1024 Hz sampling rate enhances 0.0073 s compare to 512 Hz. The major drawback of higher sampling rate is longer computation time. In sum, it can be inferred that 1024 Hz sampling rate is more suitable for EMG pattern recognition. In a real system, down-samples the sampling rate lower than 1000 Hz is not recommended.

There are several limitations in the present study. First, only three sampling rates (256, 512 and 1024 Hz) are considered in this work. Through the observation from the results, it can be inferred that the sampling rates higher than 1024 Hz, such as 2048 and 4096 Hz will guarantee better classification performance in EMG pattern recognition. However, a higher sampling rate will further improve the computational complexity, which might be increasing the delay duration. Second, the wireless EMG shimmer sensor might not shield against power line interference, noise from environment and motion artifact. These artifacts can badly degrade the quality of EMG signals. To improve the quality of recorded signal, as well as classification performance, a better wire EMG device should be implemented. Third, we found that the classification performance of 1024 Hz at 20 dB SNR is poor. The possible reason is that spectrogram based feature extraction does not work very well at lower SNR. As for future work, the other popular feature extraction such as Discrete Wavelet Transform (DWT) can be considered.

4 Conclusion

In this paper, the relation between EMG pattern recognition and sampling rate in the spectrogram is investigated. The optimal window size in the spectrogram is evaluated and selected before performance evaluation. The contribution of this study is to investigate the classification accuracy of EMG signal at different sampling rates when the noise is involved. The results revealed lower than 1024 Hz sampling rate was possible to get high classification accuracy in spectrogram. However, when the noise is presented, lower sampling rates showed the significant difference in classification error rate compared to 1024 Hz. Considering the computation time, performance, efficiency and noise evaluation, sampling rate higher than 1000 Hz may be a better choice for the EMG signal pattern recognition.