Keywords

1 Introduction

EMG signal is a bioelectrical signal expressed by the activity of muscle during contraction. EMG is a non-invasive technique, and it is related to the upper limb information of the body [1]. EMG data is collected by surface EMG (sEMG) sensors placed on the surface of the skin. sEMG sensors are mainly used in different types of experiments such as lab experiments, field experiments, etc. Muscular activity of the body can be recorded, measured, and analyzed to explore and identify various research areas such as biomechanics, rehabilitation, sports science, clinical diagnosis, man–machine interface, and so on. EMG signal can be applied into the development of an interface model for people with disabilities such as electric wheelchairs, prosthesis control, robotics (robot control), virtual word, and so on. In this work, prosthesis control based on non-invasive adaptive prosthetics is considered for classifying hand movements [2]. Recent scientific research has proposed various methods in the field of the hand prosthetics such as hybrid prosthetics, passive prosthetics, robotic hand prosthetics, and so on [3, 4]. The present work is based on the investigation of the robotic hand prosthetic followed by 67 intact subjects and 11 transradial amputated subjects. Generally, the upper link of hand prostheses is controlled using more than one EMG signal. There are many movements in the muscular activity of the amputees leading to various problems that need to be identified and isolated. To eliminate these problems, enhance classification performance by choosing the right feature selection techniques [2, 3, 5].

FS is a preprocessing and essential step to remove redundant and irrelevant information of the problem as well as reduce the number of feature and classifier complexity. There are three types of FS techniques: (i) filter techniques, (ii) wrapper techniques, and (iii) hybrid techniques. The filter technique is used for statistical techniques such as mutual information, distance information, and correlation information. Examples of filter techniques are correlation-based feature selection, information gain, ANOVA, principal component analysis (PCA), and so on [6]. Whereas the wrapper technique is based on finding the optimal feature subset from the actual feature set. It includes feature selection-based meta-heuristic algorithms such as binary artificial bee colony algorithm (BABC), binary particle swarm optimization (BPSO), binary ant colony optimization (BACO), binary crow search algorithm (BCSA) [7,8,9,10]. The third type of hybrid feature selection technique is a combination of filter and wrapper technique that improves the classification accuracy. Examples of hybrid techniques are hybrid serial GWO-WOA [11], TOPSIS-Jaya [12], and so on. Compared to all three techniques, the hybrid technique is faster than the wrapper technique, and the filter technique is a simple structure with a fast processing process. However, all feature selection techniques provide better classification accuracy with different domains and regions, therefore the major interest of researchers in the feature selection area.

In recent years, real-world problems are getting more and more challenging, thus requiring more flexible, robust, and efficient algorithms in feature selection. One of the most promising swarm algorithms of feature selection for optimization is artificial bee colony (ABC) and has great potential with proper modifications. It is a new approach proposed by Karaboga [13], and since then, there has been a rapid increase in the interest of ABC. ABC has been successfully applied to many fields like machine batch processing of a machine [14], designing two-channel quadrature mirror filter banks [15], loudspeaker design problems [16], etc. In [17], a two-archive multi-objective ABC algorithm (TMABC-FS) is developed for the cost-sensitive FS problem of machine learning [16, 17]. ABC has the potential of solving local and global optimization problems. Several benchmark functions [18] have been tested to validate the performance of ABC. As per no free lunch theorem, no specific meta-heuristic algorithm is regarded as best that gives best solution for all optimization problems. The standard ABC algorithm however demands alteration for many practical applications due to its inefficiency in balancing both exploration and exploitation phases during searching. The solution generated after each run by the solution search equation of the algorithm has an equal probability of being a positive or negative result. For ensuring better and effective results from the algorithm, the two phases need to be well balanced as excessive exploitations lead to premature convergence to local minima and excessive explorations result in slow convergence. Thus, Gaussian artificial bee colony (GABC) [19] was proposed to counterbalance between exploration and exploitation. It outperforms standard ABC by taking random values from Gaussian distribution over a uniform distribution.

Although GABC is better than the standard ABC, it still needs improvements in terms of convergence and better exploration capability. We propose a beta artificial bee colony (BetaABC) algorithm to further increase its performance by introducing beta distribution in search space. This algorithm avoids premature or slow convergence, and after comparison with ABC and GABC with different parameters, it proves to be better, in terms of global optimization. Hence, the key contributions of the proposed method are listed below:

  1. 1.

    A novel variant of ABC, i.e., beta artificial bee colony (BetaABC) is proposed and validated on 10 benchmark functions.

  2. 2.

    Binary beta artificial bee colony (BBABC)-based FS is proposed for EMG signal classification.

  3. 3.

    DWT is employed to extract various features from EMG signal.

  4. 4.

    The proposed BBABC is applied for FS and classification of prosthetic hand movements using EMG signal.

  5. 5.

    The extensive experiment is performed to validate proposed BBABC with BABC, and BGABC.

2 Methods and Implementation

2.1 EMG Data

The raw EMG data is taken from the public domain non-invasive adaptive prosthetics (NinaPro) dataset for prosthetic hands [3]. In this study, the third version of the NinaPro dataset (DB3) that comprises EMG signals acquired from 11 amputee people for 17 hand movements (Exercise A) is used. There are12 EMG electrodes with a 2000 Hz sampling frequency which are used in the experiment. Each subject was asked to perform 6 movement repetitions with 5 s of action and 3 s of rest.

2.2 Feature Extraction Using DWT Method

Discrete wavelet transform (DWT) is a widely used feature extraction method in signal processing that decomposes signals in a time–frequency domain. The signals are decomposed into multi-resolution coefficients, i.e., low-pass filter and high-pass filter. Low-pass filters are represented in approximation levels (A) and high-pass filters are represented in detail levels (D) [20]. The popular mother wavelet-like sym4, sym6, coif4, coif5, db4, db6, bior2.2, bior2.4, and bior4.4 of the DWT method were selected to extract the features from raw EMG signals. Recent research shows that the decomposition at the fourth level with db6 gives promising results in EMG signal classification. Therefore, the DWT method at the fourth decomposition level with db6 is used in the current work [21, 22]. In literature, twenty-four popular statistical features have been used to extract valuable information from each wavelet coefficient as given in Table 1 [22,23,24,25,26,27].

Table 1 Twenty-four statistical EMG features

2.3 Background

2.3.1 Artificial Bee Colony

ABC consists of two components: bees (the process of decision making for selection of the position of food) and food source (position in space). It defines two types of behavior: nectar source identification and food source abandonment. Also, the bees are categorized into three class namely—employed, onlooker, and scout bees [13]. The job of the employed bees is to visit identified food sources while the job of the scout bees are to search new food sources around the hive. The job of the onlooker bees is to stay and watch the employed bees at the hive in order to identify the food sources. In the whole process, some bees are employed to search the food randomly around the hive. These bees are responsible for collecting, bringing, and depositing some nectar from identified food sources to the hive. After depositing, their job is to share the information through various dance about quantity and quality of the nectar of food sources with the bees (onlookers) waiting in a hive [28]. The bee colony now enters a cycle of iterations and the following steps are followed: (1) after the information is shared, the employed bee either becomes onlooker after the food source is abandoned or continue to forage the site visited earlier; (2) onlookers in the hive will follow employed bees simultaneously based on the information received to forage further on some memorized sources of food; and (3) some of the scouts will start random search spontaneously [13, 28, 29]. The food sources are randomly initialized using the below expression:

$$a_{i} = l_{k} + {\rm{rand}}\left( {0,1} \right)^{*} \left( {u_{k} - l_{k} } \right)$$
(1)

where \(a_{i}\) is the ith solution in the population, k is a randomly selected parameter index, \(u_{k}\) and \(l_{k}\) are upper and lower bound constraints for the solution search space of objective function to be optimized. Onlooker bee selects the food sources influenced by employed bees based on the probability:

$$Pb_{i} = \rm{fitness}_{i} /\mathop \sum \limits_{n = 1}^{SN} \rm{fitness}_{i}$$
(2)

to calculate the fitness values, we use the following equation:

$$\rm{fitness}_{i} = \left\{ {\begin{array}{*{20}l} {\frac{1}{1 + f}} \hfill & {\rm{if}\,f \ge 0} \hfill \\ {1 + abs\left( f \right)} \hfill & {\rm{if}\,f < 0} \hfill \\ \end{array} } \right.$$
(3)

The following equation is used for greedy selection to update the solution:

$$\begin{aligned} & \left. {\begin{array}{*{20}l} {a = ~a_{\rm{new}} } \hfill \\ {f = ~f_{\rm{new}} } \hfill \\ \end{array} } \right\}{\rm{if}}\,{\rm{fitness}}_{{\rm{new}}} > {\rm{fitness}}_{i} (a_{i} ) \\ & a\,{\rm{and}}\,f\,{{\text{remains the same if}}\,\rm{fitness}_{{\rm{new}}}} < {\text{fitness}}_{i} (a_{i} ) \\ \end{aligned}$$
(4)

where f represents the food source. Onlookers will explore all the locations that seem promising and might have a higher probability than other locations. The following equation is used to generate candidate food sources from the previously memorized ones as:

$$b_{i,j} = a_{i,j} + 2 \left( {r - 0.5} \right)\left( {a_{i,j} - a_{k,j} } \right)$$
(5)

2.3.2 Gaussian Artificial Bee Colony (GABC)

A common problem with meta-heuristic methods is that poor balance between the exploitation and exploration capability and thus suffers from either very slow convergence due to excessive exploration or premature convergence due to excessive exploitation [19]. In GABC, the improvement has been made to generate candidate food sources as per the following alternative:

$$b_{i.j} = \left\{ {\begin{array}{*{20}l} {a_{i,j} + \emptyset_{i,j} \left( {a_{i,j} - a_{k,j} } \right).\theta .\rho } & {\rm{if}\,r_{2} > z} \\ {a_{i,j} + \emptyset_{i,j} \left( {a_{i,j} - a_{k,j} } \right).2.\rho } & {\rm{if}\,r_{2} < = z} \\ \end{array} } \right.$$
(6)

where \(\emptyset_{i,j} = 2.\left( {r_{1} - 0.5} \right)\), \(\emptyset\) is a random number in the range [−1,1], and i, j, k are indexes of dimension of a solution, and r1, r2 \(\in\) [0,1] are random numbers generated from uniform distribution and

$$\rho = 0.5 - 0.25 \frac{{\rm{iter}}}{{\rm{maxiter}}}$$
(7)

and \(\theta\) is a number generated from Gaussian distribution; iter and maxiter represent current and total iterations, respectively. The parameter z is used to balance the tradeoff between Gaussian and uniform distribution.

3 Proposed Methodology

3.1 Beta Artificial Bee Colony (BetaABC)

Although GABC is better than the standard ABC, it still needs improvements in terms of convergence. Therefore, we propose BetaABC to further avoid slow or premature convergence. The proposed method uses beta distribution over uniform or Gaussian distribution that increases the search space which in turn gives a better balance between exploration and exploitation capability. In GABC, a list of random samples is drawn from a normal distribution and Gaussian distribution. The distribution is dependent on many parameters like loc, scale, and size. The number of samples is equal to the value of the size of the argument. Furthermore, in each iteration, the results will be slightly different due to their randomness nature but will follow the same distribution and general shape. Unlike GABC, the proposed method uses beta distribution to draw samples. It has the probability distribution function

$$f\left( {x;a,b} \right) = \frac{1}{{B\left( {\alpha ,\beta } \right)}}x^{\alpha - 1} \left( {1 - x} \right)^{\beta - 1}$$
(8)

where B is the beta function

$$B\left( {\alpha ,\beta } \right) = \int t^{\alpha - 1} \left( {1 - t} \right)^{\beta - 1} \rm{d}t$$
(9)

where a and b are oats or an array of oats, size is int or tuple of int and α & β are positive values.

4 Proposed Binary BetaABC for EMG Feature Selection

The presence of redundant and irrelevant information in a problem makes the classification of EMG signals excessively difficult, which also reduces the performance of their classification models. Therefore, the proposed BetaABC is converted into its binary version called binary BetaABC (BBABC) to solve the feature selection problem in EMG signal classification. The dimension of the search space is equal to the number of features in EMG dataset. The following matrix G (E × F) shows the possible solution to be selected as optimal feature subset. Here, E represents the population size, and F represents the number of features. Our main objective is to select S optimal features from the original EMG dataset, where S ≪ F. The example of possible solution is given below:

$$G = \left[ {\begin{array}{*{20}c} 1 & 0 & 1 & 0 & 0 & 0 & 1 & 1 \\ 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 \\ 0 & 1 & 1 & 1 & 0 & 0 & 0 & 1 \\ 0 & 1 & 1 & 0 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 1 & 1 \\ \end{array} } \right]$$

where G has a population size of 5 with 8 dimensions. Bit 1 shows the feature is selected, whereas bit 0 shows the feature is not selected. So the first solution represents the 1st, 3rd, 7th, and 8th features that were selected. The second solution represents the 1st, 3rd, 5th, and 7th features were selected, and so on.

Algorithm: Beta Artificial Bee Colony (BetaABC)

1. Initialize population of solutions \(a_{i,j}\) using Eq. (1)

2. Evaluate the population

3. Run = 1

4. Produce new food sources \(b_{i,j}\) in the neighborhood of \(a_{i,j}\) for employed bees using Eq. (6) with \(\theta\) generated from Eqs. (8) and (9)

5. Calculate probability values \(P_{bi}\) for \(a_{i,j}\) with the help of their fitness values using the Eq. (2)

6. Calculate the value of \(\rm{fitness}_{i}\) using Eq. (3)

7. Apply greedy selection process between \(a_{i,j}\) and \(b_{i,j}\) by using Eq. (4)

8. Produce new food sources \(b_{i,j}\) in the neighborhood of \(a_{i,j}\) for onlooker using Eq. (6) with \(\theta\) generated from Eqs. (8) and (9)

9. Produce new positions bi for onlooker from the \(a_{i}\) selected depending on \(P_{bi}\) and evaluate by using Eq. (2)

10. Calculate the value of \(\rm{fitness}_{i}\) using Eq. (3)

11. Apply greedy selection process for the onlookers between \(a_{i,j}\) and \(b_{i,j}\) by using Eq. (4)

12. Find the abandoned sources, if exists, and replace them with new random solution \(a_{i}\) for scout using equation

\(a_{i,j} = \min_{j} + \rm{rand}\left( {0,1} \right)^{*} \left( {\max_{j} - \min_{j} } \right)\)

13. Memorize the position of best food source achieved so far

14. Run = Run + 1

15. Repeat step 4 to 14 until Run = Maximum runs

Figure 1 shows the flow diagram of the proposed BBABC for EMG feature selection and classification. Initially, raw EMG signals are preprocessed, and features are extracted from preprocessed EMG signal using DWT to form an original feature set. The extracted feature set contains 24 statistical features on 8 coefficients obtained at 4th level of DWT decomposition using db6 mother wavelet. The proposed BBABC is used to find the optimal feature subset from the extracted feature set. In the feature selection process, the fitness function is used to evaluate the performance of the classification. If the solutions give the same fitness value, then solution with fewer features is selected. Finally, the optimal feature subset obtained by the proposed BBABC is fed into the KNN classifier for EMG signal classification.

Fig. 1
A flowchart of the proposed B B A B C. The steps are as follows: Start, E M G signal, D W T decomposition, feature extraction, selection, classification, and 17 hand movements.

Flow diagram of proposed BBABC for EMG signal classification

5 Experimental Analysis

Experimental analysis is divided into two sections: Section-I investigates the proposed BetaABC with ABC and GABC on 10 benchmark test functions that have been tested to validate its effectiveness and Section-II investigates the proposed BBABC for EMG feature selection on 11 amputee subjects.

Section-I: Experimental results of proposed BetaABC on benchmark functions

This section presents the comparative results of proposed BetaABC with ABC and GABC on different parameters. A set of 10 benchmark test functions have been used to validate the performance of BetaABC. These benchmark functions are used for the verification of many stochastic optimization algorithms. The necessary information about the functions, their mathematical representations, search range, and desired optimal solution is showcased in Table 2. All the test functions used for experimental verification are minimization functions.

Table 2 Benchmark functions used in experiment

The results of BetaABC are compared with the results of ABC and GABC for each benchmark function with different values of the parameter. The efficiency of the BetaABC algorithm depends on the parameter z that ranges from 0.1 to 0.9. The benchmark functions were run for several evaluations as the population size used for the experiment is 35, number of iteration is 100, and the value of z is 0.5. The experiment was run 20 times independently for dimensions D = 2, and three statistical values, i.e., best, worst, and the average values, are noted from these experiments which are given in Table 3. It shows that the proposed BetaABC achieves higher or competitive results than ABC and GABC in all three statistical parameters. Convergence graphs based on fitness values are also plotted for comparison between ABC, GABC, and BetaABC on all benchmark functions as shown in Fig. 2. BetaABC is outperforming than ABC and GABC with suitable parameters.

Table 3 Fitness values of proposed BetaABC with ABC and GABC on 10 benchmark functions
Fig. 2
A set of 10 line graphs plot accuracy versus iteration for beta A B C, G A B C, and A B C. The values are plotted on 10 different benchmark functions.

Convergence graph of proposed BetaABC on 10 benchmark functions

Section-II: Experimental results of proposed BBABC on EMG signal classification

This section presents the performance of the proposed BBABC with BABC and BGABC for EMG feature selection. The EMG signals from 11 amputee subjects were collected from exercise A of the NinaPro DB3. Thereafter, the mother wavelet db6 of the DWT method was applied to decompose the EMG signal into the fourth decomposition level. Twenty-four statistical features were extracted from each DWT coefficient (four details and four approximations). In total, 2304 features (12 EMG channels × 8 DWT coefficients × 24 statistical features) were extracted from each EMG segment of each subject. The feature selection based on proposed BBABC is applied to select the optimal feature subset. The KNN classifier is used in the experiment for the classification of 17 different hand movements from 11 amputee subjects. The two metrics, namely maximum accuracy and feature selection ratio, are used to evaluate the performance of the algorithms.

Table 4 shows the maximum accuracy and standard deviation of proposed BBABC with BABC and BGABC in which the best results are highlighted by bold text. It is observed that the proposed BBABC achieves the highest accuracy than the other two methods. The ranking of three methods is also given based on win/tie/loss (W/T/L) statistical measure. The W/T/L measure is also suggesting that the proposed BBABC-based feature selection method is outperforming than other methods. In Fig. 3, the comparison between BABC, BGABC, and BBABC is demonstrated by the convergence graph based on classification accuracy. The convergence graph is also showing that BBABC-based feature selection gives better classification accuracy than BABC and BGABC on 11 amputee people.

Table 4 Maximum classification accuracy with different ABC variants on 11 amputee subjects
Fig. 3
A set of 11 line graphs plot accuracy versus iteration for B B A B C, B G A B C, and B A B C. The values are plotted on 11 different amputees.

Convergence graph of proposed BBABC with BABC and BGABC on 11 amputees

Table 5 shows the comparison of three feature selection methods BABC, BGABC, ad BBABC based on feature selection ratio, i.e., the number of selected features divides by the total number of features. It is observed that the proposed BBABC method wins 7 times in selecting the minimum number of features.

Table 5 Average feature selection ratio of BBABC with BABC and BGABC on 11 amputees

6 Conclusion

This paper proposes a new variant of ABC algorithm called as BetaABC algorithm inspired from GABC algorithm aiming to improve its performance by maintaining a proper balance between the exploitation and exploration of the search space. The Uniform distribution used in the estimation of new search space solution is replaced with beta distribution which gives the better results than the existing algorithm. The performance of proposed BetaABC is evaluated on 10 benchmark functions and an obtained result suggests that it is outperforming the standard ABC and GABC. Further, a binary version of BetaABC called BBABC is proposed to solve the FS problem of EMG signal classification. The accuracy and feature selection ratio is taken into consideration to validate the performance of the proposed BBABC with BABC and BGABC on 11 amputee subjects. It can be inferred from the experimental results and statistics that BBABC with appropriate parameters is giving either better results or competitive to other variants of ABC. Further improvements are to be expected keeping in view to the increasing popularity of the other variants of ABC.