1 Introduction

Surface electromyography (sEMG) can be applied in a wide variety of human–robot interaction applications, including motion classification, joint angle prediction, and force estimation. Firstly, with just two pairs of active surface electrodes on each forearm, twenty-five robotic commands can be controlled in real time in a robotic arm [1]. Secondly, recent research on applications of sEMG for joint angle prediction includes the determination of ground reaction forces and lower body joint angles from sEMG sensors during a step-down task for people with osteoarthritis [2], continuous prediction of joint angle from sEMG via a multi-feature temporal convolutional attention network [3], and prediction of the finger angle position of the first joint based on the value of the forearm's sEMG and the finger's previous location [4]. Thirdly, muscle force estimation is essential for biomechanical modelling and natural human–machine interaction. Estimating the muscle forces of the human arm has also been explored [5, 6], with applications in prosthetic control and rehabilitation robots [7,8,9,10,11]. Muscle contractions produce forces and enable humans to move. The forces generated by muscles in response to brain control signals are dependent on an immense number of variables dispersed over several spatiotemporal scales [12], making muscle force prediction challenging [13].

Force sensors are in charge of measuring the force exerted on an object. Tensile and pressure forces, in general, do not require sensors to directly estimate muscle forces due to the cost and size of the sensor. In addition, muscle activity can be monitored by recording either the relevant electrical or mechanical phenomena [14]. Surface electromyography (sEMG) is the total of subcutaneous motor action potentials generated by muscle contraction [15], which can represent neuromuscular activity. The sEMG can be used to assess the patient's voluntary effort. Moreover, the brain drive to the muscle is detected by sEMG [16,17,18]. It is feasible to estimate muscle force production by modelling the association between brain control signals and muscle force using knowledge of the neurological system because muscle contraction causes joint forces to be produced. Many rehabilitation devices are controlled by sEMG signals, which are generated by the electrophysiological and mechanical activations of muscles [19, 20]. The relationship between sEMG and muscle force has been the subject of research for various applications due to its ease of collection and non-invasiveness.

The study of the relationship between sEMG and force is varied and growing in this field. There are mathematical model-based approaches with complex parameters that rely on specific muscle information [21,22,23,24,25]. Furthermore, a machine learning technique has been utilized to develop a nonlinear relationship between sEMG and force. A review of related works presented in the literature has been described below.

A deep convolutional neural network (CNN) based on a regression model was introduced in [5, 13, 26]. To estimate the contract force at the wrist for sophisticated inter-subject force modelling during isometric elbow flexion contractions [5], models with feature-level fusion of the sEMG in both time and frequency domains were used. Furthermore, in [14], five deep convolutional modules were proposed to map the interaction force and sEMG signals when hand contact emerged directly on the robot’s arm, which worked in the desired Cartesian location for the task of physical human–robot interaction. To improve the middle hidden layer [26], the constrained autoencoder network (CAEN) was proposed to improve extracting features with reduced dimensions. Such a suggested model was also compared for effectiveness with four models in an artificial neural network for estimating simultaneous finger activity as well as the impact of human engagement in actual prosthetic hand control.

In [15], they compared several types of neural networks and selected the method that provided the best performance. A real-time estimate in perpendicular degrees of freedom (DoFs) was accompanied by task force, such as tightening or loosening a screwdriver (restricted motion). A long short-term memory (LSTM) network was presented in natural human–robot interaction (HRI) for the simultaneous estimation of motion and interaction force. In [21], the gripping force for object-grabbing in the three-finger pinch mode was estimated from sEMG signals. To achieve a rapid and precise prosthetic hand, various types of neural networks, including basic recurrent neural networks (RNN), long short-term memory (LSTM), gated recurrent units (GRU), and multilayer perceptron networks (MLP), were employed. For the study in [16], high-density sEMG signals were analyzed for time and frequency domain features (336 features from 21 channels) in order to create nonlinear bagged tree ensemble (BTE) models for isometric force estimation. Finally, [27] presented an approach for myoelectric control systems based on multivariable system identification in state space (SS). The Kalman filter was also applied for proportional and continuous grasping force estimation. The main purpose was to describe a new approach for predicting gripping force in real time using sEMG measurements in order to control a hand prosthesis.

A summary of related works, with a special focus on force estimation tested in the muscle area of the human arm, hardware, experimental setups and force patterns, force estimation models, and objectives based on sEMG signals, is also provided in Table 1.

Table 1 A summary of the related works

According to the literature survey introduced above, in this paper, we propose a muscular force estimation based on sEMG signals. The aims and contributions of this work are fourfold.

  • First, a system for forearm muscle estimation based on sEMG signals measured from the Myo armband is presented. A force sensor is installed on a mobile robot, with a handle on the sensor so that force can be applied according to the movement pattern in the XY plane.

  • Second, in order to cover the format for actual physical estimation, we consider force patterns covering axial movements in the two-dimensional plane, and elbow placement in three scenarios in fixed and free positioning is studied.

  • Third, nineteen regression models are applied and evaluated, including: Gaussian process regression (GPR) with rational quadratic, exponential, squared exponential, and matern 5/2 kernels; neural networks (NN) with different layers, namely narrow, medium, wide, bilayered, and trilayered; linear regression (LR) with linear, robust, stepwise, and interactions models; and support vector machines (SVM) with quadratic, medium Gaussian, linear, coarse, cubic, and fine Gaussian kernels.

  • Finally, regression models' performances are validated with various test scenarios based on the design of force patterns. The evaluation is considered based on root mean square error (RMSE) values, speeds, and times required to achieve a model. Our major findings indicate that the GPR model with the exponential algorithm obtains the best results with the RMSE in the range of 1.18–1.77 N. The model can potentially be used to estimate the force of a mobile robot in two planes, and it is capable of estimating forces close to the force recorded from the force sensor.

The structure of this paper is as follows. The methodology, which consists of a data acquisition and experimental protocol, test scenarios, force estimation, and regression models, is included in Sect. 2. In Sect. 3, results, signal characteristics, performance comparison, and signal comparison are given. Finally, we discuss and conclude the paper in Sect. 4.

2 Methodology

2.1 Data acquisition system and experimental protocol

The data acquisition system was designed to simultaneously collect sEMG and force signals, as shown in Fig. 1. On the one hand, eight channels of sEMG signals were acquired from the Myo armband (Thalmic Labs, Kitchener-Waterloo, Canada), which is capable of transmitting 8-bit resolution data across wireless communication via Bluetooth protocol. On the other hand, the 6-axis force and torque sensor (ATI: Mini40), which was placed at the base of a mobile robot, was utilized to measure the force signals in the X and Y axes. The mobile robot is composed of four wheels. The handle was designed to push the mobile robot to move in the XY plane. However, in this experimental setup, the mobile robot was fixed (not moving). A sampling rate of 200 Hz was used for both EMG and force signal collections.

Fig. 1
figure 1

A block diagram of the overall system

In the experiment, there were two participants (34 \(\pm\) 1 years, 1 male and 1 female). The experiments were in accordance with the Declaration of Helsinki. Before data collection, all participants were informed of the experimental protocols so that they were familiar with the setup and the procedure and agreed to participate in the test. The signals were collected based on ROS Kinetic, operating under Ubuntu and running Python for real-time visualization and data storage. The Myo armband was placed on the forearm, approximately 2 inches from the elbow. A reference channel was fixed at the yellow dot shown in Fig. 2, which matched the fourth channel of the Myo armband. The corresponding muscle to this sEMG channel was the pronator teres. The location of the other seven channels and their corresponding muscles is shown in Table 2.

Fig. 2
figure 2

A placement of the Myo armband on the forearm

Table 2 Distribution of the sEMG electrodes on the forearm

In the experimental setup, force signals from four directions of wrist movements, namely, the positive X-, negative X-, positive Y-, and negative Y-axes, were collected. The movement sequence was shown in Fig. 3. In each direction, there were ten sessions of movements. In each session, the participant alternately took a rest and exerted force for 10 s each. In total, each direction was performed for 20 s. The number of samples for each channel of the sEMG signal and a force signal was 40,000 (20 s × 200 Hz). To avoid fatigue, there was a one-minute rest between each direction of movement. The number of samples collected from all four directions was 160,000 (40,000 samples × 4 directions). Furthermore, the experiment was done twice, with identical protocols for each of the four directions of movement.

Fig. 3
figure 3

The movement sequence used for collecting sEMG and force signals from four directions

2.2 Test scenarios

The participant sat comfortably in a chair in front of a computer screen for signal visualization and grasped the handle when the forearm was held in place to prevent needless wrist bending. The participant then performed three different test scenarios where the elbow was placed in different positions during isometric contractions, as shown in Fig. 4. The details of each scenario are as follows.

Fig. 4
figure 4

The scenario cases in the experimental setup. (Left) The elbow position was fixed at an angle of 30 degrees to the floor. (Middle) The elbow position was fixed and parallel to the floor. (Right) The elbow position was free

2.2.1 Scenario 1

The elbow position was fixed on the floor at a distance of 6 inches from the device, and the lower arm is at an angle of 30 degrees to the floor, as shown in Fig. 4 (Left). This type of experimental setup provided the muscle force in the case of isometric movement.

2.2.2 Scenario 2

The elbow position was fixed and parallel to the floor. In this scenario, the elbow was placed on the box at a height of 5 inches, as shown in Fig. 4 (Middle). Most experimental setups for force estimation using sEMG signals from previous publications used this scenario.

2.2.3 Scenarios 3

The elbow position was free, and it was approximately parallel to the floor, as shown in Fig. 4 (Right). The sEMG and force signals collected from this scenario were close to the natural movement along each direction compared to scenarios 1 and 2.

2.3 Force estimation

There were three steps in force estimation: segmentation, feature calculation, and regression. The details of each step are as follows.

2.3.1 Step (1) segmentation

After sEMG and force signals were collected, they were divided into small segments using a window without overlapping. The window length was 50 samples (250 ms). As a result, for the data collected from the movement in each direction, the number of segments from each channel of the sEMG signal and a force signal was 800 (40,000 samples/50 samples per segment).

2.3.2 Step (2) feature calculation

The segmented data from Step (1) was used for calculating features in this step. The root mean square (RMS) value is used as a feature for sEMG signals, which can be expressed as

$${\text{RMS}}_{k}^{m} = \sqrt {\frac{1}{N}\sum\limits_{i = 1}^{N} {x_{i}^{2} } }$$
(1)

where \(x_{i}\) is the amplitude of the sEMG sample in the segment, \(N\) is 50, \(k\) is the segment number, and \(m\) is the sEMG channel number (\(1 \le m \le 8\)). For the force signal, the mean (MEAN) value is used as a feature, which is given by

$${\text{MEAN}}_{k} = \frac{1}{N}\sum\limits_{i = 1}^{N} {f_{i} }$$
(2)

where \(f_{i}\) is the force sample in the segment.

2.3.3 Step (3) Regression

In this step, the \({\text{RMS}}\) and \({\text{MEAN}}\) features from Step (2) are used to train a regression model for estimating force from the sEMG signals. In model training, while the RMS features from 8-channel sEMG signals are formed as an input vector for the regression model, the MEAN features from the force signal are formed as an output vector. For example, there are 800 feature values per channel in the movement from each direction and scenario. Then, the number of features 640 is used for model training and validation, and the number of features 160 is used for model testing. In addition, fivefold cross-validation is used for the training model.

2.4 Regression models

This research compares and tests four regression models: Gaussian process regression (GPR), neural networks (NN), linear regression (LR), and support vector machines (SVM). The following provides a brief overview of each regression model and the associated parameters that were used:

GPR integrates latent variables and an explicit basis function for describing the target [29,30,31]. Four kernels of the GPR model were utilized in this study, namely rational quadratic, exponential, squared exponential, and matern 5/2.

NN used in this paper employs a feedforward in a multilayer perceptron network architecture. Five NN regressions are studied, namely narrow, medium, wide, bilayered, and trilayered [32]. The network architecture only includes one layer for narrow, medium, and wide NNs. Meanwhile, bilayered and trilayered network architectures have two and three layers, respectively. The ReLU activation function is used in the model to conduct a threshold operation on each element of the input, where any value less than zero is set to zero. The learning algorithm is the Levenberg–Marquardt algorithm. More details on the NN parameters are shown in Table 3.

Table 3 NN models’ architecture and parameters

LR assumes a linear relationship between the output and the input of the regression model [30, 33]. Multiple linear regression is a generalization of simple linear regression with more than one independent variable and a subset of general linear models with only one dependent variable.

SVM works by translating the data from its original space into a higher-dimensional one, which is known as the feature space, by a mapping function [30, 34]. In this work, six forms of SVM regression are used, namely linear, quadratic, cubic, fine Gaussian, medium Gaussian, and coarse Gaussian. More details on the SVM parameters are shown in Table 4. We note that to obtain the optimal NN and SVM models, hyperparameter optimization through a grid search or a random search algorithm should be considered to find the best estimation result.

Table 4 SVM models’ hyperparameters

2.5 Performance evaluation

In the training and testing steps, the major metric for evaluating the performance of each regression model was the root mean square error (\({\text{RMSE}}\)), which can be expressed as

$${\text{RMSE}} = \sqrt {\frac{1}{M}\sum\nolimits_{i = 1}^{M} {(\hat{F}_{i} - F_{i} )^{2} } }$$
(3)

where \(M\) is the number of force samples, \(\hat{F}_{i}\) is the estimated force, and \(F_{i}\) is the measured force. In order to compare the performance of the proposed method to those from previous publications, other metrics are calculated, including normalize mean square error (\({\text{NMSE}}\)), normalize root mean square error (\({\text{NRMSE}}\)), and mean absolute error (\({\text{MAE}}\)), which are given by

$${\text{NMSE}} = \frac{{\sum\nolimits_{i = 1}^{M} {(\hat{F}_{i} - F_{i} )^{2} } }}{{\sum\nolimits_{i = 1}^{M} {F_{i}^{2} } }}$$
(4)
$${\text{NRMSE}} = 1 - \frac{{{\text{RMSE}}}}{{\sqrt {\sum\nolimits_{i = 1}^{M} {(\hat{F}_{i} - F_{i} )^{2} } } }}$$
(5)
$${\text{MAE}} = \frac{1}{M}\sum\nolimits_{i = 1}^{M} {\left| {\hat{F}_{i} - F_{i} } \right|}$$
(6)

Force estimation from all scenarios in this paper were run on a computer using an Intel(R)Core™ i7-8565U@1.80 GHz processor with 8 GB of RAM.

3 Results

3.1 Signals characteristics

Figure 5 shows an example of five sessions of 8-channel sEMG and force signals for four directions of movement in scenario 1 from subject 1. The Myo armband collects these sEMG signals with an 8-bit resolution ranging from − 128 to 127. The amplitude and direction of the force pattern on the X and Y axes are recorded from the force sensor. The results indicate that the difference in movement direction provides different patterns among 8-channel sEMG signals, which are caused by the contraction of different muscle groups.

Fig. 5
figure 5

Examples of measured 8-channel sEMG and force signals for four directions of movement in scenario 1 from subject 1

From the collected 8-channel sEMG and force signals, the RMS and MEAN features are determined. Examples of RMS and MEAN features for the positive X movements in scenario 1 from subject 1 are shown in Fig. 6 with a pink line.

Fig. 6
figure 6

Examples of RMS and MEAN features for positive X movements in scenario 1 from subject 1

3.2 Performance comparison

After RMS and MEAN features are extracted, the force is estimated by the proposed four regression models, namely GPR, NN, LR, and SVM. The RMSE values from the four regression models in scenario 1 are shown in Fig. 7, where the best algorithm for each model is shown. The performance indices defined in the training and testing phases are shown as the mean and standard deviation of RMSE values averaged across 5-folds for each scenario case of all movement patterns.

Fig. 7
figure 7

RMSE values from the four regression models in scenario 1

It has been discovered that GPR provides the best overall force estimating accuracy among both training and testing of all subjects. NN and SVM give decent accuracy but are less accurate than GPR, while LR has the least approximation ability. In the case of subject 2, the error results for all models are significantly higher than in the case of subject 1. As a consequence, based on the average RMSE values of 1.267 and 1.093 N in the average of force estimation from scenario 1, the results show that the GPR is the best estimate model, followed by NN and SVM, respectively.

Figure 8 shows the RMSE values from the four regression models in scenario 2. In terms of the RMSE values from the same exertion pattern, it is discovered that the overall error values from all four models in scenario 2 are higher than those in scenario 1. The GPR model gives the force-estimating abilities of RMSE in the ranges of 1.363–1.520 N. Furthermore, while NN and SVM produce similar force-estimating results in subject 1, the LR model as a whole produces rather high error amounts when compared to the three models mentioned above. As a result, in scenario 2, the GPR model overtakes the NN and SVM models in terms of accuracy of force estimation.

Fig. 8
figure 8

RMSE values from the four regression models in scenario 2

The RMSE values from the four regression models in scenario 3 are shown in Fig. 9. In terms of the RMSE efficiency from the same exertion pattern, it is discovered that the overall error values from all four models in scenario 3 are higher than those from scenarios 1 and 2 because the elbow is free in this scenario. When comparing subject 1 to subject 2, the four models' force-estimating error is smaller in subject 1. The NN and SVM models outperform the LR in terms of force estimation. However, for force estimation in Scenario 3, the GPR model still produces lower error values than the NN and SVM models do.

Fig. 9
figure 9

RMSE values from the four regression models in scenario 3

When the data from both subjects is used in both training and testing steps, the RMSE values from all scenarios are shown in Fig. 10. The RMSE values from scenario 1 are lower than those from scenarios 2 and 3 because they are estimated by the sEMG and force signals collected when the elbow is fixed on the table. On the other hand, the RMSE values from scenario 3 are the highest. Moreover, the GPR model with an exponential algorithm yields the lowest RMSE for all three scenarios, whereas the LR model with an interaction algorithm gives the highest RMSE.

Fig. 10
figure 10

RMSE values from the four regression models of two subjects

4 Discussion

4.1 Signal comparison

The estimated forces from four regression models, namely GPR, NN, LR, and SVM, are shown in Figures 11 and 12 with green, red, cyan, and pink lines, respectively, compared to the measured force in the blue line. The algorithm with the best results is chosen to estimate the muscular force of the four models (details of the algorithm from the testing results are presented in Fig. 10). The graph's details include exerting force in four ways: moving along the positive axes X and Y and the negative axes X and Y of subjects 1 (signal waveforms 1 to 5) and 2 (signal waveforms 6 to 10), respectively.

Fig. 11
figure 11

Comparison of the estimated forces from four regression models with the measured force in scenario 1

Fig. 12
figure 12

Comparison of the estimated forces from four regression models with the measured force in scenario 3

In scenario 1, the GPR model estimates force accurately across the whole range of force exerted in a negative direction on both the X and Y axes (X-movement and Y-movement) and a positive direction (X + movement and Y + movement), as shown in Fig. 11. The force estimation from the GPR model is better than that the other three models. The LR model estimates force with the lowest accuracy in all directions of movement. The NN and SVM models are found to be more accurate and have near-maximal force estimation accuracy as compared to the actual force exertion.

The GPR model provides more accurate force estimation than the NN model in scenario 3, as shown in Fig. 12. However, both models have near-optimal force prediction accuracy in all cases of pushing directions. The force estimation results for the SVM model are slightly better than those for the LR model, but they are still worse than the GPR and NN models. When looking at the overall picture in both the positive and negative directions, it can be concluded that the GPR model has a comparatively good force estimation ability when compared to the other three models. Force estimation from four regression models in scenario 3 is shown to be less accurate than those from scenario 1.

4.2 Performance comparison

The RMSE values from the four regression models in scenarios 1 to 3 for subjects 1 and 2, as shown in Figs. 7, 8 and 9, demonstrate a consistent trend. Furthermore, when data from both subjects is used in both the training and testing steps, the RMSE values from all scenarios shown in Fig. 10 also exhibit a similar trend. Specifically, GPR achieves the lowest RMSE value, while LR has the highest RMSE value. These results indicate a promising direction and suggest the potential for applying the proposed method to individuals of varying ages. However, further data acquisition and analysis are necessary to validate this assumption, which will be a focus of future research.

To address muscle fatigue, we designed a data collection protocol that minimizes its impact. Each participant completed ten sessions of movements in each direction, with each session consisting of alternating periods of rest and exertion lasting 10 s each. Between movements in different directions, participants rested for one minute. We also experimentally determined the median frequency of sEMG signals from channel 1 when the subject exerted force in the X + direction. The mean and standard deviation of this median frequency were 69.4 Hz and 1.2 Hz, respectively. The results showed no significant decrease in the median frequency, indicating that muscle fatigue was not present.

4.3 Computation complexity

Figure 13 summarizes model performance in terms of prediction speed, training time, and model size. In detail, it is discovered that there are only algorithms from the GPR model and the NN model that give force estimation results with the least error from both training and testing steps. When considering LR and SVM compared to GPR and NN, they have a higher force estimation error. When it comes to prediction speed, it has been discovered that both structures of the NN model have better prediction speed than the GPR model does, particularly the wide and the trilayered. In terms of training time, the GPR and NN models require roughly the same amount of time. The GPR model has a reserved model size of 415 kB for each algorithm, which is fairly high in comparison to the NN model. Therefore, it is possible to conclude that the GPR model with an exponential algorithm is appropriate for estimating force estimation based on sEMG in the XY plane.

Fig. 13
figure 13

Comparison of the best regression models in terms of computations

5 Conclusions

In this paper, we propose four regression models, specifically GPR, NN, LR, and SVM, for estimating the muscle force based on sEMG signals that covered the X- and Y-axis movement patterns in three scenario cases where the combined elbow positions were fixed and free. The performances of these four models are compared based on the average RMSE values. The superiority of the GPR model with the exponential algorithm in scenarios 1 to 3 provides the RMSE values of 1.18, \(\pm\) 0.17, 1.37 \(\pm\) 0.19, and 1.77 \(\pm\) 0.51 N, respectively. The RMSE from scenario 3 is the highest because the free elbow position causes more variation in sEMG signals. The model proposed in this paper can be utilized for estimating the force of a mobile robot in two planes and is capable of estimating forces that are close to the force recorded from the force sensor.

In future work, we will verify to guarantee the prediction accuracy of the dynamic force with more subjects and expect that this model with the highest performance will be a general model that can be easily applied and will be employed in real-time force estimation in practical applications.