1 Introduction

Over the past few decades, there has been growing research interest in assistive exoskeletal robots. An exoskeleton is a wearable device used to enhance the physical function of an injured or disabled person during daily activities. Exoskeletons can be used in many applications, such as assisting workers or soldiers in reducing the weight they bear while performing tasks and helping patients perform repetitive rehabilitation training [1,2,3,4,5,6]. However, fewer studies have been conducted on gadgets that aid the human body in doing uphill movements. Huang, R. et al., [7] proposed an adaptive gait planning method with dynamic motion primitives for a lower limb exoskeleton to assist the human body in uphill motion. The experimental results showed that the proposed gait planning method made the human exoskeleton system more stable in uphill scenarios. Seo, K. et al., [8] developed a hip exoskeleton to enhance gait function in the elderly and rehabilitation of post-stroke patients. In practice, human–machine communication has been crucial to ensuring the performance and comfort of the exoskeleton system as a whole [9]. Various data recording systems, such as accelerometers, gyroscopes, and barometers, have become available due to developments in wearable sensor technology [10]. Various data-recording technologies, such as accelerometers, gyroscopes, and barometers, are now available due to improvements in wearable sensor technology. However, the wearer’s movement intent hinders the development of this technology, as standard exoskeleton sensors cannot forecast movement tendencies. Existing research [11] has shown that surface EMG signals can provide information about neuromuscular activity and be used to control exoskeletons. EMG signals are biological signals used to measure the electrical activity of skeletal muscles. EMG signals can reflect muscular contraction force 30–100 ms earlier than other wearable sensors [12]. Intramuscular EMG (invasive) and surface EMG (non-invasive) are the two methods for recording EMG signals [13]. Comparatively, the noninvasive approach permits electrodes without physician supervision, discomfort, or infection risk [14]. Currently, SEMG signals are widely used in various applications such as upper [15] and lower extremity [16] exoskeletal control, neuromuscular disease examination [17], health and exercise monitoring [18].

Numerous researchers have investigated machine learning and deep learning models for identifying limb activities for controlling exoskeletons or prostheses. Chen, Y. et al., [19] proposed a low-cost Soft Exoskeleton Glove (SExoG) system for bilateral training that is powered by sEMG signals from a non-paralyzed hand. The experiments demonstrated that the hybrid model could achieve an average accuracy of 98.7% with four hand motions. Cisnal A. et al., [20] created a thresholded non-pattern recognition EMG-driven controller that detects gestures from a healthy hand and repeats them on an exoskeleton worn by a paralyzed hand. The study’s findings revealed a 97 percent overall accuracy for gesture detection and indicated that the system was adequately time-responsive.

The surface EMG signal of the lower extremities is more intricate than that of the upper extremities. Lower extremity muscles are deeply buried beneath the skin and significantly overlap, making a prediction of motion based on surface EMG data from the lower extremity more complicated than that of the upper extremities. Zhuang, Y. et al., [21] suggested an EMG-based Conductance Control Strategy (ECCS). The system incorporates an EMG-Driven Musculoskeletal Model (EDMM), a conductance filter, and an internal position controller. ECCS is excellent at enhancing motor stability and has the potential to be utilized in robot-assisted rehabilitation to treat foot drops. Lyu, M. X. et al., [22] have designed an EMG-controlled knee exoskeleton to aid in the rehabilitation of stroke patients. The EMG signal of the patient was captured via an easy-to-wear EMG sensor and then processed by a Kalman filter to drive the exoskeleton autonomously. The test results demonstrated that individuals could use their EMG signals to control the exoskeleton.

High-quality signals provide more information needed for intention prediction, thus improving the prediction accuracy. However, different interventions and interferences are inevitable during the collection of sEMG signals [10]. Changes in patch position, sweat on the surface of the human skin, and EMG sensor transmission issues can affect data gathering during trials. Numerous signal processing applications, particularly in the communication and medical areas, require the pre-processing of sensor data to decrease noise. It becomes challenging to minimize the effects of signal interference. In reality, numerous researchers have researched ways to reduce signal noise pollution. Hajian, G. et al., [23] proposed a method for channel selection utilizing Fast Orthogonal Search (FOS) to increase estimation power. The method uses PCA in the frequency domain to identify the channel that contributes the most to the first principal component. The results demonstrate that the proposed method may minimize the dimensionality of the data (the number of channels is reduced from 21 to 9) while increasing the estimating power's precision. Combining nonlinear time series analysis and time–frequency domain approaches, Wang, G. et al., [24] proposed a wavelet-based correlation dimensionality method for extracting the effective features of sEMG signals. Results indicate four separate clusters corresponding to different forearm motions at the third resolution level, with a classification accuracy of one hundred percent when using two channels of SEMG signals. This indicates that the proposed method is suitable for classifying different forearm motions. Sapsanis, C. et al., [25] proposed a pattern recognition method for identifying basic hand movements using sEMG data. Their experiments used Empirical Mode Decomposition (EMD) to decompose the EMG signal into an eigenmode function, followed by a feature extraction stage. The outcomes demonstrate that the application of EMD can enhance the recognition of traditional feature sets generated from the original EMG signal.

Deep learning techniques have developed rapidly in recent years. Compared with machine learning, deep learning focuses more on learning sample data’s intrinsic patterns and representation levels. The information obtained from these learning processes can considerably assist when interpreting text, visuals, and sounds. CNN is a type of feedforward neural network that incorporates convolutional computation and has a deep structure; they are also one of the representative algorithms of deep learning. By their hierarchical structure, CNN is capable of representational learning and classifying incoming data in a manner that is translation-invariant. CNN has also demonstrated effectiveness in identifying time-series data such as EEG [26], EMG [27], and ECG [28] signals. LSTM is a temporal recurrent neural network that can process data by learning the data dependencies based on time-sequential data, making it suited for processing and forecasting events with time intervals and delays [29].

Inspired by the constraints of CNN and the benefits of LSTM, this study proposes a hybrid CNN-LSTM model. The model combines feature extraction and time series regression for deep learning to use the Spatio-temporal correlation of surface EMG signals fully. By extracting the deep features of CNN and performing LSTM processing, it is possible to predict complex EMG signals accurately. Moreover, the proposed prediction model is more precise and effective. In addition, an improved PCA based on the kernel approach is proposed for processing experimentally acquired sEMG data to solve the classic PCA's constraints in addressing the issue of nonlinear data. The remaining sections are organized as follows. Section 1 discusses the experiments and methods in detail, including experimental design, data collection, and data preprocessing. Sections 2 and 3 describe the pre-processing data methods and prediction models used in this paper. Section 4 compares the experimental results in different cases (including different dimensions and different methods). Section 5 discusses the results of the calculations. Finally, Sect. 6 concludes the paper.

2 Materials and Methods

2.1 Data Acquisition

2.1.1 Acquisition of sEMG Signal

sEMG is a technique used in research to examine the creation, recording, and interpretation of EMG signals. When physiological changes occur in the state of muscle fiber membranes, sEMG signals are created. Delsys is a global leader in designing, producing, and marketing high-performance EMG equipment (sales@delsys.com). Since its inception in 1993 in Natick, Massachusetts, Delsys has been focused on addressing the engineering issues involved with wearing EMG sensors. These challenges include low signal artifacts, low crosstalk, signal reliability, and signal consistency. This experiment uses the TrignoTM wireless EMG (a wireless surface EMG acquisition device from Delsys, USA) to acquire EMG signals. This device has a sensing delay of less than 500us and is equipped with 16 sensors with a maximum sampling rate of 4000 Hz. To prevent signal loss, a sampling rate of 2000 Hz was adopted for the trials. In this experiment, the TrignoTM system can simultaneously activate the Codamotion (3D motion capture) system, ensuring that the EMG data correlates to the knee joint angle data when data analysis is performed. Due to the symmetry of the human body during motion, data from only one leg (the right leg) were collected and analyzed for this experiment. According to the recommendations [30], 16 muscles on the right leg were selected as signal acquisition points. Figure 1 illustrates the location of the sensor paste. Prior to the experiment, selected locations were cleaned with medical alcohol to guarantee that EMG signals could be effectively acquired. It is essential to note that the sensors must be firmly attached to the muscles to prevent the EMG signal collector from deflecting during the activity. The distance between every two sensors is approximately 30 mm to reduce crosstalk. Otherwise, data loss may occur.

Fig. 1
figure 1

Location of sensor attachment

2.1.2 Acquisition of Joint Angle

In this experiment, Codamotion’s 3D motion capture technology was used to collect human knee joint angle data. Codamotion is a provider of motion capture devices for academic research, healthcare applications and other life science markets (info@codamotion.com). The device consists of two cameras (to capture the trajectory of the marker points), a computer (to solve and store the data), and several marker points. The sampling rate of the device was adjusted to 200 Hz. Each Codamotion marker point was uniquely coded according to the Rizzoli [31] protocol specifications. As depicted in Fig. 2, three markers (1, 2, and 3) were applied to the subject’s knee joint. At the start of each trial, the EMG system activates the Codamotion system and collects the marker points' spatial coordinate coordinates in real-time. The markers transmit their acquired data to the host computer through the data collection box.

Fig. 2
figure 2

Location of the sensor attachment

2.2 Experimental Procedures

Ten volunteers (five males and five females) participated in this experiment. The subjects did not suffer from leg sprain and had no painful muscle discomfort. Their ages ranged from 22 to 26 years, with a mean age of 24.2(± 1.13) years, a height of 171.8(± 7.23) cm, and a weight of 66.6(± 9.29) kg. All volunteers were informed and signed a consent form before participating in the experiment. Table 1 provides information about the volunteers who participated in the experiment.

Table 1 Basic information of the volunteers

The experiment was approved by the Review Committee of the First Hospital of Nanjing Medical University and conducted in accordance with the Declaration of Helsinki. The ethical review number for this experiment is 2021-SR-109. We designed and built the device to simulate the uphill condition, as shown in Fig. 3. The uphill angle of the device was set at 30 deg. Each set of the experiment lasted 20 s. During this period, participants were asked to perform uphill activities according to the frequency of the metronome (at a speed of 3KM/h). As suggested [30], all volunteers were instructed to conduct 3–5 min of low-intensity exercise prior to the trial. During the experiment, the sEMG signal of the volunteers will be simultaneously captured with the knee motion coordinate data. The experiment was repeated 80 times for the same volunteer. To reduce data fluctuations due to muscle fatigue, subjects will be asked to rest for 3 min for every five experiments performed.

Fig. 3
figure 3

Experimental site

2.3 Signal Pre-processing

The sEMG signal is a weak electrical signal, which is the result of the integrated superposition of action potential sequences emitted from many motor units on the skin surface. During the signal recording process, the sEMG signal is susceptible to interference from other electromagnetic signals [32]. Therefore, the SEMG signal must be preprocessed before utilizing the model prediction. Butterworth filters [33] and Chebyshev filters [34] are the most often used digital filters. According to [14], most of the EMG signal’s frequencies lie between 10 and 500 Hz. Consequently, this paper utilizes a fourth-order Butterworth bandpass filter to filter the sEMG signal.

It is worth mentioning that the coordinates acquired in the experiment cannot be employed directly. These coordinates were solved as a continuously variable angle to facilitate model prediction. The link vector model was utilized to solve this problem. Two neighboring points (1 → 2, 2 → 3) are linked as vectors, and spatial coordinates of the three markers are employed (as shown in Fig. 4). Then, the angle between the two vectors may be computed using Codamotion's built-in solver. In this method, the coordinates recorded by the experiment are translated into a continuously changing angle.

Fig. 4
figure 4

The diagram for solving angles of motion

The formula for the angle calculation is shown below.

$$ \theta \, = \,\arccos \frac{{\vec{m}.\vec{n}}}{{\left| {\vec{m}} \right|.\left| {\vec{n}} \right|}}. $$
(1)

Since the sampling rates of the SEMG signal and the knee angle signal are different, the nearest neighbor interpolation method is used after the signal filtering is completed. The knee angle signal corresponds to the processed SEMG signal. In this way, 10*80 = 800 independent sets of data were obtained. We picked the EMG and knee angle data corresponding to the beginning of the exercise from the relevant data sets to generate a new set according to the subject number. The new dataset has 17 dimensions; x1x16 were EMG data, and y were knee angles. The data were standardized in accordance with recommendations [35] to shorten the learning time of the model and increase the accuracy of the predictions. The processed sMEG and knee angles are illustrated in Fig. 5.

Fig. 5
figure 5

Data visualization of sEMG and joint angle

3 Data Preprocessing Algorithms

3.1 Traditional PCA Algorithm

PCA (Principal Component Analysis) is one of the most widely used data dimensionality reduction algorithms. The main idea of PCA is to map n-dimensional features to k-dimensions, which are new orthogonal features also called principal components and are reconstructed from the original n-dimensional features. The work of PCA is to sequentially find a set of mutually orthogonal axes from the original space. The selection of new axes is closely related to the data itself. The first new axis is chosen to be the direction with the largest variance in the original data, the second new axis is chosen to be the plane orthogonal to the first axis that makes the largest variance, and the third axis is the plane orthogonal to the first and second axes that make the largest variance. By analogy, n such axes can be obtained. The procedure of PCA calculation is shown as follows.

First, the projections of the all pixels xj onto this normalized direction v are \({V}^{T}{x}_{1},...{V}^{T}{x}_{N}\)

The variance of the projections is

$$ \begin{aligned} \sigma^{2} \, & = \,\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {V^{T} - 0} \right)^{2} \, = \,\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} \left( {V^{T} x_{i} } \right)\left( {V^{T} x_{i} } \right) \\ & = \frac{1}{N}\mathop \sum \limits_{i = 1}^{N} V^{T} x_{i} x_{i}^{T} v\, = \,v^{T} \left( {\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} x_{i} x_{i}^{T} } \right)v\, = \,v^{T} Cv, \\ \end{aligned} $$
(2)

where:

$$ C\, = \,\frac{1}{N}\mathop \sum \limits_{i = 1}^{N} x_{i} x_{i}^{T} . $$
(3)

Then the first principal vector can be found by the following equation:

$$ \mathop {v\, = \,argmaxv^{T} Cv}\limits_{{v \in R^{d} ,v\, = \,1}} . $$
(4)

This is equivalent to finding the largest Eigenvalue of the following eigenvalue problem:

$$ \left\{ {\begin{array}{*{20}c} {Cv\, = \,\lambda v} \\ {\left\| v \right\|\, = \,1} \\ \end{array} } \right.. $$
(5)

Note that:

$$ C\, = \,\frac{1}{N}\sum\nolimits_{i = 1}^{N} {x_{i} x_{i}^{T} \, = \,\frac{1}{N}\left[ {x_{1, \ldots } x_{N} } \right]} \left[ \begin{gathered} x_{1}^{T} \hfill \\ \vdots \hfill \\ x_{N}^{T} \hfill \\ \end{gathered} \right]. $$
(6)

If:

$$ X^{T} \, = \,\left[ {x_{1} ,...x_{N} } \right]. $$
(7)

Then:

$$ C\, = \,\frac{1}{N}X^{T} X. $$
(8)

Note that XX can be translated to:

$$C=\frac{1}{N}{\sum }_{i=1}^{N}\varphi \left({x}_{1}\right)\varphi {\left({x}_{i}\right)}^{T}=\frac{1}{N}\left[\varphi \left({x}_{1}\right),...\varphi \left({x}_{N}\right)\right]\left[\begin{array}{c}\varphi {\left({x}_{1}\right)}^{T}\\ \vdots \\ \varphi {\left({x}_{N}\right)}^{T}\end{array}\right].$$
(9)

If:

$${X}^{T}=\left[\varphi \left({x}_{1}\right),...\varphi \left({x}_{N}\right)\right].$$
(10)

Then:

$$C=\frac{1}{N}{X}^{T}X.$$
(11)

3.2 Improved PCA Algorithm

Note that the kernel matrix can be computed by kernel function K:

$$ \begin{aligned} K & = XX^{T} = \left[ {\begin{array}{*{20}c} {\varphi \left( {x_{1} } \right)^{T} } \\ \vdots \\ {\varphi \left( {x_{N} } \right)^{T} } \\ \end{array} } \right]\left[ {\varphi \left( {x_{1} } \right), \ldots \varphi \left( {x_{N} } \right)} \right] \\ & = \left[ {\begin{array}{*{20}c} {\varphi \left( {x_{1} } \right)^{T} \varphi \left( {x_{1} } \right)} & \cdots & {\varphi \left( {x_{1} } \right)^{T} \varphi \left( {x_{N} } \right)} \\ \vdots & \ddots & \vdots \\ {\varphi \left( {x_{N} } \right)^{T} \varphi \left( {x_{1} } \right)} & \cdots & {\varphi \left( {x_{N} } \right)^{T} \varphi \left( {x_{N} } \right)} \\ \end{array} } \right] \\ & = \left[ {\begin{array}{*{20}c} {\kappa \left( {x_{1} ,x_{1} } \right)} & \cdots & {\kappa \left( {x_{1} ,x_{N} } \right)} \\ \vdots & \ddots & \vdots \\ {\kappa \left( {x_{N} ,x_{1} } \right)} & \cdots & {\kappa \left( {x_{N} ,x_{N} } \right)} \\ \end{array} } \right]. \\ \end{aligned} $$
(12)

Then we can use K to find the eigenvectors of XTX.

The eigenvalue problem of \(K=X{X}^{T}\) is:

$$ \left( {XX^{T} } \right)u\, = \,\lambda u $$
(13)
$$ X^{T} \left( {XX^{T} } \right)u\, = \,\lambda X^{T} u \Rightarrow \left( {X^{T} X} \right)\left( {X^{T} u} \right)\, = \,\lambda \left( {X^{T} u} \right). $$
(14)

This means that XTu is an eigenvector of XTX.

The eigenvalue v of XTX can be computed by the eigenvalue of K = XXT:

$$ v\, = \,\frac{1}{{X^{T} u}}X^{T} u\, = \,\frac{1}{{\sqrt {u^{T} XX^{T} u} }}X^{T} u\, = \,\frac{1}{{\sqrt {u^{T} \left( {\lambda u} \right)} }}X^{T} u\, = \,\frac{1}{\sqrt \lambda }X^{T} u, $$
(15)

where \(\lambda \) is the corresponding eigenvalue of u.

Finally, the projection of the testing sample \(\varphi \left({x}^{^{\prime}}\right)\) can be computed by

$$ V^{T} \varphi \left( {x^{\prime}} \right)\, = \,\left( {\frac{1}{\sqrt \lambda }X^{T} u} \right)^{T} \varphi \left( {x^{\prime}} \right)\, = \,\frac{1}{\sqrt \lambda }u^{T} X\varphi \left( {x^{\prime}} \right)\, = \,\frac{1}{\sqrt \lambda }u^{T} \left[ {\begin{array}{*{20}c} {\varphi \left( {x_{1} } \right)^{T} } \\ \vdots \\ {\varphi \left( {x_{N} } \right)^{T} } \\ \end{array} } \right]\varphi \left( {x^{\prime}} \right)\, = \,\frac{1}{\sqrt \lambda }u^{T} \left[ {\begin{array}{*{20}c} {\kappa \left( {x_{1} ,x^{\prime}} \right)} \\ \vdots \\ {\kappa \left( {x_{N} ,x^{\prime}} \right)} \\ \end{array} } \right]. $$
(16)

By applying the kernel function to the low-dimensional space, the calculation can get nearly the same result as in the high-dimensional space, as seen in the projection above. PCA is a linear transformation of the coordinate axes, meaning that the new base remains a straight line following the transformation. However, the kernel-based PCA performs a nonlinear modification of the coordinate axes, and the new basis projected by the data is no longer a straight line but rather a curve or surface (as shown in Fig. 6).

Fig. 6
figure 6

Comparison of PCA and Improved-PCA

Obviously, the kernel-based method PCA can separate different data classes, while PCA makes a projection of them. This shows the advantages of PCA based on the kernelization method.

3.3 Fast Independent Component Analysis (FICA) Algorithm

Independent Component Analysis (ICA), refers to an analysis process that separates or approximates the source signal when only the mixed signal is known, without knowing the source signal, the noise, and the mixing mechanism. The algorithm considers the observed signal as a linear combination of several statistically independent components, and what ICA has to do is a demixing process.

Suppose that mt is the EMG signal acquired in the experiment, which actually consists of the source signal nt from the muscle and the noise vt from the other sensors during the acquisition. The signal can be approximated as a linear mixed system, expressed by the following equation.

$$ m_{t} \, = \,A\, \times \,n_{t} \, + \,v_{t} . $$
(17)

The purpose of the ICA algorithm is to separate the source signal nt from the above equation and, by calculation, obtain a signal yt that is similar to the original signal.

In recent years, a fast ICA algorithm (FICA) has emerged, which is obtained based on a fixed-point recursive algorithm, and it works for any type of data. It was proposed by Hyvärinen et al., at the University of Helsinki, Finland. After optimizing the cumulative distribution function in the traditional ICA algorithm with the iterative formula, we can obtain the FICA algorithm. FICA uses a fixed-point iterative optimization algorithm, which makes the convergence faster and more robust.

4 Prediction Algorithms

4.1 Feature Extraction Based on Convolutional Neural Networks

Over recent years, numerous fields have implemented deep learning techniques. CNN is a feedforward neural network with convolutional operations and a depth structure commonly employed in image processing and natural language processing. Similar to essential neural networks, convolutional neural networks [10] are biologically inspired by feedforward artificial neural networks. Each hidden CNN layer consists of a convolutional layer and a pooling layer. The last layer of CNN is usually a fully connected layer used for data classification.

Figure 7 shows the overall architecture of the CNN, which consists of three types of layers: convolutional, max-pooling, and classification. Even layers are utilized for convolution, while odd layers are used for maximum pooling. The output nodes of the convolution and max-pooling layers are combined to form the feature map, which is a 2D plane.

Fig. 7
figure 7

The overall framework of CNN

4.1.1 Convolutional Layer

Convolutional layers take advantage of three critical ideas that can help improve machine learning systems: sparse interaction, parameter sharing, and covariant representation. The convolutional layer of a convolutional neural network operates by applying convolution to each data set. The two-dimensional discrete convolution operation is shown in Eqs. 18 and 19.

$$ S\left[ {n_{1} ,n_{2} } \right]\, = \,\sum\nolimits_{m\, = \,1}^{{M_{1} }} {\sum\nolimits_{m\, = \,1}^{M2} {x\left[ {m_{1} ,m_{2} } \right]w\left[ {n_{1} - m_{1} ,n_{2} - m_{2} } \right]} } . $$
(18)
$$ C\, = \,\frac{1}{N}\sum\nolimits_{i = 1}^{N} {x_{i} x_{i}^{T} } . $$
(19)

4.1.2 Pooling Layer

Pooling functions replace the output of a layer with the summary statistics of the previous layer's output. This layer in the architecture accelerates the training and categorization. xn is a vector holding the pooled data of the dataset. The pooling function is represented in the following equation.

$$ x_{n} \, = \,\left\{ {x_{j} ,...,x_{N} } \right\}. $$
(20)
$$ \widehat{{x_{n} }}\, = \,f\left( {x_{n} ,x_{n + 1} ,x_{n + 2} } \right)\, = \,f\left( {x_{n} } \right). $$
(21)

The pooling layer may include the set’s greatest value, the average value, the parametric value, and the weighted mean of the pool. The maximum pool is utilized according to Eq. 22.

$$ f\left( {x_{n} } \right)\, = \,argmax(x_{n} ). $$
(22)

4.2 Serial Regression Based on Long Short-Term Memory

In recent years, LSTM has been widely used in speech recognition, sentiment analysis, text analysis and other fields. The LSTM forms the lower layer of the model proposed in this paper. This layer stores the temporal information of the important attributes of the EMG signal. The memory channel and gate mechanism (the forgetting gate, input gate, update gate, and output gate) are shown in Fig. 8.

Fig. 8
figure 8

LSTMcell

Cell state (Ct-1Ct) is the foundation of the LSTM design. Cell state holds the hidden state information for the current time. The hidden state information includes both the hidden state from the preceding time step and the temporary hidden state of the current time step. In addition, the LSTM includes a unique “gate” structure for removing or adding information to the cell state.

4.3 Forgotten Gate

The first step in LSTM is to decide the information to be discarded in the cell state. This decision is made via the forgetting gate layer. The forgotten gate reads ht-1 and xt and outputs a number between 0 and 1 for each cell state number. 1 indicates “keep totally,” and 0 indicates “discard completely.”

$$ f_{t} \, = \,\sigma \left( {W_{f} .\left[ {h_{t - 1} ,x_{t} } \right]\, + \,b_{f} } \right). $$
(23)

4.4 Input Gate

Afterward, the input gate determines which new information may be added to the cell’s state. Each tanh-layer generates a vector that can be substituted for the update (Eqs. 24, 25). These two components will be merged to update the cell state.

$$ i_{t} \, = \,\sigma \left( {W_{i} .\left[ {h_{t - 1} ,x_{t} } \right]\, + \,b_{i} } \right) .$$
(24)
$$ \tilde{C}_{t} \, = \,\tanh (W_{c} .\left[ {h_{t - 1} ,x_{t} } \right]\, + \,b_{c} ) .$$
(25)

4.5 Update Gate

The function of the update gate is to transform old cell data (Ct-1) into new cell data (Ct). The update gate picks a portion of the old cell information for erasure by the forget gate. Afterward, the input gate selects a portion of the candidate cell information to combine with the new cell information Ct.

$$ C_{t} \, = \,f_{t} *C_{t - 1} \, + \,i_{t} *\tilde{C}_{t} . $$
(26)

4.6 Output Gate

After updating the cell’s state, it is necessary to determine the output cell’s state based on the input values ht-1 and xt. The cell state is transmitted through the tanh-layer to get a vector of values between [− 1, 1], multiplied by the output gate’s judgment criteria to produce the cell's output.

$${o}_{t}=\sigma \left({W}_{o}\left[{h}_{t-1},{x}_{t}\right]+{b}_{o}\right),$$
(27)
$${h}_{t}={o}_{t}*\mathit{tan}h({C}_{t}),$$
(28)

where ht is the output vector result of the memory cell at time t (as Fig. 8 shows). Wf,i,c,o are the weights matrices and bf,i,c,o the bias vectors.

4.7 CNN-LSTM Training Model

In this study, by integrating CNN and LSTM, we propose a new deep learning scheme. The feature sequences from the CNN layer are considered as the input to the LSTM. The CNN-LSTM structure proposed in this paper is shown in Fig. 9. It consists of a CNN layer, an LSTM layer, and a fully connected (FC) layer. The CNN layer is used to receive and process the sEMG signals from 16 different locations of the human lower limbs. On the other hand, the dataset is divided into two parts: 80% for training the model and 20% for validating the results. This network structure consists of an input layer (inputting sensor variables), an output layer (extracting features to the LSTM), and several hidden layers. The hidden layers include the convolutional, ReLU, activation, and pooling layers. The neural network generates a weight for each input to determine a specific output based on this structure. The CNN structure consists mostly of five convolutional layers and five pooling layers in the hybrid model. Respectively, the activation function has 64, 128, 256, 128, and 64 convolutional kernels. The LSTM model contains five hidden layers with 64, 128, 256, 128, and 64 neurons, respectively. The underlying layers of the model are two fully connected layers, which have 4096 and 2048 neurons, respectively.

Fig. 9
figure 9

The structure of CNN-LSTM

Sensitive deep learning parameters are considered to maximize accuracy and optimize training time. The model is trained with 150 data points per batch, 120 epochs maximum, and 15 iterations per epoch. After all architectures have been trained, the confusion matrix is used to evaluate the network's performance. This allows for the calculation of accuracy and recalls for each category.

4.8 Experimental Evaluation

To evaluate the model, a set of evaluation measures were chosen. According to [36], the root means square error (RMSE) and the Pearson correlation coefficient were selected to assess the accuracy of the predictions. A parameter’s mean squared error (MSE) is the expected value of the squared difference between its estimated and actual value. The root means squared error (RMSE) equals the square root of the squared mean error (MSE). These metrics are defined in the following manner:

$$MSE=\frac{1}{n}{{\sum }_{i=1}^{n}({Y}_{t}-{\widehat{Y}}_{p})}^{2},$$
(29)
$$RMSE=\sqrt{\frac{1}{n}{{\sum }_{i=1}^{n}({Y}_{t}-{\widehat{Y}}_{p})}^{2}}.$$
(30)

Pearson correlation, also known as cumulative correlation, is a method invented by the British statistician Pearson in the twentieth century to determine linear correlation. The Pearson correlation coefficient is frequently used to analyze data that conforming a linear relationship or a normal distribution. It can be calculated as Eq. 31.

$${\rho }_{{Y}_{t},{Y}_{p}}=\frac{\sum ({Y}_{t}-\overline{{Y}_{t}})({Y}_{p}-\overline{{Y}_{p}})}{\sqrt{\sum ({Y}_{t}-\overline{{Y}_{t}}{)}^{2}({Y}_{p}-\overline{{Y}_{p}}{)}^{2}}},$$
(31)

where: Yt is the real Knee angle value, \(\widehat{{Y}_{p}}\) is the predicted value, and n is the number of Yt.

5 Experimental Results and Analysis

This chapter composed a new dataset including 16 different EMG signals and corresponding knee angles. Corresponding data were collected from 10 different individuals (five males, five females). The replicated data were then preprocessed in various methods (FICA/PCA/Improved PCA, respectively). Immediately after, these data were imported into the previously built models (CNN-LSTM) separately for data prediction. Each set of data was repeatedly trained and predicted 200 times. The model automatically recorded the training error for each training. Finally, the effects of various data processing methods on prediction accuracy and time spent are compared. In addition, the model training time and Pearson correlation coefficients are compared for different scenarios. To reflect the superiority of the prediction models proposed in this paper, other prediction models are used for comparison. The study's findings are presented as the mean plus standard deviation to highlight the algorithms' consistent performance.

5.1 Analysis 1: Principal Component Contribution Rates Under Different Dimensions

To select the best data input dimensions, we downscaled the sEMG signals of these 16 different channels. All dimensions were tried, and their principal component contribution rates were recorded. Figure 10 shows the difference in principal component contribution rates between the improved PCA algorithm and the traditional PCA algorithm for the same dimensions. As shown in Fig. 11, the principal component contribution rate gradually increases from low to high dimensions. Apparently, the increasing trend of the principal component contribution is significantly greater among the 2–8 dimensions than among the 9–16 dimensions. As dimensions rise, the increase of primary component contribution tends to level off. Regarding principle component extraction rate, the improved PCA algorithm is much superior to the standard PCA method (most evident in dimensions 2–8).

Fig. 10
figure 10

Contribution of principal components under different dimensions

Fig. 11
figure 11

Difference of principal component contributions under the same dimension

The improved PCA approach presented in this study performs a nonlinear mapping of the sEMG signal using a Gaussian kernel function for nonlinear dimensionality reduction. Then, the centralization process was performed. Unlike the conventional PCA method, the resulting feature vectors are not the projected axes but the projected coordinates. For linearly indistinguishable datasets, the improved PCA method maps them to higher dimensions and divides them. The results demonstrate that this method can more efficiently extract the principal components of sEMG signals in a nonlinear state.

5.2 Analysis 2: Comparison of Prediction Accuracy Under Different Dimensions

Then, the downscaled data were classified and imported into the CNN-LSTM model for prediction. Finally, the prediction errors for 10 different experimental subjects and the Pearson correlation coefficients were obtained. Immediately after, the predicted data are collated and compared. Figure 12 shows that the prediction accuracy gradually improves as the dimensionality increases (either in FICA/PCA/Improved PCA). Obviously, the improved PCA method always has the smallest prediction error, followed closely by the PCA method and finally the FICA method.

Fig. 12
figure 12

Comparison of prediction errors in different dimensions

Figure 13 compares the Pearson correlation coefficients in different dimensions. It can be found that as the dimensionality increases, the Pearson correlation coefficient also increases. Obviously, the improved PCA algorithm achieved the highest Pearson correlation coefficient. In practical applications of sEMG in the field of assisted exoskeletons, errors higher than 5 degrees are unacceptable for achieving soft control. Excessive prediction errors may lead to ineffective control and interfere with the task process. From this perspective, the processing data of the improved PCA shows the most stable performance (among the various algorithms).

Fig. 13
figure 13

Comparison of Pearson’s correlation coefficients under different dimensions

5.3 Analysis 3: Comparison of Prediction Accuracy Under Different Data Processing Methods

Figure 14 visually compares the prediction results of the data under each different pre-processing method. The data from two volunteers (one male, one female) are used for comparison. Since the step size of males is relatively larger than that of females. The knee angles collected in the experiment for females were smaller than those for males. It can be seen from the figure that the prediction results of the data processed by the improved PCA algorithm are closest to the experimentally collected data. Then comes the data processed by the PCA algorithm, and finally, the data processed by FICA.

Fig. 14
figure 14

Prediction results under different methods

5.4 Analysis 4: Comparison of Training and Prediction Time Under Different Dimensions

Next, the algorithm prediction elapsed time (both training and prediction time) was compared under different dimensions. All data were processed by Python on a personal server (with 3.30 GHz Intel Xeon CPU and GUP-NVIDIA 2080Ti). Figure 15 illustrates each method instance’s average training and test run times. It is evident that as the dimension rises, the training time of the algorithm gradually increases. Similar to the prior contribution of the primary component, the range of increase in training time is more extensive bigger for dimensions 2–8 than dimensions 9–16. Notably, once the algorithm training phase is complete, the prediction times are practically identical (in a concise time, as shown in Fig. 16). Such restrictions are tolerable for the majority of communication systems. With the rapid advancement of graphics processing unit (GPU) technology, it is anticipated that the method will become even more effective.

Fig. 15
figure 15

Comparison of training time under different dimensions

Fig. 16
figure 16

Comparison of prediction times under different dimensions

5.5 Analysis 5: Comparsion with Other Models

In the end, the performance of the CNN-LSTM model is compared with other algorithms in terms of prediction results. All data were processed by Python on a personal server (with 3.30 GHz Intel Xeon CPU and GUP-NVIDIA 2080Ti). Table 2 shows the comparison results. It can be easily found that the CNN-LSTM model combined with the improved PCA algorithm has the best performance (both in terms of prediction/training time and prediction accuracy). Moreover, the traditional PCA still has an advantage over the prediction results of the data after the original data filtering process.

Table 2 Average accuracy of the model for 20 runs

6 Discussion

By comparing the prediction accuracy, computational time consumption, and correlation coefficients under different combinations of algorithms, a new method (CNN-LSTM + Improved PCA) for predicting knee angles using EMG signals is proposed in this paper. With the EMG signals collected from the training experiments, the model can efficiently predict the corresponding knee angles from the input sEMG signals. The prediction accuracy under different conditions was analyzed and compared. It can be said that the new method proposed in this paper outperforms other machine learning-based methods in terms of performance. The proposed method (CNN-LSTM combined with improved PCA) in this paper yields an accuracy of about 98.5%. With the above discussion, the excellent performance of the proposed method can be attributed to two reasons. One is the combination of CNN and LSTM, which can extract more practical features. The other one is the special interaction mechanism that can increase the diversity of features. Comparing the model parameters (Pearson correlation coefficient, RMSE, training time, and prediction time) under the same data, it is possible to determine that the new method (CNN + LSTM) described in this study has more advantages than existing machine learning methods. This permits the creation of auxiliary devices with high precision. This study aims to assess the impact of signal pre-processing techniques on the accuracy of prediction. Additionally, this article explores the relationship between the sEMG signal and the human joint angle to design a high-precision EMG controller for exoskeletons in future research. It facilitates the use of the sEMG signal to control the exoskeleton and provides intelligent support.

7 Conclusion

This work aims to investigate the effect of pre-processing sEMG signals on the prediction results and lay the foundation for constructing an accurate and responsive exoskeleton robot controller. A new method based on CNN-LSTM with an improved PCA algorithm structure is proposed in this paper to predict the knee joint angle. Experimental data from 10 individuals were collected to demonstrate the method’s superiority. As a result, the improved PCA method can extract the principal components from the data more efficiently and help the model achieve faster convergence than the traditional PCA method. Comparing the prediction results in various cases shows that improved PCA and CNN-LSTM produce the best results while maintaining computational efficiency. The experimental results also show that the combination of CNN and LSTM possesses the best prediction results compared to other existing models. The anticipated work will continue to expand in the future. To develop more flexible and efficient exoskeleton devices, the study of ankle joint angle and hip joint angle during the task is equally important. In addition, multi-source signal fusion of force signals, motion signals (IMU) and sEMG signals to predict human motion is another area that should be investigated in the future. Moreover, due to the complexity of human musculoskeletal models, selecting the optimal location for signal acquisition is a topic of research and interest.