Introduction

Surface electromyogram (sEMG) is a myoelectric signal recorded from the surface of skeletal muscles and it indicates the functional state of muscle fibres [25, 53]. It is a complex and non-stationary signal with low signal to noise ratio (SNR). While the underlying mechanism of sEMG is complex with number of differing factors, it has been used in applications ranging from rehabilitation to sports medicine. Some of the applications include:

  • Rehabilitation—example: assist system for the disabled [76, 77]

  • Human computer control—example: control of pointing devices [108]

  • Robotic and prosthetic hand [74, 90]

  • Clinical applications—example: assessment of muscle fatigue [73] and low back pain [96].

The ability to accurately interpret sEMG signals would enable and control the neuro-electrical interfaced systems. The design of these systems require the following two important factors to be considered as proposed by [28]:

  1. 1.

    Features of sEMG that can be related to different muscles and muscle activity, and

  2. 2.

    Classification paradigm of these features to identify these actions.

Rehabilitation process, clinical diagnosis and basic investigations are critically dependent on the ability to record and analyze physiological signals like Electrocardiography (ECG), Electroencephalography (EEG) and EMG. However, the traditional analyses of these signals have not kept pace with major advances in technology that allow for recording and storage of massive data sets of continuously fluctuating signals. Although these typically complex signals have recently been shown to represent processes that are non-linear, non-stationary, and non-equilibrium in nature, the methods used to analyze these data are often assume linearity, stationarity, and equilibrium-like conditions. Such conventional techniques include analysis of means, standard deviations and other features of histograms, along with classical power spectrum analysis [34].

Recent findings [17, 57, 68] show that sEMG signals may contain hidden information that is not extractable with conventional methods of analysis. Such hidden information promises to be of clinical value as well as to relate to basic mechanisms of muscle property and activity function. Fractal theory based analysis is one of the most promising new approaches for extracting such hidden information from physiological time series signal like sEMG, which can provide information regarding the characteristic temporal scales and the adaptability of muscle activity response [13, 14, 32, 34]. Number of researchers have identified a strong relationship between magnitude and spectral features of sEMG with the force of muscle contraction [25]. Various analogous measures such as

  • root mean square (RMS) [12],

  • windowed integration and zero-crossing count [12, 23],

  • auto-regression [9, 58], and

  • wavelet coefficients [61, 98]

have been used to classify the signal against the desired movement and/ or posture.

These features are easy to implement and are a good measure of the strength of muscle activity when there is a single active muscle that has high level of muscle activity. However these measures are not reliable when the muscle activity is very subtle and when there are multiple muscles that are simultaneously active. Alternate to the use of global parameters such as RMS, is to decompose sEMG and identify the action potentials [53, 57, 103]. The shortcomings in such techniques are that these require high level of manual supervision and are highly sensitive to the location of the electrodes. There are number of possible rehabilitation and defence applications of sEMG that are currently infeasible because there are no reliable features of sEMG that can be related to low-level of muscle contraction without manual supervision.

The classification of these features has been achieved using a range of parametric and non-parametric techniques, ranging from Bayesian statistical classifiers, neural networks [18, 62] and a predictive approach [19]. Some of the recent research work on classification of hand movements has been presented as follows:

  • Nagata et al. presented a classification method of hand movements using 96 channels matrix-type (16 × 6) of multi-channel surface EMG [76].

  • Crawford et al. proposed the classification of electromyographic signals for robotic control using amplitude of five channel EMG as features and support vector machines as classifiers [21].

  • Englehert et al. used pattern recognition to process four channels of MES, with the task of discriminating multiple classes of limb movement [28].

  • Momen et al. used RMS of two channel EMG as features and segmented using fuzzy C-means clustering [74].

  • Tenore et al. [107] have reported an accurate hand and finger gesture identification system using 16 bipolar surface-EMG electrodes that were placed on the forearm. This system has overcome the shortcoming of cross-talk with the use of an array of electrodes. Large array of electrodes and number of channels adds to the complexity and cost of the system and thus limits the application of this system. Further, mounting of an array of electrodes may not have user acceptability, may require the help of an expert and would not be easy to use by a lay person.

The features used in these techniques are a good indicator of high level muscle activation. However at low-level of muscle contraction, these measures are not reliable in identifying the muscle activation from the background activity and requires a better classifier for separation of classes of movements. In order to determine the reliable measure of low-level muscle activity, there is need to extract a feature set from sEMG, that interprets the complex property of the muscle during subtle activity.

Most methods used to model and analyse sEMG are linear. However more complex activity such as sEMG recordings during small and complex maintained hand actions cannot be modelled by such linear techniques. With the need for identifying complex and subtle actions and gestures, nonlinear methods are emerging to characterize sEMG. The following three new approaches has been proposed in [89] for characterisation of sEMG:

  1. 1.

    Methods that characterise the sEMG spectral distribution i.e., Logarithmic representation of sEMG spectrum

  2. 2.

    Poisson representation of sEMG spectrum, and

  3. 3.

    Method that examines the ‘complexity’ of raw sEMG i.e., Fractal dimension (FD) of sEMG

Out of these approaches, FD of sEMG has been found sensitive to magnitude and change of force, because sEMG is self—similar over a range of scales and the statistical properties of a part (structure of MU) are proportional to those of the whole [33, 37].

Biosignals such as sEMG are a result of the summation of identical motor units that travel through tissues and undergo spectral and magnitude compression. Burst within burst behaviour of sEMG in time has the property that patterns observed at one sampling rate are statistically similar to patterns observed at lower sampling rates. These nested patterns suggest that sEMG has self-similarity [4]. Researchers have studied fractal of sEMG to characterize normal and pathological signals [2]. To better represent the properties of sEMG signal, fractal properties of sEMG has been proposed [4, 43, 115].

In sEMG recordings multiple sensors are used to record some physiological phenomena. Often these sensors are located close to each other, so that they simultaneously record signals that are highly correlated with each other. Therefore, the sensors not only record the muscle activity transmitted by volume conduction from a few dynamic muscles but also from artificial signals, such as noise independent of muscle activities, that overlap with actual muscle activity which may be present in all sensors. Extraction of the useful information from such kind of sEMG becomes more difficult for low level of contraction mainly due to the low signal-to-noise ratio. At low level of contraction, sEMG activity is hardly discernible from the background activity. Therefore to correctly identify the number of individual muscles (sources) sEMG needs to be decomposed. There is little or no prior information of the muscle activity, and the signals have temporal and spectral overlap, making the problem suitable for blind source separation (BSS) techniques.

These literatures review is based on the study of ICA, fractal theory and use of fractal analysis of sEMG to determine the complex property of the muscle during subtle activity. This paper provides an overall review of the various researches that have been conducted in the analysis of sEMG for measurement and properties of human hand muscle movements. The review in this section covers two major research areas:

  • Use of sEMG signals in identification of subtle human hand movement, and

  • Fractal theory based measurement analysis of sEMG for identification of subtle human hand movements.

Surface electromyography

Surface EMG is the recording of the muscle’s electrical activity from the surface of the skin. In clinical application, sEMG is used for the diagnosis of neuro-muscular disorder and for rehabilitation. It is also used for device control applications where the signal is used for controlling devices such as prosthetic devices, robots, and human–machine interface. The advantage of sEMG is due to its non-invasive recording technique and it provides a safe and easy recording method. The underlying mechanism of sEMG is very complex [35] because there are number of factors such as neuron discharge rates, motor unit recruitment and the anatomy of the muscles and surrounding tissues that contribute to the recording. Surface EMG is a quick and easy process that facilitates sampling of a large number of MUAPs [12, 27]. Surface EMG is also used as a diagnostics tool for identifying neuromuscular diseases, assessing low back pain, kinesiology and disorders of motor control. Beyond medical applications, sEMG has been proposed for control of computer interfaces. It can also be used to sense isometric muscular activity where no movement is produced. This enables definition of a class of subtle motionless gestures to control interfaces without being noticed and without disrupting the surrounding environment.

Factors that influence sEMG

The action potentials recorded in sEMG signals are generated by the electrical activities in the muscle. The signal contains information related to muscle contraction and condition. Therefore, it is useful to analyse the signal to reveal the information without the need to intervene the muscle. The information immersed in sEMG signal is related to the following factors that influence the signal.

  • Level of contraction The level of contraction affects the magnitude of the recorded SEMG [20]. The magnitude of sEMG increases as the level of contraction increases as there is an increase in the number of motor units involved in the contraction.

  • Localised muscle fatigue Localised muscle fatigue can be observed from the shift of the median frequency of the signal towards the lower frequency and the increase in the signal’s magnitude [12, 20, 61]. This is due to the synchronisation of the stimulation of different motor units and the variation in the electrical properties of the muscle fibers.

  • The thickness of body tissue The body tissue tends to attenuate the high frequency component of the signal. The thicker the body tissue, the lower the frequency and amplitude of the signal are. The sEMG signals recorded from facial muscle have a frequency of up to 500 Hz, while sEMG recorded from deep muscles have lower frequency range.

  • The inter-electrode distance The size and inter-electrode distance also have a known effect to the signal. If the distance between electrodes increases, the recording covers a wider area. As a result, the recorded signal consists of a larger number of action potentials, which lowers the frequency and increases the amplitude of the signal.

  • The artefacts and noises The properties of some of the noises and artefacts are predictable. The power-line interference appears sharply at 50 Hz, while the ECG artefacts appears at frequency up to 60 Hz [20]. Although the frequency component of power-line and ECG components are well predicted, they are not easily removed due to the frequency overlapping between the artefacts and the sEMG spectrum.

  • Crosstalk muscle signals Crosstalk is the signal detected over a muscle but generated by another muscle close to the first one. The phenomenon is present exclusively in surface recordings, when the distance of the detection points from the sources may be relevant and similar for the different sources [31].

sEMG signal analysis techniques

The sEMG signal is a time and force (and possibly other parameters) dependent signal whose amplitude varies in a random nature above and below the zero value. Thus, simple average aging of the signal will not provide any useful information. Some of the measures of sEMG are explained below [22]:

  • Rectification A simple method that is commonly used to overcome the above restriction is to rectify the signal before performing mode pertinent analysis.

  • Averages or means of rectified signals The equivalent operation to smoothing in a digital sense is averaging. By taking the average of randomly varying values of a signal, the larger fluctuations are removed, thus achieving the same results as the analog smoothing operation.

  • Integration The most commonly used and abused data reduction procedure in electromyography (EMG) is integration. It applies to a calculation that obtains the area under a signal or a curve. The units of this parameter are volt seconds (Vs). It is apparent that an observed sEMG signal with an average value of zero will also have a total area (integrated value) of zero. Therefore, the concept of integration may be applied only to the rectified value of sEMG signal.

  • RMS value Mathematical derivations of the time and force dependent parameters indicate that the RMS value provides more a more rigorous measure of the information content of the signal because it measures the energy of the signal. The recent increase is due possibly to the availability of analog chips that perform the RMS operation and to the increased technical competence in sEMG.

  • Zero crossings and turns counting This method consists of counting the number of times per unit time that the amplitude of the signal contains either a peak or crosses a zero value of the signal.

  • Frequency domain analysis Analysis of the sEMG signal in the frequency domain involves measurements and parameters that describe specific aspects of the frequency spectrum of the signal. Fast Fourier transform techniques are commonly available and are convenient for obtaining the power density spectrum of the signal.

These various measures are used to extract some meaningful information from sEMG for various applications. Currently, there are three common applications of sEMG [22]. They are:

  • To determine the activation timing of the muscle; that is, when the excitation to the muscle begins and ends

  • To estimate the force produced by the muscle.

  • To obtain an index of the rate at which a muscle fatigues through the analysis of the frequency spectrum of the signal.

These information from sEMG are being used as a control input to activate or control various devices. To determine these information of sEMG from forearm, there is a need to study the anatomical and physiological properties of the low level muscle activation. The muscle activation is at low-level when there is little movement in the corresponding muscle group. When the strength of muscle contraction is small, there is small overlap of the MUAP, for example, in simple wrist and finger flexion movements. This in result shows small changes in recorded sEMG, which in turn requires different measures in identifying these small changes.

The main criterion that influence these small changes in sEMG is crosstalk between muscles. This is due to the volume conduction properties in combination with the source properties, and it is one of the most important sources of error in interpreting sEMG signals. The problem is particularly relevant in cases where the timing of activation of different muscles is of importance, such as in movement analysis [31]. The aim is to interpret these small changes in sEMG during finger and wrist movements which has many applications in prosthesis and human computer interfaces.

ICA

Independent component analysis (ICA) is a data analysis procedure that attempts to estimate unobserved signals or ‘sources’ from observed mixtures. ICA has been engaged quite successfully in a variety of areas; in biosignals such as EEG and sEMG, it has been used for signal artifact reduction and source separation. Here, the concepts behind ICA is illustrated.

Let \(x_{1},\,x_{2},\,x_{3},\,\ldots,\,x_{n}\) be a set of n observed random variables expressed as a linear combinations of another n random variables \(s_{1},\,s_{2},\,s_{3},\,\ldots,\,s_{n}\), which is

$$ x_{i} = a_{i1}s_{1}+a_{i2}s_{2}+ \ldots + a_{in}s_{n} = a_{ij}s{j}, $$
(1)

where \(i=1,\,\ldots,\,n\). a ij . The s i are assumed to be statistically mutually independent. Let x and s represent the random vectors containing the mixtures \(x_{1},\,x_{2},\,x_{3},\,\ldots,\,x_{n}\) and \(s_{1},\,s_{2},\,s_{3},\,\ldots,\,s_{n}\), respectively and let A denote the matrix with entries A ij   =  a ij . The mixing representation above can then be expressed in simplified form as

$$ x = As $$
(2)

In terms of the equation presented above, the task involved in ICA consists of finding s in terms of some given x by identifying a appropriate selection of the matrix elements of A. In ICA the objective is to find an N × N invertible square matrix such that

$$ \hat{s} = Wx $$
(3)

where the components of \(\hat{s}\) are as independent as possible. It is eminent that the solution of ICA is allowed up to intrinsic indeterminations that are permutation and scaling. Despite the success of using standard ICA in many applications, the basic assumptions of ICA may not hold for some kind of situations where there may be dependency among the signal sources. The ICA source separation process is shown in Fig. 1.

Fig. 1
figure 1

Independent component analysis (ICA) block diagram. s(t) are the sources. x(t) are the recordings, \(\hat{s}(t)\) are the estimated sources A is mixing matrix and W is un-mixing matrix

ICA is a method for finding underlying factors or components from multidimensional (multivariate) statistical data or signals [46, 47]. ICA builds a generative model for the measured multivariate data, in which the data are assumed to be linear or nonlinear mixtures of some unknown hidden variables (sources); the mixing system is also unknown. In order to overcome the underdetermination of the algorithm, it is assumed that the hidden sources have the properties of non-gaussianity and statistical independence. These sources are named Independent Components (ICs). ICA algorithms have been considered to be information theory based unsupervised learning rules. Given a set of multidimensional observations, which are assumed to be linear mixtures of unknown independent sources through an unknown mixing source, an ICA algorithm performs a search of the unmixing matrix by which observations can be linearly translated to form independent output components. When regarding ICA, the basic framework for most researchers has been to assume that the mixing is instantaneous and linear, as in Infomax. ICA is often described as an extension to Principal Component Analysis (PCA), that uncorrelates the signals for higher order moments and produces a non-orthogonal basis. More complex models assume for example, noisy mixtures [39, 69], nontrivial source distributions [51, 102], convolutive mixtures [6, 63], time dependency, underdetermined sources [45, 67], and mixture and classification of independent component [60, 65]. A general introduction and overview can be found in [64].

Challenges of source separation in bio signal processing

In biomedical data processing, the aim is to extract clinically, biochemically or pharmaceutically relevant information (e.g metabolite concentrations in the brain) in terms of parameters out of low quality measurements in order to enable an improved medical diagnosis [88, 97]. Typically, biomedical data are affected by large measurement errors, largely due to the noninvasive nature of the measurement process or the severe constraints to keep the input signal as low as possible for safety and bio-ethical reasons. Accurate and automated quantification of this information requires an ingenious combination of the following four issues:

  • An adequate pretreatment of the data,

  • The design of an appropriate model and model validation,

  • A fast and numerically robust model parameter quantification method and

  • An extensive evaluation and performance study, using in-vivo and patient data, up to the embedding of the advanced tools into user friendly user interfaces to be used by clinicians

A great challenge in biomedical engineering is to non-invasively asses the physiological changes occurring in different internal organs of the human body. These variations can be modeled and measured often as biomedical source signals that indicate the function or malfunction of various physiological systems. To extract the relevant information for diagnosis and therapy, expert knowledge in medicine and engineering is also required.

Biomedical source signals are usually weak, geostationary signals and distorted by noise and interference. Moreover, they are usually mutually superimposed. Besides classical signal analysis tools (such as adaptive supervised filtering, parametric or non parametric spectral estimation, time frequency analysis, and higher order statistics), intelligent blind signal processing (IBSP) techniques can be used for preprocessing, noise and artifact reduction, enhancement, detection and estimation of biomedical signals by taking into account their spatio-temporal correlation and mutual statistical dependence.

Exemplary ICA applications in biomedical problems include the following:

  • Fetal electrocardiogram extraction, i.e removing/filtering maternal electrocardiogram signals and noise from fetal electrocardiogram signals [88, 97].

  • Enhancement of low level electrocardiogram components [88, 97]

  • Separation of transplanted heart signals from residual original heart signals [113]

  • Separation of low level myoelectric muscle activities to identify various gestures [15, 54, 7880, 83, 84]

  • Extraction of MUAP and time frequency representation of ICA of sEMG data [98, 105]

  • Muscle fatigue synchronisation using ICA of sEMG [85, 86, 104]

  • Identifying dependency and independency nature of the ICA separated sEMG sources [1, 81, 82]

One successful and promising application domain of blind signal processing includes those biomedical signals acquired using multi-electrode devices: ECG [88, 97, 100, 113], EEG [88, 97, 110, 113], Magnetoencephalography (MEG) [38, 75, 91, 94, 106, 110] and sEMG. Surface EMG is an indicator of muscle activity and related to body movement and posture. It has major applications in biosignal processing, next section explains sEMG and its applications.

Validity of the basic ICA model for sEMG applications

The application of ICA to the study of sEMG and other bio signals assumes that several conditions are verified, at least approximately: the existence of statistically independent source signals, their instantaneous linear mixing at the sensors, and the stationarity of the mixing and the ICs. The independence criterion considers solely the statistical relations between the amplitude distributions of the signals involved, and not the morphology or physiology of neural structures. Thus, its validity depends on the experimental situation, and cannot be considered in general. There are however, two other practical issues that must be considered:

  1. 1.

    Firstly, to ensure that the mixing matrix is constant the sources must be fixed in space (this was an implied assumption as only the case of a constant mixing matrix was considered). This is satisfied by sEMG as motor units are in fixed physical locations within a muscle, and in this sense applying ICA to sEMG is much simpler than in other biomedical signal processing applications such as EEG or fMRI in which the sources can move [50].

  2. 2.

    Secondly, in order to use ICA it is essential to assume that signal propagation time is negligible. Signals from Gaussian sources cannot be separated from their mixtures using ICA [71] because Gaussianity is a measure of independence. Mathematical manipulation demonstrates that all matrices will transform this kind of mixtures to another Gaussian data. However, a small deviation of density function from Gaussian may make it suitable as it will provide some possible maximization points on the ICA optimization landscape, making Gaussianity based cost function suitable for iteration. If one of the sources has density far from Gaussian, ICA will easily detect this source because it will have a higher measure of non Gaussianity and the maximum point on the optimization landscape will be higher. If more than one of the independent sources has non Gaussian distribution, those with higher magnitude will have the highest maximum point in the optimization landscape.

Given a few signals with distinctive density and significant magnitude difference, the densities of their linear combinations will tend to follow the ones with higher amplitude. Since ICA uses density estimation of a signal, the components with dominant density will be found easily. The fundamental principle of ICA is to determine the unmixing matrix and use that to separate the mixture into the ICs. The ICs are computed from the linear combination of the recorded data. The success of ICA to separate the independent components from the mixture depends on the properties of the recordings.

Source separation of sEMG

MUAP separation is a new biomedical application of ICA. In previous applications of ICA to sEMG, researchers have treated the sEMG activity from entire muscles as ICs. Each muscle contains up to 100 individual motor units and the sEMG activity from an entire muscle is the superposition of the activity from each motor unit within the muscle. It has been shown that it is possible to apply ICA to isolate sEMG signals from individual muscles [8, 72]. Treating sEMG activity from entire muscles as ICs is useful in some applications, especially when studying muscle activity in performing movements. For example, ICA has been used to determine the exact sequence of muscle contractions in swallowing by McKeown et al. [72] in order to diagnose dysphagia (disorder of swallowing). The focus on treating sEMG activity from entire muscles as ICs arises from a desire to analyse human movement. The most important application of sEMG is as a clinical tool for neuromuscular disease diagnosis. In clinical applications physicians seek to analyse individual motor units. BSS techniques such as ICA is proposed as a novel approach for isolating individual MUAPs from sEMG interference patterns by treating individual motor units as independent sources. This is relevant to clinical sEMG as motor unit crosstalk can make it difficult to study individual MUAPs [56].

During the sEMG recordings of the digitas muscles to identify the hand gestures for human computer interface, the cross talk due to the different muscles can result in unreliable recordings. The simplest and most commonly used method to improve the quality of the recording is rejection [10]. This is done by discarding a section of the recording that has artefact exceeding a threshold. This method is simple, but causes a significant loss of data and its reliability is questionable since it is predominantly based on visual examination. There is little safeguard that prevents the removal of some small but important features of the signal. It is also very dependent on the technician making it less dependable, and very expensive.

The other commonly used techniques to improve the quality of bio signals recordings include spectral filtering, gating and cross-correlation subtraction [11]. Spectral filtering is often not useful due to the overlap of the frequency spectrum of the desired signals and the artefact component. On the other hand, gating and subtraction may introduce discontinuity in the reconstructed signal. In the recent past, techniques such as time domain [42, 109], and frequency domain regression [112, 114], have been attempted. However, simple regression in time domain can over-compensate the artefacts [93, 111]. The regression techniques depend on the availability of a good regressing channel—a separate channel to record the corresponding artefact as a reference. This is often not possible when recording sEMG. Therefore, better artefact removal techniques are necessary to overcome the disadvantages of the previous methods. One property of the sEMG is that the signal originating from one muscle can generally be considered to be independent of other bioelectric signals such as ECG, EOG, and signals from neighbouring muscles. This opens an opportunity of the use of ICA for this application.

A number of researchers have reported the use of ICA for separating the desired sEMG from the artefacts and from sEMG from other muscles. While details differ, the basic technique is that different channels of sEMG recordings are the input of ICA algorithm. The outputs of ICA are the ICs and the estimated unmixing matrix W. He et al. [40] have used ICA to remove ECG artefact from sEMG data. A variation of the same has been attempted by the Djuwari et al. [24], for removing ECG artefact from sEMG of the lumbar muscles. They attempted to overcome the limitation of the number of signals to be equal to the number of recordings and remove the ambiguity of the order. Their work utilized ICA in two sequential steps. In the first step, ICA with multichannel sEMG recordings that was corrupted with ECG artefact as the input gave one pure ECG signal in one of its row. In the next step, vector z found by concatenating the row of the output matrix u = Wx contained the ECG artefact and each single row of x in turn was used as its input. The output of this step is a matrix y = Bz that contains ECG artefact in row and the ‘cleaned’ sEMG of corresponding channel in its other row. While in both cases, the visual inspection suggested the successful removal of the artefact, and statistical analysis seem to suggest an improvement compared to other techniques, because of the unknown properties of the signal, the quality of the signal before and after could not be compared in a better way. Similar work is also reported by Yong et al. [44] where ICA has been employed to filter the sEMG of the lumbar muscles. Azzerboni et al. [7] demonstrated the artefacts removal in sEMG using ICA and Discrete Wavelet Transform (DWT). ICA has also been used by Nakamura et al. [87], to decompose the sEMG recordings in terms of the MUAPs. In their paper, they have acknowledged the drawbacks and the necessary conditions required for the success of the ICA, but have not demonstrated the suitability of their experimental data for ICA application. The earlier work done by the researchers mainly focussed on sEMG source separation and identification.

One difficulty associated with ICA is that it is an iterative process and the initialization is random in nature. Because of this reason, the outcome of the separation has a randomness associated with it and the overall performance is not optimum. The quality of the separation has an associated randomness. This results in reduced average accuracy and reliability [83, 84]. Hence extended ICA techniques such as multirun ICA (MICA) has been proposed [79]. The MICA provided a substantial improvement (from 65 to 99%) in the accuracy of identification of hand gesture based on sEMG. The problem associated due to randomness in ICA algorithms is overcome by multiple estimation of the unmixing matrix and selecting the best unmixing matrix based on the highest signal to interference ratio (SIR). This selected matrix is then used to decompose and classify the sEMG features. The applications of extended ICA are identification of hand gestures, muscle fatigue analysis and other source separation methods.

Basic properties of fractal

Fractals model complex physical processes and dynamical systems. The underlying principle of fractals is that a simple process that goes through infinitely many iterations becomes a very complex process. Fractals attempt to model the complex process by searching for the simple process underneath [36]. FDs are used to measure the complexity of these objects. The important and famous two examples are Sierpinski triangle and the Koch curve, which are shown in Fig. 2.

Fig. 2
figure 2

Sierpinski triangle [14]

Let ‘F’ represent a Fractal. The basic properties of ‘F’ are [3, 30]:

  1. 1.

    F has a fine structure i.e. detail on arbitrarily small scales.

  2. 2.

    F is too irregular to be described in traditional geometrical language, both locally and globally.

  3. 3.

    F has some self similarity, perhaps approximate or statistical.

  4. 4.

    Usually FD of F is greater than its topological dimension.

The concept of a fractal is most often associated with geometrical objects satisfying the two important properties:

  • self-similarity

  • fractional dimensions

Mathematically, the self-similarity property should hold on all scales but in the real world, there are necessarily lower and upper bounds over which such self-similar property applies. The second criterion for a fractal object is that it has a fractional dimension. This requirement distinguishes fractals from Euclidean objects, which have integer dimensions. As a simple example, a solid cube is self-similar since it can be divided into sub-units of 8 smaller solid cubes that resemble the large cube, and so on. However, the cube, despite its self-similarity, is not a fractal because it has a dimension =3 [14, 32].

The concept of a fractal structure, which lacks a characteristic length scale, can be extended to the analysis of complex temporal processes. Although time series are usually plotted on a 2-dimensional surface, it actually involves two different physical variables. The important challenge is in detecting and quantifying self-similar scaling in complex time series [14, 34].

Self-similarity

An important defining property of a fractal is self-similarity, which refers to an infinite nesting of structure on all scales. In this section, the properties and definition of self-similarity are explained.

Self-similarity is a distinctive feature of most fractals. Self-similar processes are the ones in which a small portion of the process resembles a larger section when suitably magnified indicating scale invariance of the process. Self-similarity, in a strict sense, means that the statistical properties of a stochastic process do not change for all aggregation levels of the stochastic process. The stochastic process looks the same irrespective of any magnification of the process. The following will illustrate various types of self similarity as well as present some real world examples [13, 14, 32, 48].

- Exact self similarity

Exactly self-similar fractal objects are identical regardless of the scale or magnification at which they are viewed. Strict self-similarity refers to a characteristic of a form exhibited when a substructure resembles a superstructure in the same form. The well known Koch snowflake curve created by starting with a single line segment and on each iteration replacing each line segment by four other shapes as shown in Fig. 3 is a good example for this kind.

Fig. 3
figure 3

Example of exactly self-similar object [36]

- Approximate self similarity

The more common type of self similarity is the approximate self-similarity. Approximate self-similar objects has recognisably similar object at different scales but are not exactly the same.

- Statistical self similarity

The self-similar units of a time series signal sometimes cannot be visually observable but there may be numerical or statistical measures that are preserved across scales to determine the self-similar units. This is termed as statistically self-similar. Most physiological signals fall into the category of having statistically self-similar property. An example of statistical self-similar object is 1/f noise (Fig. 4), where the units are statistically resemble across multiple zooming levels.

Fig. 4
figure 4

Example of statistical self-similar object [36]

The self-similarity of a time series related process can be verified using the procedure [52] as follows:

  • If y(k) be a time series representing the process, then y (m)(k) is the aggregated process with non-overlapping blocks of size m such that:

    $$ {y}^{(m)} (k) = \frac{1}{m} \sum\limits_{l=0 }^{m-1} y(km-l) $$
  • For the signal or process, y(k) to be self-similar, the variance of the aggregated process decays slowly with m and this self-similarity is measurable by H , that is,

    $$ Var( {y}^{(m)}) \approx {m }^{- \beta} $$

    with 0 < β < 1 and

    $$ H = 1 - \beta/2 $$

    where H expresses the degree of self similarity; large values indicate stronger self-similarity.

  • If \(H \epsilon (0.5,1)\) then the time aggregated series is long-range dependant (LRD).

It explains that the repeated occurrence of a particular pattern or a set of particular patterns creates a part and the whole time series. FD can be applied to determine this statistical self-similarity i.e., the similarity between a part and the whole time series [4, 99].

Fractal dimension (FD)

FD of a process measures its complexity, spatial extent or its space filling capacity and is related to shape and dimensionality of the process. The concept of fractal can be applied to physiological processes that have self similar fluctuations over a multiple scale of time and have broad band frequency spectrum [37].

There are many FDs reported in literature [30, 32, 66] including morphological (self-similarity, Hausdorff, mass), and entropy (gyration dimension, information, correlation, variance). The dimension is simply the exponent of the number of self-similar pieces with magnification factor N into which the figure may be broken.

Given a self-similar set S, the FD D of this set S defined as ln k/ln M where k is the number of disjoint regions that the set can be divided into, and M is the magnification factor of the self-similarity transformation [13, 34, 70]. This definition of the FD of a self-similar object is expressed as

$$ {\tt Fractal\,dimension} = \frac{\tt log(number\,of\,self-similar\,pieces)}{\tt log(magnification\,factor)} $$
(4)

A simple example of computation of FD of the Sierpinski triangle is illustrated below. Consider the Sierpinski triangle shown in the Fig. 2 consisting of 3 self-similar pieces, each with magnification factor 2. So the FD of this triangle as per the above expression (Eqn. 4) is

$$ \begin{aligned} {\tt Fractal dimension} &= \frac{\tt log\,3}{\tt log\,2}\\ &\tt{= 1.58} \end{aligned} $$

Hence the dimension of Sierpinski triangle is between 1 and 2. FD is a measure of complexity of a self-similar structure and it measures how many points lie in a given set. A plane is larger than a line, while the dimension of Sierpinski triangle lies in between these two sets [23].

The fractal properties of a time series signal can also be characterised by computation of FD. As explained in Section 4.1, the irregularity seen on different scales of time series is not visually distinguishable, an observation that can be confirmed by statistical analysis [59, 92]. The roughness of the time series signals like biosignals, possesses a self-similar or scale-invariant property and their complexity can be analysed using FD.

The nonlinearity of physiological systems may have relevance for modelling complicated sEMG, for example, low-level movements in which interactions and cross-talk occur over a wide range of temporal and spatial scales. A fundamental methodological principle underlying these interpretations is important for analyzing continuously sampled variations in physiological output, such as muscle activity. Dynamical analysis demonstrates that there is often hidden information in physiological time series and that certain fluctuations previously considered noise actually represent important information [34, 49, 92]. This research proposes the use of fractal theory in sEMG for identification of low-level muscle contraction.

Self-similarity of sEMG

In complex bio signals like sEMG, there exists self similarity phenomenon, in which there is a small structure (motor unit) that statistically resembles the larger structure. The source of sEMG is a set of similar action potentials originating from different locations in the muscles. Because of the self-similarity of the action potentials that are the source of the sEMG recordings over a range of scales, sEMG has fractals properties.

Preliminary analysis was performed to establish the suitability of the use of fractal analysis of sEMG recordings. The recording of sEMG while performing simple contraction was conducted to test the presence of self-similarity. To determine the self-similarity in the recorded muscle activity (sEMG), the procedure explained in Sect. 4.2 was followed :

  • A new time series y (m) (k) of the aggregated sEMG signal over scale, m was generated from the recorded sEMG signal.

    $$ {y}^{(m)} (k) = \frac{1 }{m} \sum\limits_{l=0 }^{m-1} y(km-l) $$
  • The natural log of variance between the original and the aggregated series was plotted against the natural log of m. This is shown in the Fig. 5.

  • From the Fig. 5, it is observed that the variance decays slowly with m with

    $$ \beta = 0.9573 < 1. $$
  • From this β value and the plot in Fig. 5, the self-similarity index of recorded sEMG signal was computed with

    $$ H= 0.5213 $$
Fig. 5
figure 5

Logarithmic plot of the variance and the scale m for a sample sEMG recording to determine the self-similarity property

Based on the value of β being less than 1, it is confirmed that the signal has self-similarity and is long-range dependant (LRD). This confirms the use of FD to determine this self-similar property of sEMG, while determining the muscle properties and muscle activation.

Method to determine Fractal dimension

There are many FDs reported in literature [26, 37, 92] including morphological (self-similarity, Hausdorff, mass), entropy (gyration dimension, information, correlation, variance) and wavelet transforms (Vrhel et al., 1995, Mallat 1989). Most common way to estimate FD of a spatial dataset is using the box-counting approach [101]. However, a disadvantage in the box counting dimension is the choice of initial and final size of the magnification factor and the computation takes more time [16].

FD analysis is frequently used in physiological signal processing like sEMG, EEG, ECG [26, 37, 92]. Applications of FD in these physiological signals include two types of approaches [29]:

  • Signals in the time domain The former approaches estimate the FD directly in the time domain or original waveform domain, where the waveform or original signal is considered a geometric figure and,

  • Signals in the phase space domain Phase space approaches estimate the FD of an attractor in state space domain.

Calculating the FD of waveforms is useful for transient detection, with the additional advantage of fast computation. It consists of estimating the dimension of a time-varying signal directly in the time domain, which allows significant reduction in program run-time [29]. The FD of sEMG is calculated to determine the transients in sEMG, that is related to the overall complexity of the muscle properties. Three of the most following prominent methods for computing the FD of a waveform [41, 55, 95] have been applied to the analysis of signals, and a variety of engineering systems.

  • Higuchi’s Algorithm

  • Katz’s Algorithm

  • Petrosian’s Algorithm

Study by [29] have shown that Higuchi’s algorithm provides the most accurate estimates of the FD. Katz’s method was found to be less linear and its calculated FDs were exponentially related to the known FDs, whereas Petrosian’s algorithm was found to be relatively linear and demonstrated the least dynamic range for the estimated FD. Based on this, Higuchi’s algorithm was considered for the computation of FD of sEMG in this study.

Review—Fractal theory based analysis of sEMG

Fractals refer to objects or signal patterns that have fractional dimension. These objects exhibit self-similarity. This defines that the objects or patterns on any level of magnification will yield a structure that resembles the larger structure in complexity [70]. The measured property of the fractal process is scale dependant and has self-similar variations in different time scales. FD of a process measures its complexity, spatial extent or its space filling capacity and is related to shape and dimensionality of the process [33]. The concept of fractal can be applied to physiological process that are self-similar over multiple scales in time and have broad band frequency spectrum. Fractals manifest a high degree of visual complexity [37].

Recent studies of fractal analysis of sEMG is summarised as follows:

  • Anmuth et al. [4] determined that there is a small change of the FD of the surface EMG signal and is linearly related to the activation of the muscle measured as a fraction of maximum voluntary contraction. They also observed a linear relationship between the FD and the flexion-extension speeds and load.

  • Gitter et al. [33] determined that FD can be used to quantify the complexity of motor unit recruitment patterns. They also demonstrated that the FD of EMG signal is correlated with muscle force.

  • Hu et al. [43] distinguished two different patterns of FD of sEMG signals.

  • Gupta et al. [37] reported that the FD could be used to characterize the EMG signal.

  • Recent study by Arjunan et al. [5] have also determined the fractal nature of other biosignals such as Electromyogram (EMG), where they have identified the relationship of FD with the size of the muscles. They have also reported the use of maximum fractal length (MFL) of biosignal as a measure of the signal strength.

The FD represents the scale invariant non-linear property of the signal and is an index for describing the irregularity of a time series. Based on the theoretical studies, FD is the property of the system or source of the signal and in the case of sEMG, it is the property of the muscle. It should be a measure of the muscle complexity and not a measure of the level of muscle activity.

Research study by [37, 43] have attributed change in FD to change in level of muscle contraction during high level muscle activity. But at low-level muscle contraction, this research work attribute the small changes in the FD to the changes in muscle properties such as size and length due to the contraction and not to the changes in muscle force. Studies by [12] have indicated that for low level of isometric muscle contraction, there is no change in the size of the muscle. Based on the above facts, it has been proposed that for low-level of muscle contraction, FD would not change with change in the level of muscle contraction and that FD would be a measure of the size and complexity of the muscles [5].

Summary

This review has described the background of ICA methods, the motivation for using ICA for source separation and identification in sEMG signal processing. This survey has discussed the issues of ICA applications in bio medical and real time data. This paper has presented and described an overview of the recent work and the background of utilization of sEMG in identification of human movement and the feature extraction methods for identification of low-level muscle activation. This paper has also presented recent studies on the use of fractal theory for analysis of sEMG. This literature review has discussed the strengths and limitations of the features used for identification of subtle muscle movements. While this review has conducted conclusive studies related to fractal analysis of sEMG, there is scope for improved understanding of multi fractal analysis when there are signals of different properties. There is the need for increased number of subjects and to conduct experiments over a longer period of time to determine the impact of inter-experimental variations.

Some of the limitations of using ICA and FD include :

  • Order and ambiguity problem associated with ICA may limit the source separation

  • Overcomplete ICA, where number of sources exceed number of recordings which may reduce the reliability of the system

  • Choice of scale and algorithm to compute FD for particular application will be a limiting factor and

  • Computation of FD will defer due to the noise and crosstalk.

There still exists numerous unsolved problems of ICA and FDs in sEMG signal processing. Some of the important ones includes:

  • To combine these fractal features with ICA to determine the source dependant properties. In order to overcome the crosstalk from different muscles recorded from different sensors or channels based on ‘complex’ nature of signal, future analysis on these fractal features with ICA has to be investigated

  • To investigate the feasibility of the MFL from single or two channels recording of different bio-signals like EEG, EOG. The study on using these physiological signals has been increased for applications in the field of HCI and medical systems and control for disabled individuals.

  • Order of the separated sEMG signals using ICA and

  • Normalisation of the estimated independent components to measure MUAP conduction velocity