Introduction

Parkinson’s disease (PD) is the most common neurodegenerative movement disorder. This type of disease is caused by both despair of motor control and malfunction of rhythm generation in the basal ganglia which has a strong effect on voluntary movement control [1]. Typical motor symptoms of PD known as movement disorders are tremor, bradykinesia, rigidity, and postural instability [2]. By progression of this disease, not only postural instability but also gait disturbances can be commonly seen in many cases. Some gait disturbances such as festination, short gait step and freezing gait can make diagnosis of PD easier and so investigation of parameters of gait would be a very useful method for both understanding of the mechanism of motor control and recognizing of the neurological disease progression [3,4].

Many different methods have been proposed in recent years for the diagnosis of PD [57]. Furthermore, in order to measure the parameters of the gait in PD and also investigate its characteristics, the computer-based methods have been widely utilized in the previous studies [814]. In [15], the coefficient of variation as a criterion for stride-to-stride fluctuations was used for analyzing of the gait data of both healthy control subjects and PD patients. They showed that this coefficient is larger in the gait data of PD patients and also can be used as the degree for disease severity. In addition, in [16] gait data in terms of acceleration signals corresponding to PD and healthy subjects were analyzed and the results showed that fractal dimensions of the body in PD patients is higher than that of healthy subjects. The same study also recommended that the acceleration signal during locomotion in both old and PD subjects alter with a complex pattern [15 17]. Although in these methods the stride-to-stride variability has been seen and analyzed, representing a powerful model for characterizing the gait variability has been remained an open problem.

In [18], in order to evaluate the gait variability in PD patients, two features were extracted and used for classification of gait data derived from healthy and PD subjects. They showed that by using a nonlinear support vector machine, it is possible to distinguish these gait data with an appropriate rate of accuracy. However, in this method most of temporal information of gait data has been neglected and only two features used for classification task. In this paper, Hidden Morkov Models (HMM) is used for modeling of the raw gait data instead of the extracted features. HMMs have been widely investigated and applied for past several years in the automatic speech recognition applications [19]. Recently, this method have been successfully used for both medical diagnosis and system monitoring applications such as EEG classification, ECG classification, character recognition and fault diagnosis system [2023]. Especially, this method would be useful for the classification of signals in terms of time sequences. In this article, we used HMM to classify the gait data derived from healthy and PD subjects and compared its performance against a sophisticated nonlinear classifier.

This paper is organized based on the following sections. Section II describes gait data and preprocessing procedures and reviews HMM formulation. Section III presents our results on the gait data classification problem. Finally, section IV concludes this paper.

Methods

Data description

The gait data provided by Hausdorff et al. were used in this study [10,15], and also can be downloaded from web page of physioNet (http://www.physionet.org) [24]. These gait data were recorded from 16 healthy subjects (2 men and 14 women) with age of 20–74 years and 15 PD subjects (10 men and 5 women) with age of 44–80 years as they walked at their normal pace in a 77 m long hallway for 5 min. The mean age (standard deviation) of the healthy, PD subjects were 39 (18.5), 67 (10.9) years, respectively. In the healthy group, any mental problems or motion disorders was not reported. The ages of the two groups were not noticeably different. From the recorded force applied to the ground during walking, 5 min of recording, consisting of stride, swing, and stand times for each leg and double support signals for both groups were derived.

Preprocessing

In the current study, the gait data from the right foot of both PD and healthy subjects in terms of stride interval (time from the contact of a foot to the ground to the following contact of same foot) were used. In order to remove the effect of unwanted artifacts in the start of recording, the first 20s of each gait data samples were removed before analysis of the gait data [10,15]. Based on the gait experiment described in [10], the subjects were requested to walk through a hallway during recording of gait signals and turn around at the end of the hall way and then continue walking. Because of this reason, the extracted gait signals represent large values in these points and so should be removed before processing. For removing these outliers, according to “sigma rule” method, the samples in the gait data with amplitude greater or less than 2 SDs of median value of whole signal were replaced by median value [25]. In Fig. 1 the typical results of removing of outliers from stride interval time sequence of a PD patient and a healthy subject are shown.

Fig. 1
figure 1

Typical result of outlier removal from the raw gait data in terms of stride interval: (a) of a 74-year-old male healthy subject; (b) of a female subject with PD disease. The gait signal with outliers is shown with blue color. The gait data samples with 2 SDs larger or less than the median value were substituted with the median value of the related gait signals

Hidden marko model

A hidden Markov model (HMM) is a state machine with two layers including state and observation layers in which a Marcovian process controls the selection of the state in each time. In standard HMMs, by using a discrete hidden state at time (t) all of the needed information before this time would be known and so the observation at any time depends only on its current hidden state. In each time, the HMM system is located in one state and transition between these states are defined based on an associated probability. Furthermore, each state is related to the output observation with its associated probability. The graphic representation of the HMM is illustrated in Fig. 2. The states of the Markov chain are hidden, but the outputs from the Markov chain are observable.

Fig. 2
figure 2

Representing an HMM. The states in each time are hidden but only outputs can be observed

An HMM for continuous data processing is represented by three matrix, λ = (π, A, B), consisting of a vector of initial probabilities π, a matrix of transition probabilities A, and a vector of probabilistic output functions B. Each element of mentioned matrixes is represented based on the following equations:

$$ {\pi}_i= \Pr \left({s}_i\ at\ t=0\right) $$
(1)
$$ {a}_{ij}= \Pr \left({s}_i at\ t+1\left|{s}_i at\ t\right.\right) $$
(2)
$$ {b}_i(x)= \Pr \left({O}_t=x\left|{s}_i at\ t\right.\right) $$
(3)

where \( x \) is an observed signal value and \( s \) is the state of the HMM model. In the current study, the output function B is a Gaussian density function which is shown in the following equation:

$$ n\left(O,\mu, \sigma \right) = \frac{1}{\sqrt{2\pi \sigma }}{e}^{\frac{-1}{2}{\left[\frac{O-\mu }{\sigma}\right]}^2} $$
(4)

where O, \( \mu \) and \( \sigma \) represents sequence of observation, average and standard deviation, respectively. In order to utilize an HMM model in the real- world applications, three main problems should be solved:

  1. Problem one

    Given the observation sequence O = (o 1o 2, …, o T ) and the model λ = (π, A, B), what is the best way for computing P(O|λ), the probability of the observation sequence. This problem is known as Evaluating.

  2. Problem two

    Given the observation sequence O = (o 1o 2, …, o T ) and the model λ = (π, A, B), what is the best way for finding corresponding sequence of states S = (s 1s 2, …, s T ). This problem is known as Decoding.

  3. Problem three

    what is the best way for adjustment of HMM model parameter (λ) to maximize P(O|λ). This problem is known as Training.

The significance of solving the third problem is obvious because the model parameters must first be estimated before the models can be used for classification purposes. For solving this problem, Baum-Welch algorithm has been introduced in the literature [19]. Furthermore, Forward backward recursive and Viterbi algorithms can be used for solving first and second problems, respectively [19].

In the following, the summary of HMM method in both training and test phases has been presented. For further information about the Hidden Markov Model refer to [19].

  1. 1.

    The number of states (N) and also the structure of transition matrix are identified.

  2. 2.

    The sequence of observation containing all of the training samples which belong to a specific class are calculated and then these samples are divided into the same segments equal to the number of state using K-means algorithm.

  3. 3.

    The initial values of average and standard deviation corresponding to Gaussian model are calculated using K-means algorithm,

  4. 4.

    Two separated HMM model for classification of PD and Healthy subjects are trained using the Baum–Welch algorithm.

  5. 5.

    The parameters of each model corresponding to each class are saved and used for the test phase.

  6. 6.

    In the test phase, Forward-backward algorithm is used and by comparison the calculated log-likelihood probability of test data the class of each data is identified.

Results

As it was told in previous section, the structure of HMM model is described by three matrix λ = (π, A, B) and so determining the parameters of this model such as the number of states and the number of Gaussian mixture is very important for training of the HMM model. In addition to these parameters, iteration numbers can be regarded as an important factor in training phase that should be considered for training of the model. In order to determine these parameters, the effect of each one on the performance of classifier has been investigated separately when all of the subjects are used for both the training and test phase. As it was told previously, the Baum–Welch algorithm is used to optimize the parameters of HMM model. By using this algorithm through iteration a maximum of the likelihood would be obtained. In Fig. 3 the relationship between number of iterations and log-likelihood for both PD and healthy subjects is shown. As can be seen in this figure, the log-likelihood maximization begins during about 20 iterations for both PD and healthy gait data. Therefore, an HMM classifier with twenty iterations can be used for classification.

Fig. 3
figure 3

Log-likelihood increase during increasing of iterations of the training process for both PD and healthy gait data

In order to investigate the effect of the number of states on the performance of classifier, the relationship between performance of classifier and the parameter of number of states has been investigated. Fig. 4 shows the recognition rates for PD and healthy subjects for different number of states. These results show that the best classification performance is obtained when the number of state is 5.

Fig. 4
figure 4

The relationship between Recognition rates and number of Gaussian mixtures for PD and healthy subjects during training phase

The same procedures have been used to investigate the effect of number of Gaussian mixtures on the performance of HMM classifier. Figure. 5 shows the recognition rates for PD and healthy subjects for different number of Gaussian mixtures. These results show that, the best classification performance is obtained when the number of mixture is 4.

Fig. 5
figure 5

The relationship between Recognition rates and number of states for PD and healthy subject during training phase

By investigation the effect of aforementioned parameters on the performance of classifier, the number of states, the number of Gaussian mixture and maximum number of iterations are chosen 5, 4 and 20, respectively. The final structure of HMM model with selected parameters for the final evaluation of performance of HMM model in gait data classification is shown in Table 1.

Table. 1 HMM structure

In Table 2 the result of classification of PD and healthy subjects based on the stride-interval gait data using HMM method has been shown. The leave-one-out (LOO) cross-validation method was used for evaluation of performance of the proposed classifier [26]. In this method, one gait pattern is used for the test of validation and the remaining patterns are used for the training of the classifier. According to Table 2, one PD and two healthy subjects were classified incorrectly by HMM. Two statistical parameters including sensitivity and specificity are used for evaluation of the accuracy of classifier in detection of PD and healthy subjects separately. These parameters are defined as follows:

Table. 2 result of recognition of Parkinson and healthy subjects

Sensitivity: the percentage of PD patients who are correctly recognized as PD ones.

Specificity: the percentage of healthy subjects who are correctly recognized as healthy ones.

In Table 2 the performance of HMM classifier is compared to LS-SVM method based on the obtained result from [18]. The comparison of results show that the HMM method resulted in higher sensitivity rates. However, higher specificity rates are obtained by using LS-SVM algorithm. The overall accuracy rate of both methods is similar with accuracy of 90.3 %.

Conclusions

Signal processing methods can help engineers to achieve diagnostic information through analysis of walking patterns in the patients suffering from neurological diseases. This information can be used to separate persons with specific neurodegenerative diseases from healthy ones. In the current study, we have tried to introduce a method to distinguish the PD persons from healthy ones. The current study is consisted of three main steps; raw gait data obtaining, preprocessing and classification using HMM. The gait rhythms of both PD and healthy subjects in terms of stride interval are used as the input of HMM classifier. In the preprocessing step, the samples of gait data recognized as outliers are simply substituted with the median value of whole time series of gait data. In the final step, the gait data are classified using HMM classifier. The investigation of results shows that the proposed method is efficient for interpretation of PD. The proposed classifier based on the raw gait data can correctly identify more than 90 % of the 31 subjects. These results are comparable to the results corresponding to the nonlinear SVM classifier based on the two feature extracted from the same gait data [18]. The same gait dataset was also used in several studies. In [27] a linear model was proposed to investigate the stride interval time series in PD, but the results showed that such a linear model is only useful for interpretation of the gait signals during walking at a constant speed.

In this article because of lack of match between age and gender of the subjects in PD and healthy subjects, we could not investigate the effect of these two factors on the performance of proposed classifier. However, it was shown that the effect of gender on usual locomotion patterns is not considerable [28]. In another study, It was reported that the effect of age factor on walking is extremely complex [29]. However, studies show that the factor of neurological disease is more effective in changing the rhythm of gait than the factor of age [3]. In the current study, the size of gait dataset was low and so this limited us to test the generality of the proposed method on recognition of PD. In the future studies we hope to construct a larger gait dataset to evaluate the performance of current classifier on recognizing PD.