Gender and age classification using a new Poincare section-based feature set of ECG

Goshvarpour, Ateke; Goshvarpour, Atefeh

doi:10.1007/s11760-018-1379-5

Gender and age classification using a new Poincare section-based feature set of ECG

Original Paper
Published: 31 October 2018

Volume 13, pages 531–539, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Signal, Image and Video Processing Aims and scope Submit manuscript

Gender and age classification using a new Poincare section-based feature set of ECG

Download PDF

Ateke Goshvarpour¹ &
Atefeh Goshvarpour^2,3

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In the medicine, the implication of individual differences has frequently been emphasized. Gender- and age-related differences can be mentioned as the most important individual parameters. On the other hand, electrocardiogram (ECG) signals are the subject of these differences. However, limited information is available regarding these individual dissimilarities in ECG dynamics. This study was aimed to evaluate gender and age differences by means of novel Poincare section indices. Our focus was to detect and classify dynamical behaviors of the ECG trajectories using three binary classification strategies: (1) gender-, (2) age-, (3) gender- and age-based classification. After constructing the 2D phase space of ECG, linear Poincare sections in distinct angles were developed and some geometric indices were extracted. The effect of delayed phase space on ECG measures was also inspected. We tested our algorithm on 79 healthy subjects. Using support vector machine, the maximum correct rate of 93.33% was achieved for the gender- and age-based classification strategies. Considering the information of both age and gender, the highest rate was 94.66%. The best results were achieved with delays of 5 and 6. In conclusion, our results showed that basin geometry of the ECG phase states is affected by individual differences.

Morphological and Temporal ECG Features for Myocardial Infarction Detection Using Support Vector Machines

Linear and Nonlinear Features for Myocardial Infarction Detection Using Support Vector Machine on 12-Lead ECG Recordings

Preliminary Study on Gender Identification by Electrocardiography Data

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Electrocardiogram (ECG) signals contain much rich information about the cardiac electrical activity. This signal is often used as a clinical and diagnostic tool [1, 2]. Women and men have general differences in their hormonal, anatomical, physiological, biochemical, and biomedical responses [3,4,5]. Therefore, gender information plays an important role in ECG signal interpretation in certain physiological, psychological, or pathological states. In addition, it is well established that other factors such as the subject’s age can have an influence on the ECG signals [6]. To evaluate the gender- or age-based differences, most of the previous investigations concentrated on the statistical and morphological features of ECG time series [6, 7]. Additionally, it has previously been proved that different frequency domain features can provide some information about individual differences in different states of disease and health. Despite the valuable and encouraging results, these techniques suffer from some shortcomings. One of the major shortcomings is that the frequency domain-based methods do not offer any details about the frequency component position in time. To overcome the imposed limitation of these techniques, Wavelet-based procedures have been presented to examine the non-stationary signals. Although the impressive reputation of the wavelet-based methods in bio-signal interpretation, it is hampered by some main limitations [8].

Due to the non-stationary, complex, and chaotic nature of ECG, many recent investigations have emphasized on the potential of the nonlinear bio-signal processing methods. By means of dynamic and nonlinear features, an increasing number of methodologies have been presented to scrutinize the gender-based ECG characteristics in various fields [9,10,11,12]. In addition, some investigations dealt with age-related ECG differences based on nonlinear dynamics [10, 13, 14]. Although some of the nonlinear algorithms such as Lyapunov exponent can deliver some global information about the reconstructed phase space structure of ECG and its trajectories, these methods are not able to describe the shape details of ECG trajectories. The geometric pattern of data points positioned on Poincare surfaces has been served as beneficial information in the study of nonlinear bio-signals [15,16,17,18,19]. To scrutinize the detailed shape information included in the reconstructed phase space, these techniques can also be utilized.

Following the idea of employing Poincare sections for capturing the cardiac chaotic behavior within ECG time series, we attempt to design a dynamic scheme to investigate gender- and age-based differences. Totally, the proposed framework covers the data selection, preprocessing module, the process of feature extraction, and implementing the classification scheme, which are described thoroughly in the next sections. An overview of the proposed procedure is provided in Fig. 1.

2 Materials and methods

2.1 Data

The data were selected from ECG-ID database available at PhysioBank [20]. Totally, three hundred and ten 20-s segments of lead I ECG were contained, which acquired from 90 participants at the sampling rate of 500 Hz. Data filtering included baseline drift removal, AC power line noise elimination (using a band-stop filter), exclusion of high-frequency distortions, and signal smoothing [20]. In this study, ECG of 79 subjects, including 37 male (age: 31.24 ± 13.92) and 42 female (age: 25.81 ± 10.8) was applied.

2.2 Preprocessing

The ECG segments (X) were normalized as follows:

$$ {\text{norm}}(X) = 2{{(X - \hbox{min} (X))} \mathord{\left/ {\vphantom {{(X - \hbox{min} (X))} {(\hbox{max} (X) - \hbox{min} (X))}}} \right. \kern-0pt} {(\hbox{max} (X) - \hbox{min} (X))}} - 1 $$

(1)

Further processing was performed in 0.8-s window length according to a normal ECG cycle duration [21].

2.3 Phase space

First, the trajectory is defined in an n-dimensional space by plotting the set of:

$$ [x_{k} ,x_{k + \tau } ,x_{k + 2\tau } , \ldots ,x_{k + (d - 1)\tau } ] = X(k)\quad {\text{for}}\quad k = 1,2, \ldots ,N - (d - 1)\tau $$

(2)

for a scalar vector x_i (i = 1, 2, …, N). In this equation, the lag and the dimension of the embedding are τ and d, respectively, and the delayed vector in the phase space is shown by X(k). The phase space reconstruction is crucially affected by the lag. Selecting a small and a large τ value generates an absolutely correlated and uncorrelated phase, respectively. We examined τ = 2–7 in the reconstruction of ECG phase space.

2.4 Poincare section

Description of the trajectory configuration and specification of the attractor type is realized using Poincare section, which is initially defined by the selection of Poincare hyperplane. Then, its definition is completed by specifying the crossing points (also called intersections) of the hyperplane and the trajectory. A line which shows the system status (Eq. 3) is the Poincare section in a 2D space.

$$ y = \tan (\theta )x + b $$

(3)

where tan(θ) is the slope. In addition, b is the y-intercept. In this study, the b was zero. Figure 2 shows the trajectory of an ECG cycle in the phase plane (black curve). The Poincare sections are shown in gray. A crossing point of the data trajectory with a Poincare section in a blue circle was also indicated.

The selection of the step size (θ) is very influential. Inaccurate θ can result in some incorrect features of basin. We examined different θ in the range of 0°–360° with the step size of 15° (Fig. 2). A line equation for each of two successive points of the ECG trajectory (F(x, y)) was calculated, and the crossing points of the line with Eq. (3) was computed (Eq. 4)

$$ \left\{ {\begin{array}{*{20}l} {x_{{{\text{Crossin}}{\text{g}}\;{\text{point}}}} = \frac{{y_{n} - mx_{n} - b}}{\tan (\theta ) - m}} \hfill \\ {y_{{{\text{Crossin}}{\text{g}}\;{\text{point}}}} = \tan (\theta )\;x_{{{\text{Crossin}}{\text{g}}\;{\text{point}}}} + b} \hfill \\ \end{array} } \right. $$

(4)

in which $ x_{n} \le x_{{{\text{Crossin}}{\text{g}}\;{\text{point}}}} \le x_{n + 1} ,\;m = {{y_{n} - y_{n - 1} } \mathord{\left/ {\vphantom {{y_{n} - y_{n - 1} } {x_{n} - x_{n - 1} }}} \right. \kern-0pt} {x_{n} - x_{n - 1} }} $. m denotes the line slope which passing over 2 successive points of trajectory F(x, y). x_n and y_n are the trajectory coordinates.

Finally, the following indices were extracted: the number of crossing points (F1), the area of ECG segment trajectory which has the smallest (F2), the largest (F3) value, and the mean area of basin values for all ECG cycles (F4). For calculating features (F2)–(F4), the area of the basin was firstly calculated for all ECG cycles, and then, the mean, minimum, and maximum of this measure were extracted. The average of standard deviations (SD) of the given series distribution in the horizontal (F5) and vertical (F6) coordinates, the average of third moments of the given series distribution in the horizontal (F7) and vertical (F8) coordinates, and the average of fourth moments of the data distribution in the horizontal (F9) and vertical (F10) coordinates were extracted.

2.5 Classification

Three binary classification strategies were considered. (1) Separating two gender categories of male (M) and female (F). (2) Classification of two age-groups, including younger adults (A1 ≤ 23 years) and older adults (A2 > 23 years). (3) Considering both age and gender information concurrently. In this way, four classes of younger male adults (MA1), younger female adults (FA1), older male adults (MA2), and older female adults (FA2) were defined. In addition, for the last strategy, one versus all schemes was adopted.

Before entering the measures to the classifier, they were normalized. A fivefold cross-validation scheme was employed for 10 times, while accuracy, sensitivity, and specificity were calculated to evaluate the network performance.

The popular SVM algorithm was implemented for categorization. This technique was known as a worthful one in the bio-signal categorization [22]. It usually operates with the adoption of a nonlinear kernel function to transform an input data into a high dimensional space, which ensures easier data separability compared with the original input. Depending on the input measures, an iterative learning procedure of SVM makes an optimum hyperplane which has the largest border between the categories. Finally, to recognize different clusters, the maximum-margin hyperplanes define the decision boundaries. Therefore, the higher the distance between hyperplanes and data points, the higher the classification rates. In this study, radial basis function (RBF), polynomial, and quadratic kernels were tested.

3 Results

After preprocessing the data, the 2D phase space of ECG segments was reconstructed for lags 2–7 (Fig. 3).

As shown in Fig. 3, the trajectory pattern was dissimilar in different lags. As the lag increases, the area of the phase space is larger. Its pattern has changed from being oval into the circular mode. Then, the Poincare sections in different angles were formed and 10 geometrical-based indices were extracted from the crossing points of the Poincare sections in different states. Mean, maximum, and minimum values of ECG features in different lags are shown in Fig. 4 for male and female groups.

Not only did gender affect the amount of indices, but the effect of lag was also evident on these values (Fig. 4). For example, the maximum number of Poincare crossing points (F1) was higher in female than in male. Mean and SD of the parameters in two age-groups are reported in Table 1.

Table 1 Mean ± SD of 10 extracted features (F1–F10) with different delays in two age ranges

Full size table

Both lag and age affected the amount of indices (Table 1). For example, lower F1 and F2 values were observed for A2 than A1. However, F3 values were higher in A2. The average area has grown with increasing delay, especially in all delays of the first age-group (F4). For other features, there was a difference between the two age-groups and among the various delays, although these changes did not have a specific pattern.

Performance evaluation of the features in terms of age and gender was done using SVM (Fig. 5).

The highest mean rates for age categorization were achieved in lag 6 using quadratic kernel (Fig. 5a). In this case, the mean accuracy, sensitivity, and specificity rates were 83.33, 95, and 97.14%, respectively. The second best rates were obtained by polynomial with the corresponding rates of 81.33, 90, and 94.29%. For age classification, the highest accuracy, sensitivity, and specificity were 93.33, 87.5, and 100%, respectively, using quadratic kernel. The highest mean classification rates for gender separation were achieved for lag 5 using quadratic kernel (Fig. 5b), where the mean accuracy, sensitivity, and specificity rates were 83.33, 94.29, and 95%, respectively. The second best rates were obtained by RBF with the corresponding rates of 83.33, 88.57, and 100%. For gender classification, the highest accuracy, sensitivity, and specificity were 93.33, 100, and 100%, respectively, using RBF and polynomial functions.

It can be concluded that the best results were obtained for the separation of age and gender classes with delays of 5 and 6. Therefore, in order to take into account the effect of both gender and age parameters, we only used these two delays. The mean ± SD of F1–F10 in four different groups is provided in Table 2.

Table 2 Mean ± SD of 10 extracted features (F1–F10) with delays 5 and 6 in four different groups

Full size table

In both lags, the ECG parameters were different in two age-groups and in two genders (Table 2). Considering both age and gender groups concurrently, mean performances are reported for lag 5 and lag 6 in Table 3.

Table 3 Mean SVM accuracy, sensitivity, and specificity rates in 10 times run, using RBF, polynomial, and quadratic as a kernel function in two lags of 5 and 6

Full size table

Optimal performances were achieved using the proposed methodology (Table 3). FA2 was obtained the highest mean rates using all kernel functions and for both lags. This class was separated with the highest mean rate of 94.66% using polynomial and lag 5. The second best results were allocated to the MA1. Using RBF and lag 6, it was recognized with the highest mean rate of 90%. Considering all classes, the mean accuracy rates were in the range of 80–85%. The highest mean accuracies were obtained using RBF. The sensitivity and specificity rates were also promising. The mean sensitivity rates were in the range of 92–98% and the mean specificity rates were in the range of 87–96%. The second best accuracy was obtained by polynomial kernel function with the mean rates of 83.33 and 82.33% for lag 5 and lag 6, respectively.

4 Discussion

Many factors affect ECG interpretation, including heart size, torso morphology, ECG lead placement, environmental artifacts, the person’s height and weight, age, gender, race, and genetic background. Therefore, it is very important and challenging to have robust ECG algorithms in different clinical conditions. The main contribution of this study was to evaluate subject differences in terms of their age and gender using ECG. We employed ECG of 79 subjects to scrutinize the effect of age and gender on the reconstructed ECG phase space. We defined 10 features to quantify the points of Poincare section intersected with the ECG phase space. Our results revealed that ECG dynamics were different in two age ranges and in two genders (Table 2). These results are consistent with the previous findings. Former analysis [6] showed that some global ECG indices are significantly different in females and males. Another study [23] emphasized that as ECG characteristics varied with gender and age, diagnostic ECG criteria should be age and sex specific. To examine the impact of age and gender in paced breathing, spectral and sample entropy indices were employed [10]. It was shown that fluctuations in cardio-respiratory coupling were noticeable only in middle-aged male subjects. Beckers et al. [13] reported that nonlinear measures of heart rate variability (HRV) declined with age. However, there were not any clear gender-based differences in these indices. ECG differences of females and males were investigated in response to sad stimuli [9]. They reported the efficiency of nonlinear indices in revealing gender-wise ECG differences. In another study, they showed that compared to females, sleep deprivation affects the ECG of males during affective stimuli [24].

In terms of the age and sex of subjects, there are a limited number of studies that applied nonlinear methods to investigate ECG dynamics. However, most of them provided global information about the ECG trajectory. Compared with these investigations, our proposed framework efforts to track the local information embedded in the ECG basin.

We obtained the highest accuracy of 93.33% for the gender- and age-based classification strategies. By combining age and gender information, the maximum rate was 94.66%. Previously, a little study was performed on gender and age classification using physiological signals. For gender classification, some frequency- and time-based HRV features were used [25]. Using SVM, the maximum accuracy of 84% was reported. A review article on gender classification [26] showed that SVM has been a common classifier in this field [26]. Other classifiers have been also used in age or gender categorization [27]. For age classification, the highest area under the ROC (86.25%) was reported for Bayesian network. Though they used some nonlinear features, detailed properties of ECG phase space have not been evaluated.

Although in this study we focused on the ECG Poincare section indices of healthy subjects based on the gender and age differences, in future investigations, the impact of these two factors on ECG dynamics of patients with heart ailments should be carefully studied by means of the proposed algorithm.

5 Conclusions

This manuscript presented a novel age and gender classification approach using Poincare section indices. Interesting results have been achieved from these phase space-based nonlinear features. Further improvements were obtained by incorporating features coming from both age and gender information concurrently. Totally, our findings provide a better insight into age- and gender-based discrimination using ECG characteristics delivered by Poincare section. In addition, considering the simplicity and rich information of the suggested technique, which is provided based on the chaotic nature of ECG, the algorithm can be applied as an efficient method in ECG waveform analysis in different states of disease and health, as well as for prediction and diagnosis purposes.

References

Arafat, M.A., Chowdhury, A.W., Hasan, K.: A simple time domain algorithm for the detection of ventricular fibrillation in electrocardiogram. SIVP 5(1), 1–10 (2011)
Google Scholar
Bassiouni, M.M., El-Dahshan, E.S.A., Khalefa, W., Salem, A.M.: Intelligent hybrid approaches for human ECG signals identification. SIVP 12(5), 941–949 (2018)
Google Scholar
Fomin, A., Da Silva, C., Ahlstrand, M., Sahlén, A., Lund, L., Stahlberg, M., Gabrielsen, A., Manouras, A.: Gender differences in myocardial function and arterio-ventricular coupling in response to maximal exercise in adolescent floor-ball players. BMC Sports Sci. Med. Rehabil. 6, 24 (2014)
Article Google Scholar
Mieszczanska, H., Pietrasik, G., Piotrowicz, K., McNitt, S., Moss, A.J., Zareba, W.: Gender-related differences in electrocardiographic parameters and their association with cardiac events in patients after myocardial infarction. Am. J. Cardiol. 101(1), 20–24 (2008)
Article Google Scholar
Kolb, B., Whishaw, I.Q.: An Introduction to Brain and Behavior, 2nd edn. Worth Publisher, New York (2005)
Google Scholar
Xue, J., Farrell, R.M.: How can computerized interpretation algorithms adapt to gender/age differences in ECG measurements? J. Electrocardiol. 47(6), 849–855 (2014)
Article Google Scholar
Patro, K.K., Kumar, P.R.: Effective feature extraction of ECG for biometric application. Procedia Comput Sci 115, 296–306 (2017)
Article Google Scholar
Kingsbury, N.: The dual-tree complex wavelet transform: a new technique for shift invariance and directional filters. In: Proceedings of the 8th IEEE DSP Workshop, Utah, August 9–12, 1998, Paper no. 86
Goshvarpour, A., Abbasi, A., Goshvarpour, A.: Do men and women have different ECG responses to sad pictures? Biomed. Signal Process. Control 38, 67–73 (2017)
Article Google Scholar
Kapidzic, A., Platisa, M.M., Bojic, T., Kalauzi, A.: Nonlinear properties of cardiac rhythm and respiratory signal under paced breathing in young and middle-aged healthy subjects. Med. Eng. Phys. 36(12), 1577–1584 (2014)
Article Google Scholar
Anishchenko, T., Igosheva, N., Yakusheva, T., Glushkovskaya-Semyachkina, O., Khokhlova, O.: Normalized entropy applied to the analysis of interindividual and gender-related differences in the cardiovascular effects of stress. Eur. J. Appl. Physiol. 85, 287–298 (2001)
Article Google Scholar
Ryan, S.M., Goldberger, A.L., Pincus, S.M., Mietus, J., Lipsitz, L.A.: Gender- and age-related differences in heart rate: are women more complex than men? J. Am. Coll. Cardiol. 24, 1700–1707 (1994)
Article Google Scholar
Beckers, F., Verheyden, B., Aubert, A.E.: Aging and nonlinear heart rate control in a healthy population. Am. J. Physiol. Heart Circ. Physiol. 290, H2560–H2570 (2006)
Article Google Scholar
Pikkujamsa, S.M., Makikallio, T.H., Sourander, L.B., Raiha, I.J., Puukka, P., Skytta, J., Peng, C.K., Goldberger, A.L., Huikuri, H.V.: Cardiac interbeat interval dynamics from childhood to senescence: comparison of conventional and new measures based on fractals and chaos theory. Circulation 100, 393–399 (1999)
Article Google Scholar
Karimui, R.Y., Azadi, S.: Cardiac arrhythmia classification using the phase space sorted by Poincare sections. Biocybern Biomed Eng 37, 690–700 (2017)
Article Google Scholar
Parvaneh, S., Golpayegani, M.R.H., Firoozabadi, M., Haghjoo, M.: Predicting the spontaneous termination of atrial fibrillation based on Poincare section in the electrocardiogram phase space. Proc Inst Mech Eng H 226(1), 3–20 (2011)
Article Google Scholar
Fang, S.C., Chan, H.L.: Human identification by quantifying similarity and dissimilarity in electrocardiogram phase space. Pattern Recogn. 42, 1824–1831 (2009)
Article Google Scholar
Yang, S.: Nonlinear signal classification in the framework of high-dimensional shape analysis in reconstructed state space. IEEE Trans. Circuits Syst. II Express Briefs 52, 512–516 (2005)
Article Google Scholar
Yang, S.: Nonlinear signal classification using geometric statistical features in state space. Electron. Lett. 40, 780–781 (2004)
Article Google Scholar
Lugovaya, T.S.: Biometric human identification based on electrocardiogram. Master’s Thesis. Faculty of Computing Technologies and Informatics, Electrotechnical University “LETI”, Saint-Petersburg (June 2005)
Najarian, K., Splinter, R.: Biomedical signal and image processing, 2nd edn, pp. 1–405. Taylor & Francis Group, LLC, CRC Press, New York (2012)
Google Scholar
Wang, X.W., Nie, D., Lu B.L.: EEG-based emotion recognition using frequency domain features and support vector machines. In: Lu, B.L., Zhang, L., Kwok, J. (eds) Neural Information Processing, ICONIP 2011, Lecture Notes in Computer Science, Vol. 7062, pp. 734-743. Springer, Berlin (2011)
Rijnbeek, P.R., van Herpen, G., Bots, M.L., Man, S., Verweij, N., Hofman, A., Hillege, H., Numans, M.E., Swenne, C.A., Witteman, J.C., Kors, J.A.: Normal values of the electrocardiogram for ages 16–90 years. J. Electrocardiol. 47(6), 914–921 (2014)
Article Google Scholar
Goshvarpour, A., Abbasi, A., Goshvarpour, A.: Sleep loss effects on affective responses of women and men using ECG characteristics. Biomed. Eng. Appl. Basis Commun. 29(5), 1750032 (2017)
Article Google Scholar
Tripathy, R.K., Acharya, A., Choudhary, S.K.: Gender classification from ECG signal analysis using least square support vector machine. Am. J. Signal Process. 5(2), 145–149 (2012)
Article Google Scholar
Lin, F., Wu, Y., Zhuang, Y., Long, X., Xu, W.: Human gender classification: a review. Int. J. Biom. 8(3/4), 275–300 (2016)
Article Google Scholar
Wigging, M., Saad, A., Litt, B., Vachtsevanos, G.: Evolving a Bayesian classifier for ECG-based age classification in medical applications. Appl. Soft Comput. 8(1), 599–608 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biomedical Engineering, Imam Reza International University, Mashhad, Razavi Khorasan, Iran
Ateke Goshvarpour
Department of Biomedical Engineering, Faculty of Electrical Engineering, Sahand University of Technology, Tabriz, Iran
Atefeh Goshvarpour
Mashhad, Iran
Atefeh Goshvarpour

Authors

Ateke Goshvarpour
View author publications
You can also search for this author in PubMed Google Scholar
Atefeh Goshvarpour
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Atefeh Goshvarpour.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goshvarpour, A., Goshvarpour, A. Gender and age classification using a new Poincare section-based feature set of ECG. SIViP 13, 531–539 (2019). https://doi.org/10.1007/s11760-018-1379-5

Download citation

Received: 06 March 2018
Revised: 23 August 2018
Accepted: 16 October 2018
Published: 31 October 2018
Issue Date: 03 April 2019
DOI: https://doi.org/10.1007/s11760-018-1379-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Gender and age classification using a new Poincare section-based feature set of ECG

Abstract

Similar content being viewed by others

Morphological and Temporal ECG Features for Myocardial Infarction Detection Using Support Vector Machines

Linear and Nonlinear Features for Myocardial Infarction Detection Using Support Vector Machine on 12-Lead ECG Recordings

Preliminary Study on Gender Identification by Electrocardiography Data

1 Introduction