Keywords

1 Introduction

Heart rate variability (HRV) is a non-invasive measure related to the balance of the activities of sympathetic and parasympathetic divisions of the autonomic nervous system [1]. This variability is normal and indicates the heart ability to response to the environmental and physiological stimuli [2]. The balance of nervous system activities results in a nonlinear behaviour of the HRV time series. In general autonomic and parasympathetic activities attenuate with age [3], which is related with reduction of the HRV [4] (comparing the normal healthy adult and older-age adult).

There are several methods for HRV analysis [5], for example, standard linear techniques (time and frequency domain analysis) and nonlinear methods (correlation dimension analysis, largest Lyapunov exponent, central tendency measure, Poincare plot). However, none of them, from our knowledge, is regarded as universally applicable or effective for all the cases related to HRV analysis. In this study we propose a methodology that is based on the recurrence plot and recurrence quantification analysis. In recent years, recurrence plot (RP) and recurrence quantification analysis (RQA) have been applied to study different dynamics systems [6, 7], in natural science, physics [8], biological systems, and physiological processes involving heart rate variability [912]. Given its intrinsic discrete character, RQA is particularly suited for the analysis of HRV time series and allows for a direct quantification of the complex dynamics of heart rhythm modulation [13, 14].

RQA is a useful tool and helps to understand the variation of the autonomic nervous system over time. The major advantage of RQA and recurrence plots (RPs) over standard HRV analysis are their applicability to non stationary data and also their sensitivity to subtle changes in the cardiovascular system dynamics. These aspects enable RPs to be used in the characterization of changes in the basic cardiovascular parameters during both physiological and pathological conditions. But the analysis of HRV time series using only RQA statistics is known as not being able to provide consistent enough information to achieve a suitable classification. And our goal here is to have an effective method that allows one enough sensitivity to properly differentiate systems with very similar dynamics. This desired amplification in the discrimination sensitivity using the SVM in combination with the RQA, will be shown here in the subsequent sections. In this work, combined with SVM, we evaluated RQA measures to discriminate and identify groups of different ages, including information about the system.

2 Materials and Methods

2.1 Experimental Database

The study comprised a total of \(148\) tachograms divided into four groups: \(26\) full-term newborns (FNB) (\(8\) days on average), \(48\) premature newborn (PNB) (\(\pm 27.4\) days), \(61\) healthy young adults (HYA) (\(20.7\pm 1.6\) years), and \(61\) adults in preoperative evaluation for coronary artery bypass grafting for severe coronary disease (SCD) (\(58.4\pm 10.2\) years). All tachograms are from databases from previous studies of Transdisciplinar Nucleus for Chaos and Complexity Study (NUTECC/Brazil) [15, 16]. There are time series with 15 min up to 1 h recording period from patients in a supine rest position without visual and sound stimulations.

The equipment used to collect signal was Polar Monitor (S810i or RS800), which has been proven [1719] to be feasible and reliable for measuring HRV according to recognized standards [20]. At a sampling rate of 1000 Hz, this device captures successive intervals between heartbeats, namely NN, in the normal sinus rhythm (i.e., initiated by the sinoatrial node). All these studies were approved by the respective ethic committee. All NN time series were filtered to remove artifacts using an adaptive filter which takes into account the peculiarities of the signal to be analyzed [5].

2.2 Recurrence Quantification Analysis

Defined as a repeated occurrence in time of a given state of a system, recurrence is a basic attribute of many dynamical systems. It means that along the time a trajectory comes repetitively close in the state space of points previously visited. Embedding the time series in a appropriate dimensional space and then plotting in a matrix the recurrences according to a tolerance rule results a recurrence plot (RP), which is a graphical representation of the recurrences in the dynamical system. The visual features of such plots are appealing and reveal patterns not previously viewed in the original series [13].

RP represents the autocorrelation in the signal at all possible time scales. Since the diagonal marks the identity in time, long-range correlations are associated to points far from the main diagonal, whereas the elements near the principal diagonal correspond to short-range correlations. Diagonals reflect the repetitive occurrence of similar states in the system dynamics and express the similarity of system behavior in two distinct time sequences. To quantity such features, recurrence quantification analysis (RQA) has been introduced for measuring quantitative information contained in recurrence plots [21].

For instance, the density of recurrence points in a recurrence plot is defined as recurrence rate (RR), giving the probability that a specific rate will recur. Parameters based on the diagonal lines are determinism (DET, the percentage of recurrence points forming diagonals from all recurrence points), averaged diagonal line length (L), maximal diagonal line length (Lmax), and entropy (which denotes the Shannon entropy of the histogram of the lengths of diagonal segments and thus indicates the complexity of the deterministic structure of the system).

Verticals are also important structures in a RP in that they reflect the persistence of one state during some time interval. The parameters derived from vertical lines are laminarity (LAM, the proportion of recurrence points forming verticals), trapping time (TT, the mean length of vertical lines), and the maximal length of a vertical, Vmax. Low TT, LAM, and Vmax values imply high complexity in the systems dynamics, since the state of the system stays only for a short time in a state similar to the previously occurring state. Theoretically, diagonal and vertical structures would not occur in random (stochastic) as opposed to determinist process [7].

2.3 SVM Classifier

Support Vector Machines (SVMs), developed by [22], are supervised learning techniques used for classification, regression analysis and learning tasks. Such techniques can be applied to the solution of problems related to text categorization, image analysis, and bioinformatics [23]. The main idea behind this classifier is to construct a hyperplane that maximizes the distance (so-called margin) to the nearest data points pertaining to two classes as pictured in Fig. 1.

Fig. 1
figure 1

Examples of separation of two classes using an SVM classifier: a two classes linearly separable, b two classes with nonlinear separation, and c separation achieved by a hyperplane in a high-dimensional space

Table 1 Number of training and test sets employed in the SVM classifier

The classifier was trained from the previously discussed dataset which, in an empirical way, was divided into the training and test sets enumerated in Table 1. The class label (PNB, FNB, HYA, SCD) for each NN interval time series was assigned by a cardiologist.

For each time series, eight RQA features were extracted to form the input for the classification step (Fig. 2). The SVM classifier was assessed from the LIBSVM open library [23] and executed \(100\) times for each RQA feature for comparison between two clinical groups. Detailed information about learning and classification algorithm can be found in [22, 23]. For each execution of the code, the training and test cases were randomly selected from which was obtained the average accuracy, defined as the percentage ratio of the number of cases correctly classified to the total number of cases used for classification.

Fig. 2
figure 2

Structure of the methodology for discrimination of HRV clinical groups

3 RQA Plus SVM: Discriminating Almost Similar Dynamics

To show the effectivity of the proposed the methodology RQA \(+\) SVM, we used the logistic map time series (\(x_{n+1} = r* x_n*(1-x_n)\)) for values \(r=3.68\), \(r=3.7\) and \(r=3.9\) (see Fig. 3). For each value of the dynamic parameter \(r\), \(30\) time series were generated, each one with 2,000 points (the first 200 points were discarded to allow transients to die out), with \(x(0)\in [0.1, 0.8]\), and an incremental step \(\varDelta {x(0)}=0.0241\).

Fig. 3
figure 3

Bifurcation diagram of the logistic map with a zoomed in view in the region of the \(r\) parameters chosen: \(r=3.68\), \(r=3.7\) and \(r=3.9\)

For the study of the RQA measures, the RP parameters to the logistic map were selected embedding dimension (\(m=1\)), delay (\(\tau =1\)) and threshold radius (\(\varepsilon = 0.1\)). Details about these values are given in [7].

To the SVM classifier three groups are assigned (according \(r\) values: \(r=3.68\), \(r=3.7\) and \(r=3.9\)). We used \(21\) time series for each group of the training set and \(9\) to the test set. The average accuracy and standard deviation obtained by SVM are displayed in Fig. 4. In this figure an accuracy equal to \(1\) means that all of the cases tested (\(100\,\%\)) were correctly classified, while a zero value means that all cases were not properly classified. For accuracy values above the threshold of \(75\,\%\), the dynamics of the analyzed groups are considered to be similar. We can observe that for the pair of groups (\(r=3.68\), \(r=3.7\)) the RQA features are more similar than the pairs of groups (\(r=3.68\), \(r=3.9\)) and (\(r=3.7\), \(r=3.9\)). These results demonstrate the ability of the methodology RQA + SVM to differentiate groups with almost similar dynamics.

Fig. 4
figure 4

Average accuracy and standard deviation (\(\langle a\rangle \pm \delta a\)) obtained by the SVM of the comparison between two time series groups to Logistic Map. (A) \(r = 3.68\) and \(r = 3.7\), (B) \(r = 3.7\) and \(r = 3.9\) (red line); \(r = 3.68\) and \(r = 3.9\) (blue line). The proper level of accuracy (\(75\,\%\) or higher) is indicated by points to the right of the dashed vertical line

4 Application: Using HRV to Discriminate Physiological Age

The main objective of this study was to analyze RQA measures as a tool to discriminate HRV time series recorded from different clinical groups. Typical HRV time series and RP patterns are shown in Figs. 5 and 6, from which the peculiarities of each recurrence plot and the corresponding HRV series are noticeable. For these plots and throughout the present study the RP parameters were selected as: \(m=3\), \(\tau =3\), and \(\varepsilon =8\). The choice of embedding dimension (\(m=3\)) was based on results from the false nearest neighbor method [24]. We chose the minimum value for \(m\) that presented minimum percentage of false neighbors. This value was adopted for the time series analyzed, standardizing all the data set. Time delay for embedding was set at the first minimum of the mutual information function [25], since the embedded signals have the minimum overlapping information. The tolerance level, following the recommendation of [26], was selected to ensure the percentage of recurrence points lying between \(0.1\) and \(0.2\,\%\) to obtain reliable values for the RP parameters. Detailed discussions about the RP parameters are found in [13, 14, 26].

Fig. 5
figure 5

Examples of NN time series and RPs for a FNB and b PNB groups (embedding dimension \(=\) 3, delay \(=\) 3 and threshold \(=\) 8)

Fig. 6
figure 6

Examples of NN time series and RPs for a HYA and b SCD groups (embedding dimension \(= 3\), delay \(= 3\) and threshold \(= 8\)). Example of RPs for each time series groups (embedding dimension \(= 3\), delay \(= 3\) and threshold \(= 8\))

For each group, the extracted RQA features are displayed in Figs. 7 and 8. We can notice that for the pairs of groups (SCD, HYA) and (FNB, PNB) the RQA features are similar. Then to further examine the ability of RQA features to differentiate groups of different ages we applied SVM classification.

Fig. 7
figure 7

Average values and standard deviation to RQA diagonal parameters for each group

Fig. 8
figure 8

Average values and standard deviations of vertical-based RQA measures for each group

A similar plot to those for the RQA measures (Figs. 7 and 8) is obtained when using the mean value and standard deviation of the HRV time series. We see in Fig. 9a that the groups FNB and PNB can be distinguished from the groups SCD and HYA in terms of the average values of the NN intervals. But since an NN interval gives (in milliseconds) the duration of a heartbeat, the values displayed in Fig. 9a are only correlated with the mean heart rate for the time series of each group. However we emphasize that the mean value of the NN interval, i.e., the heartbeat average is not enough to characterize the homeostasis of an individual, which is a dynamical process that is reflected in the heart rate variability. On the other hand, upon analyzing the set of series in terms of beat-to-beat NN interval variability, the separation between groups is no longer possible as demonstrated in Fig. 9b.

Fig. 9
figure 9

Average values for the full set of HRV time series by taking for each series: a the NN interval and b the beat-to-beat NN interval variability

The average accuracy values of RQA indexes obtained from SVM through comparison between groups of different ages are reported in Fig. 10. It is seen that RQA indices are better at distinguishing groups the larger is the age difference. In fact, for close age difference as in Fig. 10a, b the average accuracies are restricted to 50 and 60 %, respectively. Nevertheless, this might indicate that age difference between the HYA and SCD groups is more significant than in groups FNB and PNB. In support to this conclusion, we see in Fig. 5 that the recurrence plots for the groups FNB and PNB look more similar than the RPs for the HYA and SCD groups (Fig. 6).

In addition, comparison of newborns with older individuals yields higher accuracy, namely, \(80\,\%\) as demonstrated in Fig. 10d and \(90\,\%\) in Fig. 10c. It is to be mentioned, however, that a larger age difference does not necessarily imply a larger accuracy, i.e., the larger accuracy in Fig. 10c is related to an age difference smaller than that in Fig. 10d.

Fig. 10
figure 10

Average accuracy and standard deviation (\(\langle a\rangle \pm \delta a\)) obtained by SVM from the comparison between two NN intervals time series groups to RQA indexes. a Full-term newborn (FNB) and premature newborn (PNB), b healthy young adult (HYA) and adult with severe coronary disease (SCD), c premature newborn and healthy young adult (red line); full-term newborn and healthy young adult (blue line), d premature newborn and adult with severe coronary disease (blue line); full-term newborn and adult with severe coronary disease (red line)

5 Conclusion

The present study was concerned with recurrence quantification analysis (RQA) of HRV time series for groups of individuals with different ages. RQA was proven to be a powerful discriminatory tool to detect the degree of determinism of the systems examined. Among the four groups studied, all the RQA measures (Figs. 7 and 8) were lower in the healthy young adults (HYA). Low TT, Lam, and Vmax, for instance, imply high complexity in the system’s dynamics. This result is in line with the concept that high complexity is a general feature of healthy dynamics compared to pathological conditions.

We also verified that RQA measures were able to differentiate groups, with the results demonstrating that better discrimination is achieved the higher the age difference is. It was noted in Fig. 10c, d, however, that a higher age difference does not imply a higher discriminatory accuracy. The closeness of the comparison of the SCD group with the newborns (PNB and FNB) and the higher degree of dissimilarity between the HYA group and the newborns reflect the fact that the comparisons were quantified in terms of HRV, which is age dependent. This result shows that the HRV decreases with age as described in [3, 4].

Given that HRV time series reflects the complex interactions of different control loops of the cardiovascular control system, the results obtained here provide important information on the autonomic control of circulation in normal and diseased conditions. In addition, the approach discussed here permits an automatic analysis of a large number of time series, thus making the method useful in clinical sets and in epidemiological studies to analyze HRV series or other biomedical signals.