1 Introduction

Heart rate variability (HRV) is the variation in the time interval between successive R-peaks of an electrocardiogram (ECG) signal. The study of HRV is useful in the diagnosis and prognosis of various physiological and pathophysiological conditions [1,2,3,4]. HRV is a result of the dynamic interactions between several feedback loops regulating the cardiovascular system occurring at variable rates. This leads to dynamic complexity in the HRV that is altered under different physiological and pathophysiological conditions [5]. It has been established that HRV is altered by several factors like respiratory sinus arrhythmia (RSA); Valsalva maneuver; decreases in venous return, the baroreflex, and the vasovagal reaction; exercise; thermo-regulation; embolisms; intra-venous (IV) injections; circadian rhythms; inter-patient factors like genetics and family history, sex, age, medical condition, and level of fitness; emotion; stress; sleep; body posture; smoking; caffeine; humoral factors, etc. [1,2,3,4,5]. Genesis of HRV is a highly interdependent and complex phenomenon that involves the interactions among parasympathetic and sympathetic branches of ANS along with inputs from the hemodynamic, electrophysiological, and humoral systems [6]. It has been established that measurement and evaluation of cardiovascular complexity from HRV provides useful prognostic indicators [2, 5, 7]. The complexity of beat-to-beat HRV varies with the different physiological situations including disease [8], pharmaceutical interventions [9, 10], and postural changes [11, 12]. Several linear and non-linear methods have been employed in the past to assess the dynamic properties of this physiological time series [13,14,15]. Linear methods which were based on either time domain [1], frequency domain [16], or time-frequency domain [13, 14, 17] were initially used for the analysis of HRV. Though these methods were able to comprehend the steady-state relation between parasympathetic nervous system and sympathetic nervous system of the ANS that cause HRV pattern, but they were not able to quantify the dynamic behavior of HRV that involved non-linear components of signal generation. Later, non-linear methods like Poincaré plot [18], and certain entropy measures like approximate entropy (ApEn), sample entropy (SampEn) [19], and transfer entropy (TE) [11] were used to characterize complexity of the physiological time series. Lake [20] discovered Gaussianity of HR which is a measure of physiological complexity using Shannon or differential and conditional Renyi entropy rate. Other non-linear methods like phase synchronization, fractal dimension [21], de-trended fluctuation analysis (DFA) [22], and recurrence quantification analysis (RQA) [23] were also used to give deeper insight into the dynamic interactions of HRV.

Beckers [24] concluded that non-linear heart rate fluctuations decline with age due to decreased autonomic modulation with increase in age. This provided evidence for the involvement of the autonomic nervous system in the generation of the complex fluctuation of HRV. Iyenger et al. [25] proved that young subjects have a stronger stability between many different physiological inputs that operate over different time scales so as to regulate cardiac cycle times. In contrast, elderly subjects exhibit crossover behavior due to degradation of some of these inputs and dominance of others. Y. Shiogai et al. [7] confirmed that the SDNN decreases significantly with age irrespective of the gender. Also, the total energy of HRV decreases with age as the influence due to respiratory activity and myogenic activity decreases with age while the neurogenic control of HR becomes more prominent with increasing age. Kampouraki et al. [26] extracted various statistical and wavelet features and utilized SVM for the successful classification of HRV on age-stratified data.

From these studies, it can be inferred that it is very important to take age into consideration for the HRV indices to produce an accurate interpretation in a clinical condition.

The reported studies utilized the established techniques for the quantification of the HRV. This work emphasizes on giving new insights to the quantification of HRV indices of healthy elderly and young subjects by tuning the conventional techniques for optimum results. Time domain indices, descriptive statistics, complexity indices based on ApEn, and non-linear indices based on RQA are analyzed and combined to develop new indices to quantify HRV. Further, a classification method based on support vector machine (SVM) and multilayer perceptron neural network (MLPNN) is presented to classify the elderly and young subjects.

2 Materials and methods

2.1 Experimental data

Twenty young subjects aged 28 ± 8 (mean ± SD) along with 20 elderly subjects aged 65 ± 5 (mean ± SD) participated in the study. They were abstained from any kind of prescribed medicine, alcohol, tobacco, and caffeine for 12 h prior to the recording. Recordings were done in a quiet and dark room. All subjects were rested initially for 10 min before the start of recording. All subjects were healthy and declared to assume no medication. Continuous ECG signal was recorded for the subjects using MP150 Biopac® System for a duration of 30 min at a sampling frequency of 250 Hz.

2.2 The Fantasia dataset

The Fantasia database from the PhysioBank [25] is an age-stratified data repository to study the effect of age on HRV. It consists of 40 subjects (20 young and 20 elderly) for which 120-min ECG recordings were performed. For 20 young subjects, the age lies between 21 to 34 years and for 20 elderly subjects, the age lies between 68 to 85 years old. Each group consists of healthy subjects that comprise the same numbers of men and women. While recording the ECG, all subjects were kept in a resting supine position in sinus rhythm and subjects watched the movie Fantasia (Disney 1940) to help retain wakefulness. The continuous ECG was sampled at sampling frequency of 250 Hz.

2.3 Extraction of beat-to-beat HRV series

R-peak detection from the ECG was done by an algorithm based on Shannon entropy and Hilbert transform [27]. Ectopic beats, if present, were removed using zero-degree interpolation. From the identified R-peaks of ECG, a time series of RR intervals (RRi), known as tachogram, is formulated. RRi thus obtained is the function of the number of heartbeats rather than their time of occurrence.

2.4 Calculation of descriptive statistical features of HRV

In this paper, three descriptive statistical features are evaluated for the HRV. Mean RR is defined as averaged value of the RRi. SDNN is the standard deviation of the RRi and RMSSD is defined as root mean square of successive differences of RRi.

2.5 Calculation of approximate entropy

Approximate entropy (ApEn) is an entropy-based technique, governed by the parameters such as tolerance threshold (r), lag (τ), embedding dimension (m), and data length (N). All these inputs need to be fixed before the calculation of ApEn. This technique is used to quantify the similarity in any time series [19].

Given a time series, {d (i) : 1 ≤ i ≤ N}, template vectors \( {X}_1^m,{X}_2^m,{X}_3^m\cdots, {X}_{N-m+1}^m \) are formed where:

$$ {X}_i^m=\left\{d(i),d\left(i+\tau \right),\dots, d\left(i+\left(m-1\right)\times \tau \right)\right\} $$
(1)

for i = 1, 2, …, N − m + 1. The conditional measure (R), so that the distance between two such vectors, within threshold (r), is given by:

$$ {R}_{\mathrm{ij}}=\theta \left(r-\left\Vert {X}_i^m-{X}_j^m\right\Vert \right) $$
(2)

where ‖.‖ is the maximum norm distance between the two vectors \( {X}_i^m \)and\( {X}_j^m \) and θ (.) is the Heaviside function. The conditional probability, \( {C}_i^m(r) \), defined as the number of such vectors, \( {X}_j^m \) within r of\( {X}_i^m \), hence is given by

$$ {C}_i^m(r)=\frac{R_{ij}}{N-m+1} $$
(3)

where j ranges from 1 to N − m + 1.

ApEn is computed using the conditional probabilities for m and m + 1 embedding dimension, given by

$$ \mathrm{ApEn}\left(m,r\right)=\frac{\sum_{i=1}^{N-m+1}\mathit{\ln}\left(\frac{C_i^m(r)}{N-m+1}\right)}{N-m+1}-\frac{\sum_{i=1}^{N-m}\mathit{\ln}\left(\frac{C_i^{m+1}(r)}{N-m}\right)}{N-m} $$
(4)

ApEn is a biased technique which while calculating conditional probabilities includes self-matching templates. In order to reduce this bias, self-matches are excluded and resulting undefined conditional probability \( {C}_i^m(r) \) is substituted to 0.5 as a correction strategy [28]. This strategy ensures that even for small data sets the bias can be reduced.

2.6 Recurrence quantification analysis

Recurrence quantification analysis (RQA) is a technique of analysis of non-linear data which quantifies the count and period of recurrences of a dynamic system given by its state-space trajectory. Quantification analysis of recurrence plots was first performed by Zbilut and Webber Jr. [29] and was complemented with new complexity measures by Marwan et al. [30].

The calculation of Rij, when both i and j ranges from 1 to N − m + 1, results in a two-dimensional binary M × M matrix, where,M = N − m + 1. This two-dimensional matrix is called as recurrence plot (RP). Hence, RP is the recurrence of a state occurring at time i that recur at time j, represented with dots within a two-dimensional squared matrix Rij, as shown in Fig. 1 where both axes are time axes with i and j representing time instants [29].

Fig. 1
figure 1

RRi series of a typical subject and its corresponding recurrence plot (m = 2, τ = 1, r = 0.2 × SD of time series). Self-matches form diagonal line as shown

RQA of recurrence plots is done by measuring the various indices. The typical indices are recurrence rate (%REC), determinism (%DET), and laminarity (%LAM). Recurrence rate (%REC) is the density measure of the points of recurrence in the RP. It is calculated simply by counting the black dots in the RP.

$$ \%\mathrm{REC}=\frac{1}{M\times M}\sum \limits_{i,j=1}^M{R}_{\mathrm{ij}}\times 100 $$
(5)

Determinism (%DET) is developed to measure the deterministic nature of the signal. In an RP, the diagonal points represent the repeating dynamics of the signal. %DET is defined as the fraction of recurrence points that make diagonal line segments.

$$ \%\mathrm{DET}=\frac{\sum_{L=\mathrm{Lmin}}^Ml\mathrm{P}l(l)}{\sum_{i,j=1}^M{R}_{ij}}\times 100 $$
(6)

Where Pl(l) gives the number of the diagonal lines of length l, while Lmin is the minimum length of the diagonal lines that have been considered.

%LAM is defined as the proportion of recurrence points that constitute vertical lines in the RP. Laminarity represents random dynamics in the signal.

$$ \%\mathrm{LAM}=\frac{\sum_{v=\mathrm{Vmin}}^M\mathrm{vPv}(v)}{\sum_{i,j=1}^M{R}_{\mathrm{ij}}}\times 100 $$
(7)

where Pv(v) is the count of the vertical lines of length v and Vmin is the minimum length of the vertical lines that have been considered.

2.7 Multilayer perceptron neural network

For short term datasets, multilayer perceptron neural network (MLPNN) classifier is employed which is a commonly used feed-forward neural network-based classifier that is simple to implement and is fast in operation [31]. The MLPNN consists of three layers in series: input layer, hidden or concealed layer, and the output layer. The objective of the hidden layer is to receive information from input layer, process it, and to forward it to the output layer. For MPLNN, the amount of neuron in the hidden layer is very critical as insufficient or excessive neurons can cause problems of over fitting [31]. Number of neurons in the hidden layer analytically rather is based on trial and error method [31,32,33,34]. For this study, we used a MLPNN model with single hidden layer of five hidden neurons as employed in some of the previous studies [31, 35].

The neurons which are in the middle layer multiply the input Xi with their connection weights Wij and sum them up as per the following equation (Fig. 2).

$$ {Y}_{\mathrm{j}}=\varnothing \left(\sum {X}_{\mathrm{i}}{W}_{\mathrm{i}\mathrm{j}}\right) $$
(8)
Fig. 2
figure 2

The structure of the MPLNN model

Here, ∅ is an activation function which can be threshold function, sigmoidal function, or hyperbolic tangent function [31]. In this study, a hyperbolic tangent function has been employed as the activation function [31]. In MLPNN, each weight Wij is adjusted iteratively so as to reduce the error (E) between the actual response Yj and desired response Ydj. E is defined as

$$ E=\frac{1}{2}{\left({Y}_{\mathrm{dj}}-{Y}_{\mathrm{j}}\right)}^2 $$
(9)

For adjusting the weights and minimizing the error, many training algorithms have been employed and out of these, a commonly used one is backpropagation (BP) training algorithm. In this paper, backpropagation supported by the Levenberg–Marquardt (LM) algorithm [31,32,33,34] has been employed to address the problem of slow convergence of conventional BP algorithm.

2.8 Support vector machine

Support vector machine (SVM) is an algorithm of machine learning, that is used for classification and regression purposes [36]. It is a form of supervised learning that is based on statistical learning theory. SVM is based on the idea of finding a hyper plane that discriminates the data into distinct classes where data is projected into a higher dimensional feature space. SVM distinguishes the data by maximizing the margin and minimizing the class error ratio [37]. SVM comprises of many reliable properties for learning and provides good experimental results, so it finds many applications in various fields [26, 36, 38, 39].

Figure 3 presents the basic idea about SVM. The data points are classified as positive or negative by finding a hyper plane that separates the data points my maximum margin.

Fig. 3
figure 3

SVM classification

For further explanation, suppose x is a vector which denotes a pattern to be classified and d denotes its class (d ∈ {± 1}). Also let ({xi, di}, i = 1, 2, ……. k) denotes a set of k training examples. In SVM, the challenge lies with the creation of a decision variable, f(x), that it correctly categorizes data into two classes. For linear SVM classifiers, the decision variable is given as [26]

$$ f(x)={W}^Tx+b $$
(10)

such that dif(xi) > 0 for di = +1 and dif(xi) < 0 for di = −1, where W is the vector of weights and while b defines the bias that forms the hyper plane, f(x) = 0. In SVM, the optimal hyper plane with maximum class separation can be found by lessening the following cost function [26]

$$ j(w)=\frac{1}{2}{W}^TW=\frac{1}{2}{\left\Vert W\right\Vert}^2 $$
(11)

subject to the constraints of separation

$$ {d}_{\mathrm{i}}\left({W}^T{x}_{\mathrm{i}}+b\right)\ge 1\ for\kern0.5em i=1,2,\dots \dots \dots \dots \dots k $$
(12)

Solution to (11) is given by

$$ W=\sum \limits_{i=1}^k\left({\alpha}_{\mathrm{i}}{d}_{\mathrm{i}}{x}_{\mathrm{i}}\right) $$
(13)

Hence, the final decision variable can be obtained by

$$ f(x)=\operatorname{sign}\left(\sum \limits_{i=1}^k\left({\alpha}_{\mathrm{i}}{d}_{\mathrm{i}}x{x}_{\mathrm{i}}+b\right)\right) $$
(14)

where xi is the training vector, x is the classification vector, and αi are the langrage multipliers for enhancing separation.

For the classes which are not linearly separable, kernel function k(x, xi) is used, which facilitates the classification using linear hyper plane. The final decision function given in (14) is adapted to

$$ f(x)=\operatorname{sign}\left(\sum \limits_{i=1}^k\left({\alpha}_{\mathrm{i}}{d}_{\mathrm{i}}\mathrm{k}\left(x,{x}_{\mathrm{i}}\right)+b\right)\right) $$
(15)

The kernel function of the SVM can be linear, polynomial, Gaussian, radial function etc. In this paper, for the choice of the SVM kernel, SVMs with various kernels like linear SVM, cubic SVM, quadratic SVM, fine Gaussian SVM, medium Gaussian SVM, and course Gaussian SVM were tested for the classification at it was found that SVM with quadratic kernel provides the highest accuracy [23].

2.9 Statistical analysis

A comparison among the different indices for the elderly and young subjects is performed using normal distribution and variance homogeneity test. If the results are positive, then independent samples t test is implemented, otherwise Wilcoxon rank test is done for a significance level of 0.05.

2.10 Selection of optimum threshold (r opt)

For the calculation of various indices of RQA and ApEn, selection of threshold “r” is very significant. For RQA, researchers have used empirically defined “r” as 0.20–0.25 times the standard deviation of the signal [23, 40]. For calculating ApEn, in case of slow dynamic signals, researchers have prescribed “r” within 0.01.–0.2 times the standard deviation of the signal [41]. Further, Chon et al. [42] illustrated that instead of strictly following the range recommendation, selected “r” should correspond to ApEnmax. This choice eliminates the problem of underestimating the ApEn due to lower tolerance as well as intrusion of self-matches in ApEn calculations due to higher tolerance threshold. The selected “r” corresponding to ApEnmax is the tipping point where self-matches begin to dominate other matches. Hence, it is the most appropriate measure to quantify the complexity of any time series. The corresponding “r” is considered as optimum tolerance threshold value, i.e., ropt [41].

In this paper, ropt that corresponds to ApEnmax is used for the calculation of RQA indices such as %REC, %DET, and % LAM. The corresponding calculations of ApEn and RQA have been made by selecting low embedding dimension, m = 2 and τ = 1 [43]. Figure 4 shows the variation of ApEn with variation in “r” with a step size of 0.01. Corresponding to the ApEnmax, \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) and \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) define the range of the values for the choice of ropt.

Fig. 4
figure 4

ApEn and %REC values over the range of “r” varying from 0 to 0.7 in steps of 0.01 for an RRi, \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) and \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) correspond to the maximum value of ApEn depicted as ApEnmax. The corresponding calculation is made by choosing m as 2 and τ as 1

RRi is a time series with finite resolution that is acquired from ECG signal, sampled at finite sampling frequency; therefore, sampling and quantification errors in the discrete RRi may lead to erroneous ApEn and %REC calculations [44]. This is reflected from the outcomes depicted in Figs. 4 and 5, which show the relationship between the number of neighborhood points and the radius of the neighborhood indicated by stepped line for the variation of ApEn and %REC. These steps lead to different values of ropt, i.e., ranging between \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) and\( {r}_{\mathrm{opt}}^{\mathrm{max}} \). However, for infinite resolution, this stepped response will be replaced by smooth line and \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) and \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) will coincide to a single value of ropt.

Fig. 5
figure 5

Selection of ropt for the elderly subject (f2o02) and young subject (f2y02) and random noise (RN), N = 300

Figure 5 shows the behavior of ropt for a typical elderly, a young subject, and a random noise (RN) series for the data length (N) of 300. Due to higher resolution of RN series, ropt corresponds to single value, while for RRi series, ropt is spread over a range of values. It is also seen that for an elderly subject, the minimum value of ropt, i.e., \( {r}_{\mathrm{opt}}^{\mathrm{min}} \), is lesser than that of its young counterpart. On the other hand, \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) of the elderly subject is greater than that of young subject, which results in greater difference between \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) and \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) for the elderly subject than young one.

2.11 Selection of appropriate RR data length (N)

Calculating ApEn requires comparison of the various data templates derived from a larger dataset having length N. Hence, ApEnmax and ropt are critically dependent on N. In this work, N is carefully chosen so that ropt corresponding to ApEnmax lies within the suggested range between 0.1 and 0.25 times SD of RRi [45].

To ascertain the optimum data length to be chosen for ropt, 50 realizations of random noise (RN) of N varying from 50 to 700 in steps of 50 are simulated and the same is performed on real RRi for young and elderly subjects. Figure 6 shows the variation of ropt with respect to N for RN series. It is observed that ropt decreases with increase in N and becomes nearly constant for N ≥ 300 for RN series.

Fig. 6
figure 6

Variation of mean of “ropt” with data length (N). “ropt” value is mean over 50 random noise (RN) series

Figures 7 and 8 show the variation of ropt with respect to N varying for RRi obtained from young and the elderly subject respectively. It is seen that the minimum value of N, for which ropt (\( {r}_{\mathrm{opt}}^{\mathrm{min}} \)) is within the recommended range, is found to be 300 for elder as well as young subjects. In this case also, ropt remains almost constant for N ≥ 300 and remains within the suggested range for RRi.

Fig. 7
figure 7

Variation of mean of lower value of “ropt” (\( {r}_{\mathrm{opt}}^{\mathrm{min}} \)) with data length N for RRi

Fig. 8
figure 8

Variation of mean of upper value of “ropt” (\( {r}_{\mathrm{opt}}^{\mathrm{max}} \)) with data length N for RRi

From Figs. 7 and 8, it is also observed that mean value of \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) remains lower for the elderly subjects irrespective of data length, while on the other hand, mean value of \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) always remain higher.

2.12 Calculation of radius differential (R D)

It has been established that due to the finite sampling frequency of ECG, the return map of RRi as shown in Fig. 9 have points lying at least one sampling period (TS) apart [44]. This separation increases further, if RRi time series has lower complexity, due to reduced probability of finding a point at minimum resolution. Based on this, an index, namely radius differential (RD), is provided for the assessment of complexity. RD is defined as the range of values of ropt that corresponds to same value of ApEn, i.e., ApEnmax.

$$ {R}_{\mathrm{D}}=\left({r}_{\mathrm{opt}}^{\mathrm{max}}-{r}_{\mathrm{opt}}^{\mathrm{min}}\right) $$
(16)
Fig. 9
figure 9

Poincaré plot of an RRi, inner and outer circles are drawn with threshold (radii) \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) and \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) respectively. Both these circles have encircled the same number of points

Hence, RD is the amount of uncertainty involved in calculating ropt within the plateau range \( \left({r}_{\mathrm{opt}}^{\mathrm{max}},{r}_{\mathrm{opt}}^{\min .}\right) \)

Figure 9 shows the Poincaré plot of RRi where threshold \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) and \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) are represented by two concentric circles. From Fig. 9, it is assumed that the inner and outer circles encompass the same number of points, and hence they correspond to the same value of ApEn. Hence, RD can be taken as the radial difference between these two concentric circles.

Figure 10 shows the Poincaré plot of an elderly and a young subject from the Fantasia database. The distribution of points is dense for the young subject, which results in lower value of RD. It is also worth mentioning here that the minimum value of RD is limited by sampling interval (Ts). Hence, the changes in HRV with age, captured by RD, are apparent from Figs. 5, 9, and 10

Fig. 10
figure 10

Poincaré plot of an elderly and a young subject from the Fantasia database

3 Results

3.1 Descriptive statistics (mean ± SD)

The results presented in Table 1 are calculated by randomly extracting 1040 data segments of RRi with a preset length of 300, from the recorded and standard Fantasia database of the elderly and young subject s. It is observed that for HRV, mean of SDNN is greater in the case of the young subjects than the elderly subjects. The heart rate, represented by reciprocal of mean RR, is higher in young subjects than those of the elderly subjects. These results were found significant with p value less than 0.05. Figure 11 also endorses the results.

Table 1 Mean and SD of various indices of the young and the elderly subjects calculated for 1040 data segments with N = 300, extracted from recorded and fantasia dataset
Fig. 11
figure 11

Indices computed for RRi time series of the elderly and young subjects from the Fantasia database [43]

3.2 ApEn results

In this paper, for the calculation of ApEn- and RQA-related indices, m = 2, τ = 1, and N = 300 are used. The indices calculated from ApEn are tabulated next to the descriptive statistics indices in Table 1. ApEnmax, \( {r}_{\mathrm{opt}}^{\mathrm{min}} \), \( {r}_{\mathrm{opt}}^{\mathrm{max}} \), and RD are computed from the RRi time series of the elderly and young subjects. Mean and SD of each parameter along with p value are depicted. Mean value of ApEnmax is almost similar for the elderly and young subjects with a marginal difference. Averaged \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) is lower for the elderly subjects than the young ones, while the mean value of \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) is slightly higher for the elderly subjects. Parameter RD has a significantly higher mean value for the elderly subjects than the young subjects. All indices are found significant with a lower p value, except ApEnmax, which has a p value of 0.118.

3.3 RQA results

The indices calculated from RQA are presented next to the ApEn indices in Table 1. Mean and SD of %REC, %DET, and %LAM along with their respective p values are presented. In this paper, for the calculation of %DET and %LAM, minimum line length is set to 3, as this results in the decay in the influence of noise [43]. Mean of %REC is almost the same for the two classes with a slight difference, conversely SD of %REC shows substantial change. Similar trend is observed for %DET and %LAM. These indices are found significant with a lower p value.

3.4 Effect of sampling frequency of ECG on ApEn indices

The values of ApEn indices \( {r}_{\mathrm{opt}}^{\mathrm{min}} \), \( {r}_{\mathrm{opt}}^{\mathrm{max}} \), and RD may be influenced by the resolution of RR intervals and hence depend upon sampling frequency of the ECG signal acquired. To investigate this, these indices are computed for the ECG signals sampled at 250 Hz, 500 Hz, and 1000 Hz for the elderly and young subjects respectively. To obtain high-resolution ECG signals, the already acquired ECG signals (sampled at 250 Hz) were interpolated to reflect a sampling frequency of 500 Hz and 1000 Hz using cubic spline interpolation [46]. The results are presented in Table 2.

Table 2 Effect of sampling frequency of ECG on ApEn indices

3.5 Correlation analysis of descriptive statistics, RQA, and ApEn

A correlation analysis between descriptive statistics, RQA, and ApEn indices is carried out for the young and elderly subjects using Pearson’s cross correlation coefficient and the results are tabulated in Tables 3, 4, and 5 respectively. Table 3 shows the Pearson cross correlation (CC) between descriptive statistics and ApEn indices for the elderly and young subjects. Table 4 shows the cross correlation between ApEn and RQA indices. Table 5 shows the cross correlation between RQA indices and descriptive statistics.

Table 3 Pearson cross correlation (CC) between descriptive statistics and ApEn indices for the elderly and young subjects
Table 4 Pearson cross correlation (CC) between ApEn and RQA indices for the elderly and young subjects
Table 5 Pearson cross correlation (CC) between descriptive statistics and RQA indices for the elderly and young subjects

3.6 Classification by MLPNN and SVM

In this work, commonly used 10-fold cross-validation has been employed to classify the samples. The significant indices as stated in Table 1, derived from 1040(520 each) data segments for the elderly and young subjects, were used. To examine the effect of the selection of indices on the classification performance, the indices were categorized into three different categories. Category-I comprises of descriptive indices such as mean RR, SDNN, and RMSSD. Category-II comprises of RQA indices such as %REC, %DET, and %LAM, while Category-III consists of ApEn indices such as\( {r}_{\mathrm{opt}}^{\mathrm{min}} \), \( {r}_{\mathrm{opt}}^{\mathrm{max}} \), and RD. For comparing the performance of the two classifiers, the following performance measures were employed:

$$ \mathrm{Recall}=\frac{\mathrm{True}\ \mathrm{Positive}\left(\mathrm{TP}\right)}{\mathrm{True}\ \mathrm{Positive}\left(\mathrm{TP}\right)+\mathrm{False}\ \mathrm{Negative}\ \left(\mathrm{FN}\right)} $$
(17)
$$ \mathrm{Precision}=\frac{\mathrm{True}\ \mathrm{Positive}\ \left(\mathrm{TP}\right)}{\mathrm{True}\ \mathrm{Positive}\ \left(\mathrm{TP}\right)+\mathrm{False}\ \mathrm{Positive}\left(\mathrm{FP}\right)} $$
(18)
$$ \%\mathrm{Accuracy}=\frac{\mathrm{No}\ \mathrm{of}\ \mathrm{Correct}\ \mathrm{decisions}}{\mathrm{Total}\ \mathrm{no}\ \mathrm{of}\ \mathrm{decisions}} $$
(19)

Figure 12 presents the variation of various performance measures with the choice of features. It is observed that, considering all the features as input to the classifier, SVM performs better than MLPNN with a maximum % accuracy of 99.7, recall and precision of 0.998 and 0.996 respectively. It is found that while considering Category-I along with Category-III, viz., leaving out the RQA features, the classification performance is significantly decreased. Similarly, the combination of Category-II & III produced a % accuracy of 90%, recall of 0.896, and precision of 0.903 for SVM while % accuracy of 85.1%, recall of 0.862, and precision of 0.864 was observed for MLPNN. Further, a significant drop in % accuracy, recall, and precision was observed both for SVM and MLPNN when ApEn-derived features were omitted, viz., Category-I along with Category-III was employed. Moreover, for the individual classification performance of the three categories of feature vectors, each category was separately tested and it was found that the Category-III, in which ApEn-derived features were present, gave improved results, i.e., % accuracy of 84.6 for SVM and 79.6 for MPLNN. The separate test of Category-I resulted with % accuracy of 81.2% and 78% for SVM and MLPNN respectively. RQA-derived features from Category-II were only able to discriminate classes with % accuracy of 68.8% for SVM and 62.1% for MLPNN.

Fig. 12
figure 12

Classification performance measures as a function of combination of features

4 Discussion

In the clinical setting, HRV and its use for predictive purposes is accounted for a number of physiological factors such as age and gender. HRV is known to decrease with normal aging process. This is indicated by the linear and non-linear indexes that reduce with age. This can be related to the concept of decreasing autonomic modulation with advancing age. Moreover, the reduction of the magnitude of heart period fluctuations and the decrease of complexity of the heart period dynamics are interpreted as a sign of the reduction of respiratory sinus arrhythmia and the increased activation of sympathetic control with age [4, 7, 24, 47, 48]. Therefore, it is important to take age into consideration for the HRV indices to produce an accurate interpretation in a clinical condition. The present work aids the previous studies and uses complexity indices that correlate to the descriptive statistics. In the earlier works [4, 12, 24, 47], it has been found that the young and elderly subjects can be differentiated based on HRV statistics; however, present work emphasizes on the development of complexity related indices that can significantly differentiate the two groups even for shorter data sets.

The autonomic nervous system (ANS) possesses a regulatory structure governed by non-linear processes and mechanisms controlled by the brain stem [7]. From Table 1, it is evident that the averaged mean RR of the elderly subjects is higher than that of the young subjects, implying that the heart rate of the young subjects is greater than that of the elderly subjects. Also, the data segments derived from the young subjects have higher averaged SDNN than the elderly subjects confirming greater HRV in the young subjects. Physiologically, this is due to sluggishness in the control mechanisms governing HRV due to aging. This is further confirmed by the higher value of RMSSD in the young subjects. These descriptive statistics alone are not enough to capture the real non-linear characteristics of the ANS processes and controls. To address this issue, the present work utilizes the features obtained from non-linear method of RQA and ApEn to quantify and classify HRV of the young and the elderly subjects. The bias introduced in the ApEn due to the “not defined” conditional probability (CP) is addressed by substituting the CP to 0.5 [28]. A unification approach utilizing the indices obtained from RQA and ApEn is presented to refine the RQA based on optimum threshold values \( {r}_{\mathrm{opt}}^{\mathrm{min}} \),\( {r}_{\mathrm{opt}}^{\mathrm{max}} \) and a newly proposed index RD.

Considering this, an effort has been made to extract these non-linear features, derived from indices listed in Table 1. In this work, traditional RQA method is fine-tuned by the selection of appropriate threshold based upon the maximum value of ApEn. From Table 1, it can be realized that, there is no substantial difference in the value of ApEnmax for the elderly and young subjects, further confirmed by a higher p value. On the contrary, \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) and \( {r}_{\mathrm{opt}}^{\mathrm{max}} \)and RD are found to be significant, having a considerably different values for the elderly subjects and young ones. This is also depicted in the box and whisker plot shown in Fig. 11.

It is observed from Fig. 7 that, irrespective of data length N, \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) has a lower mean value for the elderly subjects than the young subjects. This is because, for the elderly subjects, self-matches rapidly overpower other matches than the young subjects due to lesser variability in the heart rate of the elderly subjects than the young ones.

From the results shown in Table 2, it can be seen that \( {r}_{\mathrm{opt}}^{\mathrm{min}} \), \( {r}_{\mathrm{opt}}^{\mathrm{max}} \), and RD are influenced by the sampling frequency of ECG. However, the indices calculated for a relative study of elderly and young subjects under similar data acquisition techniques still provide significant distinguishing features. The lowest value of RD, i.e., mean minus the SD, is 0.04 for young subjects at 250-Hz-sampled ECG signal. This value is in fact the resolution of the ECG signal at this sampling rate. However, the value of RD for elderly subjects is higher than this lower limit of 0.04. Similar results are obtained for ECG sampled at 500 Hz and 1000 Hz. In a more generalized study, it is proposed to sample the ECG signal at a higher rate than 250 Hz to get a good resolution for fast as well as slow changing signals.

From Table 3, it is observed that RD is significantly correlated (p value < 0.05) to SDNN with a CC value of − 0.8927 and − 0.8597 respectively for elderly and younger subjects. Similarly, it is significantly correlated (p value < 0.05) to RMSSD with a CC value of − 0.6893 and − 0.6632 respectively for the elderly and younger subjects. This drop in SDNN is due to the decrease in HRV with age. RD being strongly correlated to SDNN and RMSSD shows the similar behavior. However, signal amplitude (RR variability amount sized by SDNN) is not coincident with complexity. Linear measures like SDNN does not capture the dynamics involved in the genesis of HRV. Non-linear measures like RQA face the limitation of dimensionality. Complexity-based measures like ApEn are used to characterize these dynamics quantitatively. The utility of RD is towards the unification of RQA and ApEn and highlighting uncertainty in calculating “ropt,” hence provide an alternative index to measure the complexity of HRV. The radius indices \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) and \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) show moderate positive and negative correlation with SDNN and RMSSD respectively and that confirms to the results obtained in Fig. 5. From Figs. 5, 9, and 10, it is seen that index RD, which signifies the uncertainty in calculating ropt, shows a relative difference between the younger and elderly subjects. There is no significant correlation between mean RR and ApEn indices, i.e., \( {r}_{\mathrm{opt}}^{\mathrm{min}} \) and\( {r}_{\mathrm{opt}}^{\mathrm{max}} \). This is because mean RR provides no information about the complexity of the signal. Table 4 shows the cross correlation between ApEn and the RQA indices. A moderate correlation is observed between the %REC, %LAM,\( {r}_{\mathrm{opt}}^{\mathrm{min}} \), and \( {r}_{\mathrm{opt}}^{\mathrm{max}} \) indices. The cross correlation between RQA indices and descriptive statistics is tabulated in Table 5. A significantly lower correlation is observed between these two classes of indices. This is because complexity is a different phenomenon than the amount of variability as measured by the variance and the standard deviation of a signal. Though, SDNN and RMSSD quantify the variations in a signal, these do not quantify the information about recurrence of samples of a given signal as measured by the RQA indices, viz., %REC, %LAM, and %DET.

A classification % accuracy of 81.2% is obtained by feeding the descriptive features (Category-I) to SVM in comparison to 78% using MLPNN classifier is shown in Fig. 12a. Moreover, from Fig. 12b, c, the recall values can be observed as 0.70 and 0.684 and precision values of 0.667 and 0.611 respectively for SVM and MLNPP. The classification indices for the quantification of the young vs. the elderly subjects can be enhanced if the non-linear characteristics of ANS control are captured using non-linear and information theory-based techniques such as RQA and ApEn respectively. Similar classification was done using RQA indices (Category-II) and a reasonable accuracy was obtained.

A significant improvement in the classification accuracy is observed when\( {r}_{\mathrm{opt}}^{\mathrm{min}} \),\( {r}_{\mathrm{opt}}^{\mathrm{max}} \), and RD features (Category-III) are used for the classification with % accuracy of 84.6% and 79.6% for SVM and MPLNN respectively. The classification accuracy improved further when a combination of two categories of features was used. The best results were obtained when descriptive indices (Category-I) were used along with ApEn features (Category-III) for the classification with % accuracy of 92.8% and 87.2% using SVM and MLPNN respectively.

Lastly, all the features were combined for the classification of data. The classification % accuracy of 99.7% was achieved using SVM with recall and precision values of 0.998 and 0.996 respectively. Using MLPNN resulted in % accuracy of 96.6% with recall value of 0.971 and precision values of 0.962.

The results of this work are in line with the previous studies reported [7, 24,25,26, 39, 47, 49]. The decrease in the value of RD indicates the decreases complexity on HRV series with advancing age as concluded by Y. Shiogai et al. [7], Iyenger et al. [25], Voss et al. [49], and many other researchers. The optimized data length N of 300 for the quantification of HRV by ApEn and RQA confirms the choice of length of RR time series for the HRV analysis as reported in [13, 50, 51]

5 Conclusion

Non-linear physiological control mechanisms associated with ANS are captured by adding the indices obtained from non-linear and information-based methods to the traditional descriptive statistics used for quantification of HRV. Enhanced classification accuracy is observed using the combination of these indices to segregate the young from the elderly subjects. Compared with the classification performed earlier using the descriptive statistics, with the addition of indices like %REC, %DET, %LAM, \( {r}_{\mathrm{opt}}^{\mathrm{min}} \), \( {r}_{\mathrm{opt}}^{\mathrm{max}} \), and newly defined RD, significant improvement in the quantification and classification of HRV is observed. Feature classification methods can be employed in future to optimize the choice of extracted features to successfully classify the HRV.