Keywords

1 Introduction

In this modern digital era, an unique and accurate identity is essential need of the society. Traditional strategies for recognition include PIN numbers, tokens, passwords and ID cards raise serious security concerns of identity theft. The major benefit of security systems based on biometrics is that they work on an individual physiological or behavioral characteristics. One of the flaws of commonly used biometrics are the ease of falsification of credentials. For example, a photo can be counterfeited a face, the iris can be falsified by contact lenses and even the fingerprint can be fooled from a gel or latex finger.

In order to overcome the issues of conventional biometrics the bioelectric signals are one of the better choice. They are subjective to an individual and therefore, harder to mimic them. They are highly secure and are prevent from any fear of imitation. The electrocardiogram (ECG) is one of the known bioelectrical signal used to monitor the health of an individual heart. An ECG records changes in the electric potential of cardiac cells and possesses unique characteristics. The ECG records the electrophysiologic pattern of depolarizing and repolarizing during each heartbeat as shown in Fig. 1. Studies show that ECG exhibits discriminatory patterns among individuals [5,6,7,8,9,10,11,12,13,14,15,16].

Fig. 1.
figure 1

ECG waveform

Outmoded medicine through efforts to universalize ECG signal to produce a common diagnostic method applicable to most individual [1], but the uniqueness of ECG among individuals is an advantage in biometrics as well as challenge in medicine [17]. Several studies have demonstrated ECG-based recognition is a robust biometric method. To ascertain that it is possible to identify individual using ECG, Biel et al. extracted the features from P, QRS and T waveforms and evaluate the feasibility of ECG signal for human recognition [5]. They performed multivariate analysis for classification and achieved 100% recognition rate. Israel et al. have demonstrated the Wilks Lambda technique for feature selection and linear discriminant analysis for classification [10]. This framework was tried on a database of 29 subjects with 100% human recognition rate was achieved. Shen et al. presented one lead ECG based on identity verification with seven fiducial based features that are related to QRS complex [6]. The consequence of identity verification has discovered to be 95% using template matching, 80% using decision based neural network and 100% for consolidating the two methods from a gathering of 20 people. Singh and Gupta have proposed P and T wave delineators along with QRS complex to extract different features from dominant fiducials of the electrocardiogram on each heartbeat [16]. The proposed system is tested on 50 subjects and achieved the classification accuracy to 98%.

In this paper, a robust and an efficient method of ECG biometric recognition is proposed. For denoising ECG signal, FIR equiripple high pass filter is used that removes baseline noise. The FIR equiripple low pass filter removes the power interference noise. Haar wavelet transform is used for accurately detection of the R peaks (\(R_{peak}\)). All other dominant features of the ECG waveform are detected with respect to the R peaks by setting of the windows whose sizes depend on the length of the corresponding wave duration and location. Features of the ECG signal including interval features, amplitude features, angle features and area features where successfully despite. The algorithm has been applied on 100 ECG signals of PTB database from physionet bank and could detect 39 features from every ECG signals. By applying PCA and Kernel PCA reduction methods on 39 features. Finally the similarities within the components of feature set are calculated on the basis of Euclidean distance.

The rest of the paper is organized as follows. Section 2 presents the methodology for the recognition system based on ECG. The delineation techniques of P and T wave are demonstrated with detailed description of ECG data. The experiment results of recognition system presented in Sect. 3. Finally, conclusions are drawn in Sect. 4.

2 Methodology

The framework of ECG biometric recognition system is shown in Fig. 2. The method is implemented in a series of steps: (1) ECG data preprocessing: includes correction of signal from noise artifacts. (2) Data representation: includes delineation of dominant waveform and recognition of dominant features between the diagnostic points. (3) Recognition: that matches test template with the template stored in the database using a suitable technique.

2.1 ECG Data Preprocessing

An electrocardiogram exhibits the electric potential actually electrical voltages are higher in the heart, it can be characterize as P, Q, R, S, and T waves. When an ECG is recorded, it contaminates several kind of noises. The contamination of different artifacts such as baseline wander noise and power line interference may change the levels, values of amplitudes and time periods of the ECG waveform, respectively.

Equiripple highpass filter is capable of removing baseline wander noise without affecting the dominant fiducials of the ECG. The equiripple highpass filter uses a filter order of 2746, cutoff frequency of 1 Hz, stop frequency of 2 Hz, and stop attenuation of 80 dB. The power interference noise appears as spike in frequency components analysis at 50 Hz. This frequency component can be removed by using notch filter. The FIR equiripple lowpass filter is used with filter order of 508 and cutoff frequency is set to 40 Hz. This filter is followed by an IIR filter to reach sharp frequency notch and avoid phase distortion.

Fig. 2.
figure 2

ECG biometric recognition system [16].

Fig. 3.
figure 3

Detection of ECG waveforms, Ponset, P, Poffset, Q, R, S, Tonset, T and Toffset.

2.2 Data Representation

The ECG signal is now ready to process for features extraction. In this stage, a systematic analysis of ECG is done using different techniques. The Haar wavelet transform method is used to extract the ECG features. Haar wavelet gives promising performance to delineate P-QRS-T wave fiducials.

Peak Detection. Using Harr wavelet transform R peaks are easily detected due to the multiresolution analysis of the ECG signal. In reference to the R peak location, the P, Q, S and T waveforms are detected. The rhythm of heartbeat is calculated using the following formula:

Number of heartbeats = R peaks * Length of signal/(Frequency * 60 s) per minute.

Fig. 4.
figure 4

Interval features of ECG waveform [16].

The procedure to detect P-peak is shown as follows: To detect \(P_{peak}\) location, window of 160 samples is set. This window extends from 200 to 60 samples to the left of \(R_{peak}\). Within the window, \(P_{peak}\) is located at the samples that have the maximum amplitude value. Another window of 90 samples is set. The window boundary from 100 to 10 samples to the left of \(R_{peak}\) location. \(Q_{peak}\) is located where the minimum amplitude value is found within the window. For \(S_{peak}\), the window of size 95 samples is set and window extends from 5 to 100 samples to the right side of \(R_{peak}\) location. The minimum amplitude value within the window is the \(S_{peak}\) location. \(T_{peak}\) are the farthest waves from \(R_{peak}\). \(T_{peak}\) are detected using window of 300 samples of width. These windows start at 100 samples on the right of \(R_{peak}\) and end at 400 samples away from \(R_{peak}\). \(T_{peak}\) is located at the maximum amplitude value from right side of \(R_{peak}\) within the window. A window of size 300 samples is set. Within this window the minimum amplitude value at 150 samples from the left of \(T_{peak}\) is \(T_{onset}\) location and minimum amplitude value at 150 samples to the right of \(T_{peak}\) location is \(T_{offset}\) location. Thus all peaks are successfully detected. Figure 3 shows detected P, Q, R, S, T, \(P_{offset}\), \(P_{onset}\), \(T_{offset}\) and \(T_{onset}\) waves.

2.3 Feature Extraction

Once the ECG is delineated, peak and limits of QRS complex, P wave and T wave are known. From known fiducials 39 features which are extracted from each heartbeat where each derives from one of the classes:

Fig. 5.
figure 5

Amplitude features of ECG waveform [16].

Interval Features. Following features related to heartbeat intervals are computed. The \(PR_{I}\) is the time interval between \(P_{peak}\) and \(R_{peak}\) fiducials. \(PR_{S}\) is the time interval between \(P_{offset}\) to \(QRS_{onset}\) fiducials. The QT is the corrected time interval between \(QRS_{onset}\) to \(T_{offset}\) fiducials, according to Bazett’s formula. The \(ST_{S}\) is the time interval from \(QRS_{offset}\) to \(T_{onset}\) fiducials and \(ST_{I}\) is the time intervals from \(QRS_{offset}\) to \(T_{offset}\) fiducials. Other interval features are computed relative to \(R_{peak}\) fiducial. The time interval from \(R_{peak}\) to P wave fiducials, \(P_{offset}\), \(P_{peak}\) and \(P_{offset}\) are defined as \(P_{offset}R\), PR and PonR, respectively. The time interval from \(R_{peak}\) to \(Q_{peak}\) is defined as QR and time interval from \(R_{peak}\) to \(S_{peak}\) is defined as RS. Similarly, time interval from \(R_{peak}\) to T wave fiducials, \(T_{onset}\) and \(T_{offset}\) are defined as RT, \(RT_{onset}\) and \(RT_{offset}\) respectively.

The computed time interval features are shown in Fig. 4. Along to these interval features within a beat three interbeat interval features set RR, PP and TT are also extracted. RR is defined as the time interval between two successive R-peaks, similarly PP and TT are also detected. The RR feature is also used to correct the QT interval from the effects of change in heartrate [16].

Amplitude Features. Following amplitude features are computed relative to the amplitude of R peak. This class of features are dependent to QRS complex which is usually invariant to change in the heart rate. The QRa feature is defined as the difference in amplitude of R and Q waves. The SRa feature is defined as the difference in amplitude between R and S waves. Similarly, the difference in amplitude of P wave and T wave to R wave are defined as PRa and TRa, respectively [16]. These amplitude features are shown in Fig. 5.

Fig. 6.
figure 6

Angle features of ECG waveform [16].

Angle Features. Following features related to angular displacement between different peak fiducials of P, Q, R, S and T waves are extracted from each heartbeat. Hence the aim is to extract a class of features which are stable and prone to the change in heart rate. The \(\angle Q\) is defined as the angular displacement between directed lines joining from \(Q_{peak}\) to \(P_{peak}\) and \(Q_{peak}\) to \(R_{peak}\) fiducials [43]. Using Cosine rule \(\angle Q\) can be computed as follows:

$$\begin{aligned} \cos Q = \frac{{PQ}^2+{QR}^2-{PR}^2}{2*PQ*QR} \end{aligned}$$
(1)

\(\angle R\) is defined as the angular displacement between directed lines joining from \(R_ {peak}\) to \(Q_{peak}\) and from \(R_ {peak}\) to \(S_ {peak}\) fiducials. Similarly, \(\angle S\) is defined as the angular displacement between directed lines joining from \(S_ {peak}\) to \(R_ {peak}\) and from \(S_ {peak}\) to \(T_ {peak}\) fiducials. \(\angle P\) is defined as the angular displacement between directed lines joining from \(P_{onset}\) to \(P_{peak}\) and from \(P_{peak}\) to \(P_{offset}\) fiducials. \(\angle T\) is defined as the angular displacement between directed lines joining from \(T_{onset}\) to \(T_{peak}\) and from \(T_{peak}\) to \(T_{offset}\) fiducials. These angle features are shown in Fig. 6.

Area Features. We compute another set of feature called area features formed among ECG wave fiducials as follows (Table 1):

The procedure used to compute the area of a triangle having known vertices \((A_{x},A_{y}),(B_{x},B_{y})\) and \((C_{x},C_{y})\) in a 2D space is given as follows [44]:

$$\begin{aligned} \mathbf{Area~of~Triangle~ABC } =\frac{A_{x}(B_{y}-C_{y})+B_{x}(C_{y}-A_{y})+C_{x}(A_{y}-B_{y})}{2} \end{aligned}$$
(2)
Table 1. Area features of a heartbeat

3 Recognition Results

3.1 Database

Physikalisch-Technische Bundesanstalt (PTB), the National Metrology Institute of Germany, has provided the digitized ECG for research [41]. The ECG signal were collected from healthy volunteers and patients with different heart diseases by Professor Michael, M.D., at the Department of Cardiology of University Clinic Benjamin Franklin in Berlin, Germany. The PTB database contain total records 549 from 290 subjects with the conventional 12 leads is represented as i, ii, iii, avr, avl, avf, v1, v2, v3, v4, v5, v6 together with three Frank ECG leads that is vx, vy, vz. Each signal is digitized at 1000 samples per second, with 16 bit resolution over a range of ±16.384 mV. The performance of the ECG biometric recognition system is evaluated on the ECG recordings of 100 subjects from the class Physikalisch Technishe Bundesanstalt (PTB) database. The proposed methodology is tested on 100 subject of PTB database from each of these subject 6 windows of 30 s is created. A feature vector of 600 \(\times \) 39 (PTB) dimension. 6 windows from each subject is used as training template from which distance was calculated for each subject.

3.2 Feature Selection

Feature selection is the process of selecting a subset of relevant features from the feature vector collected from ECG identification model. In this paper two dimensionality reduction methods are used that is principal components analysis (PCA) and kernel principal components analysis (KPCA) [3]. PCA is a very popular technique for dimensionality reduction. Suppose a data set is of n-dimensions, the aim of the PCA is to find a linear subspace of d-dimension which is less than n than this data points lies on the linear subspace. Such a reduced subspace attempts to maintain the inconsistency of the data. The PCA approach can be described in five steps: (1) Calculate the covariance matrix of the given d-dimensional data set. (2) After that calculate the eigenvalues and eigenvector of the given data set and sort the eigenvalues in a decreasing order. (3) Select the k eigenvectors that belong to k largest eigenvalues and k is the dimension of the new feature space. (4) Compute the W projection matrix of the k selected eigenvectors. (5) Finally, transform the given data set X to obtain the k-dimensional feature subspace Y

$$\begin{aligned} Y= W^T.\,X \end{aligned}$$
(3)

PCA is designed for linear capabilities in high-dimensional data set. However, high dimensional data sets are nonlinear [3]. In some cases the high-dimensional data lay on boundary or near the boundary of a nonlinear manifold, so in this case PCA cannot variability of the data correctly. In kernel PCA, the kernel is used in PCA to calculate the high-dimensional feature vector efficiently in nonlinear mapping on the given input data set. The formulation of kernel PCA as follows:

$$\begin{aligned} \sum _{i}^{t}{\varTheta (x_{i}) -Z_{q}Z_{q} ^{T}\varTheta (x_{i})} \end{aligned}$$
(4)

where \(Z_{q}\) consist of eigenvectors and \((x_{i})\) is data set.

3.3 Recognition Performance

For recognition we generate the genuine and imposter matching scores. A matching score is a similarity measures between features derived from the test and training template. For different individuals, the test template is compared to the template stored in the gallery set using Euclidean distance as the similarity measure to generate matching scores (Table 2).

Table 2. Evaluation of recognition performance using different method
Fig. 7.
figure 7

ROC curves representing equal error rate.

The receiver operating characteristic (ROC) curve plot is a function of the decision threshold which plots the rate of false acceptance against the false rejection. The equal error rate (EER) is defined as the rate at which the false acceptance rate equals the false rejection rate. The accuracy of the recognition system is determined from subtracting the EER value to 100.

Table 3. Comparision of proposed method with other known methods

The equal error rate (EER) of the identification system is found to be 2.88% and accuracy is 97.12% by applying kernel PCA for dimensionality reduction. By using PCA EER is 8.86% and accuracy is 91.14% and Euclidean distance having an EER is 8.98% and accuracy is 91.01%. The performance of the ECG biometric recognition system is represented using receiver operator characteristic (ROC) curve as shown in Fig. 7. It shows that the system has genuine acceptance rate (GAR) for kernel PCA is 100% at 19.67% false acceptance rate (FAR), GAR for PCA is 100% at 25.06% FAR and GAR for euclidean distance is 100% at 24.57%. The recognition performance is found better for kernel PCA reduction method. In comparision to other methods, the proposed ECG biometric recognition system give outstanding performance on PTB database and this is shown in Table 3.

4 Conclusion

This study has proposed a method of biometric recognition of individuals using their heartbeats. The method has delineated the dominant fiducials of ECG waveform and then interval, amplitude, angle and area features are computed. The recognition results are shown that the proposed method of ECG biometric recognition is and useful to distinguish the heartbeats of normal as well as the inpatient subjects.

Universally, individuals have a heart and the nature of the way it beats and once used as a biometric proves the life of the user in a natural way. Therefore, no test of liveness is required. Finally, each individual has a unique set of heartbeat features. Thus, the proposed techniques can be used as a potential biometric for human recognition which is very secure and robust from falsification.