1 Introduction

According to World Health Organization, presently, 1.13 billion people in the world have high BP and among them less than 1 in 5 people have taken remedial measures to keep BP under control. Hypertension is one of the biggest causes of life threatening cardiovascular diseases such as stroke and heart attack. Cardiovascular disease caused 17.9 million (31%) deaths worldwide in the year 2016. Hence blood pressure needs to be checked regularly, and if found high, remedial measures such as medications, healthy diet and physical activity need to be taken to keep the blood pressure under control [1].

Hypertension, also known as high blood pressure is a condition in which the blood vessels have constricted due to deposit of fats and free radicals and the individual has persistently ‘raised’ blood pressure. Blood is carried from the heart to all parts of the body through the blood vessels. Each time the heart beats, it pumps blood into the vessels. Blood pressure is created due to the force exerted by blood, which pushes against the blood vessel walls and the arteries, when it is pumped by the heart. The unit for measuring blood pressure is millimetres of mercury (“mmHg”). Blood pressure is recorded with the systolic reading followed by the diastolic reading. Systolic blood pressure (SBP) is the pressure created when the heart pumps out blood. Diastolic blood pressure (DBP) is created while the heart muscle is resting between beats and is being refilled with blood. Mean arterial pressure (MAP) refers to the average of SBP and DBP values for one cardiac cycle.

The sphygmomanometer is the gold standard for measuring blood pressure. This is not convenient for the continuous monitoring of blood pressure and also it requires trained medical practitioners to assess the measurement. Recently, measuring blood pressure using the technique of photoplethysmography is gaining more popularity, due to the convenience it offers in continuous monitoring of blood pressure without the need for any special training. There have been several methods proposed to estimate blood pressure based on photoplethysmography [2,3,4,5,6,7,8,9]. The most well-known method is estimation of systolic blood pressure (SBP) and diastolic blood pressure (DBP) from pulse transit time (PTT), which requires the measurement from two different spots of the human body, and special training to assess the readings. Hence, it is inconvenient and is difficult to measure regularly, even for a trained person. This process is quiet complicated, as it involves synchronizing two different signals being captured simultaneously.

In this paper, a novel approach is presented that exploits the features extracted using wavelet scattering and deep learning algorithm to estimate blood pressure accurately. This BP estimation technique uses only photoplethysmograph signals in which measurement is taken from a single site, fingertip. This BP estimation technique consists of four steps. The first step involves preprocessing of PPG signals. The second step involves extraction of wavelet scattering features from the preprocessed signals. The third step involves training the regression model. In this work, support vector regression model and long short-term memory network regression model are used. The fourth step involves evaluation of the learning model’s performance using hold out dataset. Wavelet scattering transform is a good feature extraction technique that exhibits some typical properties: It computes a time shift invariant image representation. It remains stable to time warping deformation. It also retains vital information such as frequency content. Using wavelet scattering transform, a new representation of the original PPG signal is generated that contains time and local frequency [10]. The rest of this paper is organized as follows, Sect. 2 presents an overview of photoplethysmography and explores the related research in this area. Section 3 provides an overview of the database used, elaborates the wavelet scattering-based feature extraction technique and derivation of BP estimation model. Section 4 elucidates the experimental results. Section 5 gives the conclusion of the paper.

2 Underlying technology and related research

2.1 Principle of photoplethysmography

Photoplethysmography is a technique in which blood pressure is calculated based on the blood volume changes that are observed over the surface of the skin, especially in areas where the skin is sensitive (nerve endings) and its color changes based on the amount of blood flow. Measuring blood pressure using PPG is a noninvasive technique and user friendly model; hence it can be seamlessly integrated into any portable devices such as smart phones, smart watches and smart bracelets [2, 9, 11]. In photoplethysmography, low intensity infrared light is used to measure blood volume changes in peripheral blood circulation. When light passes through the skin, it is absorbed by the tissues, blood, bones and skin pigments. Blood naturally absorbs more light than the surrounding tissues; hence it is quiet easy to detect the changes in blood flow using this technique. Thus by measuring the blood volume changes, blood pressure can be assessed. Accuracy of results depends on various factors influencing the environment of measurement, such as intensity of the surrounding light and ability of the subject to hold still. The most commonly used measuring areas are the fingertips, earlobes and forehead as the blood flow can be easily measured in these light sensitive areas, the light source illuminates the skin and the photodetector captures the intensity of light variations from a specified area [3]. These light variations in conjunction with the time differences from the signals are used to calculate the blood pressure.

2.2 Related research

For the past few years, there have been lot of research works carried out in estimating blood pressure from PPG signals. The common method typically consists of three steps, namely (1) PPG signal is analysed and features are extracted, (2) machine learning algorithms are used to study how far the extracted features correlate with the actual blood pressure values obtained using standard medical devices, (3) a prediction model to evaluate blood pressure is derived using correlated features. The widely acceptable feature used to estimate blood pressure is pulse transit time (PTT). PTT is the time duration for the pulse wave to travel from heart to the extremity of the body. Pulse wave velocity (PWV) is found to have relationship with blood pressure which in turn is inversely proportional to pulse transit time (PTT). This method requires two sources to calculate the time interval. In most of the studies ECG signals and PPG signals are used to calculate PTT [8, 12,13,14,15]. In some other studies two PPG signals captured at two different peripheral sites are used [16]. It was observed that BP and PTT are negatively correlated with each other. Hence, using linear regression, the correlation between PTT and blood pressure is determined and regression equations are obtained. The drawback here is, it is difficult to synchronize the ECG signal data and PPG to calculate PTT [11]. Positioning the PPG sensor wrongly in wrist may lead to distortion in PPG signals and this affects the accuracy.

Blood pressure has also been estimated from vascular transit time (VTT) which is measured from heart sound and finger pulse [17, 18]. VTT is defined as the time it takes for the blood to propagate from the heart to body peripherals for one cardiac cycle. Two mobile phones are used to record heart sound and finger pulse. The clocks in both mobile phones should be synchronized which is a challengeable task. Another challenge is finding the best spot to record heart sound.

There are many research works done in estimating blood pressure using only PPG signals [4, 5, 7, 19,20,21,22]. The shape of the PPG signal is analyzed and features are extracted. Such features include peak width, peak height, peak area, distance between consecutive peak and valley. But, achieving accuracy equivalent to that of accuracy provided by standard medical device is challengeable.

The proposed BP estimator uses the PPG signals for BP estimation. PPG features are extracted using wavelet scattering transform, using which the learning model is trained to derive a BP estimator.

3 Proposed method

In this paper, a system is proposed, that computes blood pressure noninvasively from PPG signals captured from the fingertip of a person. The proposed system is developed in various phases viz., (1) dataset collection, (2) data preprocessing, (3) feature extraction, (4) training the machine learning model, (5) model evaluation. These phases are depicted in Fig. 1.

Fig. 1
figure 1

Block diagram depicting the phases of BP estimation using wavelet scattering and regression model

3.1 Dataset collection

The dataset required for the design of PPG-based BP estimation system was collected from Multi-parameter Intelligent Monitoring in Intensive care (MIMIC) II online database provided by PhysioNet organization [6]. The database contains preprocessed and cleaned waveform signals. Ten thousand records were extracted from this database. Each record consists of three rows, in which first row corresponds to PPG signal extracted from fingertip, second row corresponds to invasive arterial blood pressure (ABP) (in mmHg) signal and the third row corresponds to electrocardiogram (ECG) signal. The sampling frequency of each signal is 125 Hz. PPG signals and ABP signals were collected from database and used for this work. Target systolic and diastolic values were derived from ABP signals and were used in the training of machine learning model, and for comparing the estimated BP values from proposed system and thereby evaluating the accuracy of the proposed system.

3.2 Data selection

Analyzing the dataset collected from the database, few PPG signals were found to have insufficient record duration which were not suitable for BP estimation. Those signals were detected from the collected records by analyzing the pulse onsets in the ABP waveform and then eliminated. Pulse onset indicates the arrival of ABP pulse at the site of recording. Pulse onsets were detected by applying the following three steps on ABP signal [23] viz., (1) suppression of high frequency noise that might affect the ABP onset detection using a low pass filter. (2) Conversion of filtered ABP signal into slope sum function signal in which the upslope of the ABP pulse is enhanced and the remaining pressure waveform is suppressed. (3) Detection of pulse onset from the slope sum function signal by applying adaptive thresholding and local search strategy.

The number of ABP pulse onsets in the ABP signal in each record was counted. The records with pulse onsets less than or equal to 30 were detected and eliminated because those records contain signals with insufficient length that were not suitable for analysis and estimation of BP. After elimination, the resultant dataset contains 8271 records. Later irrelevant signals with BP values outside the scope i.e., very high or very low BP values were eliminated. As a result of eliminating such invalid signals, the final dataset contained 4314 records. The distribution of values for systolic blood pressure (SBP) and diastolic blood pressure (DBP) in the final dataset is depicted in Fig. 2

3.3 Data preprocessing

The signals from MIMIC database are found to have certain blocks deteriorated due to different distortions and artifacts [24], which when processed for BP estimation may lead to incorrect results. To remove the noise and other artifacts, wavelet denoising technique used in [24] was adapted in this work. Preprocessing involves resampling signals at a fixed frequency of 1000 Hz, wavelet decomposition, Zeroing \(0\tilde{0}.25\) Hz, Zeroing \(250\tilde{5}00\) Hz, wavelet reconstruction, threshold selection and wavelet thresholding. The original signal and the denoised signal are shown in Fig. 3. The resultant preprocessed signals were used for feature extraction and training the learning model.

Fig. 2
figure 2

The distribution of values for SBP and DBP in the final dataset. a SBP values and b DBP values

Fig. 3
figure 3

Original PPG signal and the preprocessed signal is shown

3.4 Feature extraction

In this proposed work, a signal analysis approach called wavelet scattering transform [10] is applied to extract features from the PPG signal. In order to extract features wave like oscillations called wavelets are used which can be scaled and shifted to best fit the signal. By creating a linear combination of wavelets, a new signal representation is created. Wavelet is operated on the signal in order to generate a set of coefficients which displays the similarity within a wavelet and the signal. These coefficients create a new representation of the original signal containing time and local frequency. This process is referred to as wavelet transform and it is the foundation of the wavelet scattering transform.

A wavelet scattering builds translation invariant representations which are stable to deformation by applying convolution, nonlinearity and scaling functions. Scattering transform delocalizes signal data, y into scattering decomposition paths. Let the original signal be segmented into equal sized timing windows. If p is a wavelet scattering path \(\left( p=\lambda 1,\lambda 2,\ldots ,\lambda m\right) \) of length m, w is the timing window position and window size \(2^{k}\), then the scattering coefficient of order m at the scale \(2^{k}\) denoted by \(S_{k}\left[ p\right] y\left( w\right) \) is computed as in Eq. (1) [10].

$$\begin{aligned} S_{k}\left[ p\right] y\left( w\right) =\left| \left| \left| y*\psi _{\lambda 1}\right| *\psi _{\lambda 2}\right| \ldots *\psi _{\lambda m}\right| *\phi _{2^{k}}(w) \end{aligned}$$
(1)

where \(\psi \left( w\right) \) is the morlet wavelet that forms the building block of wavelet scattering and is given by Eq. (2)

$$\begin{aligned} \psi \left( w\right) = c1\left( \mathrm{e}^{iw\cdot \nu } - c2\right) \mathrm{e}^{\frac{-\left| w\right| ^2}{\left( 2\sigma ^2\right) }} \end{aligned}$$
(2)

where \(\nu \) is the frequency, \(\sigma \) is the measure of spread, c1, c2 are constants that are adjusted so that Eqs. (3) and (4) are satisfied

$$\begin{aligned}&\int \psi \left( w\right) \mathrm{d}w = 0 \end{aligned}$$
(3)
$$\begin{aligned}&\int \psi ^{2}\left( w\right) \mathrm{d}w = 1 \end{aligned}$$
(4)

Scaling function \(\phi _{2^{k}}\left( w\right) \) is given by Eq. (5)

$$\begin{aligned} \phi _{2^{k}}\left( w\right) = 2^{-2k}\phi \left( 2^{-k}w\right) \end{aligned}$$
(5)

Original PPG signal, Y is segmented into equal sized time windows say \(Y=y1,y2,y3,\ldots ,yn \). Vector of scattering coefficients are computed from each time window as follows. First the segmented slice of signal, y1 is filtered with \(\phi _{2^{k}}\), the scaling function which yields an averaging of the signal. The averaged signal is represented by \(y1*\phi _{2^{k}}\) and provides invariance to local time shifting. The averaging of the signal removes high frequencies and hence loses information. The original signal is once again filtered with a high pass filter \(\psi _{\lambda 1}\), the wavelet that yields new representation of the signal and is given by \(y1*\psi _{\lambda 1}\). High pass filtering retains detailed information about the signal. Also, it recovers the information lost during low pass filtering. The modulus of the high pass filtered output is taken that results in \(\left| y1*\psi _{\lambda 1}\right| \). The modulus computes the low frequency envelope. Now the high pass filtered output from the previous layer is selected and is filtered with low pass filter giving \(\left| y1*\psi _{\lambda 1}\right| *\phi _{2^{k}}\) and high pass filters giving \(\left| y1*\psi _{\lambda 1}\right| *\psi _{\lambda 2}\) and modulus of high pass filtered output is taken. This process is continued for the desired number of layers. The output of low pass filtering yields a scattering coefficients that represent the signal at every layer. The next time window is selected and the process is repeated. This process is depicted in Fig. 4. This operation helps in extracting the wavelet scattering features. This operation of extraction of wavelet scattering features was implemented in MATLAB by executing the following steps: (i) Construction of wavelet time scattering decomposition framework with default filterbanks, adjusted invariance scale and sampling frequency set to 125. (ii) Extraction of scattering coefficients from PPG signal.

Fig. 4
figure 4

Process of extracting wavelet scattering features from a time window i.e. slice yi of original signal Y

Fig. 5
figure 5

Wavelet scattering coefficients for first 50 consecutive time windows of layer 1, layer 2 and layer 3 extracted from three PPG signals are shown

Table 1 LSTM forecasting architecture based on keras

Wavelet scattering features were extracted from the preprocessed PPG signals of all the 4314 records one by one. This extraction operation yields a set of robust features in two-dimensional matrix of size \(157\times N\). Hence, for each PPG signal, scattering coefficients were obtained across M scattering paths. N represents the number of time windows whose value depends on the length of the PPG signal. The coefficients at layers 0, 1 and 2 contain most of the energy [25]. Figure 5 shows the wavelet scattering coefficients computed for first 50 consecutive time windows for three PPG signals. Hence, scattering coefficients derived at layers 0, 1 and 2 were selected for training the learning model in this work. Wavelet scattering coefficients (SC) obtained at layers 0,1, and 2 are represented using Eq. (6).

$$\begin{aligned} \mathrm{SC} = {Y*\phi _{2^{k}}, \left| Y*\psi _{\lambda 1}\right| *\phi _{2^{k}}, \left| \left| Y*\psi _{\lambda 1}\right| *\psi _{\lambda 2}\right| *\phi _{2^{k}}} \end{aligned}$$
(6)
Fig. 6
figure 6

Error Histogram from SVR. a Histogram of relative error in calculated SBP. b Histogram of relative error in calculated DBP. c Histogram of relative error in calculated MAP

Fig. 7
figure 7

Error Histogram from LSTM. a Histogram of relative error in calculated SBP. b Histogram of relative error in calculated DBP. c Histogram of relative error in calculated MAP

3.5 Support vector regression model

The dataset consisting of 4314 records were partitioned into training set with 3883 records and testing set with 431 records. Training set and testing set were selected using hold out technique. There is no overlapping between training set and the testing set. The features obtained using wavelet scattering transform is given as input to support vector regression model. SVR model is a supervised machine learning technique that relies on kernel function and can predict data accurately [26]. It has good generalization capability. It handles both linear and nonlinear data efficiently. It is highly noise tolerant. Predictor model was constructed by training the support vector regression model with features set, systolic and diastolic values obtained from ABP signal. Derived predictor model was evaluated using the testing dataset.

3.6 Long short-term memory (LSTM) network model

LSTM network [27] is the recurrent neural network used in several time series forecasting tasks and has shown remarkable results. Learned LSTM networks performs the prediction task in a quick manner [28]. Predictive model using LSTM network was developed using the Keras deep learning package. Table 1 shows the architecture of the LSTM. The dataset was partitioned into training set with 3883 records and testing set with 431 records. The extracted wavelet scattering features was fed into the LSTM network for learning. The learned model was evaluated using testing set and have obtained the RMSE value of 10.95 for diastolic BP estimation and 19.36 for systolic BP estimation.

4 Results and discussion

4.1 Analysis of error distribution

Error histogram for estimated SBP, DBP and MAP for SVM regression model and LSTM regression model are shown in Figs. 6 and 7 respectively. Results of DBP and MAP are comparatively good in both the models since the models have a good relationship between wavelet scattering features and the BP values. Training the models with large samples enabled the machine learning algorithms to build a accurate model. Error rate is found to be high in SBP targets.

4.2 Comparison with the grading criteria used by the British society of hypertension

Table 2 shows an evaluation of our predicted models using support vector machine and long short term memory network by the British Hypertension Society (BHS) standard. BHS standard is designed to evaluate the accuracy of proposed BP monitor devices based on the cumulative percentage of error readings (i.e., absolute difference between BP values estimated by standard device and proposed one) under three threshold values 5 mmHg, 10 mmHg and 15 mmHg [31]. It is observed from Table 2 that both the learned models, SVM regression and LSTM regression model achieve grade B for diastolic blood pressure and grade C for mean arterial pressure according to BHS protocol.

Table 2 Comparison with BHS standard
Fig. 8
figure 8

Bland Altman plot for the difference between the actual values and the values obtained from proposed method for 431 observation pairs. a SBP predicted using SVR model, b DBP predicted using SVR model, c MAP predicted using SVR model, d SBP predicted using LSTM model, e DBP predicted using LSTM model and f MAP predicted using LSTM model

Table 3 Comparison with existing works
Table 4 Comparison of prediction accuracy produced by proposed model and various benchmark regression models

Figure 8 presents Bland Altman plots for SBP, DBP and MAP (Mean Arterial Pressure) targets. Bland Altman plot finds out how far the values obtained using proposed model agrees with one measured from standard device [32]. In the Bland Altman plot, the mean of BP values obtained using standard device and proposed model is plotted against the x axis, difference between the two values is plotted against the y axis. The number of observation pairs for which the SBP, DBP and MAP targets are plotted is 431. The results are found to be satisfactory for DBP and MAP as most of the plots are tightly scattered about the bias line and the limits of agreement are appreciably low. It can be observed from the plot that the samples of BP with very high or very low values produced poor results. This is because, the training set contains only a limited number of samples with very high or very low values. It can be deduced from these results that both the regression algorithms produced poor results for infrequent samples. This is the limitation of both the regression algorithms.

4.3 Comparison with existing work

The results of the proposed work are compared with the results of prior studies and are reported in Table 3. The metrics used for evaluation are mean absolute error (MAE) and standard deviation (SD). Table presents input signals used, number of subjects or samples used for BP measurement, features extracted from input signals, techniques used for analysis and learning, \(\mathrm{MAE}\pm \mathrm{SD}\) for SBP, DBP and MAP targets for each of the work. In the proposed work, only PPG signals captured from single site, of 4314 samples collected from MIMIC II database are used and the results achieved are comparable to existing works in the literature.

4.4 Comparison of BP prediction accuracy of proposed model with that of various benchmark regression algorithms

The results of proposed method are compared with three benchmark prediction algorithms. The regression algorithms used for comparison are artificial neural network (ANN), random forest regression (RFR) and K-nearest neighbour (K-NN) regression. The algorithms have been implemented using Scikit-learn python library. Their performance are compared with that of proposed system and the results are produced in Table 4. MAE, SD, mean squared error (MSE), relative absolute error (RAE) and root relative squared error (RRSE) are the metrics used for comparison. From the table, it is evident that, in the calculation of systolic blood pressure SVR outperforms other methods marginally and in the calculation of diastolic blood pressure LSTM and Random Forest outperforms other methods by a margin. But on the whole, these algorithms produced an acceptable accuracy for systolic blood pressure estimation, but produced appreciable accuracy in the estimation of diastolic blood pressure.

The error observed in the prediction of Diastolic blood pressure by SVM and LSTM are 9.8% and 10% respectively and that of mean arterial blood pressure is 9.5% and 9.0% respectively. The formula used for the calculation of error is given in Eq. (7).

$$\begin{aligned} \mathrm{Error} = \frac{\left| \mathrm{Actual} - \mathrm{Predicted}\right| }{\mathrm{Actual}}\times 100 \end{aligned}$$
(7)

5 Conclusion

This paper describes blood pressure estimation from photoplethysmograph signals, captured from fingertip. A new signal analysis method for extracting novel features from PPG signals has been introduced. Derivation of predictor model using machine learning and the model evaluation using testing set was described. The testing results were compared using BHS standard and it is shown that proposed model achieved B grade for DBP and C grade for MAP. A comparative analysis of results produced by proposed models and various benchmark regression models were performed. The results showed a marginal improvement in the accuracy of proposed model. Also results were compared with existing works in the literature and it is found that the results of proposed method are comparable with existing works. MIMIC II database used for this study contains signal parts that are weakened due to noise. Hence improving the preprocessing step for noise removal would improve the results. This work differentiates itself from the existing works as it involves wavelet scattering techniques.