1 Introduction

Cardiovascular diseases (CVDs) are the leading cause of death globally, claiming 17.9 million lives in 2019. Hypertension, high cholesterol levels, and diabetes are major risk factors for CVDs. Continuous monitoring of blood pressure (BP) is crucial for preventing heart attacks in intensive care unit (ICU) patients. Invasive and non-invasive devices have been developed for BP measurement. The cuff-based sphygmomanometer is the most widely used method for measuring BP. It provides systolic and diastolic blood pressure readings.

Several research articles have explored new methods for using photoplethysmography (PPG) signals to measure BP and heart rate. Elgendi et al. analyzed PPG characteristics [1]. Lee et al. demonstrated webcam footage-based heart rate measurement using joint blind source decomposition (JBSS) and ensemble empirical mode decomposition (EEMD) techniques [2]. S. Lee et al. proposed a method for identifying optimal measurement locations for PPG sensors on the wrist [3]. Sangurmath et al. suggested using a reflectance optical sensor to obtain real-time PPG signals and monitor heart function [4].

Pulse oximeter-based wearable BP measurement has also been proposed [5]. Other studies have explored the use of PPG for various physiological monitoring applications [6]. Pourush Sood et al. developed an algorithm to detect characteristics and rhythms in PPG data, such as dicrotic notches, ectopic beats, and apertures [7]. Piyush Jain proposed a two-stage CNN approach for classifying hypertension risk using multi-lead ECG data [8].

I-Ping Yao proposed a method to measure BP from the combination of PPG and ECG signals using machine learning algorithms [10]. A systematic review by [11] focused on heart rate variability in bipolar disorder based on PPG signals.

Hendrana Tjahjadi et al. proposed a method for classifying BP using the K-nearest neighbors (KNN) algorithm based on PPG signals [12]. PTT-based BP measurement was investigated using a multi-layer neural network [15]. Tine Proesmans et al. used the PPG approach to estimate heart rate from fingertip recordings taken with a camera phone [16].The mathematical analyses like error convergence analysis, attitude, and its order degree were also included [19, 20] for disease prediction.

These studies highlight the potential of PPG as a non-invasive and cost-effective method for monitoring cardiovascular health. Further research is needed to validate and refine these methods for clinical use.

From the literature it is observed that most existing BP measurement algorithms lack real-world validation and focus on invasive methods for ICU patients. Cuff-based measuring devices, though widely used, pose limitations such as discomfort from arm pressure, especially during IV administration, exacerbating for ICU patients, unsuitability for certain conditions, and disruption from cuff inflation, emphasizing the need for non-invasive methods like PPG-based monitoring proposed here. Hence there is a need for accurate, non-invasive alternatives, especially when medical staff is unavailable. To address this gap, this paper proposes a novel, non-invasive method for BP and heart rate measurement from the PPG signal using the machine learning technique.

2 Experimental Framework

The proposed workflow comprises four main stages: data acquisition, smoothing, preprocessing, and machine learning for prediction. The simulation workflow depicting this model is illustrated in Fig. 1 and hardware implementation steps are shown in Fig. 2.Explanation of all the blocks depicted in Fig. 1 is provided in the following section.

Fig. 1
figure 1

Simulation work flow of proposed system

Fig. 2
figure 2

Hardware implementation of proposed system

2.1 Data Acquisition

Photoplethysmography operates on the principle of light reflection or transmission. An Infrared LED, also known as a pulse sensor, placed on the body detects blood volume changes using a photo detector, typically positioned beside or opposite the LED, functioning in reflectance mode. Light penetrates tissues, skin pigments, and blood, with blood exhibiting higher light absorption. Variations in blood volume during each heartbeat enable the calculation of heart rate [9] and blood pressure. As the heart inflates, blood enters and is purified, while deflation leads to blood pumping throughout the body via arteries, veins, and capillaries. Heart inflation corresponds to Diastolic Blood Pressure, detected as a diastolic peak, while heart deflation relates to Systolic Blood Pressure, indicated by a systolic peak [14]. Figure 3. Shows the acquired real PPG signal. Figure 4 illustrates PPG signal parameters. The MIMIC II online waveform database [18] from Physionet serves as the dataset for training, testing and validation of ML algorithms, featuring fingertip PPG signals collected from 1000 ICU patients, sampled at a frequency of 125 Hz with 8-bit accuracy.

Fig. 3
figure 3

PPG signal acquisition

Fig. 4
figure 4

Photoplethysmogram details

2.2 Signal Smoothing, Data Preprocessing and Heart Rate Calculation

Signal smoothing, data preprocessing, and heart rate calculation were conducted. Noisy values in the PPG dataset were eliminated using a Savitzky–Golay filter with an order of 3 and a window size of 51, which fits adjacent data points to a polynomial function to estimate smoothed values. Outlier data points from the filtered PPG were removed through amplitude comparison. Principal Component Analysis was then employed to extract the dominant 43 PPG features from the signal peaks, reducing the feature set size.

2.2.1 Principal Component Analysis Algorithm

The Principal Component Analysis (PCA) is used to extract optimum feature set from PPG signal. It is mainly a dimensionality reduction technique which transforms a high-dimensional feature set into a lower-dimensional space by preserving the most significant component of the data set. The algorithm steps are given below in Table 1:

Table 1 Algorithm of Principal Component Analysis (PCA) for dimensionality reduction

2.3 Machine Learning Algorithm for Prediction of Blood Pressure

In this section, the optimization of machine learning algorithms for blood pressure prediction is discussed. Five distinct algorithms were evaluated: Decision Tree Regressor, Adaboost Regressor, Support Vector Regressor (SVR), Hyper Tuned SVR, and Random Forest. Among these, SVR demonstrated the highest accuracy.

2.3.1 Decision Tree Regressor

Decision Tree is a versatile supervised learning method, applicable for both classification and regression tasks. It analyzes dataset features to provide solutions, resembling a tree structure with branches originating from a root node.

2.3.2 Support Vector Regressor

Support Vector Regressor (SVR) is a key component of Support Vector Machine (SVM), operating in multidimensional space. SVR employs a margin of tolerance to minimize coefficients, optimizing accuracy. It forms an optimization problem by constructing a convex insensitive loss function and identifying the flattest tube encompassing training cases, represented by support vectors.

2.3.3 Hyper Parameters of Support Vector Regression

Tuning hyperparameters such as regularization parameter (C) and kernel type enhances SVR model performance and flexibility for regression tasks.

2.3.4 Adaboost

Adaptive Boosting is a boosting algorithm in machine learning and uses an Ensemble Method (Improving week learners) and creating an aggregated model to improve model accuracy.

2.3.5 Random Forest

Random Forest, a popular ML algorithm, combines multiple decision trees to enhance forecasting accuracy by averaging results, thus improving overall model performance.

2.4 Hardware Implementation of Proposed System

The hardware model, depicted in Fig. 5, comprises a pulse sensor, Arduino embedded board, Raspberry Pi board, and LCD display. Utilizing a non-invasive optical technique [13], the pulse sensor detects the PPG signal, illuminating the finger with infrared light and measuring light intensity variations with a photo detector. Arduino UNO, featuring ATmega328P processor, offers 14 digital I/O pins, 6 PWM outputs, 6 analog inputs, USB connectivity, and a reset button. Raspberry Pi 3, a LINUX-based single-board computer, powers the system.

Fig. 5
figure 5

Implementation of proposed model in hardware

3 Mathematical Background for Heart Rate Calculation, BP Prediction and Performance Analysis

This section provides the overview of Heart rate calculation, BP prediction and performance analyzing parameters of ML algorithm.

3.1 Heart Rate Calculation, BP Prediction

The implemented proposed system shown in Fig. 5 is used for the HR calculation and BP prediction from the acquired PPG signal. Peak values of the PPG signal were determined by identifying local maxima. Heart rate was calculated using the formula given in Eq.  1.

$${\rm{Heart Rate}} = {\rm{ }}60{\rm{ }}* {{Sampling\,Rate} \over {Consequtive\,difference\,between\,Peaks}}$$
(1)

Machine learning Regressor predicts the mean arterial blood pressure(ABP or MBP) values from the PPG signal. The arterial BP is the pressure that exerted on the vessel walls when the blood streams through the blood vessels. Each heart beat causes the arterial BP to vary between systolic BP (SBP), an increase of pressure induced by the systolic contraction of the left ventricle, and diastolic BP (DBP), a drop of arterial pressure occurring during the diastolic arrest of the heart between two contractions. Systolic Blood Pressure and Diastolic Blood Pressure are calculated using the following Eq. (2) and Eq. (3) with respect to the PPG features in Fig. 6.

Fig. 6
figure 6

SBP/ DBP and MBP

$$ DBP=MBP- 0.67\text{*}PPo{\left( \frac{PTTo}{PTT} \right)}^{2}$$
(2)

Systolic Blood Pressure can be calculated from the following formula [17]:

$$ SBP=DBP+PPo{\left( \frac{PTTo}{PTT} \right)}^{2}$$
(3)

Here PTT is time taken for each pulse to transmit, PTT0 is calibrated value of PTT and PP0 is the difference between systolic peak and diastolic peak. To prevent movement artifacts as much as possible, it’s critical to tune the Photoplethysmographic data to instrumentation circuits. Because the PPG signal has low amplitudes, noise has a significant impact on the signal’s quality and readability, i.e., this signal is influenced by numerous sounds such as the surroundings, the patient’s condition, breathing, or movement.

3.2 Performance Analysis

The proposed method was evaluated using performance metrics. Accuracy of SBP, DBP prediction, Root Mean square error, Mean absolute error, R-Squared Score, Explained variance Score are calculated to determine the selected regression algorithm performance. The metrics are calculated as,

(i) Accuracy:

$${\rm{ACC }} = \left( {{\rm{TP }} + {\rm{ TN}}} \right)/{\rm{ TP}} + {\rm{TN}} + {\rm{FP}} + {\rm{FN}}$$
(4)

.

Where, TN- True Negative, TP = True Positive, FN = False Negative, FP = False Positive.

ii) Root Mean square error (RMSE):

$$RMSE\, = \,\sqrt {{{\sum\nolimits_{i = 1}^N {{{{\rm{(Predicte}}{{\rm{d}}_i}{\rm{ - Actua}}{{\rm{l}}_i}{\rm{)}}}^2}} } \over N}} $$
(5)
  1. iii)

    Mean absolute error= \( \left|\text{R}\text{M}\text{S}\text{E}\right|\)

  2. iv)

    R-Squared Score = 

    $${\rm{R}}2\, = \,1 - {{{\rm{RSS}}} \over {{\rm{TSS}}}}$$
    (6)

Where, R2 = Co efficient of determination, RSS = Sum of squares of residual, TSS = Total sum of squares.

v)

$${\rm{Explained Variance}}\, = \,1 - {{{\rm{Var(y - y}}\,{\rm{Predicted)}}} \over {{\rm{Var(y)}}}}$$
(7)

where y = Actual output, y predicted = value predicted by machine learning algorithm.

4 Results and Discussion

The MIMIC data set was preprocessed using Savgol filter noise removal. The order of the filter is 3 and frame length is 51. The raw PPG and Filtered PPG are shown in Fig. 7(a) and (b).

Fig. 7
figure 7

(a) Raw PPG signal in time domain. (b) Savgol filtered signal in time domain

The dimensionality reduction was achieved through PCA algorithm. Totally 43 features out of 123 statistical features are extracted.

The proposed optimized model predicts Heart Rate and Blood Pressure, compared with real-time data for 10 subjects (Table 2). Figure 8 illustrates real-time heart rate monitoring from fingertip using a pulse sensor, alongside a standard wristwatch comparison, with results in Table 3. The hardware system is remotely accessible via mobile, tablet, or laptop using Virtual Network Computing (VNC) viewer.

Table 2 Comparison of actual and predicted values of BP using Support Vector Regressor for 10 different subjects
Table 3 Heart rate comparison
Fig. 8
figure 8

Pulse sensor vs. smart watch

Five type’s regression algorithms are tested in order to predict SBP and DBP values. The prediction accuracy of algorithms is listed in the Table 2. In this analysis Support Vector Regression algorithm accuracy is higher compared to other algorithms. Accuracy of various ML algorithms for BP prediction is been provided in Table 4. The Performance metrics of various regression algorithms in prediction of SBP and DBP values is shown in Fig. 9.

Table 4 Accuracy of various ML algorithms for BP prediction
Fig. 9
figure 9

Accuracy comparison of ML algorithms

Fig. 10
figure 10

Performance metrics of regression algorithms

The statistical parameter analysis of various machines is given in Fig. 10. Based on the statistical performance SVR is selected for BP prediction because of its light weight and easy to implement in hardware.

5 Conclusion and Future Extensions

In our proposed system, we utilized a pulse sensor for Blood Pressure (BP) and Heart rate monitoring and prediction. Given the significance of heart rate and BP variations in ICU health assessment, we aimed for precise data collection while prioritizing patient comfort. By exclusively utilizing a Photoplethysmography (PPG) dataset and applying a Savgol filter for noise reduction, we prepared the data for ML algorithms. To address BP estimation complexities, we employed Principal Component Analysis (PCA) for feature extraction, yielding 43 features with high variance. The SVR algorithm accurately predicted SBP and DBP values. Future extensions may include refining the wireless API for enhanced accessibility and exploring advanced ML techniques for even greater accuracy.