Introduction

According to the WHO (World Health Organization), an estimated 17.9 million deaths occur globally each year from cardiovascular disease (CVD), accounting for 31% of all deaths worldwide [1, 2]. The United Nations’ Sustainable Development Goals (SDGs) include a target to reduce premature mortality from non-communicable diseases, including cardiovascular diseases, by one-third by 2030. CVD includes cerebrovascular disease, rheumatic heart disease, coronary heart disease, and other conditions that are related to the heart and blood vessels [1]. PPG is a non-invasive optical technique that measures the variation of transmitted and reflected light in terms of intensity [3]. It is a low-cost method, and the resultant signal has several components such as blood vessel wall movement, blood volume, and blood flow in arteries which are associated with cardiac activity [4, 5]. There are so many applications of PPG in healthcare. Many studies have claimed that PPG has the potential for screening, monitoring, and diagnosis of respiratory and cardiovascular disease and neurological disorders, etc. PPG is one of the best options for developing effective and affordable tools for real-time monitoring, screening, and diagnosis [4, 6]. Physiological measurements and estimations play crucial roles in healthcare, research, and personal health management. They help doctors make decisions, track treatment progress, and identify abnormalities or deviations from normal physiological functioning.

A comparative analysis of PPG publications over the past 23 years has been conducted. In this study, the data were obtained from the University of Toronto libraries accessed on 06 June 2023. We searched “photoplethysmography” and found the number of articles with the search criteria, i.e., search for “everything,” search scope “All libraries,” and language “Any language” with yearly publication. Two different filters were applied to determine the number of published articles. We select the search filters “Any Field” and “contains” with the search “photoplethysmography” photoplethysmography’ with respect to filter 1. As shown in Fig. 1, there has been a significant increase in published papers with respect to the year, and an exponential increase has been observed in the last 10 years. We selected the search filter “Title” and “contains exact phrase” with respect to filter 2. As shown in Fig. 2, there was a significant increase in the number of articles, but the total number of articles was less than that in Fig. 1.

Fig. 1
figure 1

Published articles trend on Utoronto with filter 1 (“Any Field” and “contains”)

Fig. 2
figure 2

Published articles trend on Utoronto with filter 2 (“Title” and “contains exact phrase”)

Figure 3 shows a comparative analysis of PPG publications over the past 23 years from the well-known and trusted databases known as PubMed. The PubMed database is maintained by the National Center for Biotechnology Information (NCBI) in the USA. The National Library of Medicine (NLM) is located at the National Institutes of Health (NIH). We searched “photoplethysmography” as the title. Figure 3 clearly shows that there is a significant increase in the number of articles with respect to year. Moreover, Figs. 1, 2 and 3 show a highly positive correlation between the number of published articles on photoplethysmography and the years 2010–2022. Photoplethysmography is popular because of its important applications in the evaluation of cardiac activity, variations in venous blood volume, blood oxygen saturation, blood pressure, heart rate variability, etc.

Fig. 3
figure 3

Published articles trend on PubMed (NCBI data)

The regulation of blood pressure in the human body is a complex and multivariate physiological process; therefore, PPG-based blood pressure estimation may not be sufficiently precise [7]. In other words, extracting accurate and precise information from the PPG signal is not easy. Numerous studies have been conducted to extract information from PPG signals. Some authors have used a combination of artificial intelligence (AI) and signal processing algorithms. Artificial intelligence (AI) models can learn complex patterns and waveform variations, enabling more accurate and efficient feature extraction than traditional signal-processing methods. ML and deep learning (DL) algorithms are subsets of AI that can be employed to process PPG signals and extract relevant features automatically.

This article’s primary contribution revolves around the comprehensive discourse on diverse physiological parameters leveraging PPG and AI/ML coupled with devising healthcare applications through a systematic meticulous review of published scientific literature or recent scholarly works. While the study’s objectives are diverse, our specific emphasis is directed towards the following points:

  1. (a)

    Use of PPG techniques in the field of healthcare.

  2. (b)

    Presenting an in-depth exposition of distinct physiological measurements and estimations using PPG.

  3. (c)

    Recent advancements in the utilization of machine learning techniques for blood pressure measurement via PPG are discussed.

  4. (d)

    Identify and discuss the different machine learning models used for hypertension classification based on PPG.

  5. (e)

    Outlining prospective and future research directions in the field of PPG.

After elaborating on the introduction and objectives (“Introduction”), the remainder of this article is structured into four additional sections. In “Research methodology,” we expound on our research methodology, delving into the details of the data sources, including their identification and search processes, selection criteria, and eligibility parameters. “Results” encompasses the results of the present study. In this section, the different physiological measurements and estimations from PPG, machine learning models used for blood pressure estimation, and hypertension classification based on PPG are discussed comprehensively. Transitioning, in “Discussion,” our focus centers on elucidating the limitations and challenges inherent to the study’s results. Finally, in “Conclusion and Future Research Recommendations,” we provide concluding remarks and recommendations for prospective avenues for future research.

Research Methodology

The Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) is a well-known, recognized, and accepted guideline for reporting systematic reviews and meta-analyses [2, 4] Therefore, to achieve a valid formulation for systematic reviews, the PRISMA 2020 guidelines were used for this study [8].

Data Source: Identification and Search

So many scientific databases and search engines are available to fulfill the criteria of our research. These include Web of Science, PubMed, Scopus, ProQuest, Elsevier, IEEEXplore, ScienceDirect, and ACM Digital Library. To search the research articles, the keywords “Photoplethysmography” or “PPG” or “Photoplethysmogram” were used to identify all relevant articles. The flowchart of the systematic review according to PRISMA 2020 is shown in Fig. 4. A total number of records were identified from the database and registers were 1474.

Fig. 4
figure 4

PRISMA 2020 flowchart illustrating the involved steps undertaken for systematic review

Eligibility Criteria

To ensure the inclusion of relevant studies, the following criteria were used.

  1. (a)

    The studies were published in peer-reviewed journals or conferences.

  2. (b)

    The studies were published in English only as a language.

  3. (c)

    These studies focus more on a recent study, mostly from 2018 to July 2023.

  4. (d)

    If the study papers were not available in full text, then they should be excluded.

  5. (e)

    If the keyword was not available in the title or abstract, then it should be excluded.

  6. (f)

    If the studies were irrelevant to healthcare, they should be excluded.

Selection Process

The abstracts of all included studies were carefully studied and analyzed to determine their relevance in fulfilling our objectives. In this study, we do not use the automation tool for the selection process. The final study selection was performed using the following stepwise criteria:

  1. (a)

    The studies should be original if any studies were similar or duplicate it was excluded.

  2. (b)

    Studies that demonstrate the use of photoplethysmography in healthcare have been considered.

  3. (c)

    The studies only on humans using photoplethysmography were considered.

  4. (d)

    The studies of the machine learning model used for hypertension classification based on photoplethysmography were considered.

  5. (e)

    Studies have focused on measuring blood pressure using machine learning from photoplethysmography.

Results

Within the framework of this systematic review, a total of 68 studies were incorporated, after the meticulous screening process that followed the elimination of 987 initial studies. Each of these studies underwent a thorough examination and subsequent classification guided by their thematic alignment with our research objectives. The following subsequent sections present the outcomes of this study.

Physiological Parameters Using Photoplethysmography

Table 1 summarizes the different physiological parameters of the study using PPG. There are many clinical parameters studied using PPG in healthcare. PPG can be utilized in various ways in healthcare such as diagnosis, monitoring, and screening. Different PPG waveforms were used to extract features such as image PPG (iPPG), the first derivative of PPG (FDPPG), velocity waveform of PPG signal (VPG), the second derivative of PPG (SDPPG), and acceleration waveform of PPG signal (APG). The authors recently surveyed the literature on physiological parameters using PPG in healthcare, such as arterial stiffness [9], jugular venous pulse [10], heart rate [11,12,13,14,15], heart rate variability, blood pressure [16,17,18,19], blood glucose [20, 21], venous function [22, 23], oxygen saturation [24,25,26,27], fetal oxygen saturation [28, 29], respiratory rate [6], lipid profiling [30, 31], cardiac output [32, 33], and ankle brachial pressure [10, 34]. Jugular venous pulse is used to detect right atrial and central venous pressure abnormalities in CVD diagnosis [10].

Table 1 Enlisting a comparative analysis of various physiological parameters studied using photoplethysmography (PPG)

Blood Pressure Estimation Based on PPG Using Machine Learning Models

There is a comparison of different studies on blood pressure estimation based on PPG using machine learning models. The results are shown in Table 2. The performance of each model and database/number of subjects used in the study based on PPG with SBP and DBP are shown in Table 2. To evaluate the model’s performance, different criteria or evaluation metrics have been used. These are mean error (ME), mean absolute error (MAE), mean relative error (MRE), root mean square error (RMSE), standard deviation (STD) or (SD), and ME ± SD. In certain studies, such as those referenced in [37] and [38], the correlation coefficient or Karl–Pearson’s coefficient of correlation (r) is employed as an additional performance criterion.

Table 2 Comparison of blood pressure estimation models used in different studies

Let Ei be the error corresponding to the reference or actual or true blood pressure (\({BP}_{{True}_{i}})\) and expected or predicted blood pressure (\({BP}_{{Pred}_{i}}\)) in the \(i\) th observation or determination, then the error can be defined as follows:

$${E}_{i }= {BP}_{{True}_{i}}- {BP}_{{Pred}_{i}} ;\,i=1, 2, 3\dots \dots ..n$$
(1)

Now, absolute error \(A{E}_{i}\) and relative error \(R{E}_{i}\) in the \(i\) th observation can be defined as:

$$A{E}_{i }= \left|{BP}_{{True}_{i}}- {BP}_{{Pred}_{i}}\right| ;\,i=1, 2, 3\dots \dots ..n$$
(2)
$$R{E}_{i }=\frac{\left|{BP}_{{True}_{i}}- {BP}_{{Pred}_{i}}\right|}{{BP}_{{True}_{i}}}= \frac{{AE}_{i}}{{BP}_{{True}_{i}}} ;\,i=1, 2, 3\dots \dots ..n$$
(3)

The ME, MAE, MRE, RMSE, and STD are defined as follows:

$$ME= \frac{1}{N}\times \sum_{i=1}^{n}{E}_{i} ;\,i=1, 2, 3\dots \dots .n$$
(4)
$$MAE= \frac{1}{N}\times \sum_{i=1}^{n}\left|{E}_{i}\right| = \frac{1}{N}\times \sum_{i=1}^{n}A{E}_{i } ;\,i=1, 2, 3\dots \dots .n$$
(5)
$$MRE= \frac{1}{N}\times \sum_{i=1}^{n}R{E}_{i } ;\,i=1, 2, 3\dots \dots .n$$
(6)
$$RMSE=\sqrt{\frac{1}{N}\times \sum_{i=1}^{n}{\left({E}_{i}\right)}^{2}} ;\,i=1, 2, 3\dots \dots .n$$
(7)
$$\mathrm{STD}\;\mathrm{or}\;\mathrm{SD}=\sqrt{\frac1N\times\sum_{i=1}^n\left(E_i-ME\right)^2};\,i=1,2,3\dots\dots.n$$
(8)

where N is the total number of observations (test segments or samples). The above performance criteria were used to measure systolic blood pressure (SBP) as well as diastolic blood pressure (DBP).

Several types of machine learning models can be used for blood pressure estimation using photoplethysmography signals. Linear regression [64], polynomial regression, or a support vector machine (SVM) [60, 61, 63] can be used to establish a relationship between the extracted features from PPG signals and blood pressure values. These models learn a mapping function between the input features and the target output (blood pressure) and can provide continuous blood pressure estimates. Feedforward Neural Networks [40], multilayer perceptron (MLP), autoencoder [45, 47], modified U-net, Cycle Generative Adversarial Network [43], Residual Network (ResNet), and Gaussian Process Regression (GPR) [38] can be used for blood pressure estimation. Convolutional Neural Networks (CNN) [41] and Artificial Neural Networks (ANN) [46] are effective for learning spatial features from PPG signals. They are particularly useful for analyzing the temporal patterns and morphological characteristics of PPG waveforms. Recurrent Neural Networks (RNN) [46], Long Short-Term Memory (LSTM) [42, 45], Bidirectional LSTM (Bi-LSTM) [48, 49], and gated recurrent units (GRU) [48] are suitable for modeling temporal dependencies in PPG signals. Ensemble models such as Gradient Boosting, CatBoost [38], and Adaptive Boosting (AdaBoost) [53, 54, 59] leverage the diversity of different models to obtain more accurate blood pressure estimates using PPG signals. Deep Neural Networks (DNN) [51] can be utilized to build complex models with multiple layers for blood pressure estimation.

Each ML model was applied to one of the well-known databases such as MIMIC, MIMIC II [66], MIMIC III [67], PPG-BP [68], VitalDB, and Queensland for measuring BP. As shown in Table 2, in studies [40, 44, 49, 52], they were used different databases with a smaller number of subjects. In a recent study by [39], they applied SVR, CatBoost, LightGBM, and XGBoost machine learning models to predict BP. In this study, they used PPG waveform from the MIMIC III database and extracted 38 features from three categories namely semi-classical signal analysis (SCSA), SDPPG, and PPG. The authors showed that the CatBoost algorithm was better than the SVR, LightGBM, and XGBoost algorithms. The CatBoost algorithm achieved a mean absolute error of 5.37 mmHg with a standard deviation of 5.56 mmHg and 2.96 mmHg with a standard deviation of 3.13 mmHg for systolic and diastolic blood pressure, respectively. A previous study [41] estimated a mean absolute error of 5.73 mmHg and 3.45 mmHg for SBP and DBP respectively by using the CNN learning model. Previous studies [42, 49, 52, 56] use an electrocardiogram (ECG) and PPG waveforms to measure SBP and DBP.

Machine Learning Model Used for Hypertension Classification

Hypertension (HT), also known as high blood pressure (high BP), stands as the primary risk factor for CVD. In 2015, a report by the World Health Organization (WHO) indicated that 1.13 billion people were suffering from hypertension (HT) [69]. The 7th report of the US Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure (JNC7) categorized the blood pressure level of adults into normotension (NT), prehypertension (PHT), and hypertension (HT) [70]. Many studies have been conducted into three classes namely NT vs. PHT, NT vs. HT, and (NT + PHT) vs. HT. Many assessment criteria are available in the literature such as specificity, sensitivity, accuracy, F1-score, precision, area under the curve (AUC) and receiver operating characteristic curve (ROC), Matthew’s correlation coefficient, and Cohen’s kappa coefficient. Mostly, the authors used the F1-score as a performance criterion of the ML models. In Table 3, we show the F1-score as a performance criterion of the model achieved by the ML classifier for hypertension classification. The F1-score was calculated using the following formula [71]:

$${F}_{1} Score=2 \times \frac{Precision \times Recall}{Precision +Recall}= \frac{TP}{TP+\frac{1}{2}\left(FP+FN\right)}$$
(9)

where

$$Precision= \frac{TP}{TP+FP}$$
(10)
$$Recall= \frac{TP}{TP+FN}$$
(11)

where TP = true positive, FP = false positive, FN = false negative.

Table 3 Analytical comparison of hypertension classification models used in various studies

In Table 3, detailed descriptions related to the hypertension classification model are shown such as database, required signal, and features used by the researchers. A variety of ML classifiers were proposed for hypertension classification. Some of the widely used algorithm/ML classifiers are k-Nearest Neighbors (KNN) [7, 75], Naïve Bayes [74], Logistic Regression [75], Random Forest [74], SVM [72], AdaBoost [75], and Bagged Tree [75]; these are ease of modelling for a complex problem. In our finding, some authors proposed modern neural network architectures such as LightGBM [70], AlexNet [73], DenseNet [73], GoogleNet [73, 76], and ResNet [73]. In the study [69], they used a hybrid classifier known as Adaptive Neuro-fuzzy Inference System (ANFIS), a combination of ANN and fuzzy-set theory and widely used in medical diagnostic systems.

Discussion

The British Hypertension Society (BHS) and the Association for Advancement of Medical Instrumentation (AAMI) are two types of performance indicators used as BP monitoring global standard [52]. As shown in Table 4, the BHS standard provides the Grades namely A, B, C, and D according to the different ranges of MAE with the proportion of subjects. According to BHS standard, the model achieved at least grade B for SBP and DBP predictions [47].

Table 4 Grade and MAE value with BHS Standard

As shown in Table 5, the AAMI standard sets establish the range for the mean absolute error (MAE), standard deviation (SD), and size of the population (subjects or sample).

Table 5 AAMI International Standard Range

The performance criteria of the SBP and DBP estimation models were evaluated using MAE ± SD. As mentioned in Table 5, MAE and SD should be less than 5 mmHg and 8 mmHg respectively according to AAMI standard. As shown in Table 2, many ML models do not fulfill the AAMI standard for evaluating SBP such as [39, 40, 53, 60, 61, 63]. Moreover, as shown in Table 1, many studies were performed with less than 15 subjects which was insufficient because the AAMI standards require at least 85 subjects.

A recent study done by [70] achieved a higher F1-score with LightGBM ML classifier for hypertension when they used 189 features from PPG, 200 features from VPG, and 190 features from APG. In their study [73], they applied deep learning architectures namely AlexNet, ResNet, and GoogLeNet based on the Hilbert–Huang Transform (HHT) method to predict the hypertension level and achieved higher F1-scores using AlexNet than ResNet and GoogLeNet. They applied the model on the MIMIC dataset and PPG, VPG, and APG features were used. In another study [7] were proposed a KNN model to classify the BP applied on PPG–BP figshare data. The PPG–BP figshare database [68] was collected from 219 subjects while they used 121 subjects which were divided into normotensive (46 subjects), pre-hypertensive (41 subjects), and hypertensive (34 subjects). The F1-scores with three classification trials as NT vs. PHT, NT vs. HT, and NT + PHT vs. HT. were 100%, 100%, and 90.90%, respectively. They showed that the KNN model is superior to the model proposed by [75] (KNN, AdaBoost, Bagged Tree, Logistic Regression) whereas Liang et al. 2018a used the MIMIC database, PPG, and PAT features. The disadvantage of extracting PPG features is the requirement for the finest PPG waveform [72]. Moreover, calculating PAT features from PPG and ECG signals is complicated because it requires stable and high-quality synchronized waveforms.

Conclusion and Future Research Recommendations

This study offers a systematic review of physiological measurements and estimations obtained using photoplethysmography (PPG). This has significant clinical relevance in various healthcare domains such as diagnosis, monitoring, and screening. It includes the capabilities and constraints of PPG usage through a systematic analysis of diverse research in healthcare and the advancement of machine learning methods for estimation and classification. Thus, this study empowers researchers and clinicians with the knowledge required to make informed decisions regarding PPG utilization. The outcomes of this thorough investigation revealed that PPG holds potential for healthcare diagnosis, monitoring, and screening, particularly when combined with machine learning (ML) or deep learning (DL) algorithms to enhance computational capacity and achieve heightened accuracy. However, certain gaps in the existing literature are still evident and that will be undertaken by future studies. According to our findings, many studies have compared the accuracy of the ML model without any consideration of the database and features used in the model. To achieve higher accuracy with the ML/DL model, the data should be more aligned, accurate, and precise, which is a significant challenge for PPG. Advanced deep learning frameworks, such as LightGBM, AlexNet, DenseNet, GoogleNet, and ResNet, have the potential to yield superior accuracy. However, they require extensive training time, heightened processing capabilities, and increased resources. Consequently, these frameworks can substantially amplify the computational intricacies of the system [39].

Based on the literature exploration, proposing recommendations that can be considered to solve clinical problems by using PPG as follows: (i) early precise detection and prediction—investigate the potential of PPG-derived features and machine learning algorithms for early detection and prediction of cardiovascular diseases, such as heart failure, hypertension, and arrhythmias; (ii) multi-modal analysis—explore the integration of PPG with other physiological signals, such as electrocardiography (ECG) and accelerometer data, to develop a comprehensive and accurate cardiovascular health monitoring system; (iii) novel biomarkers—identify novel PPG-based biomarkers that can provide insights into cardiovascular health, such as arterial stiffness, pulse wave velocity, and cardiac output, and correlate these biomarkers with disease progression; (iv) personalized risk assessment—develop a personalized risk assessment model that utilizes PPG data along with patient-specific information (age, sex, medical history) to estimate an individual’s risk of developing cardiovascular diseases; (v) ambulatory monitoring—design wearable PPG devices for continuous ambulatory monitoring, allowing for real-time tracking of cardiovascular parameters during daily activities and sleep, which could aid in understanding disease patterns; (vi) stress and emotional analysis—explore the relationship between PPG signals and stress levels or emotional states, as chronic stress is a risk factor for cardiovascular diseases. Develop algorithms to detect stress-induced changes in PPG patterns and (vii) large-scale data analysis—conduct large-scale data analysis using PPG data from diverse populations to uncover potential disparities and variations in cardiovascular health and disease outcomes.

While PPG is a valuable and convenient tool for physiological measurement and estimation, it may not be as accurate or reliable as more direct and invasive measurement methods in certain clinical scenarios. There are various factors such as motion artifacts, ambient light interference, skin pigmentation, and device calibration that can affect the accuracy of PPG measurements. Therefore, proper calibration and consideration of these factors are essential for obtaining meaningful and reliable physiological data from PPG. The monitoring of mental health based on PPG technology is an exciting field, but more collaboration is needed between software engineers, sensor manufacturers, and medical practitioners to provide a jumpstart [4].