1 Introduction

In today’s world, monitoring cardiovascular health status for early diagnosis is one of the leading research areas. The heart rate study is a prominent approach to analyze cardiovascular health status during daily routine [1]. Due to its simplicity, accuracy, and low cost, Photoplethysmography (PPG) is gaining importance and becoming an alternative approach to monitoring and studying vital body signs. PPG technology uses optical sensors and is popular due to its lightweight, fashionable, simplicity, and more importantly, it can be used as wearable devices like the smart fitness band [2]. Generally, abnormalities in the functionality of the heart are identified using heart rate and percentage of oxygen. Initially, PPG technology is used in pulse oximetry to monitor oxygen levels in the blood. Due to PPG's noninvasive nature, it has now become a standard of care in the operating theatre, intensive care unit [3]. Pulse oximetry has the flexibility to observe the body vitals both qualitatively and quantitatively. PPG is a noninvasive tool that can continuously monitor heart rate, respiratory rate, cardiac outputs, and blood pressure.

Even though PPG technology has many advantages, the major drawback is erroneous data in certain circumstances, mainly due to noise from motion artifacts. Hence the accuracy of PPG technology depends upon the suppression of noises [4].

1.1 Principle of PPG

PPG technology measures the change in blood volume in the tissue during a heart cycle using a light source. This volumetric measurement provides important information regarding the cardiovascular system. A PPG sensor mainly consists of two electronic components, a light emitter and a light intensity sensing component. Typically, LED is used as a light emitter and a photodetector to detect (sense) the change in light intensity [5]. A PPG pulse corresponding to one heartbeat includes the systolic and diastolic phases. During the systolic phase, the volume of blood in arteries is more; this is because during this phase heart contracts and pushes oxygen-rich blood to all the tissues and organs. The systolic phase causes more light is absorbed by the blood cells. Therefore the amount of light detected by the photodetector during the systolic phase is low. During the diastolic phase, the blood has flown back into the heart. Therefore, during the diastole phase, the light detected by the photodetector increases due to a decrease in the blood volume. Depending upon application and sensor placement, PPG can be used either in transmissive mode or in reflection mode, as shown in Fig. 1 [6].

Fig. 1
figure 1

Placement of sensor in transmissive mode PPG (left) and reflection mode PPG (right)

When a photodetector and LED are placed on parallel sides of a finger to detect the transmitted light, this mode is known as a transmissive mode. In transmissive mode, the probe is in a projection that the photodetector and LED face each other with a layer of tissues between them [7]. Detection in transmissive mode depends upon transmission of light from body parts, so thin structures like the earlobe and finger are preferred in this mode. When both photodetector and LED are placed on the same side of a finger to detect the reflected light, it is a reflective mode. In reflection mode, both the sensors are placed next to each other with an approximate spacing of 3 cm. Therefore reflection mode can use anybody site like the forehead and wrist. Choice of the site to place PPG sensors depends on the patient's blood perfusion, comfortability of the subject, and application [8].

The role of the photodetector is to detect and quantify the light absorbed during pulsatile and non-pulsatile flow [9]. During pulsatile flow, light is absorbed by the change in blood flow inside the arteries, which is synchronous with a heartbeat. During the non-pulsatile flow, light is absorbed by background tissues. Therefore, a photodetector detects the volumetric change in blood flow in arteries by detecting the light intensity difference [10]. Measurement of this change in light intensity thus helps to analyze the functionality of the heart.

A PPG signal mainly consists of AC and DC components. AC component in the PPG output waveform indicates the change in light intensity during the systolic and diastolic phase due to the blood in arteries [11]. The steady DC part of the PPG waveform indicates the light absorbed by tissues, skin, and bone, as shown in Fig. 2. Analysis of the DC component provides valuable information regarding venous blood flow, respiration, and thermoregulation. Variation in light intensity detected due to arterial blood flow is around 1% only, which provides information on the heart's functionality [12].

Fig. 2
figure 2

Variation in light intensity during pulsatile and non-pulsatile flow

1.2 PPG Analysis Using Multiple Wavelengths

Light absorption during systolic and diastolic phases of a heart cycle follows Beer's law and Lambert's law, jointly known as Beer–Lambert's law. According to Beer's law, light absorbed by the blood is proportional to the concentration of oxygenated hemoglobin and deoxygenated hemoglobin. As per Lambert's law, light absorption is proportional to light penetration in the skin [13].

Therefore according to Beer–Lambert law, the amount of light absorption (Aλ) through a substance, given by Eq. (1) is directly proportional to the light absorber concentration (C), optical path length traversed by the light signal (L), and light absorptivity at a particular wavelength (ελ)

$$A_{\lambda} = \varepsilon_{\lambda } CL$$
(1)

Body skin mainly consists of three layers, as shown in Fig. 3. Due to absorption, only light waves with a larger wavelength can penetrate through all three layers.

Fig. 3
figure 3

A schematic representation of the penetration of light through the skin at different wavelengths

Therefore the measurement mode and the body vitals that need to monitor, determine the selection of LED. Oxygenated hemoglobin absorbs light at near infra-red (NIR) wavelength, while deoxygenated hemoglobin absorbs light at red wavelength. Hence, PPG employing NIR and red light wavelength LEDs and photodetectors is generally used for a clinical checkup to calculate the hemoglobin concentration. The effect of motion artifact on the PPG signal also depends on the wavelength of the light used. Longer wavelength light like infra-red light gets affected more due to motion artifact as it penetrates deep inside the tissue.

On the other hand, the light of a shorter wavelength (green light) is generally free from motion artifacts. Light at a shorter wavelength penetrates less inside the body tissue. Thus, to mitigate the effect of motion artifacts and the absorption of light by body tissues, PPG based on multi-wavelength optical sensors has been proposed to detect blood flow variations at different skin depths [14].

The light emitted by the diode is absorbed by tissues, and the amount of absorption in terms of detected light intensity is determined by photodetector [15]. When used as a pulse oximeter, PPG uses two LEDs of a different wavelength. One LED emits light in the red spectrum around 660 nm, at which light absorption due to deoxyhemoglobin is greater than that of oxyhemoglobin. Another LED emits light in the infrared spectrum at a wavelength of 940 nm, at which oxyhemoglobin absorbs more light than deoxyhemoglobin. Accurate information on the blood circulation during a heart cycle is obtained by fixing the wavelength of LEDs between 660 and 940 nm.[16]. Finally, a Microprocessor unit analyzes the light absorption at each wavelength to determine the concentration of oxyhemoglobin and deoxyhemoglobin.

The rest of the paper is organized as follows: Estimation of heart rate from PPG is outlined in Sect. 2; Sect. 3 describes different methodologies proposed to date to remove motion artifacts. Section 4 highlights different datasets available for heart rate estimation using PPG. A literature survey based on different algorithms and methods proposed for heart rate identification is presented in Sect. 5. Challenges, and Discussion are drawn in Sects. 6, and 7 summarizes the work.

2 Heart Rate Estimation Using PPG

Realtime estimation of heart rate using a wearable device is one of the demanding applications in the health care system for the early diagnosis of cardiovascular diseases. Heart rate is the average number of times a heart beats per minute. Fluctuation in the time interval between subsequent heartbeats in milliseconds is called heart rate variability (HRV). Heart rate and HRV are standard markers for detecting health status. In a human body, the behavior of sympathetic and the parasympathetic branches of the autonomic nervous system (ANS) indicate the status of HRV [17]. The sympathetic branch is related to the acting condition of the body, and the parasympathetic branches are related to the resting and digesting phase of the body. Depending upon day-to-day activities, the brain processing signal through ANS to the other parts of the body, through which the body can either react or stay relaxed. The human body tackles all kinds of signals received through the ANS system in a balanced way [18]. However, if a body persistently involves an unhealthy diet, irregular sleep, stress, and laziness, the balance between the ANS system's branches may be disturbed.

A subject with a high HRV means that the ANS system is in balance and responding to both sympathetic and parasympathetic inputs. Low HRV indicates that the subject is working under stress or fatigue and sympathetic branches dominate parasympathetic branches. A body with high HRV has a healthy status, but a low HRV indicates more stress, due to which the risk of cardiovascular disease may increase. Therefore from the last few years, HRV analysis has become a valuable tool for the early diagnosis of cardiovascular disease. Therefore both heart rate and HRV are used to measure cardiovascular health status. Heart rate and HRV are determined by measuring the volumetric change in blood during a heart cycle by passing the light through the skin. The PPG output waveform shown in Fig. 4 depicts the fluctuation in light absorption during a systolic and diastolic phase of a heart. When the heart contracts, the volume of blood flow increases, which increases the hemoglobin; therefore, the light absorption due to increased hemoglobin also increases—the amount of light detected by the detector decreases. In the dilation phase, when the blood volume reduces, the hemoglobin decreases. Therefore, the amount of absorbed light decreases, hence the light detected by the photodetector increases. As a result, a pulsatile waveform in response to a cardiac cycle is observed as a PPG waveform [19].

Fig. 4
figure 4

Different feature points related to the PPG waveform

The volumetric change of blood in tissue is synchronous to the heartbeat, which is used to estimate the heart rate. A PPG waveform mainly consists of four points O–S–N–D. As shown in Fig. 4, the S-point (Systolic Peak) represents the peak value in a PPG signal. The calculation of the Peak-to-Peak interval of consecutive PPG signals (S–S) provides information on the heart rate. The Peak-to-Peak interval correlates closely with the R-R interval in an ECG waveform. Analysis of pulse interval (O–O) provides information about HRV.

For the estimation of heart rate and HRV using PPG, it is necessary to analyze different properties of pulsatile PPG waveform like time interval between two consecutive systolic peaks (tS–S), systolic peak amplitude (Ps), and the amplitude of diastolic Peak (Pd) [20]. After calculating the accurate value tS–S, the instantaneous heart rate due to a single heartbeat is calculated using Eq. (2).

$$HR_{i} = \,\frac{60}{{t_{S - S} }}$$
(2)

For a time window H, the heart rate is calculated by using Eq. (3).

$$HR_{true} = \frac{60H}{{t_{S - S} }}$$
(3)

PPG waveform recorded from a healthy subject consists of three feature points, systolic Peak (S), diastolic Peak (D), and dicrotic notch (N). However, some of the feature points may be missing in some PPG waveforms. As the morphology of a PPG wave depends on age, gender, and health status, some of the feature points may miss the recoded PPG signal. The accuracy of cardiovascular functionality estimation depends on the accurate analysis of these features. The first derivative and second derivative of a PPG signal help identifying the PPG feature points [21]. By analyzing the features extracted from these three waveforms, namely the PPG signal, the 1st derivative of the PPG waveform, and the 2nd derivate of the PPG waveform, adequate information related to cardiac function can be processed [7]. A schematic representation of these three waveforms is shown in Fig. 5. It is mandatory to detect feature Point S in PPG signals to detect the heart rate accurately. It is important to note that reliable estimation of the heart rate and HRV is only possible if the Point-S in the PPG signal is detected.

Fig. 5
figure 5

Schematic representation of a PPG waveform, its First Derivative PPG (FDPPG) and Second Derivative PPG (SDPPG)

In a healthy subject, the subsequent cardiac cycle's morphological structure possesses almost similar properties as its predecessor. A missing feature point indicates a sign of abnormality in cardiac function. To accurately locate feature points in a PPG signal, the derivatives of a PPG signal are used.

Between the first and second derivatives of a PPG signal, the second derivative is widely-used to locate the missing feature points. A PPG waveform, along with its first derivative and second derivative, is shown in Fig. 5. Normalized amplitude values, namely \(\mathrm{b}/\mathrm{a},\mathrm{ c}/\mathrm{a},\mathrm{ d}/\mathrm{a},\mathrm{ e}/\mathrm{a}\), can be used to detect arterial stiffness [22].

In elderly subjects, the normalized amplitude \(\mathrm{b}/\mathrm{a}\) increases and other normalized amplitudes decrease. Analysis of change in amplitude value is used to measure the subject's cardiovascular age index, studied by aging index (AGI) as in eq. (4) [23].

$$AGI\, = \,\frac{b - c - d - e}{a}$$
(4)

Moreover, different intervals between different peaks from the second derivative of the PPG signal are used to identifying a subject with abnormalities [24].

Using the correlation between consecutive heartbeats within a time window, pulse transit time (PTT) and pulse wave velocity (PWV) provide vital information about heart rate and HRV. PTT is defined as the time required by an arterial pulse wave to travel from an aortic valve to a body site perfuse by optical light [20]. In reference to the ECG waveform, PTT is the time interval between the R-wave peak and any feature point on the PPG signal. PWV is used to measure the heart rate and heart rate variability. PWV is the velocity of a pressure wave when the blood flows through arteries. PWV has an inverse relation with PTT as given by Eq. (5). Therefore, PTT and PWV form a noninvasive method to analyze cardiac functionality.

$$PWV = \frac{D}{PTT}$$
(5)

Here D is the vessel length through traversed by a pressure pulse.

The PPG signal analysis is also affected by various noises like motion artifacts, variation due to baseline drift, and ambient light noise due to sensor position variation. Out of these noises, motion artifact has a significant effect on heart rate analysis as the frequency of the motion artifact lies inside the required heart rate information band. Hence, accurate heart rate peak identification when the PPG sensor is in motion is challenging. For accurate heart rate estimation, the effect of motion artifact in the PPG signal must be removed. The following section describes the motion artifacts reduction techniques and their properties proposed to date.

3 Motion Artifacts Removal Techniques

Accurate and reliable peak detection with wearable PPG sensors for heart rate estimation has become a demanding application in the health care industry. Physical motion during daily activities drastically reduces the accuracy of heart rate identification using a PPG sensor. In this section, several approaches proposed to date to mitigate the effect of motion artifacts from raw PPG signals are summarized.

Due to physical movement, sensor light passes from the body tissue deviates from its path, which provides erroneous data. The frequency spectrum of motion artifact is greater than 0.1 Hz and usually lies inside the heart signal's desired spectrum [25]. Hence, motion artifact is a leading noise source that influences various factors in the PPG signal analysis, potentially limiting the PPG sensor's usage to study and monitor the cardiac system information for health monitoring. Thus, the suppression of the noise spectrum from PPG signals is one of the leading research topics in the healthcare industry.

In [26], decomposition-based independent component analysis (ICA) is proposed to suppress the motion artifacts components from a PPG spectrum. Moreover, the ICA-based approach provides reliable output only if noise and information signal possesses a mutually exclusive spectrum. In the realtime analysis of cardiac health monitoring, independent spectral conditions are not met. Thus, the efficiency of the ICA approach becomes suboptimal. Another widely used approach to suppress motion artifacts is adaptive filtering. As motion artifact behavior is random, a fixed coefficient filtering process is not suitable. Therefore adaptive filtering based motion artifact removal was proposed in [25]. However, the adaptive filtering performance depends upon the nature of the reference noise signal [27]. Therefore, adaptive filters can only provide reliable noise suppression when the correlation between the reference accelerometer signal and the motion spectrum is high, which is not possible in realtime.

Moreover, the high computational complexity of the adaptive filter limits their usage in wearable PPG. To make the system computationally efficient and to reduce the requirement of an additional accelerometer, deep learning convolutional neural network (CNN) is proposed to detect the noise in a PPG signal. The proposed CNN-based PPG signal classification in [25] uses a 1-D CNN network and provides the flexibility to the user to select any PPG segment of 5-s duration to detect motion artifacts [28]. CNN network can automatically extract the features by classification, thus reduces the need for threshold setting and segmentation. The correlation feature between both left and right hands was used to detect motion artifacts without an additional accelerometer sensor [29]. Since the nature of the PPG signal is nonlinear and varies between subjects, the proposed work in [28] uses the artificial neural network approach to analyze PPG signal characteristics to detect motion artifacts, and by using ANFIS based algorithm, the lost part of the PPG signal due to noise is retrieved. In [30], a method based on neural-network-based classification was proposed to detect the PPG signal accurately.

Based on the penetration depth of different light wavelengths, one more approach to removing motion artifacts without using accelerometer sensors was proposed [31]. A shorter wavelength green light source to estimate heart rate and a longer-wavelength infrared light source to provide a reference noise signal is used. Moreover, light sources with different wavelengths also detect noise that arises due to micro motions. In [29], to reduce the computational complexity, a multi-sensor method with multiple wavelengths is proposed to study the infected frame instead of analyzing the whole PPG signal. As a PPG signal is of pulsating nature, the most pulsating signal is used to extract a clean PPG signal. Multi-wavelength (Red, Green, Infrared) have different penetration depths. ICA approach is used to extract the pulsatile component. A method based on the fusion of signals from multiple sensors was proposed in [32] to remove motion artifacts from the PPG signal. The method in [33] extracts the reference signal through the PPG signal, thus reducing the hardware cost.

Most of the proposed methods related to motion artifact removal deal with simple exercise or limited physical movement. Therefore, to remove strong-motion artifacts, discrete wavelet signal decomposition and thresholding-based approaches are proposed to remove the noise spectrum from the PPG signal [34]. A decomposition-based empirical mode decomposition (EMD) approach was implemented to extract the correct PPG segment from the corrupted PPG signal. A modified nonlinear approach named ensemble empirical mode decomposition (EEMD) was proposed in [35] to reduce motion artifacts from the PPG signal to resolve the mode mixing problems that arise during time–frequency distribution. In the EEMD method, reference noise is added to decompose the given PPG signal into IMF, without any prerequisite selection criterion on window width.

The potential of the principal component analysis (PCA) approach was combined with the EEMD method for accurate extraction of vital sign information from the PPG signal. Generally, motion artifact removal techniques are either based on time analysis or frequency analysis, which possess their inherent limitations. Therefore time–frequency based approach was proposed in [35]. However, time–frequency based approaches failed to provide reliable results when the nature of motion noise is periodic and strong. In that case, the extraction of a clean PPG signal becomes very difficult. Therefore the demand for accurate and reliable motion artifact removal methods for analyzing accurate vital signs is still an important research topic.

4 PPG Database

There are several data sets publicly available to test proposed algorithms. Table 1 highlights all the publicly available databases recorded with PPG-enabled wrist-worn devices. One of the most standard datasets is IEEE signal processing competition (SPC) 2015. IEEE SPC 2015 dataset was first used in [36]. IEEE SPC 2015 dataset consists of recordings from 23 subjects, in which the first 12 subjects have undergone simple physical exercises like walking (IEEE SPC-12 Training). The subjects numbered 13–23 performed arm exercises to introduce some motion noise (IEEE SPC-11 Testing). Two PPG signals and three-axis accelerometers are used on the wrist while recording the PPG. To test the efficacy of the work, the IEEE SPC dataset also recorded ECG signals while the subject is at rest. One more publicly available recent dataset is named PPG dataset for heart rate estimation in daily life activities (PPG DaLiA) [37], which is introduced to overcome the limitation on low physical activity used while recording the IEEE SPC dataset. In PPG DaLiA, fifteen subjects have undergone physical activities that are similar to daily activities. PPG DaLiA dataset is specially designed to identify heart rate under a motion noise environment. Besides this real-life exercise feature, the PPG DaLia dataset has limited information on the age group.

Table 1 Summary of publically available PPG databases

The limitations posed by the accelerometer during recording on the accuracy of the PPG data set are improved by introducing a gyroscope along with accelerometers in the PPG signal recorder. During the signal recordings, the subjects underwent physical exercise activities like walking, running on a treadmill [39].

5 Literature Survey Based on Heart Rate Estimation

Accurate estimation of heart rate is essential to detect any abnormalities in body function. The reliability of heart rate estimation is always affected due to the presence of motion artifacts. Therefore denoising motion artifacts and correct heart rate estimation in realtime are current research areas while designing smart wearable healthcare devices. This motivates researchers to develop and implement a faster and reliable way to identify the correct heart rate during physical activities. The majority of the proposed work to date related to heart rate detection follows a four-step approach, as shown in Fig. 6.

Fig. 6
figure 6

Flowchart indicating the four main stages in heart rate estimation

Input to the preprocessing stage consists of sensor information like accelerometer, PPG, and gyroscope [39]. The role of the preprocessing stage is to filter out undesired frequency spectrum (out of the desired window) by using bandpass filters. For reliable and correct estimation of heart rate, the role of the denoising stage is crucial. Using a reference noise signal (output of the accelerometer sensor) while recording a PPG signal helps the denoising algorithm remove the noise spectrum from the information signal. After removing motion artifacts, by identifying the correct peak, the heart rate is estimated in stage-3. A post-processing stage known as the heart rate tracking stage is used to provide exact information. The algorithms proposed to date showed a tradeoff between complexity and accuracy.

This literature review summarizes the research articles related to heart rate estimation using the Photoplethysmography (PPG) method. The heart rate estimation performance is studied in the literature by evaluating average absolute error (AAE), absolute error percentage (AEP), and Pearson correlation coefficient. AAE and AEP are computed using a reference ground truth heart rate value, estimated using ECG. The performance of the heart rate algorithm is estimated using the following indexing. HRtrue(i) represents the ground truth ECG heart rate in the ith time window, and HRest(i) is the estimated heart rate value using the proposed method. The output of each proposed work was analyzed and compared in terms of mean absolute error, error percentage, and Pearson correlation coefficient.

The average absolute error is calculated by using Eq. (6).

$${\text{AAE}} = \frac{1}{W}\sum\nolimits_{i = 1}^{W} {} \left| {\sum\limits_{i = 1}^{W} {{HR_{est}}\left( i \right)} - {HR_{true}}\left( i \right)} \right|$$
(6)

For a total number of windows W, the average absolute error percentage (AEP) is calculated using Eq. (7).

$$\text{AEP}\, = \,\frac{1}{W}\sum\limits_{i = 1}^{W} {\frac{{\left| {{HR_{est}}\left( i \right) - HR_{true}\left( i \right)} \right|}}{{{HR_{ture}}\left( i \right)}}} \times 100$$
(7)

The other set of parameters used in some works in the literature include accuracy (ACC), sensitivity (SCC), and specificity, given by Eqs. (8)–(10), respectively.

$${\text{Accuracy}}\left( {{\text{ACC}}} \right) = \frac{{\left( {TP + TN} \right)}}{{\left( {TP + TN} \right) + \left( {FP + FN} \right)}}$$
(8)
$${\text{Sensitivity}}\left( {\text{SCC}} \right) = \frac{TP}{{TP + FN}}$$
(9)
$${\text{Specificity}} = \,\frac{TN}{{FP + TN}}$$
(10)

In Eqs. (8)–(10) true positive (TP) is the number of segments that are classified correctly. NP is the true negative, which shows the number of segments affected due to motion artifacts. False-positive (FP) indicates the segment which is affected but also classified incorrectly. FN false negative shows segment, which is artifact affected. Table 2 summarized all the techniques proposed, along with their evaluation results.

Table 2 Summary of heart rate estimation techniques using wrist PPG technology

An algorithm to minimize the motion artifact effect on heart rate estimation is proposed in [69]. Due to lower complexity and normalization features, the Normalized Least Mean Square (NLMS) adaptive filter is used to remove motion artifacts. After removing the motion artifact, the heart rate is calculated from the autocorrelation-based fundamental period extraction unit. A threshold-based approach is used as a post-processing step to extract heart rate information. The proposed algorithm extracts heart rate with a correlation of more than 0.98. The accuracy of denoising using an adaptive filter always depends on the accuracy of the reference noise signal recorded using the accelerometer. An algorithm named signal decomposition for denoising, sparse signal reconstruction for high-resolution spectrum estimation, and spectral peak tracking (TROIKA) [36] is proposed in a wearable PPG device that does not require a reference signal to estimate heart rate. TROIKA technique for heart rate estimation consists of a three-step process. Step 1 consists of the signal decomposition method to denoise the motion artifacts components. Step 2 used the sparsity-based spectrum estimation approach to estimate heart rate. Step 3 is a post-processing step to track and verify the desired peak related to heart rate. An AAE of 2.34 ± 0.82 BPM and AEP of 1.80% was calculated with IEEE SPC 12 candidate dataset. TROIKA approach has shown good results during physical activities also. To further improve the performance [41], proposed an approach named joint sparse spectrum reconstruction (JOSS), which follows a modified procedure to improve the accuracy of previous work TROIKA. It utilizes the PPG signals and acceleration signals jointly for heart rate spectrum estimation under the multiple measurement vectors model. Noise due to motion from PPG signal is removed by spectral subtraction instead of signal decomposition. Selection and verification of peak were used as a post-processing step to track heart rate. The authors calculated an AAE of 1.28 ± 2.61 BPM and an AEP of 1.01% ± 2.29% with the proposed technique. JOSS provides a reduction in the error compared to TROIKA implemented on the IEEE SPC 12 candidates' dataset. Despite the improvement in the result recorded with [36, 41], both approaches faced a limitation in terms of computational complexity. A novel method called spectrum subtraction, peak tracking, and post-processing (SPECTRAP) is proposed to reduce the computational complexity [43]. Asymmetric least squares spectrum subtraction approach is used to denoise the PPG signal. Instead of using heuristic rules based spectral peak tracking, a Bayesian decision theory was used for reliable estimation of heart rate. An AAE of 1.50 ± 1.95 BPM and AEP of 1.12 ± 1.47% were calculated with IEEE SPC 12 candidate dataset. SPECTRAP showcased the reduction in computation complexity at the expense of an increase in the AEP. Using random forest-based spectral peak tracking algorithm, a method to reduce computational complexity by reducing AAE is proposed in [46]. The power spectral density of the PPG data segment and the accelerometer are compared to remove motion artifacts. Using the method in [46], an AAE of 1.23 ± 0.80 BPM with IEEE SPC 12 candidate dataset and 1.65 ± 1.56 BPM with IEEE SPC 22 candidates' dataset were showcased with a reduced computational complexity with and reduced APE.

Like TROIKA, a method to estimate the heart rate by using spectral peak tracking is proposed [42]. The spectral tracking method involves multiple heart rate trajectories, while TROIKA uses a single heart rate trajectory. Using an adaptive noise cancellation (ANC) filter for denoising, this work calculated an AAE of 1.11 ± 2.33 BPM with the same PPG dataset used in [37]. TROIKA's major disadvantage is that it uses SSA for signal decomposition, which partially removes the motion artifacts from raw a PPG signal. An algorithm based on EEMD to minimize the motion artifact from the PPG signal, which occurs due to strong physical exercise, is proposed [44]. In [44], EEMD is used for signal decomposition to remove the motion artifacts from a raw PPG signal partially. After signal denoising, the spectrum subtraction approach is used to find the correlation between motions affected PPG signal and reference noise signal to remove the remaining motion artifact from the spectrum. This approach resulted in better noise performance than TROIKA in terms of AAE and AEP, which are 1.83 ± 1.21 BPM and 1.40%, respectively.

Instead of signal decomposition, an adaptive motion artifact reduction approach to suppress motion was proposed in [45] using an NLMS adaptive filter to reduce motion artifact. A time-varying bandpass filter is used for accurate heart rate estimation. Filter updates its coefficient at constant intervals so that it can efficiently track the frequency component. In this method, AAE and AEP have calculated as 1.71 ± 0.49 BPM and 1.41%, respectively, resulting in a 27% reduction in AAE. This approach works well when the motion artifact is weak. To improve the performance of the adaptive filter to suppress the strong motion artifact from the PPG signal, in [47] singular value decomposition (SVD) stage is introduced before the adaptive filter to decompose the three-axis accelerometer signal having different periodic components. SVD eases the convergence of the adaptive filters. The decomposed output and reference noise signal used by the adaptive filter to suppress the motion artifact from the PPG signal. An AAE and AEP of 1.25 ± 0.6 BPM and 0.99% calculated respectively.

One major issue faced by benchmark techniques like TROIKA and JOSS was the runaway error problem. A hybrid approach that abolishes the dependency of heart rate estimation over the previous window is proposed [33] to overcome the runaway error problem. In this method, a two-channel PPG signal is used to estimate heart rate. EEMD approach is used to obtain a noise-free PPG signal, and the RLS adaptive filter is used to remove motion artifacts and identify the heart rate peak. An AAE of 1.15 ± 2.37 BPM using a single channel and 1.02 ± 1.79 BPM with two channels are reported.

Apart from various advantages, the proposed in [33] does not denoise the signal effectively when the motion artifact frame exists close to the heart rate frame. A method named precise heart rate tracking(PREHEAT) is proposed in [57] by introducing a dynamic order correlation-based recursive least-squares (cRLS) adaptive filter to minimize the effect of motion artifact effectively. After denoising, Wavelets are used in addition to Fourier transform to detect the correct heart rate peak. PREHEAT calculates an improved AAE of 0.83 ± 0.96 BPM. PPG time–frequency features based motion artifact removal approach was proposed in 2016 [51], named time–frequency spectra of PPG signal (TifMA) for realtime heart signal analysis. Compared to published work related to motion artifact removal and heart rate detection, TifMA also tests the noise frame usability for heart rate peak detection instead of deleting them. Using frequency modulated and amplitude modulated data from the usable signal, the proposed algorithm accurately estimates heart rate value using a subsequent window approach. The affectivity of TifMA was tested in terms of specificity and selectivity.

Various methods proposed in the literature to denoise a PPG signal are based on signal decomposition or adaptive filtering that failed to provide reliable results in realtime applications. An approach based on cascaded RLS adaptive filter and EEMD is proposed in [35] to overcome the limitations posed by realtime PPG applications. The author computed an AAE of 1.16 ± 2.23 BPM and AEP of 0.93% with the IEEE SPC 12 candidate's dataset.

In particle filter-based algorithm for heart rate estimation using photoplethysmographic signals (PARHELIA) [48], a method based on particle filter for heart rate estimation is proposed with tracking multiple candidates. A particle filter can help recover an incorrect track to the correct track. PARHELIA uses the acceleration signals to update the weight of particles in the particle filter to reduce the effects of motion artifacts. Updating weight depends on three steps, namely prediction, weight calculation, and resampling. An AAE of 1.17 BPM was calculated with PARHELIA, which showed an improvement of 8.6% compared to the TROIKA. Another work based on particle filter proposed in [59] the heart peak by focusing on those consistent with time. Instead of three axes reference noise signal, a single reference noise signal was used to reduce computational complexity having the highest peak frequency. Instead of relying on any reference characteristics points for measurement, the proposed filter considers noisy signals as input and modifies the weight of selected particles to analyze heart information. Heart rate was estimated by detecting the highest weight particle assigned to each window. To further refine the heart rate estimation, a fusion method was used, in which an AAE of 1.4 ± 1.55 BPM is calculated [59].

An algorithm named multiple reference adaptive noise cancellation technique (MURAD) is proposed in [49] to improve the effectiveness of adaptive filters for accurate heart rate estimation. In this method, the three-axis accelerometer reference noise signal and the difference between two PPG signals are used as the reference noise signal. Instead of using a fixed reference noise signal for each window, the proposed work provides flexibility to select a realtime reference noise signal for accurate and reliable heart rate estimation. An AAE of 0.97 ± 1.83 BPM and AEP of 0.76 ± 1.5% were calculated with MURAD algorithm. In [50], a different approach to separate motion artifacts spectrum and PPG spectrum from raw PPG data is proposed. The harmonic sum model retrieves the fundamental frequency component of the reference noise acceleration signal within a short window range to estimate the heart rate spectrum from raw a PPG signal. An AAE of 0.73 ± 0.83 BPM was calculated, which showed improved error performance over methods already reported.

As observed from the literature, the frequency-domain approach, like EMD [33, 44], increases the computational complexity. In [70], a modified EMD approach with variance characterization to identify motion-affect periods in the whole PPG signal from a predefined time window is proposed to overcome the computation complexity issue. An AEP is calculated as 1.03%, which demonstrated the use of a modified EMD approach introduced in wearable devices [70]. A method that uniquely detects heart rate peak frequency under the realtime environment with reduced system complexity is proposed in [53] to reduce computational complexity. A unique property of this work was that it does not rely on heart rate information recorded in the previous window for heart rate detection. To avoid large-amplitude reference noise signal detection in detecting heart rate, a spectral division approach is used to extract the reference accelerometer spectra from the PPG signal. A composition of all frequency components is used to measure the highest peak frequency under the desired range. Finally, a constant value based jump procedure was introduced to track the heart rate in the noisy spectrum.

Wiener filter and phase vocoder based new approach named WFPV is proposed in [53] to overcome the limitations of computational complexity faced by methods based on heuristic rules or thresholds detection for heart rate estimation. A Wiener filter is used to attenuate the effect of strong motion artifacts. A phase vocoder was used, which allows the user to estimate heart rate for a short period. Compared with previously presented methods, WFPV improved AAE to 1.02 BPM and AEP to 0.81%. The Wiener filter used reference noise signals from accelerometers from all three axes to filter motion artifacts. In [64], a modified method to remove motion artifacts by using a three-axis acceleration reference noise signal is proposed.

Some zeros were added at the end of the signal to make heart rate resolution less than 1 BPM to identify heart rate peak frequency. The heart rate is further tracked by comparing the estimated result with a predefined threshold. An AAE of 1.02 ± 0.44 BPM was calculated, which is better than most of the proposed work. Conceptually similar work was also presented in [56] to estimate the correct heart rate peak. A one-variable Kalman filter was employed to refine the heart rate value. To reduce the effect of the motion noise SVD technique filters out a subset matrix of noise-free PPG signal. To assess the present work compared with [53], the authors calculated two more parameters for maximum absolute deviation and standard deviation. Maximum absolute deviation provides the capability to assess the algorithm's accuracy at each point in a window and a standard deviation computed over the whole window.

Considering the advantages of time–frequency approaches simultaneously, a time–frequency based short-time Fourier spectral tracking (SFST) approach was proposed to estimate heart rate in a short period. As FFT provides limited resolution to study heart rate, [54] replaced FFT with STFT for realtime heart rate estimation. After the preprocessing step, the signal is divided in a short time window using the STFT approach to reduce motion artifacts. A cyclic moving average filter is used to filter out unexpected variance values in heart rate due to complex motion artifacts. Using IEEE SPC 12 candidates' dataset, calculated results showed improved AAE results of 1.06 ± 0.69 BPM and AEP of 0.94% ± 0.53%. In [55], a new method to utilize the potential of a time–frequency based approach for heart rate estimation is proposed. A combination of RLS adaptive filter (a time-domain approach) output and SSA (frequency domain approach) output was used to minimize the motion artifacts in [58]. By considering the previous heart rate time window, a conditional sum approach was used to avoid false estimation of heart rate. For reliable heart rate peak detection, tracking of heart rate within a search range is implemented as a post-processing step, which resulted in an AAE of 1.16 ± 1.74 BPM.

Researchers have devoted many efforts to provide low computational complexity approaches to estimate heart rate for wearable devices accurately in recent years. In [58], an approach based on the random forest binary decision algorithm for accurate heart rate estimation is proposed. A binary decision algorithm helps in deciding between two algorithms used for motion artifacts removal. For feature extraction, wavelet-based techniques were used. Compared with the result of a similar approach, this work calculated an AAE of 1.23 BPM with low computational complexity.

Another concern in developing wearable devices is the accurate estimation of heart rate during intensive physical activity. In [60], an algorithm to identify heart rate in a realtime environment is proposed. The main objective of this work is to remove the motion artifacts spectrum that occurs due to physical movement across the sensor. For denoising, the Wiener filtering approach was used. To solve the difficulties faced in heart rate estimation during intensive exercise, the finite state machine (FSM) based algorithm was used under the post-processing step, ignoring inaccurate estimations. Compared with the previously reported method, an improved result in terms of AAE 0.79 ± 0.6 BPM was calculated with IEEE SPC 23 candidate dataset. Even though the accelerometer signals cancel out the motion artifact, they introduce gravitational acceleration error. To solve the problem of gravitational acceleration, a gyroscope is used to record the reference noise signal [40].

For heart rate estimation using wearable devices, properties like tracking ability, robustness, and computational cost are considered important design parameters and can be realized by a combination of adaptive filters [71]. By assigning different weights to the combined layers of an adaptive filter, the adaptive filter's denoising performance can be improved [61]. The output of two parallel cascaded networks was combined using a convex combination to improve the output efficiency, which depends on the choice of filters and adaptive filter parameters. A three-stage cascaded network model was proposed to filter out motion artifacts in three directions. The output from the cascaded RLS and cascaded LMS stage were combined using the convex combination. An AAE of 1.12 BPM was calculated on the same dataset used in [36]. Using the LMS filter properties, a method to minimize motion artifacts was also introduced to estimate heart rate accurately. A notch filter was used to reproduce the PPG signal from the detected heart rate peak [63]. An AAE of 0.92 BPM was calculated, which showed an improved result compared to the state-of-the-art techniques.

Despite this improvement in error performance, the performance of the LMS filter depends upon an adjustment of tap weight, which is directly related to the input vector. If the input vector is not bounded, then the LMS filter may face gradient noise amplification due to the incorrect selection of step size. To avoid the gradient noise amplification and step size issues, a three-stage cascaded adaptive filter RLS, NLMS, LMS based approach is proposed in [72]. In [72], two different pairs of adaptive filters are combined using a convex combination to effectively denoise the PPG signal. Sigmoid function based parameters are assigned to each pair of adaptive filters were updated at each iteration to improve the filtering performance. The FFT-based approach is used to estimate the heart rate. Convex combination assigns constant value at each combinational layer consists of different output combinations of the adaptive filter. It provides maximum value to those layers that perform well in that iteration. Using IEEE SPC 12 candidates' data set, an AAE of 0.92 BPM is calculated. For reliable denoising and heart rate, in [73], three stages of cascaded adaptive filters output are combined using the softmax normalized function. The FFT approach estimates the heart rate value by using a phase vocoder. An AAE of 1.86 BPM was calculated on large datasets, which showed less error than other techniques that used the same data set to test the algorithms. By combining the output of adaptive filters, estimation of heart rate becomes more accurate, but computational time increases. In [68], a new denoising algorithm named combination of adaptive filters using single noise reference signal (CASINOR) is proposed to reduce computational time and error values. Only RLS and NLMS adaptive filters are used to denoise the signal. A sigmoid function was also used to combine the output of both filters. The main feature of CASINOR was that it requires only a single reference acceleration noise signal instead of a three-direction reference noise signal. The accelerometer signal with maximum power is chosen as a reference noise signal. After spectral estimation, a phase vocoder is used to refine the heart rate peak values. Using CASINOR, an AAE of 1.92 BPM is calculated with IEEE SPC 23 candidates' dataset.

Following the decomposition approach for denoising in [73], a method based on VMD is introduced to study the PPG signal in small data length to improve heart rate estimation accuracy. Further to the identified heart rate peak, the PCA approach was used to select the more heart rate relevant mode. With shorter length data, the proposed [73] decompose method identified heart rate peak with less error. Further, to identified accurate heart rate spectrum peak during physical exercise, in [74], a personalized deep learning approach was introduced. For accurate estimation, the algorithm was trained according to the realtime situation. An AAE of 1.47 ± 3.37 BPM was calculated with IEEE SPC 23 candidate's dataset. In realtime, the nature of noise cannot be predicted. A fixed reference noise model may not work effectively to analyze the signal in a realtime environment. In [75], a neural network-based classification approach to separate clean segments without reference noise acceleration signals is proposed for realtime applications. The main feature of this work was that instead of assessing the complete PPG frame, it access individual pulse behavior. The efficacy of the work depends upon the accuracy of the reference template. In [66], a hybrid approach comprised of VMD and neural network classification to estimate heart rate in a realtime environment is proposed. This work identifies the beat morphological structure of beat besides heart rate estimation using a neural network model-based template matching feature. An AAE of 0.53 BPM was calculated on IEEE SPC 23 candidate's dataset, which showed improved performance over the state-of-the-art techniques. In [76], a hybrid approach to jointly estimate heart and respiratory information from the IMF spectrum is proposed. In this method, the EEMD approach is used to generate the desired frequency window's IMF function. PCA technique was used to extract the most relevant feature for heart rate estimation. The method showed similar results on IEEE SPC 23 candidate's dataset obtained, but the accuracy and reliability of this work are far greater than the EEMD approach. Effectiveness of work is calculated in terms of mean and variance with a value of 99.95% and 0.0010% respectively.

Most of the techniques presented were tested with the common dataset IEEE SPC 2015. However, this dataset was recorded with little physical exercise, and each dataset has a duration of less than one hour. In [75], to design a more robust system, a new dataset PPG DaLiA is introduced, which contains recording with some real-life daily activities with a duration of more than 36 h. Two-channel PPG signal and three-axis accelerometer signal are firstly separated in a short window duration of eight seconds. Then Fast Fourier approach was implemented on each window for heart rate estimation. The tracking step is introduced in the CNN layer to improve accuracy and reliability, which relies heavily on the correlation property of the subsequent window of the heart cycle.

In [62], to reduce the computation complexity problem faced by benchmark techniques [44], an SVD based algorithm to estimate heart rate from motion corrupted raw PPG signal is introduced. A genetic algorithm was used to optimize the value of parameters used under the heart rate tracking step to deal with the different motion artifacts cases. From the acceleration signal, the KNN classifier is used to detect the intensity of physical activities. The proposed [62] approach produced comparable results but required less complex processing stages. An AAE of 2.17 BPM was calculated on the same dataset [42]. One more technique based on neural networks for heart rate estimation is introduced in [38], which uses an eight-layer filter model to track the heart rate. The Gaussian distribution function is used to improve the accuracy of the estimation signal. Complex mathematical calculations limit the application of the eight filter model to use in the realtime analysis of heart rate.

To improve mean absolute error performance, A method based on the power spectrum of the desired signal to improve mean absolute error performance is proposed in [65]. This approach deal with the signal's power for measuring accurate heart rate peak during body movement. Estimating the true heart rate of the present window depends on the accuracy of the previous window; hence the crest factor property of FSM is used to check the response of heart rate in the.

subsequent window. The mean value of the previous heart rate window in terms of the Gaussian kernel function is multiplied by the current time window to improve the SNR value. Improved results in terms of AAE of 1.20 BPM and AEP of 1.05% were calculated with IEEE SPC 23 candidates' dataset. After considering problems faced in time and frequency-based approaches, in [67], a modified approach simultaneously uses both the PPG modes to reduce the effect of noise. The effect of noise imposes on the PPG signal depends on the penetration depth of light used to capture the signal. A total of six sensors of different wavelengths were used to illuminate the skin. Out of six sensors, four sensors were used for reflection mode and two for transmissive mode. Separate LEDs were used because the transmissive mode needs a light source that penetrates deeper into the skin. Blue, green, and infrared light show superior results compared to other light sources for estimating heart signals.

6 Challenges and Discussion

In the last decade, monitoring cardiovascular health has become an essential feature for the early diagnosis and prevention of cardiovascular diseases. Due to a lack of efficient monitoring tools, the mortality rate due to cardiovascular diseases increases year by year. To prevent any accidents related to cardiovascular disease, personal health monitoring devices are gaining importance. Therefore the demand for battery-operated wearable sensing devices is ever increasing. Wearable devices with PPG sensor technology will give people the flexibility to measure their health status at any time and any place.

Based on the literature review, PPG technology can monitor heart rate in wearable devices like bands and watches. The accuracy of wearable PPG-based monitoring tools suffers from effects related to motion artifacts. Researchers devoted a lot of effort to design an accurate and reliable monitoring tool in the healthcare system to tackle motion artifacts. We have also highlighted the algorithm proposed to reduce the effect of motion artifacts from the PPG signal. In the literature, time-domain approaches like adaptive filtering and frequency domain approach like signal decomposition are used to denoise. Later on, some of the methods combined the positive feature of both techniques to provide accurate results. Signal-based techniques can give noise-free signals, but they faced computational complexity problems.

On the other hand, adaptive noise cancellation showed reliable results only when reference noise signals correlate highly with the motion spectrum, which is not possible in realtime. In addition to this work, proposed related to heart rate estimation using PPG provide inaccurate results if the noise spectrum lies close to the heart rate peak. Moreover, due to the non-stationary nature of the biological signal, Fourier-based heart rate estimation also not provides reliable results.

Despite the outstanding progress in the past few years related to motion artifact removal from PPG signal discussed in section (III), an effective and computational efficient motion artifact removal algorithm is still in great demand. Therefore there are still many issues to be resolved to implement a realtime continuous method using PPG to monitor cardiovascular behavior during physical activities.

7 Conclusion

This paper presents a review of the potential of Photoplethysmography technology in the field of biomedical signal processing. This paper presented a comprehensive review of state-of-the-art research on suppressing motion artifacts and heart rate estimation using a PPG-enabled wearable device. In the last decade, the ratio of death worldwide due to cardiovascular diseases increases day by day. This hike is due to faster changing lifestyle, stress level, and people's food habits across the world. To reduce the risk of cardiovascular diseases, a frequent medical checkup is needed for continuous assessment. So regular monitoring of cardiovascular health status is important for early diagnosis and timely treatment of cardiovascular disease. Therefore the need for a portable and wearable device for early diagnosis is growing day by day. Due to their small size and low cost, PPG sensor-based wearable devices showed their potential to use as a health monitoring device in the future. This review paper summarized different techniques proposed in the last ten years for noise suppression and heart rate estimation with PPG technology. Some of the methods were computationally inefficient, and others were inefficient under realtime monitoring. Despite many advantages of the Photoplethysmography sensor, it can produce erroneous data in certain circumstances. One of the main reasons for error is the occurrence of motion artifacts. Therefore the role of the PPG sensor for extracting vital information is limited due to motion artifact. A reliable health monitoring device in a realtime environment requires signal processing algorithms that effectively remove motion artifacts and are computationally efficient.