1 Introduction

During labor, a decrease in fetal oxygen supply can lead to fetal acidosis, which can then lead to serious neonatal complications and neurological damage, such as intellectual or motor impairments. One such complication, cerebral palsy, is the most commonly associated long-term neurological complication [1], and 10–20% of the incidence of cerebral palsy is secondary to intrapartum hypoxia [2].

In clinical practice, fetal well-being is usually assessed through visual analysis of a fetal heart rate (FHR) tracing. This analysis is subjective and prone to high inter- and intraobserver variability, which results in frequent false-positive interpretations [3,4,5] despite the publication of new guidelines to harmonize FHR interpretation [6]. Second-line methods can also be used, such as fetal blood sampling or ST analysis, but these invasive methods have not been shown to reduce the need for an operative delivery or neonatal morbidity [7, 8]. Therefore, it seems important to develop new noninvasive and automated methods to avoid interobserver variability.

One new method is the analysis of the fetal autonomic nervous system (ANS). Knowledge about fetal physiology is crucial for improving the interpretation of FHR changes during labor [9]. The FHR is affected by the ANS through chemoreceptors and baroreceptors that are sensitive to fetal hypoxia [10, 11]. Analysis of heart rate variability (HRV) explores the changes in ANS activity [12] that regulate the cardiovascular system. Impairment of ANS activity in cases of fetal acidosis has been demonstrated [11].

Several HRV analysis methods are used, such as frequency- or time-domain analysis [13]. Time-domain analysis includes indexes such as long-term variability (LTV) and short-term variability (STV), which have been specifically elaborated for fetal HRV analysis [14]. These beat-to-beat indexes are favored in clinical practice, and STV is widely accepted as a significant index for assessing fetal well-being [13, 15]. STV is used mainly as a predictor of hypoxia during the antenatal period [16]; a few studies have analyzed STV during labor but have produced discordant results [17, 18].

The ANS is one of the main mediators of increased fetal HRV [19]. Our team has developed a new index, the fetal stress index (FSI). This index is based on an original method of HRV analysis. In previous experimental studies, we have shown that this index correlated well with fetal acidosis and reflected parasympathetic fluctuations more specifically than the usual HRV markers [11, 20, 21]. Those studies were performed in a sheep model and the hypoxia was induced by intermittent cord compression without nor uterine contractions neither maternal pushing, which could interfere with the FHR signal quality. We were interested in evaluating the accuracy of the FSI and other markers during labor.

The primary objective of this study was to evaluate the ability of FSI to predict neonatal acidosis. Secondary objectives were to evaluate the ability of other commonly used HRV methods (LTV and STV) to predict acidosis and to develop a multiparameter model to improve this predictive ability.

2 Methods

2.1 Database

We used the open-access intrapartum cardiotocography (CTG) database published by Chudáček et al. [22]. The database comprises 552 intrapartum recordings along with additional maternal and neonatal outcome data. The recordings started no more than 90 min before delivery and were at most 90 min long. The second stage of labor did not exceed 30 min.

The CTG data were recorded using Avalon FM40 and FM50 (Philips Healthcare, Andover, MA) and STAN S21 and S31 (Neoventa Medical, Mölndal, Sweden) fetal monitors [22]. Each CTG record contains time information and signals for the FHR and uterine contractions, both of which were sampled at 4 Hz.

2.2 HRV analysis

We have previously described the algorithm used for the calculation of our new FSI index [23,24,25]. It uses a spectral analysis (wavelet transform) to filter the signal and keeps only high-frequency oscillations. It then computes the magnitude of the oscillations in the time domain. The method includes detection of each heartbeat to construct the RR series, which is isolated in a 64-s moving window, normalized, and high-pass filtered above 0.15 Hz using a wavelet-based numerical filter. The remaining oscillation magnitudes are computed by plotting the local minima and maxima. The area between the upper and lower envelopes is divided into four subareas A1, A2, A3, and A4, and the area under the curve (AUC) minimum (AUCmin) is defined as the minimum of the four subareas. FSI is then computed as FSI = a × AUCmin + b, where a = 39.84 and b = 9.38, and a and b are two constants determined using a dataset of 200 RR series records (Fig. 1).

Fig. 1
figure 1

FSI computation. Normalized and filtered RR series. Areas A1, A2, A3, and A4 are computed between the lower and upper envelopes (grid area). The smallest area is then selected (A1) and the FSI is computed

The STV and LTV indexes for HRV were calculated using the algorithm published by Dawes and Redman [26]. STV is defined as the mean FHR difference between successive 3.75-s R–R interval epochs for 1 min. LTV, or mean minute range, refers to fluctuations in the FHR over seconds and is defined as the difference between the minimum and maximum value of the mean FHR of the different epochs for 1 min [27].

Continuous computation of the HRV indexes was assumed by sliding the moving window with a 1-s moving period. Indexes were then averaged over 3 min.

2.3 Data analysis

At the end of labor, uterine contractions are stronger and more frequent, which causes important FHR decelerations. To avoid any mathematical artifacts generated by those sudden and important FHR changes, we performed our analysis in the last hour before the beginning of maternal pushing during the second stage or before the decision to initiate cesarean delivery. Our analysis was performed for a 5-min uninterrupted stable period. The choice to use 5 min of continuous measurement was made based on other studies [28, 29] and was considered a necessity for reliable computation of HRV. The visual selection of the stable period aimed to exclude false identification of R peaks and to limit the poor signal quality. The stable period was defined as a period with no accelerations or decelerations in FHR and with an LTV < 50, which reflects the stability of FHR during the period.

We computed the mean FHR, FSI, STV, and LTV in this 5-min period. We also analyzed the maximum FSI peak in the record. In an experimental sheep model, the FSI increased during fetal acidosis because of increased parasympathetic nervous system activity [21, 23]. We, therefore, decided to compute the mean FHR, FSI, STV, and LTV at the FSI max peak to provide better discrimination. The FHR was also analyzed using the International Federation of Gynecology and Obstetrics (FIGO) classification. The FIGO evaluator was blind to the neonatal outcomes. To evaluate the discriminative performance of the multivariate associations, we used a factorial discriminant analysis for both stable and FSI max periods.

2.4 Statistical analysis

Records were separated into two groups according to a pH threshold of 7.10. The data were compared between groups using Student’s t test. A p-value < 0.05 was considered significant. The area under the receiver-operating characteristic (ROC) curve was computed for variables showing significant differences. The data are presented as mean ± standard deviation.

2.5 Ethical approval

The ethics statement is explained in the paper by Chudáček et al. [22]. The CTG recordings and clinical data were matched using anonymized unique identifiers generated by the hospital information system. The timings of the CTG records were matched to the stages of labor (first and second stage) and were made relative to the time of birth, and were also deidentified. This study was approved by the Institutional Review Board of University Hospital Brno, and all women signed an informed consent form.

3 Results

The database included 552 intrapartum recordings. Eighty (14.5%) were excluded because of poor FHR signal quality and 33 (6.0%) because of the absence of a 5-min stable period (i.e., without any acceleration/deceleration or with LTV < 50). The 439 (79.5%) remaining records were then separated into two groups based on the pH value. The group with a pH > 7.10 included 396 (90.2%) records, and the group with pH ≤ 7.10 included 43 (9.8%) records (Fig. 2). Table 1 presents the patients’ and infants’ characteristics for the two groups.

Fig. 2
figure 2

Flow chart

Table 1 Characteristics of the two pH groups

FHR, FSI, STV, and LTV did not differ significantly between the two groups when calculated during the 5-min stable period (Table 2). At the FSI max peak, FHR, LTV, and STV were significantly higher for the group with pH ≤ 7.10 (p = 0.012, p = 0.010, and p = 0.037, respectively). FSI at the FSI max peak did not differ significantly between groups (Table 3).

Table 2 Comparison between HRV markers and pH groups during the stable period
Table 3 Comparison between HRV markers and pH groups during FSI max peak

The AUC for FIGO to distinguish between the pH ≤ 7.10 and pH > 7.10 groups was 0.569. The AUC values were 0.595 for STV and 0.622 for LTV.

A factorial discriminant analysis was also conducted to seek a better predictor of acidosis. The combination FHR, FSI, LTV, and STV at the FSI max had an AUC of 0.713 for the prediction of acidosis (Fig. 3). Adding the FIGO score to this multivariate model increased the AUC to 0.719 (Fig. 4).

Fig. 3
figure 3

ROC curve analysis in the multivariable model (FSI, FHR, LTV, STV) to predict acidosis for pH ≤ 7.10 vs pH > 7.10

Fig. 4
figure 4

ROC curve analysis in the multivariable model (FIGO, FSI, FHR, LTV, STV) to predict acidosis for pH ≤ 7.10 vs pH > 7.10

4 Discussion

Fetal monitoring during labor aims to detect fetuses at risk of acidosis. Analysis of the ANS using HRV is one promising solution. We evaluated our new index based on the analysis of ANS activity and other different automated HRV analysis methods. We found no significant differences in the FSI between pH groups. Although we found no significant differences between pH groups for the different analysis methods when calculated during the stable period, we found that the FHR, LTV, and STV calculated during FSI max peak were significantly higher for the group with pH ≤ 7.10 and had a better ability to predict acidosis than the FIGO. The multiparametric model that included FHR, FSI, STV, and LTV provided even better discrimination.

These results for our new index were disappointing. In previous experimental studies in a sheep model, we showed that FSI correlated strongly with fetal acidosis [21, 23]. The main reason for this lack of conclusive results in our current data may be the lack of precision in the FHR signal. The FSI evaluates high-frequency content, which requires highly accurate beat-to-beat data and FHR series (30). However, most of the analyzed records were from the CTG (320/439), which provide averaged and resampled FHR values (averaged over 3–5 beats and sampled at 4 Hz). Van Laar et al. stated that complete spectral information about HRV is reliable only if FHR data are acquired on a beat-to-beat basis and from direct fetal electrocardiogram (ECG) signals measured with a scalp electrode [30]. It appears that it may be necessary to obtain a reliable beat-to-beat measurement of FHR to study the FSI fully.

STV and LTV are commonly used during the antenatal period to monitor growth-restricted fetuses. STV’s interest during labor is debated [31]. Our study showed no significant differences between groups when the indexes were calculated during the stable period. When calculated during the FSI max peak, STV and LTV were significantly higher in the pH ≤ 7.10 group and had a better ability to predict acidosis than the FIGO classification. These results are consistent with the study by Lu et al., which reported an STV elevation when lactate level increased [17]. In their study, they used a modified algorithm, which excluded decelerations because they hypothesized these would affect the results. We found significant results only during the FSI max period, which reflects the peak of ANS parasympathetic activity. This may explain why we found significant results only during the FSI max period, especially for STV, which also reflects high-frequency variability.

In another study with FHR signals obtained with CTG, Butruille et al. compared the FSI between infants with normal and low pH values at birth. Even with its low predictive capacity, the FSI was significantly lower in the group with the lower pH, which may be interpreted to indicate a decrease in parasympathetic tone in the acidotic fetus [24]. These results were consistent with those of Van Laar et al. who reported a decrease in the normalized high-frequency content in acidotic fetuses (11). These two studies were performed during the final 30 min of labor, which usually includes many instances of FHR deceleration, especially in the acidotic fetus, and may explain the decrease of the normalized high-frequency content. By contrast, we found an increase in STV, which is also a marker of high-frequency variability (i.e., parasympathetic tone) in the acidotic fetus. However, our analysis was performed during an earlier stage of labor, which does not normally include any heart rate deceleration or acceleration. This result suggests that the HRV indexes should be interpreted carefully in the context of the period of measurement and that HRV analysis may be more relevant physiologically when evaluated during periods without FHR deceleration or acceleration.

One way to improve the precision of neonatal acidosis may be the use of a multiparametric model. By combining FHR, FSI, STV, and LTV, with or without the FIGO classification, we observed a better prediction of neonatal acidosis than with the use of the FIGO classification, STV, or LTV alone. Signorini et al. proposed a multiparametric FHR analysis that included spectral parameters from autoregressive models and nonlinear algorithms, and found that their model was better for distinguishing normal from abnormal fetuses in the antenatal period [32]. The use of a multiparametric model seems to improve the prediction of acidosis. However, these results should be confirmed in a larger population with additional HRV analysis methods such as spectral and nonlinear methods.

This study has some limitations. The FIGO classification of FHR tracings was performed by only one observer. However, this observer was blind to the gasometric and neonatal outcomes. Ayres de Campos et al. showed that knowledge of an adverse neonatal outcome leads to significantly more frequent identification of abnormal CTG features and therefore to a more severe classification in the intrapartum CTG [33]. For the HRV computation period, the indexes were measured during a 5-min period without any deceleration or acceleration to eliminate any risk of a mathematical artifact. In our study, we selected the stable period visually, and it would be interesting to implement a new algorithm that could automatically detect these periods and to automate the computation process fully. However, the need for 5-min periods without decelerations is a major limitation in terms of the clinical usefulness, especially during the second stage of labor and should be reduced to 2–3 min. Moreover, other factors such as medication, maternal characteristics (parity, obesity, and scarred uterus), labor characteristics (temperature, duration, and meconium) were not included in our prediction model due to the sample size of our population. We also chose as endpoint neonatal acidosis, which could be debated because the majority of severely acidotic fetuses at birth are likely to be normal, require no admission in ICU, and have normal Apgar scores. Similarly, most newborns, delivered for “fetal distress” are not acidotic at delivery [34]. However, this database does not include complete neonatal outcome (NICU admission, intubation, respiratory distress…), and only 16 neonates had an Apgar score lower than 7 at 5 min.

5 Conclusion

Although we found no differences in the FSI between acidotic and control fetuses, this study suggests that measuring fetal HRV provides relevant information about acidosis during labor and that a multiparametric approach combining several indexes significantly increases the ability to predict acidosis. The commonly used Doppler ultrasound technique does not reflect the real beat-to-beat variability, which must be known for efficient HRV analysis. The use of scalp electrodes or noninvasive transabdominal fetal ECG allowing beat-to-beat HRV analysis may improve the ability to monitor fetal pH [35].